logical replication restrictions
One thing is needed and is not solved yet is delayed replication on logical
replication. Would be interesting to document it on Restrictions page,
right ?
regards,
Marcos
On Mon, Sep 20, 2021 at 9:47 PM Marcos Pegoraro <marcos@f10.com.br> wrote:
One thing is needed and is not solved yet is delayed replication on logical replication. Would be interesting to document it on Restrictions page, right ?
What do you mean by delayed replication? Is it that by default we send
the transactions at commit?
--
With Regards,
Amit Kapila.
No, I´m talking about that configuration you can have on standby servers
recovery_min_apply_delay = '8h'
Atenciosamente,
Em seg., 20 de set. de 2021 às 23:44, Amit Kapila <amit.kapila16@gmail.com>
escreveu:
Show quoted text
On Mon, Sep 20, 2021 at 9:47 PM Marcos Pegoraro <marcos@f10.com.br> wrote:
One thing is needed and is not solved yet is delayed replication on
logical replication. Would be interesting to document it on Restrictions
page, right ?What do you mean by delayed replication? Is it that by default we send
the transactions at commit?--
With Regards,
Amit Kapila.
On Tue, Sep 21, 2021 at 4:21 PM Marcos Pegoraro <marcos@f10.com.br> wrote:
No, I´m talking about that configuration you can have on standby servers
recovery_min_apply_delay = '8h'
oh okay, I think this can be useful in some cases where we want to avoid
data loss similar to its use for physical standby. For example, if the user
has by mistake truncated the table (or deleted some required data) on the
publisher, we can always it from the subscriber if we have such a feature.
Having said that, I am not sure if we can call it a restriction. It is more
of a TODO kind of thing. It doesn't sound advisable to me to keep growing
the current Restrictions page [1]https://wiki.postgresql.org/wiki/Todo.
[1]: https://wiki.postgresql.org/wiki/Todo
[2]: https://www.postgresql.org/docs/devel/logical-replication-restrictions.html
https://www.postgresql.org/docs/devel/logical-replication-restrictions.html
--
With Regards,
Amit Kapila.
oh okay, I think this can be useful in some cases where we want to avoid
data loss similar to its use for physical standby. For example, if the user
has by mistake truncated the table (or deleted some required data) on the
publisher, we can always it from the subscriber if we have such a feature.Having said that, I am not sure if we can call it a restriction. It is
more of a TODO kind of thing. It doesn't sound advisable to me to keep
growing the current Restrictions page
OK, so, could you guide me where to start on this feature ?
regards,
Marcos
On Wed, Sep 22, 2021, at 1:18 AM, Amit Kapila wrote:
On Tue, Sep 21, 2021 at 4:21 PM Marcos Pegoraro <marcos@f10.com.br> wrote:
No, I´m talking about that configuration you can have on standby servers
recovery_min_apply_delay = '8h'oh okay, I think this can be useful in some cases where we want to avoid data loss similar to its use for physical standby. For example, if the user has by mistake truncated the table (or deleted some required data) on the publisher, we can always it from the subscriber if we have such a feature.
Having said that, I am not sure if we can call it a restriction. It is more of a TODO kind of thing. It doesn't sound advisable to me to keep growing the current Restrictions page [1].
It is a new feature. pglogical supports it and it is useful for delayed
secondary server and if, for some business reason, you have to delay when data
is available. There might be other use cases but these are the ones I regularly
heard from customers.
BTW, I have a WIP patch for this feature. I didn't have enough time to post it
because it lacks documentation and tests. I'm planning to do it as soon as this
CF ends.
--
Euler Taveira
EDB https://www.enterprisedb.com/
Show quoted text
No, I´m talking about that configuration you can have on standby servers
recovery_min_apply_delay = '8h'oh okay, I think this can be useful in some cases where we want to avoid
data loss similar to its use for physical standby. For example, if the user
has by mistake truncated the table (or deleted some required data) on the
publisher, we can always it from the subscriber if we have such a feature.Having said that, I am not sure if we can call it a restriction. It is
more of a TODO kind of thing. It doesn't sound advisable to me to keep
growing the current Restrictions page [1].It is a new feature. pglogical supports it and it is useful for delayed
secondary server and if, for some business reason, you have to delay when
data
is available. There might be other use cases but these are the ones I
regularly
heard from customers.BTW, I have a WIP patch for this feature. I didn't have enough time to
post it
because it lacks documentation and tests. I'm planning to do it as soon as
this
CF ends.Fine, let me know if you need any help, testing, for example.
On Wed, Sep 22, 2021 at 10:27 PM Euler Taveira <euler@eulerto.com> wrote:
On Wed, Sep 22, 2021, at 1:18 AM, Amit Kapila wrote:
On Tue, Sep 21, 2021 at 4:21 PM Marcos Pegoraro <marcos@f10.com.br> wrote:
No, I´m talking about that configuration you can have on standby servers
recovery_min_apply_delay = '8h'oh okay, I think this can be useful in some cases where we want to avoid data loss similar to its use for physical standby. For example, if the user has by mistake truncated the table (or deleted some required data) on the publisher, we can always it from the subscriber if we have such a feature.
Having said that, I am not sure if we can call it a restriction. It is more of a TODO kind of thing. It doesn't sound advisable to me to keep growing the current Restrictions page [1].
It is a new feature. pglogical supports it and it is useful for delayed
secondary server and if, for some business reason, you have to delay when data
is available.
What kind of reasons do you see where users prefer to delay except to
avoid data loss in the case where users unintentionally removed some
data from the primary?
--
With Regards,
Amit Kapila.
What kind of reasons do you see where users prefer to delay except to
avoid data loss in the case where users unintentionally removed some
data from the primary?Debugging. Suppose I have a problem, but that problem occurs once a week
or a month. When this problem occurs again a monitoring system sends me a
message ... Hey, that problem occurred again. Then, as I configured my
replica to Delay = '30 min', I have time to connect to it and wait, record
by record coming and see exactly what made that mistake.
On Wed, Sep 22, 2021 at 6:18 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
On Tue, Sep 21, 2021 at 4:21 PM Marcos Pegoraro <marcos@f10.com.br> wrote:
No, I´m talking about that configuration you can have on standby servers
recovery_min_apply_delay = '8h'oh okay, I think this can be useful in some cases where we want to avoid data loss similar to its use for physical standby. For example, if the user has by mistake truncated the table (or deleted some required data) on the publisher, we can always it from the subscriber if we have such a feature.
Having said that, I am not sure if we can call it a restriction. It is more of a TODO kind of thing. It doesn't sound advisable to me to keep growing the current Restrictions page [1].
One could argue that not having delayed apply *is* a restriction
compared to both physical replication and "the original upstream"
pg_logical.
I think therefore it should be mentioned in "Restrictions" so people
considering moving from physical streaming to pg_logical or just
trying to decide whether to use pg_logical are warned.
Also, the Restrictions page starts with " These might be addressed in
future releases." so there is no exclusivity of being either a
restriction or TODO.
[1] - https://wiki.postgresql.org/wiki/Todo
[2] - https://www.postgresql.org/docs/devel/logical-replication-restrictions.html
-----
Hannu Krosing
Google Cloud - We have a long list of planned contributions and we are hiring.
Contact me if interested.
On Wed, Sep 22, 2021, at 1:57 PM, Euler Taveira wrote:
On Wed, Sep 22, 2021, at 1:18 AM, Amit Kapila wrote:
On Tue, Sep 21, 2021 at 4:21 PM Marcos Pegoraro <marcos@f10.com.br> wrote:
No, I´m talking about that configuration you can have on standby servers
recovery_min_apply_delay = '8h'oh okay, I think this can be useful in some cases where we want to avoid data loss similar to its use for physical standby. For example, if the user has by mistake truncated the table (or deleted some required data) on the publisher, we can always it from the subscriber if we have such a feature.
Having said that, I am not sure if we can call it a restriction. It is more of a TODO kind of thing. It doesn't sound advisable to me to keep growing the current Restrictions page [1].
It is a new feature. pglogical supports it and it is useful for delayed
secondary server and if, for some business reason, you have to delay when data
is available. There might be other use cases but these are the ones I regularly
heard from customers.BTW, I have a WIP patch for this feature. I didn't have enough time to post it
because it lacks documentation and tests. I'm planning to do it as soon as this
CF ends.
Long time, no patch. Here it is. I will provide documentation in the next
version. I would appreciate some feedback.
--
Euler Taveira
EDB https://www.enterprisedb.com/
Attachments:
v1-0001-Time-delayed-logical-replication-subscriber.patchtext/x-patch; name=v1-0001-Time-delayed-logical-replication-subscriber.patchDownload
From 9635fec1a031b82ec5d67cdfe16aa1f553ffa936 Mon Sep 17 00:00:00 2001
From: Euler Taveira <euler.taveira@enterprisedb.com>
Date: Sat, 6 Nov 2021 11:31:10 -0300
Subject: [PATCH v1] Time-delayed logical replication subscriber
Similar to physical replication, a time-delayed copy of the data is
useful for some scenarios (specially to fix errors that might cause data
loss).
If the subscriber sets apply_delay parameter, the logical replication
worker will delay the transaction commit for apply_delay milliseconds.
Discussion: https://postgr.es/m/CAB-JLwYOYwL=XTyAXKiH5CtM_Vm8KjKh7aaitCKvmCh4rzr5pQ@mail.gmail.com
---
src/backend/catalog/pg_subscription.c | 1 +
src/backend/catalog/system_views.sql | 3 +-
src/backend/commands/subscriptioncmds.c | 44 +++++++++-
src/backend/replication/logical/worker.c | 48 +++++++++++
src/backend/utils/adt/timestamp.c | 8 ++
src/bin/pg_dump/pg_dump.c | 16 +++-
src/bin/pg_dump/pg_dump.h | 1 +
src/bin/psql/describe.c | 8 +-
src/include/catalog/pg_subscription.h | 3 +
src/include/utils/timestamp.h | 2 +
src/test/regress/expected/subscription.out | 96 +++++++++++-----------
src/test/subscription/t/029_apply_delay.pl | 71 ++++++++++++++++
12 files changed, 248 insertions(+), 53 deletions(-)
create mode 100644 src/test/subscription/t/029_apply_delay.pl
diff --git a/src/backend/catalog/pg_subscription.c b/src/backend/catalog/pg_subscription.c
index ca65a8bd20..0788384579 100644
--- a/src/backend/catalog/pg_subscription.c
+++ b/src/backend/catalog/pg_subscription.c
@@ -69,6 +69,7 @@ GetSubscription(Oid subid, bool missing_ok)
sub->binary = subform->subbinary;
sub->stream = subform->substream;
sub->twophasestate = subform->subtwophasestate;
+ sub->applydelay = subform->subapplydelay;
/* Get conninfo */
datum = SysCacheGetAttr(SUBSCRIPTIONOID,
diff --git a/src/backend/catalog/system_views.sql b/src/backend/catalog/system_views.sql
index 3cb69b1f87..1cc0d86f2e 100644
--- a/src/backend/catalog/system_views.sql
+++ b/src/backend/catalog/system_views.sql
@@ -1261,7 +1261,8 @@ REVOKE ALL ON pg_replication_origin_status FROM public;
-- All columns of pg_subscription except subconninfo are publicly readable.
REVOKE ALL ON pg_subscription FROM public;
GRANT SELECT (oid, subdbid, subname, subowner, subenabled, subbinary,
- substream, subtwophasestate, subslotname, subsynccommit, subpublications)
+ substream, subtwophasestate, subslotname, subsynccommit,
+ subapplydelay, subpublications)
ON pg_subscription TO public;
CREATE VIEW pg_stat_subscription_workers AS
diff --git a/src/backend/commands/subscriptioncmds.c b/src/backend/commands/subscriptioncmds.c
index 3ef6607d24..19916f04a8 100644
--- a/src/backend/commands/subscriptioncmds.c
+++ b/src/backend/commands/subscriptioncmds.c
@@ -46,6 +46,7 @@
#include "utils/lsyscache.h"
#include "utils/memutils.h"
#include "utils/syscache.h"
+#include "utils/timestamp.h"
/*
* Options that can be specified by the user in CREATE/ALTER SUBSCRIPTION
@@ -61,6 +62,7 @@
#define SUBOPT_BINARY 0x00000080
#define SUBOPT_STREAMING 0x00000100
#define SUBOPT_TWOPHASE_COMMIT 0x00000200
+#define SUBOPT_APPLY_DELAY 0x00000400
/* check if the 'val' has 'bits' set */
#define IsSet(val, bits) (((val) & (bits)) == (bits))
@@ -82,6 +84,7 @@ typedef struct SubOpts
bool binary;
bool streaming;
bool twophase;
+ int64 apply_delay;
} SubOpts;
static List *fetch_table_list(WalReceiverConn *wrconn, List *publications);
@@ -249,12 +252,34 @@ parse_subscription_options(ParseState *pstate, List *stmt_options,
opts->specified_opts |= SUBOPT_TWOPHASE_COMMIT;
opts->twophase = defGetBoolean(defel);
}
+ else if (strcmp(defel->defname, "apply_delay") == 0)
+ {
+ char *val;
+ Interval *interval;
+
+ if (IsSet(opts->specified_opts, SUBOPT_APPLY_DELAY))
+ errorConflictingDefElem(defel, pstate);
+
+ opts->specified_opts |= SUBOPT_APPLY_DELAY;
+ val = defGetString(defel);
+
+ interval = DatumGetIntervalP(DirectFunctionCall3(interval_in,
+ CStringGetDatum(val),
+ ObjectIdGetDatum(InvalidOid),
+ Int32GetDatum(-1)));
+ opts->apply_delay = interval_to_ms(interval);
+ }
else
ereport(ERROR,
(errcode(ERRCODE_SYNTAX_ERROR),
errmsg("unrecognized subscription parameter: \"%s\"", defel->defname)));
}
+ if (opts->apply_delay < 0)
+ ereport(ERROR,
+ errcode(ERRCODE_NUMERIC_VALUE_OUT_OF_RANGE),
+ errmsg("option \"%s\" must not be negative", "apply_delay"));
+
/*
* We've been explicitly asked to not connect, that requires some
* additional processing.
@@ -390,7 +415,8 @@ CreateSubscription(ParseState *pstate, CreateSubscriptionStmt *stmt,
supported_opts = (SUBOPT_CONNECT | SUBOPT_ENABLED | SUBOPT_CREATE_SLOT |
SUBOPT_SLOT_NAME | SUBOPT_COPY_DATA |
SUBOPT_SYNCHRONOUS_COMMIT | SUBOPT_BINARY |
- SUBOPT_STREAMING | SUBOPT_TWOPHASE_COMMIT);
+ SUBOPT_STREAMING | SUBOPT_TWOPHASE_COMMIT |
+ SUBOPT_APPLY_DELAY);
parse_subscription_options(pstate, stmt->options, supported_opts, &opts);
/*
@@ -464,6 +490,7 @@ CreateSubscription(ParseState *pstate, CreateSubscriptionStmt *stmt,
CharGetDatum(opts.twophase ?
LOGICALREP_TWOPHASE_STATE_PENDING :
LOGICALREP_TWOPHASE_STATE_DISABLED);
+ values[Anum_pg_subscription_subapplydelay - 1] = Int64GetDatum(opts.apply_delay);
values[Anum_pg_subscription_subconninfo - 1] =
CStringGetTextDatum(conninfo);
if (opts.slot_name)
@@ -913,6 +940,13 @@ AlterSubscription(ParseState *pstate, AlterSubscriptionStmt *stmt,
replaces[Anum_pg_subscription_substream - 1] = true;
}
+ if (IsSet(opts.specified_opts, SUBOPT_APPLY_DELAY))
+ {
+ values[Anum_pg_subscription_subapplydelay - 1] =
+ Int64GetDatum(opts.apply_delay);
+ replaces[Anum_pg_subscription_subapplydelay - 1] = true;
+ }
+
update_tuple = true;
break;
}
@@ -935,6 +969,14 @@ AlterSubscription(ParseState *pstate, AlterSubscriptionStmt *stmt,
if (opts.enabled)
ApplyLauncherWakeupAtCommit();
+ /*
+ * If this subscription has been disabled and it has an apply
+ * delay set, wake up the logical replication worker to finish
+ * it as soon as possible.
+ */
+ if (!opts.enabled && sub->applydelay > 0)
+ logicalrep_worker_wakeup(sub->oid, InvalidOid);
+
update_tuple = true;
break;
}
diff --git a/src/backend/replication/logical/worker.c b/src/backend/replication/logical/worker.c
index 5d9acc6173..39231d464e 100644
--- a/src/backend/replication/logical/worker.c
+++ b/src/backend/replication/logical/worker.c
@@ -248,6 +248,7 @@ WalReceiverConn *LogRepWorkerWalRcvConn = NULL;
Subscription *MySubscription = NULL;
bool MySubscriptionValid = false;
+TimestampTz MySubscriptionApplyDelayUntil = 0;
bool in_remote_transaction = false;
static XLogRecPtr remote_final_lsn = InvalidXLogRecPtr;
@@ -303,6 +304,8 @@ static void store_flush_position(XLogRecPtr remote_lsn);
static void maybe_reread_subscription(void);
+static void apply_delay(void);
+
/* prototype needed because of stream_commit */
static void apply_dispatch(StringInfo s);
@@ -373,6 +376,9 @@ begin_replication_step(void)
{
SetCurrentStatementStartTimestamp();
+ /* delay the current transaction? */
+ apply_delay();
+
if (!IsTransactionState())
{
StartTransactionCommand();
@@ -778,6 +784,43 @@ slot_modify_data(TupleTableSlot *slot, TupleTableSlot *srcslot,
ExecStoreVirtualTuple(slot);
}
+static void
+apply_delay(void)
+{
+ /* nothing to do if no delay set */
+ if (MySubscription->applydelay <= 0)
+ return;
+
+ while (true)
+ {
+ int diffms;
+
+ diffms = TimestampDifferenceMilliseconds(GetCurrentTimestamp(), MySubscriptionApplyDelayUntil);
+
+ elog(DEBUG2, "logical replication apply delay: %u ms", diffms);
+
+ /*
+ * Exit without arming the latch if it's already past time to apply
+ * this transaction.
+ */
+ if (diffms <= 0)
+ break;
+
+ WaitLatch(MyLatch,
+ WL_LATCH_SET | WL_TIMEOUT | WL_EXIT_ON_PM_DEATH,
+ diffms,
+ WAIT_EVENT_RECOVERY_APPLY_DELAY);
+ ResetLatch(MyLatch);
+
+ CHECK_FOR_INTERRUPTS();
+ }
+
+ /*
+ * Delay applied. Reset state.
+ */
+ MySubscriptionApplyDelayUntil = 0;
+}
+
/*
* Handle BEGIN message.
*/
@@ -789,6 +832,11 @@ apply_handle_begin(StringInfo s)
logicalrep_read_begin(s, &begin_data);
set_apply_error_context_xact(begin_data.xid, begin_data.committime);
+ /* set apply delay */
+ if (MySubscription->applydelay > 0)
+ MySubscriptionApplyDelayUntil = TimestampTzPlusMilliseconds(TimestampTzGetDatum(begin_data.committime),
+ MySubscription->applydelay);
+
remote_final_lsn = begin_data.final_lsn;
in_remote_transaction = true;
diff --git a/src/backend/utils/adt/timestamp.c b/src/backend/utils/adt/timestamp.c
index ae36ff3328..1eda3ed57c 100644
--- a/src/backend/utils/adt/timestamp.c
+++ b/src/backend/utils/adt/timestamp.c
@@ -2379,6 +2379,14 @@ interval_cmp_internal(Interval *interval1, Interval *interval2)
return int128_compare(span1, span2);
}
+int64
+interval_to_ms(const Interval *interval)
+{
+ INT128 span = interval_cmp_value(interval) / 1000;
+
+ return span;
+}
+
Datum
interval_eq(PG_FUNCTION_ARGS)
{
diff --git a/src/bin/pg_dump/pg_dump.c b/src/bin/pg_dump/pg_dump.c
index e69dcf8a48..08fc4068cb 100644
--- a/src/bin/pg_dump/pg_dump.c
+++ b/src/bin/pg_dump/pg_dump.c
@@ -4298,6 +4298,7 @@ getSubscriptions(Archive *fout)
int i_subsynccommit;
int i_subpublications;
int i_subbinary;
+ int i_subapplydelay;
int i,
ntups;
@@ -4340,12 +4341,17 @@ getSubscriptions(Archive *fout)
appendPQExpBufferStr(query, " false AS substream,\n");
if (fout->remoteVersion >= 150000)
- appendPQExpBufferStr(query, " s.subtwophasestate\n");
+ appendPQExpBufferStr(query, " s.subtwophasestate,\n");
else
appendPQExpBuffer(query,
- " '%c' AS subtwophasestate\n",
+ " '%c' AS subtwophasestate,\n",
LOGICALREP_TWOPHASE_STATE_DISABLED);
+ if (fout->remoteVersion >= 150000)
+ appendPQExpBufferStr(query, " s.subapplydelay\n");
+ else
+ appendPQExpBufferStr(query, " 0 AS subapplydelay\n");
+
appendPQExpBufferStr(query,
"FROM pg_subscription s\n"
"WHERE s.subdbid = (SELECT oid FROM pg_database\n"
@@ -4366,6 +4372,7 @@ getSubscriptions(Archive *fout)
i_subbinary = PQfnumber(res, "subbinary");
i_substream = PQfnumber(res, "substream");
i_subtwophasestate = PQfnumber(res, "subtwophasestate");
+ i_subapplydelay = PQfnumber(res, "subapplydelay");
subinfo = pg_malloc(ntups * sizeof(SubscriptionInfo));
@@ -4393,6 +4400,8 @@ getSubscriptions(Archive *fout)
pg_strdup(PQgetvalue(res, i, i_substream));
subinfo[i].subtwophasestate =
pg_strdup(PQgetvalue(res, i, i_subtwophasestate));
+ subinfo[i].subapplydelay =
+ strtoi64(PQgetvalue(res, i, i_subapplydelay), NULL, 10);
/* Decide whether we want to dump it */
selectDumpableObject(&(subinfo[i].dobj), fout);
@@ -4466,6 +4475,9 @@ dumpSubscription(Archive *fout, const SubscriptionInfo *subinfo)
if (strcmp(subinfo->subsynccommit, "off") != 0)
appendPQExpBuffer(query, ", synchronous_commit = %s", fmtId(subinfo->subsynccommit));
+ if (subinfo->subapplydelay > 0)
+ appendPQExpBuffer(query, ", apply_delay = '" INT64_FORMAT " ms'", subinfo->subapplydelay);
+
appendPQExpBufferStr(query, ");\n");
if (subinfo->dobj.dump & DUMP_COMPONENT_DEFINITION)
diff --git a/src/bin/pg_dump/pg_dump.h b/src/bin/pg_dump/pg_dump.h
index 997a3b6071..d3d7ae4587 100644
--- a/src/bin/pg_dump/pg_dump.h
+++ b/src/bin/pg_dump/pg_dump.h
@@ -658,6 +658,7 @@ typedef struct _SubscriptionInfo
char *substream;
char *subtwophasestate;
char *subsynccommit;
+ int64 subapplydelay;
char *subpublications;
} SubscriptionInfo;
diff --git a/src/bin/psql/describe.c b/src/bin/psql/describe.c
index e3382933d9..99ec85db2d 100644
--- a/src/bin/psql/describe.c
+++ b/src/bin/psql/describe.c
@@ -6084,7 +6084,7 @@ describeSubscriptions(const char *pattern, bool verbose)
PGresult *res;
printQueryOpt myopt = pset.popt;
static const bool translate_columns[] = {false, false, false, false,
- false, false, false, false, false};
+ false, false, false, false, false, false};
if (pset.sversion < 100000)
{
@@ -6124,6 +6124,12 @@ describeSubscriptions(const char *pattern, bool verbose)
", subtwophasestate AS \"%s\"\n",
gettext_noop("Two phase commit"));
+ /* apply_delay is only supported in v15 and higher */
+ if (pset.sversion >= 150000)
+ appendPQExpBuffer(&buf,
+ ", subapplydelay AS \"%s\"\n",
+ gettext_noop("Apply delay"));
+
appendPQExpBuffer(&buf,
", subsynccommit AS \"%s\"\n"
", subconninfo AS \"%s\"\n",
diff --git a/src/include/catalog/pg_subscription.h b/src/include/catalog/pg_subscription.h
index 18c291289f..57d8472b6e 100644
--- a/src/include/catalog/pg_subscription.h
+++ b/src/include/catalog/pg_subscription.h
@@ -67,6 +67,8 @@ CATALOG(pg_subscription,6100,SubscriptionRelationId) BKI_SHARED_RELATION BKI_ROW
char subtwophasestate; /* Stream two-phase transactions */
+ int64 subapplydelay; /* Replication apply delay */
+
#ifdef CATALOG_VARLEN /* variable-length fields start here */
/* Connection string to the publisher */
text subconninfo BKI_FORCE_NOT_NULL;
@@ -103,6 +105,7 @@ typedef struct Subscription
* binary format */
bool stream; /* Allow streaming in-progress transactions. */
char twophasestate; /* Allow streaming two-phase transactions */
+ int64 applydelay; /* Replication apply delay */
char *conninfo; /* Connection string to the publisher */
char *slotname; /* Name of the replication slot */
char *synccommit; /* Synchronous commit setting for worker */
diff --git a/src/include/utils/timestamp.h b/src/include/utils/timestamp.h
index c1a74f8e2b..58a6e6b6da 100644
--- a/src/include/utils/timestamp.h
+++ b/src/include/utils/timestamp.h
@@ -78,6 +78,8 @@ extern bool TimestampDifferenceExceeds(TimestampTz start_time,
TimestampTz stop_time,
int msec);
+extern int64 interval_to_ms(const Interval *interval);
+
extern TimestampTz time_t_to_timestamptz(pg_time_t tm);
extern pg_time_t timestamptz_to_time_t(TimestampTz t);
diff --git a/src/test/regress/expected/subscription.out b/src/test/regress/expected/subscription.out
index 80aae83562..bb6866d160 100644
--- a/src/test/regress/expected/subscription.out
+++ b/src/test/regress/expected/subscription.out
@@ -76,10 +76,10 @@ ALTER SUBSCRIPTION regress_testsub CONNECTION 'foobar';
ERROR: invalid connection string syntax: missing "=" after "foobar" in connection info string
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two phase commit | Synchronous commit | Conninfo
------------------+---------------------------+---------+-------------+--------+-----------+------------------+--------------------+-----------------------------
- regress_testsub | regress_subscription_user | f | {testpub} | f | f | d | off | dbname=regress_doesnotexist
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two phase commit | Apply delay | Synchronous commit | Conninfo
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+-------------+--------------------+-----------------------------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | f | d | 0 | off | dbname=regress_doesnotexist
(1 row)
ALTER SUBSCRIPTION regress_testsub SET PUBLICATION testpub2, testpub3 WITH (refresh = false);
@@ -94,10 +94,10 @@ ERROR: subscription "regress_doesnotexist" does not exist
ALTER SUBSCRIPTION regress_testsub SET (create_slot = false);
ERROR: unrecognized subscription parameter: "create_slot"
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two phase commit | Synchronous commit | Conninfo
------------------+---------------------------+---------+---------------------+--------+-----------+------------------+--------------------+------------------------------
- regress_testsub | regress_subscription_user | f | {testpub2,testpub3} | f | f | d | off | dbname=regress_doesnotexist2
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two phase commit | Apply delay | Synchronous commit | Conninfo
+-----------------+---------------------------+---------+---------------------+--------+-----------+------------------+-------------+--------------------+------------------------------
+ regress_testsub | regress_subscription_user | f | {testpub2,testpub3} | f | f | d | 0 | off | dbname=regress_doesnotexist2
(1 row)
BEGIN;
@@ -129,10 +129,10 @@ ALTER SUBSCRIPTION regress_testsub_foo SET (synchronous_commit = foobar);
ERROR: invalid value for parameter "synchronous_commit": "foobar"
HINT: Available values: local, remote_write, remote_apply, on, off.
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two phase commit | Synchronous commit | Conninfo
----------------------+---------------------------+---------+---------------------+--------+-----------+------------------+--------------------+------------------------------
- regress_testsub_foo | regress_subscription_user | f | {testpub2,testpub3} | f | f | d | local | dbname=regress_doesnotexist2
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two phase commit | Apply delay | Synchronous commit | Conninfo
+---------------------+---------------------------+---------+---------------------+--------+-----------+------------------+-------------+--------------------+------------------------------
+ regress_testsub_foo | regress_subscription_user | f | {testpub2,testpub3} | f | f | d | 0 | local | dbname=regress_doesnotexist2
(1 row)
-- rename back to keep the rest simple
@@ -165,19 +165,19 @@ ERROR: binary requires a Boolean value
CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUBLICATION testpub WITH (connect = false, binary = true);
WARNING: tables were not subscribed, you will have to run ALTER SUBSCRIPTION ... REFRESH PUBLICATION to subscribe the tables
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two phase commit | Synchronous commit | Conninfo
------------------+---------------------------+---------+-------------+--------+-----------+------------------+--------------------+-----------------------------
- regress_testsub | regress_subscription_user | f | {testpub} | t | f | d | off | dbname=regress_doesnotexist
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two phase commit | Apply delay | Synchronous commit | Conninfo
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+-------------+--------------------+-----------------------------
+ regress_testsub | regress_subscription_user | f | {testpub} | t | f | d | 0 | off | dbname=regress_doesnotexist
(1 row)
ALTER SUBSCRIPTION regress_testsub SET (binary = false);
ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE);
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two phase commit | Synchronous commit | Conninfo
------------------+---------------------------+---------+-------------+--------+-----------+------------------+--------------------+-----------------------------
- regress_testsub | regress_subscription_user | f | {testpub} | f | f | d | off | dbname=regress_doesnotexist
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two phase commit | Apply delay | Synchronous commit | Conninfo
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+-------------+--------------------+-----------------------------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | f | d | 0 | off | dbname=regress_doesnotexist
(1 row)
DROP SUBSCRIPTION regress_testsub;
@@ -188,19 +188,19 @@ ERROR: streaming requires a Boolean value
CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUBLICATION testpub WITH (connect = false, streaming = true);
WARNING: tables were not subscribed, you will have to run ALTER SUBSCRIPTION ... REFRESH PUBLICATION to subscribe the tables
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two phase commit | Synchronous commit | Conninfo
------------------+---------------------------+---------+-------------+--------+-----------+------------------+--------------------+-----------------------------
- regress_testsub | regress_subscription_user | f | {testpub} | f | t | d | off | dbname=regress_doesnotexist
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two phase commit | Apply delay | Synchronous commit | Conninfo
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+-------------+--------------------+-----------------------------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | t | d | 0 | off | dbname=regress_doesnotexist
(1 row)
ALTER SUBSCRIPTION regress_testsub SET (streaming = false);
ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE);
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two phase commit | Synchronous commit | Conninfo
------------------+---------------------------+---------+-------------+--------+-----------+------------------+--------------------+-----------------------------
- regress_testsub | regress_subscription_user | f | {testpub} | f | f | d | off | dbname=regress_doesnotexist
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two phase commit | Apply delay | Synchronous commit | Conninfo
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+-------------+--------------------+-----------------------------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | f | d | 0 | off | dbname=regress_doesnotexist
(1 row)
-- fail - publication already exists
@@ -215,10 +215,10 @@ ALTER SUBSCRIPTION regress_testsub ADD PUBLICATION testpub1, testpub2 WITH (refr
ALTER SUBSCRIPTION regress_testsub ADD PUBLICATION testpub1, testpub2 WITH (refresh = false);
ERROR: publication "testpub1" is already in subscription "regress_testsub"
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two phase commit | Synchronous commit | Conninfo
------------------+---------------------------+---------+-----------------------------+--------+-----------+------------------+--------------------+-----------------------------
- regress_testsub | regress_subscription_user | f | {testpub,testpub1,testpub2} | f | f | d | off | dbname=regress_doesnotexist
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two phase commit | Apply delay | Synchronous commit | Conninfo
+-----------------+---------------------------+---------+-----------------------------+--------+-----------+------------------+-------------+--------------------+-----------------------------
+ regress_testsub | regress_subscription_user | f | {testpub,testpub1,testpub2} | f | f | d | 0 | off | dbname=regress_doesnotexist
(1 row)
-- fail - publication used more then once
@@ -233,10 +233,10 @@ ERROR: publication "testpub3" is not in subscription "regress_testsub"
-- ok - delete publications
ALTER SUBSCRIPTION regress_testsub DROP PUBLICATION testpub1, testpub2 WITH (refresh = false);
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two phase commit | Synchronous commit | Conninfo
------------------+---------------------------+---------+-------------+--------+-----------+------------------+--------------------+-----------------------------
- regress_testsub | regress_subscription_user | f | {testpub} | f | f | d | off | dbname=regress_doesnotexist
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two phase commit | Apply delay | Synchronous commit | Conninfo
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+-------------+--------------------+-----------------------------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | f | d | 0 | off | dbname=regress_doesnotexist
(1 row)
DROP SUBSCRIPTION regress_testsub;
@@ -270,10 +270,10 @@ ERROR: two_phase requires a Boolean value
CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUBLICATION testpub WITH (connect = false, two_phase = true);
WARNING: tables were not subscribed, you will have to run ALTER SUBSCRIPTION ... REFRESH PUBLICATION to subscribe the tables
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two phase commit | Synchronous commit | Conninfo
------------------+---------------------------+---------+-------------+--------+-----------+------------------+--------------------+-----------------------------
- regress_testsub | regress_subscription_user | f | {testpub} | f | f | p | off | dbname=regress_doesnotexist
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two phase commit | Apply delay | Synchronous commit | Conninfo
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+-------------+--------------------+-----------------------------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | f | p | 0 | off | dbname=regress_doesnotexist
(1 row)
--fail - alter of two_phase option not supported.
@@ -282,10 +282,10 @@ ERROR: unrecognized subscription parameter: "two_phase"
-- but can alter streaming when two_phase enabled
ALTER SUBSCRIPTION regress_testsub SET (streaming = true);
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two phase commit | Synchronous commit | Conninfo
------------------+---------------------------+---------+-------------+--------+-----------+------------------+--------------------+-----------------------------
- regress_testsub | regress_subscription_user | f | {testpub} | f | t | p | off | dbname=regress_doesnotexist
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two phase commit | Apply delay | Synchronous commit | Conninfo
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+-------------+--------------------+-----------------------------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | t | p | 0 | off | dbname=regress_doesnotexist
(1 row)
ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE);
@@ -294,10 +294,10 @@ DROP SUBSCRIPTION regress_testsub;
CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUBLICATION testpub WITH (connect = false, streaming = true, two_phase = true);
WARNING: tables were not subscribed, you will have to run ALTER SUBSCRIPTION ... REFRESH PUBLICATION to subscribe the tables
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two phase commit | Synchronous commit | Conninfo
------------------+---------------------------+---------+-------------+--------+-----------+------------------+--------------------+-----------------------------
- regress_testsub | regress_subscription_user | f | {testpub} | f | t | p | off | dbname=regress_doesnotexist
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two phase commit | Apply delay | Synchronous commit | Conninfo
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+-------------+--------------------+-----------------------------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | t | p | 0 | off | dbname=regress_doesnotexist
(1 row)
ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE);
diff --git a/src/test/subscription/t/029_apply_delay.pl b/src/test/subscription/t/029_apply_delay.pl
new file mode 100644
index 0000000000..bf9bf2b22d
--- /dev/null
+++ b/src/test/subscription/t/029_apply_delay.pl
@@ -0,0 +1,71 @@
+
+# Copyright (c) 2021, PostgreSQL Global Development Group
+
+# Test replication apply delay
+use strict;
+use warnings;
+use PostgreSQL::Test::Cluster;
+use PostgreSQL::Test::Utils;
+use Test::More tests => 3;
+
+# Create publisher node
+my $node_publisher = PostgreSQL::Test::Cluster->new('publisher');
+$node_publisher->init(allows_streaming => 'logical');
+$node_publisher->start;
+
+# Create subscriber node
+my $node_subscriber = PostgreSQL::Test::Cluster->new('subscriber');
+$node_subscriber->init(allows_streaming => 'logical');
+$node_subscriber->append_conf('postgresql.conf',
+ "log_min_messages = debug2");
+$node_subscriber->start;
+
+# Create some preexisting content on publisher
+$node_publisher->safe_psql('postgres',
+ "CREATE TABLE test_tab (a int primary key, b varchar)");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO test_tab VALUES (1, 'foo'), (2, 'bar')");
+
+# Setup structure on subscriber
+$node_subscriber->safe_psql('postgres',
+ "CREATE TABLE test_tab (a int primary key, b text, c timestamptz DEFAULT now(), d bigint DEFAULT 999)"
+);
+
+# Setup logical replication
+my $publisher_connstr = $node_publisher->connstr . ' dbname=postgres';
+$node_publisher->safe_psql('postgres',
+ "CREATE PUBLICATION tap_pub FOR TABLE test_tab");
+
+my $appname = 'tap_sub';
+$node_subscriber->safe_psql('postgres',
+ "CREATE SUBSCRIPTION tap_sub CONNECTION '$publisher_connstr application_name=$appname' PUBLICATION tap_pub WITH (apply_delay = '10s')"
+);
+
+$node_publisher->wait_for_catchup($appname);
+
+# Also wait for initial table sync to finish
+my $synced_query =
+ "SELECT count(1) = 0 FROM pg_subscription_rel WHERE srsubstate NOT IN ('r', 's');";
+$node_subscriber->poll_query_until('postgres', $synced_query)
+ or die "Timed out while waiting for subscriber to synchronize data";
+
+my $result =
+ $node_subscriber->safe_psql('postgres',
+ "SELECT count(*), min(a), max(a) FROM test_tab");
+is($result, qq(2|1|2), 'check initial data was copied to subscriber');
+
+# new row to trigger apply delay
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO test_tab VALUES (3, 'baz')");
+
+$node_publisher->wait_for_catchup($appname);
+
+$result =
+ $node_subscriber->safe_psql('postgres',
+ "SELECT count(*), min(a), max(a) FROM test_tab");
+is($result, qq(3|1|3), 'check the new row was applied to subscriber');
+
+my $logfile = slurp_file($node_subscriber->logfile());
+ok( $logfile =~
+ qr/logical replication apply delay/,
+ 'check if replication apply delay is triggered');
--
2.30.2
On Tuesday, March 1, 2022 9:19 AM Euler Taveira <euler@eulerto.com> wrote:
Long time, no patch. Here it is. I will provide documentation in the next
version. I would appreciate some feedback.
Hi, thank you for posting the patch !
$ git am v1-0001-Time-delayed-logical-replication-subscriber.patch
Applying: Time-delayed logical replication subscriber
error: patch failed: src/backend/catalog/system_views.sql:1261
error: src/backend/catalog/system_views.sql: patch does not apply
FYI, by one recent commit(7a85073), the HEAD redesigned pg_stat_subscription_workers.
Thus, the blow change can't be applied. Could you please rebase v1 ?
diff --git a/src/backend/catalog/system_views.sql b/src/backend/catalog/system_views.sql
index 3cb69b1f87..1cc0d86f2e 100644
--- a/src/backend/catalog/system_views.sql
+++ b/src/backend/catalog/system_views.sql
@@ -1261,7 +1261,8 @@ REVOKE ALL ON pg_replication_origin_status FROM public;
-- All columns of pg_subscription except subconninfo are publicly readable.
REVOKE ALL ON pg_subscription FROM public;
GRANT SELECT (oid, subdbid, subname, subowner, subenabled, subbinary,
- substream, subtwophasestate, subslotname, subsynccommit, subpublications)
+ substream, subtwophasestate, subslotname, subsynccommit,
+ subapplydelay, subpublications)
ON pg_subscription TO public;
CREATE VIEW pg_stat_subscription_workers AS
Best Regards,
Takamichi Osumi
On Tue, Mar 1, 2022, at 3:27 AM, osumi.takamichi@fujitsu.com wrote:
$ git am v1-0001-Time-delayed-logical-replication-subscriber.patch
I generally use -3 to fall back on 3-way merge. Doesn't it work for you?
--
Euler Taveira
EDB https://www.enterprisedb.com/
On Wednesday, March 2, 2022 8:54 AM Euler Taveira <euler@eulerto.com> wrote:
On Tue, Mar 1, 2022, at 3:27 AM, osumi.takamichi@fujitsu.com
<mailto:osumi.takamichi@fujitsu.com> wrote:$ git am v1-0001-Time-delayed-logical-replication-subscriber.patch
I generally use -3 to fall back on 3-way merge. Doesn't it work for you?
It did. Excuse me for making noises.
Best Regards,
Takamichi Osumi
On Mon, Feb 28, 2022, at 9:18 PM, Euler Taveira wrote:
Long time, no patch. Here it is. I will provide documentation in the next
version. I would appreciate some feedback.
This patch is broken since commit 705e20f8550c0e8e47c0b6b20b5f5ffd6ffd9e33. I
rebased it.
I added documentation that explains how this parameter works. I decided to
rename the parameter from apply_delay to min_apply_delay to use the same
terminology from the physical replication. IMO the new name seems clear that
there isn't a guarantee that we are always x ms behind the publisher. Indeed,
due to processing/transferring the delay might be higher than the specified
interval.
I refactored the way the delay is applied. The previous patch is only covering
a regular transaction. This new one also covers prepared transaction. The
current design intercepts the transaction during the first change (at the time
it will start the transaction to apply the changes) and applies the delay
before effectively starting the transaction. The previous patch uses
begin_replication_step() as this point. However, to support prepared
transactions I changed the apply_delay signature to accepts a timestamp
parameter (because we use another variable to calculate the delay for prepared
transactions -- prepare_time). Hence, the apply_delay() moved to another places
-- apply_handle_begin and apply_handle_begin_prepare().
The new code does not apply the delay in 2 situations:
* STREAM START: streamed transactions might not have commit_time or
prepare_time set. I'm afraid it is not possible to use the referred variables
because at STREAM START time we don't have a transaction commit time. The
protocol could provide a timestamp that indicates when it starts streaming
the transaction then we could use it to apply the delay. Unfortunately, we
don't have it. Having said that this new patch does not apply delay for
streamed transactions.
* non-transaction messages: the delay could be applied to non-transaction
messages too. It is sent independently of the transaction that contains it.
Since the logical replication does not send messages to the subscriber, this
is not an issue. However, consumers that use pgoutput and wants to implement
a delay will require it.
I'm still looking for a way to support streamed transactions without much
surgery into the logical replication protocol.
--
Euler Taveira
EDB https://www.enterprisedb.com/
Attachments:
v2-0001-Time-delayed-logical-replication-subscriber.patchtext/x-patch; name=v2-0001-Time-delayed-logical-replication-subscriber.patchDownload
From 9ad783c1259aed9bef81877265c648aae84ed5d8 Mon Sep 17 00:00:00 2001
From: Euler Taveira <euler.taveira@enterprisedb.com>
Date: Sat, 6 Nov 2021 11:31:10 -0300
Subject: [PATCH v2] Time-delayed logical replication subscriber
Similar to physical replication, a time-delayed copy of the data for
logical replication is useful for some scenarios (specially to fix
errors that might cause data loss).
If the subscriber sets min_apply_delay parameter, the logical
replication worker will delay the transaction commit for min_apply_delay
milliseconds.
The delay is calculated between the WAL time stamp and the current time
on the subscriber.
The delay occurs only on WAL records for transaction begins. Regular and
prepared transactions are covered. Streamed transactions are not
delayed. It should be implemented in future releases.
Author: Euler Taveira
Discussion: https://postgr.es/m/CAB-JLwYOYwL=XTyAXKiH5CtM_Vm8KjKh7aaitCKvmCh4rzr5pQ@mail.gmail.com
---
doc/src/sgml/ref/alter_subscription.sgml | 5 +-
doc/src/sgml/ref/create_subscription.sgml | 33 +++++
src/backend/catalog/pg_subscription.c | 1 +
src/backend/catalog/system_views.sql | 4 +-
src/backend/commands/subscriptioncmds.c | 46 ++++++-
src/backend/replication/logical/worker.c | 61 +++++++++
src/backend/utils/adt/timestamp.c | 8 ++
src/bin/pg_dump/pg_dump.c | 13 +-
src/bin/pg_dump/pg_dump.h | 1 +
src/bin/psql/describe.c | 11 +-
src/include/catalog/pg_subscription.h | 2 +
src/include/utils/timestamp.h | 2 +
src/test/regress/expected/subscription.out | 141 +++++++++++++--------
src/test/regress/sql/subscription.sql | 20 +++
src/test/subscription/t/030_apply_delay.pl | 76 +++++++++++
15 files changed, 358 insertions(+), 66 deletions(-)
create mode 100644 src/test/subscription/t/030_apply_delay.pl
diff --git a/doc/src/sgml/ref/alter_subscription.sgml b/doc/src/sgml/ref/alter_subscription.sgml
index 58b78a94ea..b764f8eba8 100644
--- a/doc/src/sgml/ref/alter_subscription.sgml
+++ b/doc/src/sgml/ref/alter_subscription.sgml
@@ -204,8 +204,9 @@ ALTER SUBSCRIPTION <replaceable class="parameter">name</replaceable> RENAME TO <
information. The parameters that can be altered
are <literal>slot_name</literal>,
<literal>synchronous_commit</literal>,
- <literal>binary</literal>, <literal>streaming</literal>, and
- <literal>disable_on_error</literal>.
+ <literal>binary</literal>, <literal>streaming</literal>,
+ <literal>disable_on_error</literal>, and
+ <literal>min_apply_delay</literal>.
</para>
</listitem>
</varlistentry>
diff --git a/doc/src/sgml/ref/create_subscription.sgml b/doc/src/sgml/ref/create_subscription.sgml
index b701752fc9..2bc8deb132 100644
--- a/doc/src/sgml/ref/create_subscription.sgml
+++ b/doc/src/sgml/ref/create_subscription.sgml
@@ -302,6 +302,39 @@ CREATE SUBSCRIPTION <replaceable class="parameter">subscription_name</replaceabl
</para>
</listitem>
</varlistentry>
+
+ <varlistentry>
+ <term><literal>min_apply_delay</literal> (<type>integer</type>)</term>
+ <listitem>
+ <para>
+ By default, subscriber applies changes as soon as possible. Similar
+ to the physical replication feature
+ (<xref linkend="guc-recovery-min-apply-delay"/>), it may be useful to
+ have a time-delayed copy of data for logical replication. This
+ parameter allows you to delay the application of changes by a
+ specified amount of time. If this value is specificed without units,
+ it is taken as milliseconds. The default is zero, adding no delay.
+ </para>
+ <para>
+ The delay occurs only after the initial table synchronization. It is
+ possible that the replication delay between publisher and subscriber
+ exceeds the value of this parameter, in which case no delay is added.
+ Note that the delay is calculated between the WAL time stamp as
+ written on publisher and the current time on the subscriber. Delays
+ in logical decoding and in transfer the transaction may reduce the
+ actual wait time. If the system clocks on publisher and subscriber
+ are not synchronized, this may lead to apply changes earlier than
+ expected. This is not a major issue because a typical setting of this
+ parameter are much larger than typical time deviations between
+ servers.
+ </para>
+ <para>
+ The delay occurs only on WAL records for transaction begins. Streamed
+ transactions do not impose a delay. It should be implemented in future
+ releases.
+ </para>
+ </listitem>
+ </varlistentry>
</variablelist>
</para>
diff --git a/src/backend/catalog/pg_subscription.c b/src/backend/catalog/pg_subscription.c
index a6304f5f81..48e6d0af70 100644
--- a/src/backend/catalog/pg_subscription.c
+++ b/src/backend/catalog/pg_subscription.c
@@ -70,6 +70,7 @@ GetSubscription(Oid subid, bool missing_ok)
sub->stream = subform->substream;
sub->twophasestate = subform->subtwophasestate;
sub->disableonerr = subform->subdisableonerr;
+ sub->applydelay = subform->subapplydelay;
/* Get conninfo */
datum = SysCacheGetAttr(SUBSCRIPTIONOID,
diff --git a/src/backend/catalog/system_views.sql b/src/backend/catalog/system_views.sql
index bb1ac30cd1..d7c018b81f 100644
--- a/src/backend/catalog/system_views.sql
+++ b/src/backend/catalog/system_views.sql
@@ -1261,8 +1261,8 @@ REVOKE ALL ON pg_replication_origin_status FROM public;
-- All columns of pg_subscription except subconninfo are publicly readable.
REVOKE ALL ON pg_subscription FROM public;
GRANT SELECT (oid, subdbid, subname, subowner, subenabled, subbinary,
- substream, subtwophasestate, subdisableonerr, subslotname,
- subsynccommit, subpublications)
+ substream, subtwophasestate, subdisableonerr, subapplydelay,
+ subslotname, subsynccommit, subpublications)
ON pg_subscription TO public;
CREATE VIEW pg_stat_subscription_stats AS
diff --git a/src/backend/commands/subscriptioncmds.c b/src/backend/commands/subscriptioncmds.c
index 3922658bbc..4b9890e6e6 100644
--- a/src/backend/commands/subscriptioncmds.c
+++ b/src/backend/commands/subscriptioncmds.c
@@ -46,6 +46,7 @@
#include "utils/lsyscache.h"
#include "utils/memutils.h"
#include "utils/syscache.h"
+#include "utils/timestamp.h"
/*
* Options that can be specified by the user in CREATE/ALTER SUBSCRIPTION
@@ -62,6 +63,7 @@
#define SUBOPT_STREAMING 0x00000100
#define SUBOPT_TWOPHASE_COMMIT 0x00000200
#define SUBOPT_DISABLE_ON_ERR 0x00000400
+#define SUBOPT_MIN_APPLY_DELAY 0x00000800
/* check if the 'val' has 'bits' set */
#define IsSet(val, bits) (((val) & (bits)) == (bits))
@@ -84,6 +86,7 @@ typedef struct SubOpts
bool streaming;
bool twophase;
bool disableonerr;
+ int64 min_apply_delay;
} SubOpts;
static List *fetch_table_list(WalReceiverConn *wrconn, List *publications);
@@ -262,12 +265,35 @@ parse_subscription_options(ParseState *pstate, List *stmt_options,
opts->specified_opts |= SUBOPT_DISABLE_ON_ERR;
opts->disableonerr = defGetBoolean(defel);
}
+ else if (IsSet(supported_opts, SUBOPT_MIN_APPLY_DELAY) &&
+ strcmp(defel->defname, "min_apply_delay") == 0)
+ {
+ char *val;
+ Interval *interval;
+
+ if (IsSet(opts->specified_opts, SUBOPT_MIN_APPLY_DELAY))
+ errorConflictingDefElem(defel, pstate);
+
+ opts->specified_opts |= SUBOPT_MIN_APPLY_DELAY;
+ val = defGetString(defel);
+
+ interval = DatumGetIntervalP(DirectFunctionCall3(interval_in,
+ CStringGetDatum(val),
+ ObjectIdGetDatum(InvalidOid),
+ Int32GetDatum(-1)));
+ opts->min_apply_delay = interval_to_ms(interval);
+ }
else
ereport(ERROR,
(errcode(ERRCODE_SYNTAX_ERROR),
errmsg("unrecognized subscription parameter: \"%s\"", defel->defname)));
}
+ if (opts->min_apply_delay < 0)
+ ereport(ERROR,
+ errcode(ERRCODE_NUMERIC_VALUE_OUT_OF_RANGE),
+ errmsg("option \"%s\" must not be negative", "min_apply_delay"));
+
/*
* We've been explicitly asked to not connect, that requires some
* additional processing.
@@ -404,7 +430,7 @@ CreateSubscription(ParseState *pstate, CreateSubscriptionStmt *stmt,
SUBOPT_SLOT_NAME | SUBOPT_COPY_DATA |
SUBOPT_SYNCHRONOUS_COMMIT | SUBOPT_BINARY |
SUBOPT_STREAMING | SUBOPT_TWOPHASE_COMMIT |
- SUBOPT_DISABLE_ON_ERR);
+ SUBOPT_DISABLE_ON_ERR | SUBOPT_MIN_APPLY_DELAY);
parse_subscription_options(pstate, stmt->options, supported_opts, &opts);
/*
@@ -479,6 +505,7 @@ CreateSubscription(ParseState *pstate, CreateSubscriptionStmt *stmt,
LOGICALREP_TWOPHASE_STATE_PENDING :
LOGICALREP_TWOPHASE_STATE_DISABLED);
values[Anum_pg_subscription_subdisableonerr - 1] = BoolGetDatum(opts.disableonerr);
+ values[Anum_pg_subscription_subapplydelay - 1] = Int64GetDatum(opts.min_apply_delay);
values[Anum_pg_subscription_subconninfo - 1] =
CStringGetTextDatum(conninfo);
if (opts.slot_name)
@@ -935,6 +962,12 @@ AlterSubscription(ParseState *pstate, AlterSubscriptionStmt *stmt,
replaces[Anum_pg_subscription_subdisableonerr - 1]
= true;
}
+ if (IsSet(opts.specified_opts, SUBOPT_MIN_APPLY_DELAY))
+ {
+ values[Anum_pg_subscription_subapplydelay - 1] =
+ Int64GetDatum(opts.min_apply_delay);
+ replaces[Anum_pg_subscription_subapplydelay - 1] = true;
+ }
update_tuple = true;
break;
@@ -958,6 +991,17 @@ AlterSubscription(ParseState *pstate, AlterSubscriptionStmt *stmt,
if (opts.enabled)
ApplyLauncherWakeupAtCommit();
+ /*
+ * If this subscription has been disabled and it has an apply
+ * delay set, wake up the logical replication worker to finish
+ * it as soon as possible.
+ */
+ if (!opts.enabled && sub->applydelay > 0)
+ {
+ elog(DEBUG1, "subscription has been disabled");
+ logicalrep_worker_wakeup(sub->oid, InvalidOid);
+ }
+
update_tuple = true;
break;
}
diff --git a/src/backend/replication/logical/worker.c b/src/backend/replication/logical/worker.c
index 03e069c7cd..241810c62d 100644
--- a/src/backend/replication/logical/worker.c
+++ b/src/backend/replication/logical/worker.c
@@ -250,6 +250,7 @@ WalReceiverConn *LogRepWorkerWalRcvConn = NULL;
Subscription *MySubscription = NULL;
bool MySubscriptionValid = false;
+TimestampTz MySubscriptionMinApplyDelayUntil = 0;
bool in_remote_transaction = false;
static XLogRecPtr remote_final_lsn = InvalidXLogRecPtr;
@@ -307,6 +308,8 @@ static void maybe_reread_subscription(void);
static void DisableSubscriptionAndExit(void);
+static void apply_delay(TimestampTz ts);
+
/* prototype needed because of stream_commit */
static void apply_dispatch(StringInfo s);
@@ -782,6 +785,58 @@ slot_modify_data(TupleTableSlot *slot, TupleTableSlot *srcslot,
ExecStoreVirtualTuple(slot);
}
+/*
+ * Apply the informed delay for the transaction.
+ *
+ * A regular transaction uses the commit time to calculate the delay. A
+ * prepared transaction uses the prepare time to calculate the delay (the
+ * commit time is unknown at prepare time).
+ *
+ * FIXME A streamed transaction could be delayed too but there is no timestamp
+ * in the STREAM START protocol message. Hence, if it is a streamed (regular or
+ * prepared) transaction, no delay is applied.
+ */
+static void
+apply_delay(TimestampTz ts)
+{
+ /* nothing to do if no delay set */
+ if (MySubscription->applydelay <= 0)
+ return;
+
+ /* set apply delay */
+ MySubscriptionMinApplyDelayUntil = TimestampTzPlusMilliseconds(TimestampTzGetDatum(ts),
+ MySubscription->applydelay);
+
+ while (true)
+ {
+ int diffms;
+
+ diffms = TimestampDifferenceMilliseconds(GetCurrentTimestamp(), MySubscriptionMinApplyDelayUntil);
+
+ elog(DEBUG2, "logical replication apply delay: %u ms", diffms);
+
+ /*
+ * Exit without arming the latch if it's already past time to apply
+ * this transaction.
+ */
+ if (diffms <= 0)
+ break;
+
+ WaitLatch(MyLatch,
+ WL_LATCH_SET | WL_TIMEOUT | WL_EXIT_ON_PM_DEATH,
+ diffms,
+ WAIT_EVENT_RECOVERY_APPLY_DELAY);
+ ResetLatch(MyLatch);
+
+ CHECK_FOR_INTERRUPTS();
+ }
+
+ /*
+ * Delay applied. Reset state.
+ */
+ MySubscriptionMinApplyDelayUntil = 0;
+}
+
/*
* Handle BEGIN message.
*/
@@ -793,6 +848,9 @@ apply_handle_begin(StringInfo s)
logicalrep_read_begin(s, &begin_data);
set_apply_error_context_xact(begin_data.xid, begin_data.final_lsn);
+ /* Should we delay the current transaction? */
+ apply_delay(begin_data.committime);
+
remote_final_lsn = begin_data.final_lsn;
in_remote_transaction = true;
@@ -845,6 +903,9 @@ apply_handle_begin_prepare(StringInfo s)
logicalrep_read_begin_prepare(s, &begin_data);
set_apply_error_context_xact(begin_data.xid, begin_data.prepare_lsn);
+ /* Should we delay the current prepared transaction? */
+ apply_delay(begin_data.prepare_time);
+
remote_final_lsn = begin_data.prepare_lsn;
in_remote_transaction = true;
diff --git a/src/backend/utils/adt/timestamp.c b/src/backend/utils/adt/timestamp.c
index ae36ff3328..1eda3ed57c 100644
--- a/src/backend/utils/adt/timestamp.c
+++ b/src/backend/utils/adt/timestamp.c
@@ -2379,6 +2379,14 @@ interval_cmp_internal(Interval *interval1, Interval *interval2)
return int128_compare(span1, span2);
}
+int64
+interval_to_ms(const Interval *interval)
+{
+ INT128 span = interval_cmp_value(interval) / 1000;
+
+ return span;
+}
+
Datum
interval_eq(PG_FUNCTION_ARGS)
{
diff --git a/src/bin/pg_dump/pg_dump.c b/src/bin/pg_dump/pg_dump.c
index 725cd2e4eb..c1cc5476cc 100644
--- a/src/bin/pg_dump/pg_dump.c
+++ b/src/bin/pg_dump/pg_dump.c
@@ -4325,6 +4325,7 @@ getSubscriptions(Archive *fout)
int i_subsynccommit;
int i_subpublications;
int i_subbinary;
+ int i_subapplydelay;
int i,
ntups;
@@ -4369,11 +4370,13 @@ getSubscriptions(Archive *fout)
if (fout->remoteVersion >= 150000)
appendPQExpBufferStr(query,
" s.subtwophasestate,\n"
- " s.subdisableonerr\n");
+ " s.subdisableonerr,\n"
+ " s.subapplydelay\n");
else
appendPQExpBuffer(query,
" '%c' AS subtwophasestate,\n"
- " false AS subdisableonerr\n",
+ " false AS subdisableonerr,\n"
+ " 0 AS subapplydelay\n",
LOGICALREP_TWOPHASE_STATE_DISABLED);
appendPQExpBufferStr(query,
@@ -4397,6 +4400,7 @@ getSubscriptions(Archive *fout)
i_substream = PQfnumber(res, "substream");
i_subtwophasestate = PQfnumber(res, "subtwophasestate");
i_subdisableonerr = PQfnumber(res, "subdisableonerr");
+ i_subapplydelay = PQfnumber(res, "subapplydelay");
subinfo = pg_malloc(ntups * sizeof(SubscriptionInfo));
@@ -4426,6 +4430,8 @@ getSubscriptions(Archive *fout)
pg_strdup(PQgetvalue(res, i, i_subtwophasestate));
subinfo[i].subdisableonerr =
pg_strdup(PQgetvalue(res, i, i_subdisableonerr));
+ subinfo[i].subapplydelay =
+ strtoi64(PQgetvalue(res, i, i_subapplydelay), NULL, 10);
/* Decide whether we want to dump it */
selectDumpableObject(&(subinfo[i].dobj), fout);
@@ -4502,6 +4508,9 @@ dumpSubscription(Archive *fout, const SubscriptionInfo *subinfo)
if (strcmp(subinfo->subsynccommit, "off") != 0)
appendPQExpBuffer(query, ", synchronous_commit = %s", fmtId(subinfo->subsynccommit));
+ if (subinfo->subapplydelay > 0)
+ appendPQExpBuffer(query, ", min_apply_delay = '" INT64_FORMAT " ms'", subinfo->subapplydelay);
+
appendPQExpBufferStr(query, ");\n");
if (subinfo->dobj.dump & DUMP_COMPONENT_DEFINITION)
diff --git a/src/bin/pg_dump/pg_dump.h b/src/bin/pg_dump/pg_dump.h
index 772dc0cf7a..15a05227f9 100644
--- a/src/bin/pg_dump/pg_dump.h
+++ b/src/bin/pg_dump/pg_dump.h
@@ -659,6 +659,7 @@ typedef struct _SubscriptionInfo
char *subtwophasestate;
char *subdisableonerr;
char *subsynccommit;
+ int64 subapplydelay;
char *subpublications;
} SubscriptionInfo;
diff --git a/src/bin/psql/describe.c b/src/bin/psql/describe.c
index 991bfc1546..cd387ec62c 100644
--- a/src/bin/psql/describe.c
+++ b/src/bin/psql/describe.c
@@ -6139,13 +6139,18 @@ describeSubscriptions(const char *pattern, bool verbose)
gettext_noop("Binary"),
gettext_noop("Streaming"));
- /* Two_phase and disable_on_error are only supported in v15 and higher */
+ /*
+ * two_phase, disable_on_error and min_apply_delay are only supported
+ * in v15 and higher.
+ */
if (pset.sversion >= 150000)
appendPQExpBuffer(&buf,
", subtwophasestate AS \"%s\"\n"
- ", subdisableonerr AS \"%s\"\n",
+ ", subdisableonerr AS \"%s\"\n"
+ ", subapplydelay AS \"%s\"\n",
gettext_noop("Two phase commit"),
- gettext_noop("Disable on error"));
+ gettext_noop("Disable on error"),
+ gettext_noop("Apply delay"));
appendPQExpBuffer(&buf,
", subsynccommit AS \"%s\"\n"
diff --git a/src/include/catalog/pg_subscription.h b/src/include/catalog/pg_subscription.h
index e2befaf351..e8840a7d14 100644
--- a/src/include/catalog/pg_subscription.h
+++ b/src/include/catalog/pg_subscription.h
@@ -69,6 +69,7 @@ CATALOG(pg_subscription,6100,SubscriptionRelationId) BKI_SHARED_RELATION BKI_ROW
bool subdisableonerr; /* True if a worker error should cause the
* subscription to be disabled */
+ int64 subapplydelay; /* Replication apply delay */
#ifdef CATALOG_VARLEN /* variable-length fields start here */
/* Connection string to the publisher */
@@ -109,6 +110,7 @@ typedef struct Subscription
bool disableonerr; /* Indicates if the subscription should be
* automatically disabled if a worker error
* occurs */
+ int64 applydelay; /* Replication apply delay */
char *conninfo; /* Connection string to the publisher */
char *slotname; /* Name of the replication slot */
char *synccommit; /* Synchronous commit setting for worker */
diff --git a/src/include/utils/timestamp.h b/src/include/utils/timestamp.h
index c1a74f8e2b..58a6e6b6da 100644
--- a/src/include/utils/timestamp.h
+++ b/src/include/utils/timestamp.h
@@ -78,6 +78,8 @@ extern bool TimestampDifferenceExceeds(TimestampTz start_time,
TimestampTz stop_time,
int msec);
+extern int64 interval_to_ms(const Interval *interval);
+
extern TimestampTz time_t_to_timestamptz(pg_time_t tm);
extern pg_time_t timestamptz_to_time_t(TimestampTz t);
diff --git a/src/test/regress/expected/subscription.out b/src/test/regress/expected/subscription.out
index ad8003fae1..aa1fdc8a11 100644
--- a/src/test/regress/expected/subscription.out
+++ b/src/test/regress/expected/subscription.out
@@ -76,10 +76,10 @@ ALTER SUBSCRIPTION regress_testsub CONNECTION 'foobar';
ERROR: invalid connection string syntax: missing "=" after "foobar" in connection info string
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two phase commit | Disable on error | Synchronous commit | Conninfo
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------------------+-----------------------------
- regress_testsub | regress_subscription_user | f | {testpub} | f | f | d | f | off | dbname=regress_doesnotexist
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two phase commit | Disable on error | Apply delay | Synchronous commit | Conninfo
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+-------------+--------------------+-----------------------------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | f | d | f | 0 | off | dbname=regress_doesnotexist
(1 row)
ALTER SUBSCRIPTION regress_testsub SET PUBLICATION testpub2, testpub3 WITH (refresh = false);
@@ -94,10 +94,10 @@ ERROR: subscription "regress_doesnotexist" does not exist
ALTER SUBSCRIPTION regress_testsub SET (create_slot = false);
ERROR: unrecognized subscription parameter: "create_slot"
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two phase commit | Disable on error | Synchronous commit | Conninfo
------------------+---------------------------+---------+---------------------+--------+-----------+------------------+------------------+--------------------+------------------------------
- regress_testsub | regress_subscription_user | f | {testpub2,testpub3} | f | f | d | f | off | dbname=regress_doesnotexist2
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two phase commit | Disable on error | Apply delay | Synchronous commit | Conninfo
+-----------------+---------------------------+---------+---------------------+--------+-----------+------------------+------------------+-------------+--------------------+------------------------------
+ regress_testsub | regress_subscription_user | f | {testpub2,testpub3} | f | f | d | f | 0 | off | dbname=regress_doesnotexist2
(1 row)
BEGIN;
@@ -129,10 +129,10 @@ ALTER SUBSCRIPTION regress_testsub_foo SET (synchronous_commit = foobar);
ERROR: invalid value for parameter "synchronous_commit": "foobar"
HINT: Available values: local, remote_write, remote_apply, on, off.
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two phase commit | Disable on error | Synchronous commit | Conninfo
----------------------+---------------------------+---------+---------------------+--------+-----------+------------------+------------------+--------------------+------------------------------
- regress_testsub_foo | regress_subscription_user | f | {testpub2,testpub3} | f | f | d | f | local | dbname=regress_doesnotexist2
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two phase commit | Disable on error | Apply delay | Synchronous commit | Conninfo
+---------------------+---------------------------+---------+---------------------+--------+-----------+------------------+------------------+-------------+--------------------+------------------------------
+ regress_testsub_foo | regress_subscription_user | f | {testpub2,testpub3} | f | f | d | f | 0 | local | dbname=regress_doesnotexist2
(1 row)
-- rename back to keep the rest simple
@@ -165,19 +165,19 @@ ERROR: binary requires a Boolean value
CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUBLICATION testpub WITH (connect = false, binary = true);
WARNING: tables were not subscribed, you will have to run ALTER SUBSCRIPTION ... REFRESH PUBLICATION to subscribe the tables
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two phase commit | Disable on error | Synchronous commit | Conninfo
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------------------+-----------------------------
- regress_testsub | regress_subscription_user | f | {testpub} | t | f | d | f | off | dbname=regress_doesnotexist
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two phase commit | Disable on error | Apply delay | Synchronous commit | Conninfo
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+-------------+--------------------+-----------------------------
+ regress_testsub | regress_subscription_user | f | {testpub} | t | f | d | f | 0 | off | dbname=regress_doesnotexist
(1 row)
ALTER SUBSCRIPTION regress_testsub SET (binary = false);
ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE);
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two phase commit | Disable on error | Synchronous commit | Conninfo
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------------------+-----------------------------
- regress_testsub | regress_subscription_user | f | {testpub} | f | f | d | f | off | dbname=regress_doesnotexist
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two phase commit | Disable on error | Apply delay | Synchronous commit | Conninfo
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+-------------+--------------------+-----------------------------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | f | d | f | 0 | off | dbname=regress_doesnotexist
(1 row)
DROP SUBSCRIPTION regress_testsub;
@@ -188,19 +188,19 @@ ERROR: streaming requires a Boolean value
CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUBLICATION testpub WITH (connect = false, streaming = true);
WARNING: tables were not subscribed, you will have to run ALTER SUBSCRIPTION ... REFRESH PUBLICATION to subscribe the tables
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two phase commit | Disable on error | Synchronous commit | Conninfo
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------------------+-----------------------------
- regress_testsub | regress_subscription_user | f | {testpub} | f | t | d | f | off | dbname=regress_doesnotexist
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two phase commit | Disable on error | Apply delay | Synchronous commit | Conninfo
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+-------------+--------------------+-----------------------------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | t | d | f | 0 | off | dbname=regress_doesnotexist
(1 row)
ALTER SUBSCRIPTION regress_testsub SET (streaming = false);
ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE);
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two phase commit | Disable on error | Synchronous commit | Conninfo
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------------------+-----------------------------
- regress_testsub | regress_subscription_user | f | {testpub} | f | f | d | f | off | dbname=regress_doesnotexist
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two phase commit | Disable on error | Apply delay | Synchronous commit | Conninfo
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+-------------+--------------------+-----------------------------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | f | d | f | 0 | off | dbname=regress_doesnotexist
(1 row)
-- fail - publication already exists
@@ -215,10 +215,10 @@ ALTER SUBSCRIPTION regress_testsub ADD PUBLICATION testpub1, testpub2 WITH (refr
ALTER SUBSCRIPTION regress_testsub ADD PUBLICATION testpub1, testpub2 WITH (refresh = false);
ERROR: publication "testpub1" is already in subscription "regress_testsub"
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two phase commit | Disable on error | Synchronous commit | Conninfo
------------------+---------------------------+---------+-----------------------------+--------+-----------+------------------+------------------+--------------------+-----------------------------
- regress_testsub | regress_subscription_user | f | {testpub,testpub1,testpub2} | f | f | d | f | off | dbname=regress_doesnotexist
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two phase commit | Disable on error | Apply delay | Synchronous commit | Conninfo
+-----------------+---------------------------+---------+-----------------------------+--------+-----------+------------------+------------------+-------------+--------------------+-----------------------------
+ regress_testsub | regress_subscription_user | f | {testpub,testpub1,testpub2} | f | f | d | f | 0 | off | dbname=regress_doesnotexist
(1 row)
-- fail - publication used more then once
@@ -233,10 +233,10 @@ ERROR: publication "testpub3" is not in subscription "regress_testsub"
-- ok - delete publications
ALTER SUBSCRIPTION regress_testsub DROP PUBLICATION testpub1, testpub2 WITH (refresh = false);
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two phase commit | Disable on error | Synchronous commit | Conninfo
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------------------+-----------------------------
- regress_testsub | regress_subscription_user | f | {testpub} | f | f | d | f | off | dbname=regress_doesnotexist
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two phase commit | Disable on error | Apply delay | Synchronous commit | Conninfo
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+-------------+--------------------+-----------------------------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | f | d | f | 0 | off | dbname=regress_doesnotexist
(1 row)
DROP SUBSCRIPTION regress_testsub;
@@ -270,10 +270,10 @@ ERROR: two_phase requires a Boolean value
CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUBLICATION testpub WITH (connect = false, two_phase = true);
WARNING: tables were not subscribed, you will have to run ALTER SUBSCRIPTION ... REFRESH PUBLICATION to subscribe the tables
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two phase commit | Disable on error | Synchronous commit | Conninfo
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------------------+-----------------------------
- regress_testsub | regress_subscription_user | f | {testpub} | f | f | p | f | off | dbname=regress_doesnotexist
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two phase commit | Disable on error | Apply delay | Synchronous commit | Conninfo
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+-------------+--------------------+-----------------------------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | f | p | f | 0 | off | dbname=regress_doesnotexist
(1 row)
--fail - alter of two_phase option not supported.
@@ -282,10 +282,10 @@ ERROR: unrecognized subscription parameter: "two_phase"
-- but can alter streaming when two_phase enabled
ALTER SUBSCRIPTION regress_testsub SET (streaming = true);
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two phase commit | Disable on error | Synchronous commit | Conninfo
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------------------+-----------------------------
- regress_testsub | regress_subscription_user | f | {testpub} | f | t | p | f | off | dbname=regress_doesnotexist
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two phase commit | Disable on error | Apply delay | Synchronous commit | Conninfo
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+-------------+--------------------+-----------------------------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | t | p | f | 0 | off | dbname=regress_doesnotexist
(1 row)
ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE);
@@ -294,10 +294,10 @@ DROP SUBSCRIPTION regress_testsub;
CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUBLICATION testpub WITH (connect = false, streaming = true, two_phase = true);
WARNING: tables were not subscribed, you will have to run ALTER SUBSCRIPTION ... REFRESH PUBLICATION to subscribe the tables
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two phase commit | Disable on error | Synchronous commit | Conninfo
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------------------+-----------------------------
- regress_testsub | regress_subscription_user | f | {testpub} | f | t | p | f | off | dbname=regress_doesnotexist
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two phase commit | Disable on error | Apply delay | Synchronous commit | Conninfo
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+-------------+--------------------+-----------------------------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | t | p | f | 0 | off | dbname=regress_doesnotexist
(1 row)
ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE);
@@ -309,18 +309,47 @@ ERROR: disable_on_error requires a Boolean value
CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUBLICATION testpub WITH (connect = false, disable_on_error = false);
WARNING: tables were not subscribed, you will have to run ALTER SUBSCRIPTION ... REFRESH PUBLICATION to subscribe the tables
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two phase commit | Disable on error | Synchronous commit | Conninfo
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------------------+-----------------------------
- regress_testsub | regress_subscription_user | f | {testpub} | f | f | d | f | off | dbname=regress_doesnotexist
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two phase commit | Disable on error | Apply delay | Synchronous commit | Conninfo
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+-------------+--------------------+-----------------------------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | f | d | f | 0 | off | dbname=regress_doesnotexist
(1 row)
ALTER SUBSCRIPTION regress_testsub SET (disable_on_error = true);
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two phase commit | Disable on error | Synchronous commit | Conninfo
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------------------+-----------------------------
- regress_testsub | regress_subscription_user | f | {testpub} | f | f | d | t | off | dbname=regress_doesnotexist
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two phase commit | Disable on error | Apply delay | Synchronous commit | Conninfo
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+-------------+--------------------+-----------------------------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | f | d | t | 0 | off | dbname=regress_doesnotexist
+(1 row)
+
+ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE);
+DROP SUBSCRIPTION regress_testsub;
+-- fail -- min_apply_delay must be a non-negative integer
+CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUBLICATION testpub WITH (connect = false, min_apply_delay = foo);
+ERROR: invalid input syntax for type interval: "foo"
+CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUBLICATION testpub WITH (connect = false, min_apply_delay = -1);
+ERROR: option "min_apply_delay" must not be negative
+-- success -- 123 ms
+CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUBLICATION testpub WITH (connect = false, min_apply_delay = 123);
+WARNING: tables were not subscribed, you will have to run ALTER SUBSCRIPTION ... REFRESH PUBLICATION to subscribe the tables
+\dRs+
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two phase commit | Disable on error | Apply delay | Synchronous commit | Conninfo
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+-------------+--------------------+-----------------------------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | f | d | f | 123000 | off | dbname=regress_doesnotexist
+(1 row)
+
+ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE);
+DROP SUBSCRIPTION regress_testsub;
+-- success -- interval is converted into ms and stored as integer
+CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUBLICATION testpub WITH (connect = false, min_apply_delay = '4h 27min 35s');
+WARNING: tables were not subscribed, you will have to run ALTER SUBSCRIPTION ... REFRESH PUBLICATION to subscribe the tables
+\dRs+
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two phase commit | Disable on error | Apply delay | Synchronous commit | Conninfo
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+-------------+--------------------+-----------------------------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | f | d | f | 16055000 | off | dbname=regress_doesnotexist
(1 row)
ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE);
diff --git a/src/test/regress/sql/subscription.sql b/src/test/regress/sql/subscription.sql
index a7c15b1daf..9b282b8263 100644
--- a/src/test/regress/sql/subscription.sql
+++ b/src/test/regress/sql/subscription.sql
@@ -243,6 +243,26 @@ ALTER SUBSCRIPTION regress_testsub SET (disable_on_error = true);
ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE);
DROP SUBSCRIPTION regress_testsub;
+-- fail -- min_apply_delay must be a non-negative integer
+CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUBLICATION testpub WITH (connect = false, min_apply_delay = foo);
+CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUBLICATION testpub WITH (connect = false, min_apply_delay = -1);
+
+-- success -- 123 ms
+CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUBLICATION testpub WITH (connect = false, min_apply_delay = 123);
+
+\dRs+
+
+ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE);
+DROP SUBSCRIPTION regress_testsub;
+
+-- success -- interval is converted into ms and stored as integer
+CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUBLICATION testpub WITH (connect = false, min_apply_delay = '4h 27min 35s');
+
+\dRs+
+
+ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE);
+DROP SUBSCRIPTION regress_testsub;
+
RESET SESSION AUTHORIZATION;
DROP ROLE regress_subscription_user;
DROP ROLE regress_subscription_user2;
diff --git a/src/test/subscription/t/030_apply_delay.pl b/src/test/subscription/t/030_apply_delay.pl
new file mode 100644
index 0000000000..25433c45e9
--- /dev/null
+++ b/src/test/subscription/t/030_apply_delay.pl
@@ -0,0 +1,76 @@
+
+# Copyright (c) 2022, PostgreSQL Global Development Group
+
+# Test replication apply delay
+use strict;
+use warnings;
+use PostgreSQL::Test::Cluster;
+use PostgreSQL::Test::Utils;
+use Test::More;
+
+# Create publisher node
+my $node_publisher = PostgreSQL::Test::Cluster->new('publisher');
+$node_publisher->init(allows_streaming => 'logical');
+$node_publisher->start;
+
+# Create subscriber node
+my $node_subscriber = PostgreSQL::Test::Cluster->new('subscriber');
+$node_subscriber->init;
+$node_subscriber->append_conf('postgresql.conf',
+ "log_min_messages = debug2");
+$node_subscriber->start;
+
+# Create some preexisting content on publisher
+$node_publisher->safe_psql('postgres',
+ "CREATE TABLE test_tab (a int primary key, b varchar)");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO test_tab VALUES (1, 'foo'), (2, 'bar')");
+
+# Setup structure on subscriber
+$node_subscriber->safe_psql('postgres',
+ "CREATE TABLE test_tab (a int primary key, b text, c timestamptz DEFAULT now(), d bigint DEFAULT 999)"
+);
+
+# Setup logical replication
+my $publisher_connstr = $node_publisher->connstr . ' dbname=postgres';
+$node_publisher->safe_psql('postgres',
+ "CREATE PUBLICATION tap_pub FOR TABLE test_tab");
+
+my $appname = 'tap_sub';
+$node_subscriber->safe_psql('postgres',
+ "CREATE SUBSCRIPTION tap_sub CONNECTION '$publisher_connstr application_name=$appname' PUBLICATION tap_pub WITH (min_apply_delay = '2s')"
+);
+
+$node_publisher->wait_for_catchup($appname);
+
+# Also wait for initial table sync to finish
+my $synced_query =
+ "SELECT count(1) = 0 FROM pg_subscription_rel WHERE srsubstate NOT IN ('r', 's');";
+$node_subscriber->poll_query_until('postgres', $synced_query)
+ or die "Timed out while waiting for subscriber to synchronize data";
+
+my $result =
+ $node_subscriber->safe_psql('postgres',
+ "SELECT count(*), min(a), max(a) FROM test_tab");
+is($result, qq(2|1|2), 'check initial data was copied to subscriber');
+
+# new row to trigger apply delay
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO test_tab VALUES (3, 'baz')");
+
+$node_publisher->wait_for_catchup($appname);
+
+$result =
+ $node_subscriber->safe_psql('postgres',
+ "SELECT count(*), min(a), max(a) FROM test_tab");
+is($result, qq(3|1|3), 'check the new row was applied to subscriber');
+
+my $logfile = slurp_file($node_subscriber->logfile());
+ok( $logfile =~
+ qr/logical replication apply delay/,
+ 'check if replication apply delay is triggered');
+
+$node_subscriber->stop;
+$node_publisher->stop;
+
+done_testing();
--
2.30.2
On 2022-03-20 21:40:40 -0300, Euler Taveira wrote:
On Mon, Feb 28, 2022, at 9:18 PM, Euler Taveira wrote:
Long time, no patch. Here it is. I will provide documentation in the next
version. I would appreciate some feedback.This patch is broken since commit 705e20f8550c0e8e47c0b6b20b5f5ffd6ffd9e33. I
rebased it.
This fails tests, specifically it seems psql crashes:
https://cirrus-ci.com/task/6592281292570624?logs=cores#L46
Marked as waiting-on-author.
Greetings,
Andres Freund
On Mon, Mar 21, 2022, at 10:04 PM, Andres Freund wrote:
On 2022-03-20 21:40:40 -0300, Euler Taveira wrote:
On Mon, Feb 28, 2022, at 9:18 PM, Euler Taveira wrote:
Long time, no patch. Here it is. I will provide documentation in the next
version. I would appreciate some feedback.This patch is broken since commit 705e20f8550c0e8e47c0b6b20b5f5ffd6ffd9e33. I
rebased it.This fails tests, specifically it seems psql crashes:
https://cirrus-ci.com/task/6592281292570624?logs=cores#L46
Yeah. I forgot to test this patch with cassert before sending it. :( I didn't
send a new patch because there is another issue (with int128) that I'm
currently reworking. I'll send another patch soon.
--
Euler Taveira
EDB https://www.enterprisedb.com/
On Mon, Mar 21, 2022, at 10:09 PM, Euler Taveira wrote:
On Mon, Mar 21, 2022, at 10:04 PM, Andres Freund wrote:
On 2022-03-20 21:40:40 -0300, Euler Taveira wrote:
On Mon, Feb 28, 2022, at 9:18 PM, Euler Taveira wrote:
Long time, no patch. Here it is. I will provide documentation in the next
version. I would appreciate some feedback.This patch is broken since commit 705e20f8550c0e8e47c0b6b20b5f5ffd6ffd9e33. I
rebased it.This fails tests, specifically it seems psql crashes:
https://cirrus-ci.com/task/6592281292570624?logs=cores#L46Yeah. I forgot to test this patch with cassert before sending it. :( I didn't
send a new patch because there is another issue (with int128) that I'm
currently reworking. I'll send another patch soon.
Here is another version after rebasing it. In this version I fixed the psql
issue and rewrote interval_to_ms function.
--
Euler Taveira
EDB https://www.enterprisedb.com/
Attachments:
v3-0001-Time-delayed-logical-replication-subscriber.patchtext/x-patch; name=v3-0001-Time-delayed-logical-replication-subscriber.patchDownload
From 6718e96d3682af9094c02b0e308a8c814148a197 Mon Sep 17 00:00:00 2001
From: Euler Taveira <euler.taveira@enterprisedb.com>
Date: Sat, 6 Nov 2021 11:31:10 -0300
Subject: [PATCH v3] Time-delayed logical replication subscriber
Similar to physical replication, a time-delayed copy of the data for
logical replication is useful for some scenarios (specially to fix
errors that might cause data loss).
If the subscriber sets min_apply_delay parameter, the logical
replication worker will delay the transaction commit for min_apply_delay
milliseconds.
The delay is calculated between the WAL time stamp and the current time
on the subscriber.
The delay occurs only on WAL records for transaction begins. Regular and
prepared transactions are covered. Streamed transactions are not
delayed. It should be implemented in future releases.
Author: Euler Taveira
Discussion: https://postgr.es/m/CAB-JLwYOYwL=XTyAXKiH5CtM_Vm8KjKh7aaitCKvmCh4rzr5pQ@mail.gmail.com
---
doc/src/sgml/ref/alter_subscription.sgml | 5 +-
doc/src/sgml/ref/create_subscription.sgml | 33 +++++
src/backend/catalog/pg_subscription.c | 1 +
src/backend/catalog/system_views.sql | 4 +-
src/backend/commands/subscriptioncmds.c | 46 ++++++-
src/backend/replication/logical/worker.c | 61 +++++++++
src/backend/utils/adt/timestamp.c | 32 +++++
src/bin/pg_dump/pg_dump.c | 13 +-
src/bin/pg_dump/pg_dump.h | 1 +
src/bin/psql/describe.c | 13 +-
src/include/catalog/pg_subscription.h | 2 +
src/include/datatype/timestamp.h | 2 +
src/include/utils/timestamp.h | 2 +
src/test/regress/expected/subscription.out | 149 ++++++++++++---------
src/test/regress/sql/subscription.sql | 20 +++
src/test/subscription/t/030_apply_delay.pl | 76 +++++++++++
16 files changed, 389 insertions(+), 71 deletions(-)
create mode 100644 src/test/subscription/t/030_apply_delay.pl
diff --git a/doc/src/sgml/ref/alter_subscription.sgml b/doc/src/sgml/ref/alter_subscription.sgml
index ac2db249cb..83e4e352ca 100644
--- a/doc/src/sgml/ref/alter_subscription.sgml
+++ b/doc/src/sgml/ref/alter_subscription.sgml
@@ -205,8 +205,9 @@ ALTER SUBSCRIPTION <replaceable class="parameter">name</replaceable> RENAME TO <
information. The parameters that can be altered
are <literal>slot_name</literal>,
<literal>synchronous_commit</literal>,
- <literal>binary</literal>, <literal>streaming</literal>, and
- <literal>disable_on_error</literal>.
+ <literal>binary</literal>, <literal>streaming</literal>,
+ <literal>disable_on_error</literal>, and
+ <literal>min_apply_delay</literal>.
</para>
</listitem>
</varlistentry>
diff --git a/doc/src/sgml/ref/create_subscription.sgml b/doc/src/sgml/ref/create_subscription.sgml
index b701752fc9..2bc8deb132 100644
--- a/doc/src/sgml/ref/create_subscription.sgml
+++ b/doc/src/sgml/ref/create_subscription.sgml
@@ -302,6 +302,39 @@ CREATE SUBSCRIPTION <replaceable class="parameter">subscription_name</replaceabl
</para>
</listitem>
</varlistentry>
+
+ <varlistentry>
+ <term><literal>min_apply_delay</literal> (<type>integer</type>)</term>
+ <listitem>
+ <para>
+ By default, subscriber applies changes as soon as possible. Similar
+ to the physical replication feature
+ (<xref linkend="guc-recovery-min-apply-delay"/>), it may be useful to
+ have a time-delayed copy of data for logical replication. This
+ parameter allows you to delay the application of changes by a
+ specified amount of time. If this value is specificed without units,
+ it is taken as milliseconds. The default is zero, adding no delay.
+ </para>
+ <para>
+ The delay occurs only after the initial table synchronization. It is
+ possible that the replication delay between publisher and subscriber
+ exceeds the value of this parameter, in which case no delay is added.
+ Note that the delay is calculated between the WAL time stamp as
+ written on publisher and the current time on the subscriber. Delays
+ in logical decoding and in transfer the transaction may reduce the
+ actual wait time. If the system clocks on publisher and subscriber
+ are not synchronized, this may lead to apply changes earlier than
+ expected. This is not a major issue because a typical setting of this
+ parameter are much larger than typical time deviations between
+ servers.
+ </para>
+ <para>
+ The delay occurs only on WAL records for transaction begins. Streamed
+ transactions do not impose a delay. It should be implemented in future
+ releases.
+ </para>
+ </listitem>
+ </varlistentry>
</variablelist>
</para>
diff --git a/src/backend/catalog/pg_subscription.c b/src/backend/catalog/pg_subscription.c
index 0ff0982f7b..4d8c96efaf 100644
--- a/src/backend/catalog/pg_subscription.c
+++ b/src/backend/catalog/pg_subscription.c
@@ -71,6 +71,7 @@ GetSubscription(Oid subid, bool missing_ok)
sub->twophasestate = subform->subtwophasestate;
sub->disableonerr = subform->subdisableonerr;
sub->skiplsn = subform->subskiplsn;
+ sub->applydelay = subform->subapplydelay;
/* Get conninfo */
datum = SysCacheGetAttr(SUBSCRIPTIONOID,
diff --git a/src/backend/catalog/system_views.sql b/src/backend/catalog/system_views.sql
index bd48ee7bd2..e9c5088d8a 100644
--- a/src/backend/catalog/system_views.sql
+++ b/src/backend/catalog/system_views.sql
@@ -1261,8 +1261,8 @@ REVOKE ALL ON pg_replication_origin_status FROM public;
-- All columns of pg_subscription except subconninfo are publicly readable.
REVOKE ALL ON pg_subscription FROM public;
GRANT SELECT (oid, subdbid, subname, subowner, subenabled, subbinary,
- substream, subtwophasestate, subdisableonerr, subskiplsn, subslotname,
- subsynccommit, subpublications)
+ substream, subtwophasestate, subdisableonerr, subskiplsn,
+ subapplydelay, subslotname, subsynccommit, subpublications)
ON pg_subscription TO public;
CREATE VIEW pg_stat_subscription_stats AS
diff --git a/src/backend/commands/subscriptioncmds.c b/src/backend/commands/subscriptioncmds.c
index e16f04626d..4ba234dfe1 100644
--- a/src/backend/commands/subscriptioncmds.c
+++ b/src/backend/commands/subscriptioncmds.c
@@ -47,6 +47,7 @@
#include "utils/memutils.h"
#include "utils/pg_lsn.h"
#include "utils/syscache.h"
+#include "utils/timestamp.h"
/*
* Options that can be specified by the user in CREATE/ALTER SUBSCRIPTION
@@ -64,6 +65,7 @@
#define SUBOPT_TWOPHASE_COMMIT 0x00000200
#define SUBOPT_DISABLE_ON_ERR 0x00000400
#define SUBOPT_LSN 0x00000800
+#define SUBOPT_MIN_APPLY_DELAY 0x00001000
/* check if the 'val' has 'bits' set */
#define IsSet(val, bits) (((val) & (bits)) == (bits))
@@ -87,6 +89,7 @@ typedef struct SubOpts
bool twophase;
bool disableonerr;
XLogRecPtr lsn;
+ int64 min_apply_delay;
} SubOpts;
static List *fetch_table_list(WalReceiverConn *wrconn, List *publications);
@@ -292,12 +295,35 @@ parse_subscription_options(ParseState *pstate, List *stmt_options,
opts->specified_opts |= SUBOPT_LSN;
opts->lsn = lsn;
}
+ else if (IsSet(supported_opts, SUBOPT_MIN_APPLY_DELAY) &&
+ strcmp(defel->defname, "min_apply_delay") == 0)
+ {
+ char *val;
+ Interval *interval;
+
+ if (IsSet(opts->specified_opts, SUBOPT_MIN_APPLY_DELAY))
+ errorConflictingDefElem(defel, pstate);
+
+ opts->specified_opts |= SUBOPT_MIN_APPLY_DELAY;
+ val = defGetString(defel);
+
+ interval = DatumGetIntervalP(DirectFunctionCall3(interval_in,
+ CStringGetDatum(val),
+ ObjectIdGetDatum(InvalidOid),
+ Int32GetDatum(-1)));
+ opts->min_apply_delay = interval_to_ms(interval);
+ }
else
ereport(ERROR,
(errcode(ERRCODE_SYNTAX_ERROR),
errmsg("unrecognized subscription parameter: \"%s\"", defel->defname)));
}
+ if (opts->min_apply_delay < 0)
+ ereport(ERROR,
+ errcode(ERRCODE_NUMERIC_VALUE_OUT_OF_RANGE),
+ errmsg("option \"%s\" must not be negative", "min_apply_delay"));
+
/*
* We've been explicitly asked to not connect, that requires some
* additional processing.
@@ -434,7 +460,7 @@ CreateSubscription(ParseState *pstate, CreateSubscriptionStmt *stmt,
SUBOPT_SLOT_NAME | SUBOPT_COPY_DATA |
SUBOPT_SYNCHRONOUS_COMMIT | SUBOPT_BINARY |
SUBOPT_STREAMING | SUBOPT_TWOPHASE_COMMIT |
- SUBOPT_DISABLE_ON_ERR);
+ SUBOPT_DISABLE_ON_ERR | SUBOPT_MIN_APPLY_DELAY);
parse_subscription_options(pstate, stmt->options, supported_opts, &opts);
/*
@@ -510,6 +536,7 @@ CreateSubscription(ParseState *pstate, CreateSubscriptionStmt *stmt,
LOGICALREP_TWOPHASE_STATE_DISABLED);
values[Anum_pg_subscription_subdisableonerr - 1] = BoolGetDatum(opts.disableonerr);
values[Anum_pg_subscription_subskiplsn - 1] = LSNGetDatum(InvalidXLogRecPtr);
+ values[Anum_pg_subscription_subapplydelay - 1] = Int64GetDatum(opts.min_apply_delay);
values[Anum_pg_subscription_subconninfo - 1] =
CStringGetTextDatum(conninfo);
if (opts.slot_name)
@@ -966,6 +993,12 @@ AlterSubscription(ParseState *pstate, AlterSubscriptionStmt *stmt,
replaces[Anum_pg_subscription_subdisableonerr - 1]
= true;
}
+ if (IsSet(opts.specified_opts, SUBOPT_MIN_APPLY_DELAY))
+ {
+ values[Anum_pg_subscription_subapplydelay - 1] =
+ Int64GetDatum(opts.min_apply_delay);
+ replaces[Anum_pg_subscription_subapplydelay - 1] = true;
+ }
update_tuple = true;
break;
@@ -989,6 +1022,17 @@ AlterSubscription(ParseState *pstate, AlterSubscriptionStmt *stmt,
if (opts.enabled)
ApplyLauncherWakeupAtCommit();
+ /*
+ * If this subscription has been disabled and it has an apply
+ * delay set, wake up the logical replication worker to finish
+ * it as soon as possible.
+ */
+ if (!opts.enabled && sub->applydelay > 0)
+ {
+ elog(DEBUG1, "subscription has been disabled");
+ logicalrep_worker_wakeup(sub->oid, InvalidOid);
+ }
+
update_tuple = true;
break;
}
diff --git a/src/backend/replication/logical/worker.c b/src/backend/replication/logical/worker.c
index 82dcffc2db..bde78e5b1a 100644
--- a/src/backend/replication/logical/worker.c
+++ b/src/backend/replication/logical/worker.c
@@ -252,6 +252,7 @@ WalReceiverConn *LogRepWorkerWalRcvConn = NULL;
Subscription *MySubscription = NULL;
bool MySubscriptionValid = false;
+TimestampTz MySubscriptionMinApplyDelayUntil = 0;
bool in_remote_transaction = false;
static XLogRecPtr remote_final_lsn = InvalidXLogRecPtr;
@@ -324,6 +325,8 @@ static void maybe_reread_subscription(void);
static void DisableSubscriptionAndExit(void);
+static void apply_delay(TimestampTz ts);
+
/* prototype needed because of stream_commit */
static void apply_dispatch(StringInfo s);
@@ -804,6 +807,58 @@ slot_modify_data(TupleTableSlot *slot, TupleTableSlot *srcslot,
ExecStoreVirtualTuple(slot);
}
+/*
+ * Apply the informed delay for the transaction.
+ *
+ * A regular transaction uses the commit time to calculate the delay. A
+ * prepared transaction uses the prepare time to calculate the delay (the
+ * commit time is unknown at prepare time).
+ *
+ * FIXME A streamed transaction could be delayed too but there is no timestamp
+ * in the STREAM START protocol message. Hence, if it is a streamed (regular or
+ * prepared) transaction, no delay is applied.
+ */
+static void
+apply_delay(TimestampTz ts)
+{
+ /* nothing to do if no delay set */
+ if (MySubscription->applydelay <= 0)
+ return;
+
+ /* set apply delay */
+ MySubscriptionMinApplyDelayUntil = TimestampTzPlusMilliseconds(TimestampTzGetDatum(ts),
+ MySubscription->applydelay);
+
+ while (true)
+ {
+ int diffms;
+
+ diffms = TimestampDifferenceMilliseconds(GetCurrentTimestamp(), MySubscriptionMinApplyDelayUntil);
+
+ elog(DEBUG2, "logical replication apply delay: %u ms", diffms);
+
+ /*
+ * Exit without arming the latch if it's already past time to apply
+ * this transaction.
+ */
+ if (diffms <= 0)
+ break;
+
+ WaitLatch(MyLatch,
+ WL_LATCH_SET | WL_TIMEOUT | WL_EXIT_ON_PM_DEATH,
+ diffms,
+ WAIT_EVENT_RECOVERY_APPLY_DELAY);
+ ResetLatch(MyLatch);
+
+ CHECK_FOR_INTERRUPTS();
+ }
+
+ /*
+ * Delay applied. Reset state.
+ */
+ MySubscriptionMinApplyDelayUntil = 0;
+}
+
/*
* Handle BEGIN message.
*/
@@ -815,6 +870,9 @@ apply_handle_begin(StringInfo s)
logicalrep_read_begin(s, &begin_data);
set_apply_error_context_xact(begin_data.xid, begin_data.final_lsn);
+ /* Should we delay the current transaction? */
+ apply_delay(begin_data.committime);
+
remote_final_lsn = begin_data.final_lsn;
maybe_start_skipping_changes(begin_data.final_lsn);
@@ -869,6 +927,9 @@ apply_handle_begin_prepare(StringInfo s)
logicalrep_read_begin_prepare(s, &begin_data);
set_apply_error_context_xact(begin_data.xid, begin_data.prepare_lsn);
+ /* Should we delay the current prepared transaction? */
+ apply_delay(begin_data.prepare_time);
+
remote_final_lsn = begin_data.prepare_lsn;
maybe_start_skipping_changes(begin_data.prepare_lsn);
diff --git a/src/backend/utils/adt/timestamp.c b/src/backend/utils/adt/timestamp.c
index ae36ff3328..19ea8e9ba1 100644
--- a/src/backend/utils/adt/timestamp.c
+++ b/src/backend/utils/adt/timestamp.c
@@ -2379,6 +2379,38 @@ interval_cmp_internal(Interval *interval1, Interval *interval2)
return int128_compare(span1, span2);
}
+/*
+ * Given an Interval returns the number of milliseconds.
+ */
+int64
+interval_to_ms(const Interval *interval)
+{
+ int64 days;
+ int64 ms;
+ int64 result;
+
+ days = interval->month * INT64CONST(30);
+ days += interval->day;
+
+ /*
+ * The following operations use these special functions to detect overflow.
+ * Number of ms per informed days.
+ */
+ if (pg_mul_s64_overflow(days, MSECS_PER_DAY, &result))
+ ereport(ERROR,
+ (errcode(ERRCODE_NUMERIC_VALUE_OUT_OF_RANGE),
+ errmsg("bigint out of range")));
+
+ /* adds portion time (in ms) to the previous result. */
+ ms = interval->time / INT64CONST(1000);
+ if (pg_add_s64_overflow(result, ms, &result))
+ ereport(ERROR,
+ (errcode(ERRCODE_NUMERIC_VALUE_OUT_OF_RANGE),
+ errmsg("bigint out of range")));
+
+ return result;
+}
+
Datum
interval_eq(PG_FUNCTION_ARGS)
{
diff --git a/src/bin/pg_dump/pg_dump.c b/src/bin/pg_dump/pg_dump.c
index e5816c4cce..2bd5e24be3 100644
--- a/src/bin/pg_dump/pg_dump.c
+++ b/src/bin/pg_dump/pg_dump.c
@@ -4325,6 +4325,7 @@ getSubscriptions(Archive *fout)
int i_subsynccommit;
int i_subpublications;
int i_subbinary;
+ int i_subapplydelay;
int i,
ntups;
@@ -4369,11 +4370,13 @@ getSubscriptions(Archive *fout)
if (fout->remoteVersion >= 150000)
appendPQExpBufferStr(query,
" s.subtwophasestate,\n"
- " s.subdisableonerr\n");
+ " s.subdisableonerr,\n"
+ " s.subapplydelay\n");
else
appendPQExpBuffer(query,
" '%c' AS subtwophasestate,\n"
- " false AS subdisableonerr\n",
+ " false AS subdisableonerr,\n"
+ " 0 AS subapplydelay\n",
LOGICALREP_TWOPHASE_STATE_DISABLED);
appendPQExpBufferStr(query,
@@ -4401,6 +4404,7 @@ getSubscriptions(Archive *fout)
i_substream = PQfnumber(res, "substream");
i_subtwophasestate = PQfnumber(res, "subtwophasestate");
i_subdisableonerr = PQfnumber(res, "subdisableonerr");
+ i_subapplydelay = PQfnumber(res, "subapplydelay");
subinfo = pg_malloc(ntups * sizeof(SubscriptionInfo));
@@ -4430,6 +4434,8 @@ getSubscriptions(Archive *fout)
pg_strdup(PQgetvalue(res, i, i_subtwophasestate));
subinfo[i].subdisableonerr =
pg_strdup(PQgetvalue(res, i, i_subdisableonerr));
+ subinfo[i].subapplydelay =
+ strtoi64(PQgetvalue(res, i, i_subapplydelay), NULL, 10);
/* Decide whether we want to dump it */
selectDumpableObject(&(subinfo[i].dobj), fout);
@@ -4506,6 +4512,9 @@ dumpSubscription(Archive *fout, const SubscriptionInfo *subinfo)
if (strcmp(subinfo->subsynccommit, "off") != 0)
appendPQExpBuffer(query, ", synchronous_commit = %s", fmtId(subinfo->subsynccommit));
+ if (subinfo->subapplydelay > 0)
+ appendPQExpBuffer(query, ", min_apply_delay = '" INT64_FORMAT " ms'", subinfo->subapplydelay);
+
appendPQExpBufferStr(query, ");\n");
if (subinfo->dobj.dump & DUMP_COMPONENT_DEFINITION)
diff --git a/src/bin/pg_dump/pg_dump.h b/src/bin/pg_dump/pg_dump.h
index 772dc0cf7a..15a05227f9 100644
--- a/src/bin/pg_dump/pg_dump.h
+++ b/src/bin/pg_dump/pg_dump.h
@@ -659,6 +659,7 @@ typedef struct _SubscriptionInfo
char *subtwophasestate;
char *subdisableonerr;
char *subsynccommit;
+ int64 subapplydelay;
char *subpublications;
} SubscriptionInfo;
diff --git a/src/bin/psql/describe.c b/src/bin/psql/describe.c
index 714097cad1..a54c084e34 100644
--- a/src/bin/psql/describe.c
+++ b/src/bin/psql/describe.c
@@ -6105,7 +6105,7 @@ describeSubscriptions(const char *pattern, bool verbose)
PGresult *res;
printQueryOpt myopt = pset.popt;
static const bool translate_columns[] = {false, false, false, false,
- false, false, false, false, false, false, false};
+ false, false, false, false, false, false, false, false};
if (pset.sversion < 100000)
{
@@ -6139,13 +6139,18 @@ describeSubscriptions(const char *pattern, bool verbose)
gettext_noop("Binary"),
gettext_noop("Streaming"));
- /* Two_phase and disable_on_error are only supported in v15 and higher */
+ /*
+ * two_phase, disable_on_error and min_apply_delay are only supported
+ * in v15 and higher.
+ */
if (pset.sversion >= 150000)
appendPQExpBuffer(&buf,
", subtwophasestate AS \"%s\"\n"
- ", subdisableonerr AS \"%s\"\n",
+ ", subdisableonerr AS \"%s\"\n"
+ ", subapplydelay AS \"%s\"\n",
gettext_noop("Two phase commit"),
- gettext_noop("Disable on error"));
+ gettext_noop("Disable on error"),
+ gettext_noop("Apply delay"));
appendPQExpBuffer(&buf,
", subsynccommit AS \"%s\"\n"
diff --git a/src/include/catalog/pg_subscription.h b/src/include/catalog/pg_subscription.h
index 599c2e4422..d638788f18 100644
--- a/src/include/catalog/pg_subscription.h
+++ b/src/include/catalog/pg_subscription.h
@@ -70,6 +70,7 @@ CATALOG(pg_subscription,6100,SubscriptionRelationId) BKI_SHARED_RELATION BKI_ROW
bool subdisableonerr; /* True if a worker error should cause the
* subscription to be disabled */
+ int64 subapplydelay; /* Replication apply delay */
XLogRecPtr subskiplsn; /* All changes finished at this LSN are
* skipped */
@@ -115,6 +116,7 @@ typedef struct Subscription
* occurs */
XLogRecPtr skiplsn; /* All changes finished at this LSN are
* skipped */
+ int64 applydelay; /* Replication apply delay */
char *conninfo; /* Connection string to the publisher */
char *slotname; /* Name of the replication slot */
char *synccommit; /* Synchronous commit setting for worker */
diff --git a/src/include/datatype/timestamp.h b/src/include/datatype/timestamp.h
index 5fa38d20d8..45819da223 100644
--- a/src/include/datatype/timestamp.h
+++ b/src/include/datatype/timestamp.h
@@ -88,6 +88,8 @@ typedef struct
#define SECS_PER_MINUTE 60
#define MINS_PER_HOUR 60
+#define MSECS_PER_DAY INT64CONST(86400000)
+
#define USECS_PER_DAY INT64CONST(86400000000)
#define USECS_PER_HOUR INT64CONST(3600000000)
#define USECS_PER_MINUTE INT64CONST(60000000)
diff --git a/src/include/utils/timestamp.h b/src/include/utils/timestamp.h
index c1a74f8e2b..58a6e6b6da 100644
--- a/src/include/utils/timestamp.h
+++ b/src/include/utils/timestamp.h
@@ -78,6 +78,8 @@ extern bool TimestampDifferenceExceeds(TimestampTz start_time,
TimestampTz stop_time,
int msec);
+extern int64 interval_to_ms(const Interval *interval);
+
extern TimestampTz time_t_to_timestamptz(pg_time_t tm);
extern pg_time_t timestamptz_to_time_t(TimestampTz t);
diff --git a/src/test/regress/expected/subscription.out b/src/test/regress/expected/subscription.out
index 7fcfad1591..2604a48002 100644
--- a/src/test/regress/expected/subscription.out
+++ b/src/test/regress/expected/subscription.out
@@ -76,10 +76,10 @@ ALTER SUBSCRIPTION regress_testsub CONNECTION 'foobar';
ERROR: invalid connection string syntax: missing "=" after "foobar" in connection info string
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two phase commit | Disable on error | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub} | f | f | d | f | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two phase commit | Disable on error | Apply delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+-------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | f | d | f | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
ALTER SUBSCRIPTION regress_testsub SET PUBLICATION testpub2, testpub3 WITH (refresh = false);
@@ -96,10 +96,10 @@ ERROR: unrecognized subscription parameter: "create_slot"
-- ok
ALTER SUBSCRIPTION regress_testsub SKIP (lsn = '0/12345');
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two phase commit | Disable on error | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+---------------------+--------+-----------+------------------+------------------+--------------------+------------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub2,testpub3} | f | f | d | f | off | dbname=regress_doesnotexist2 | 0/12345
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two phase commit | Disable on error | Apply delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+---------------------+--------+-----------+------------------+------------------+-------------+--------------------+------------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub2,testpub3} | f | f | d | f | 0 | off | dbname=regress_doesnotexist2 | 0/12345
(1 row)
-- ok - with lsn = NONE
@@ -108,10 +108,10 @@ ALTER SUBSCRIPTION regress_testsub SKIP (lsn = NONE);
ALTER SUBSCRIPTION regress_testsub SKIP (lsn = '0/0');
ERROR: invalid WAL location (LSN): 0/0
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two phase commit | Disable on error | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+---------------------+--------+-----------+------------------+------------------+--------------------+------------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub2,testpub3} | f | f | d | f | off | dbname=regress_doesnotexist2 | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two phase commit | Disable on error | Apply delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+---------------------+--------+-----------+------------------+------------------+-------------+--------------------+------------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub2,testpub3} | f | f | d | f | 0 | off | dbname=regress_doesnotexist2 | 0/0
(1 row)
BEGIN;
@@ -143,10 +143,10 @@ ALTER SUBSCRIPTION regress_testsub_foo SET (synchronous_commit = foobar);
ERROR: invalid value for parameter "synchronous_commit": "foobar"
HINT: Available values: local, remote_write, remote_apply, on, off.
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two phase commit | Disable on error | Synchronous commit | Conninfo | Skip LSN
----------------------+---------------------------+---------+---------------------+--------+-----------+------------------+------------------+--------------------+------------------------------+----------
- regress_testsub_foo | regress_subscription_user | f | {testpub2,testpub3} | f | f | d | f | local | dbname=regress_doesnotexist2 | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two phase commit | Disable on error | Apply delay | Synchronous commit | Conninfo | Skip LSN
+---------------------+---------------------------+---------+---------------------+--------+-----------+------------------+------------------+-------------+--------------------+------------------------------+----------
+ regress_testsub_foo | regress_subscription_user | f | {testpub2,testpub3} | f | f | d | f | 0 | local | dbname=regress_doesnotexist2 | 0/0
(1 row)
-- rename back to keep the rest simple
@@ -179,19 +179,19 @@ ERROR: binary requires a Boolean value
CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUBLICATION testpub WITH (connect = false, binary = true);
WARNING: tables were not subscribed, you will have to run ALTER SUBSCRIPTION ... REFRESH PUBLICATION to subscribe the tables
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two phase commit | Disable on error | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub} | t | f | d | f | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two phase commit | Disable on error | Apply delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+-------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | t | f | d | f | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
ALTER SUBSCRIPTION regress_testsub SET (binary = false);
ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE);
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two phase commit | Disable on error | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub} | f | f | d | f | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two phase commit | Disable on error | Apply delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+-------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | f | d | f | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
DROP SUBSCRIPTION regress_testsub;
@@ -202,19 +202,19 @@ ERROR: streaming requires a Boolean value
CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUBLICATION testpub WITH (connect = false, streaming = true);
WARNING: tables were not subscribed, you will have to run ALTER SUBSCRIPTION ... REFRESH PUBLICATION to subscribe the tables
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two phase commit | Disable on error | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub} | f | t | d | f | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two phase commit | Disable on error | Apply delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+-------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | t | d | f | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
ALTER SUBSCRIPTION regress_testsub SET (streaming = false);
ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE);
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two phase commit | Disable on error | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub} | f | f | d | f | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two phase commit | Disable on error | Apply delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+-------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | f | d | f | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
-- fail - publication already exists
@@ -229,10 +229,10 @@ ALTER SUBSCRIPTION regress_testsub ADD PUBLICATION testpub1, testpub2 WITH (refr
ALTER SUBSCRIPTION regress_testsub ADD PUBLICATION testpub1, testpub2 WITH (refresh = false);
ERROR: publication "testpub1" is already in subscription "regress_testsub"
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two phase commit | Disable on error | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-----------------------------+--------+-----------+------------------+------------------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub,testpub1,testpub2} | f | f | d | f | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two phase commit | Disable on error | Apply delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-----------------------------+--------+-----------+------------------+------------------+-------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub,testpub1,testpub2} | f | f | d | f | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
-- fail - publication used more then once
@@ -247,10 +247,10 @@ ERROR: publication "testpub3" is not in subscription "regress_testsub"
-- ok - delete publications
ALTER SUBSCRIPTION regress_testsub DROP PUBLICATION testpub1, testpub2 WITH (refresh = false);
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two phase commit | Disable on error | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub} | f | f | d | f | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two phase commit | Disable on error | Apply delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+-------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | f | d | f | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
DROP SUBSCRIPTION regress_testsub;
@@ -284,10 +284,10 @@ ERROR: two_phase requires a Boolean value
CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUBLICATION testpub WITH (connect = false, two_phase = true);
WARNING: tables were not subscribed, you will have to run ALTER SUBSCRIPTION ... REFRESH PUBLICATION to subscribe the tables
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two phase commit | Disable on error | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub} | f | f | p | f | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two phase commit | Disable on error | Apply delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+-------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | f | p | f | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
--fail - alter of two_phase option not supported.
@@ -296,10 +296,10 @@ ERROR: unrecognized subscription parameter: "two_phase"
-- but can alter streaming when two_phase enabled
ALTER SUBSCRIPTION regress_testsub SET (streaming = true);
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two phase commit | Disable on error | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub} | f | t | p | f | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two phase commit | Disable on error | Apply delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+-------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | t | p | f | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE);
@@ -308,10 +308,10 @@ DROP SUBSCRIPTION regress_testsub;
CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUBLICATION testpub WITH (connect = false, streaming = true, two_phase = true);
WARNING: tables were not subscribed, you will have to run ALTER SUBSCRIPTION ... REFRESH PUBLICATION to subscribe the tables
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two phase commit | Disable on error | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub} | f | t | p | f | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two phase commit | Disable on error | Apply delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+-------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | t | p | f | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE);
@@ -323,18 +323,47 @@ ERROR: disable_on_error requires a Boolean value
CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUBLICATION testpub WITH (connect = false, disable_on_error = false);
WARNING: tables were not subscribed, you will have to run ALTER SUBSCRIPTION ... REFRESH PUBLICATION to subscribe the tables
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two phase commit | Disable on error | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub} | f | f | d | f | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two phase commit | Disable on error | Apply delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+-------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | f | d | f | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
ALTER SUBSCRIPTION regress_testsub SET (disable_on_error = true);
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two phase commit | Disable on error | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub} | f | f | d | t | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two phase commit | Disable on error | Apply delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+-------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | f | d | t | 0 | off | dbname=regress_doesnotexist | 0/0
+(1 row)
+
+ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE);
+DROP SUBSCRIPTION regress_testsub;
+-- fail -- min_apply_delay must be a non-negative integer
+CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUBLICATION testpub WITH (connect = false, min_apply_delay = foo);
+ERROR: invalid input syntax for type interval: "foo"
+CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUBLICATION testpub WITH (connect = false, min_apply_delay = -1);
+ERROR: option "min_apply_delay" must not be negative
+-- success -- 123 ms
+CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUBLICATION testpub WITH (connect = false, min_apply_delay = 123);
+WARNING: tables were not subscribed, you will have to run ALTER SUBSCRIPTION ... REFRESH PUBLICATION to subscribe the tables
+\dRs+
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two phase commit | Disable on error | Apply delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+-------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | f | d | f | 123000 | off | dbname=regress_doesnotexist | 0/0
+(1 row)
+
+ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE);
+DROP SUBSCRIPTION regress_testsub;
+-- success -- interval is converted into ms and stored as integer
+CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUBLICATION testpub WITH (connect = false, min_apply_delay = '4h 27min 35s');
+WARNING: tables were not subscribed, you will have to run ALTER SUBSCRIPTION ... REFRESH PUBLICATION to subscribe the tables
+\dRs+
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two phase commit | Disable on error | Apply delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+-------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | f | d | f | 16055000 | off | dbname=regress_doesnotexist | 0/0
(1 row)
ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE);
diff --git a/src/test/regress/sql/subscription.sql b/src/test/regress/sql/subscription.sql
index 74c38ead5d..51dade37b1 100644
--- a/src/test/regress/sql/subscription.sql
+++ b/src/test/regress/sql/subscription.sql
@@ -254,6 +254,26 @@ ALTER SUBSCRIPTION regress_testsub SET (disable_on_error = true);
ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE);
DROP SUBSCRIPTION regress_testsub;
+-- fail -- min_apply_delay must be a non-negative integer
+CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUBLICATION testpub WITH (connect = false, min_apply_delay = foo);
+CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUBLICATION testpub WITH (connect = false, min_apply_delay = -1);
+
+-- success -- 123 ms
+CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUBLICATION testpub WITH (connect = false, min_apply_delay = 123);
+
+\dRs+
+
+ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE);
+DROP SUBSCRIPTION regress_testsub;
+
+-- success -- interval is converted into ms and stored as integer
+CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUBLICATION testpub WITH (connect = false, min_apply_delay = '4h 27min 35s');
+
+\dRs+
+
+ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE);
+DROP SUBSCRIPTION regress_testsub;
+
RESET SESSION AUTHORIZATION;
DROP ROLE regress_subscription_user;
DROP ROLE regress_subscription_user2;
diff --git a/src/test/subscription/t/030_apply_delay.pl b/src/test/subscription/t/030_apply_delay.pl
new file mode 100644
index 0000000000..25433c45e9
--- /dev/null
+++ b/src/test/subscription/t/030_apply_delay.pl
@@ -0,0 +1,76 @@
+
+# Copyright (c) 2022, PostgreSQL Global Development Group
+
+# Test replication apply delay
+use strict;
+use warnings;
+use PostgreSQL::Test::Cluster;
+use PostgreSQL::Test::Utils;
+use Test::More;
+
+# Create publisher node
+my $node_publisher = PostgreSQL::Test::Cluster->new('publisher');
+$node_publisher->init(allows_streaming => 'logical');
+$node_publisher->start;
+
+# Create subscriber node
+my $node_subscriber = PostgreSQL::Test::Cluster->new('subscriber');
+$node_subscriber->init;
+$node_subscriber->append_conf('postgresql.conf',
+ "log_min_messages = debug2");
+$node_subscriber->start;
+
+# Create some preexisting content on publisher
+$node_publisher->safe_psql('postgres',
+ "CREATE TABLE test_tab (a int primary key, b varchar)");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO test_tab VALUES (1, 'foo'), (2, 'bar')");
+
+# Setup structure on subscriber
+$node_subscriber->safe_psql('postgres',
+ "CREATE TABLE test_tab (a int primary key, b text, c timestamptz DEFAULT now(), d bigint DEFAULT 999)"
+);
+
+# Setup logical replication
+my $publisher_connstr = $node_publisher->connstr . ' dbname=postgres';
+$node_publisher->safe_psql('postgres',
+ "CREATE PUBLICATION tap_pub FOR TABLE test_tab");
+
+my $appname = 'tap_sub';
+$node_subscriber->safe_psql('postgres',
+ "CREATE SUBSCRIPTION tap_sub CONNECTION '$publisher_connstr application_name=$appname' PUBLICATION tap_pub WITH (min_apply_delay = '2s')"
+);
+
+$node_publisher->wait_for_catchup($appname);
+
+# Also wait for initial table sync to finish
+my $synced_query =
+ "SELECT count(1) = 0 FROM pg_subscription_rel WHERE srsubstate NOT IN ('r', 's');";
+$node_subscriber->poll_query_until('postgres', $synced_query)
+ or die "Timed out while waiting for subscriber to synchronize data";
+
+my $result =
+ $node_subscriber->safe_psql('postgres',
+ "SELECT count(*), min(a), max(a) FROM test_tab");
+is($result, qq(2|1|2), 'check initial data was copied to subscriber');
+
+# new row to trigger apply delay
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO test_tab VALUES (3, 'baz')");
+
+$node_publisher->wait_for_catchup($appname);
+
+$result =
+ $node_subscriber->safe_psql('postgres',
+ "SELECT count(*), min(a), max(a) FROM test_tab");
+is($result, qq(3|1|3), 'check the new row was applied to subscriber');
+
+my $logfile = slurp_file($node_subscriber->logfile());
+ok( $logfile =~
+ qr/logical replication apply delay/,
+ 'check if replication apply delay is triggered');
+
+$node_subscriber->stop;
+$node_publisher->stop;
+
+done_testing();
--
2.30.2
On Wed, Mar 23, 2022, at 6:19 PM, Euler Taveira wrote:
On Mon, Mar 21, 2022, at 10:09 PM, Euler Taveira wrote:
On Mon, Mar 21, 2022, at 10:04 PM, Andres Freund wrote:
On 2022-03-20 21:40:40 -0300, Euler Taveira wrote:
On Mon, Feb 28, 2022, at 9:18 PM, Euler Taveira wrote:
Long time, no patch. Here it is. I will provide documentation in the next
version. I would appreciate some feedback.This patch is broken since commit 705e20f8550c0e8e47c0b6b20b5f5ffd6ffd9e33. I
rebased it.This fails tests, specifically it seems psql crashes:
https://cirrus-ci.com/task/6592281292570624?logs=cores#L46Yeah. I forgot to test this patch with cassert before sending it. :( I didn't
send a new patch because there is another issue (with int128) that I'm
currently reworking. I'll send another patch soon.Here is another version after rebasing it. In this version I fixed the psql
issue and rewrote interval_to_ms function.
From the previous version, I added support for streamed transactions. For
streamed transactions, the delay is applied during STREAM COMMIT message.
That's ok if we add the delay before applying the spooled messages. Hence, we
guarantee that the delay is applied *before* each transaction. The same logic
is applied to prepared transactions. The delay is introduced before applying
the spooled messages in STREAM PREPARE message.
Tests were refactored a bit. A test for streamed transaction was included too.
Version 4 is attached.
--
Euler Taveira
EDB https://www.enterprisedb.com/
Attachments:
v4-0001-Time-delayed-logical-replication-subscriber.patchtext/x-patch; name=v4-0001-Time-delayed-logical-replication-subscriber.patchDownload
From 7dd7a3523ed8e7a3494e7ec25ddc0af8ed4cf4d3 Mon Sep 17 00:00:00 2001
From: Euler Taveira <euler.taveira@enterprisedb.com>
Date: Sat, 6 Nov 2021 11:31:10 -0300
Subject: [PATCH v4] Time-delayed logical replication subscriber
Similar to physical replication, a time-delayed copy of the data for
logical replication is useful for some scenarios (specially to fix
errors that might cause data loss).
If the subscriber sets min_apply_delay parameter, the logical
replication worker will delay the transaction commit for min_apply_delay
milliseconds.
The delay is calculated between the WAL time stamp and the current time
on the subscriber.
The delay occurs only on WAL records for transaction begins. Regular and
prepared transactions are covered. Streamed transactions are also
covered.
Author: Euler Taveira
Discussion: https://postgr.es/m/CAB-JLwYOYwL=XTyAXKiH5CtM_Vm8KjKh7aaitCKvmCh4rzr5pQ@mail.gmail.com
---
doc/src/sgml/catalogs.sgml | 9 ++
doc/src/sgml/ref/alter_subscription.sgml | 5 +-
doc/src/sgml/ref/create_subscription.sgml | 31 ++++-
src/backend/catalog/pg_subscription.c | 1 +
src/backend/catalog/system_views.sql | 2 +-
src/backend/commands/subscriptioncmds.c | 46 ++++++-
src/backend/replication/logical/worker.c | 82 ++++++++++++
src/backend/utils/adt/timestamp.c | 32 +++++
src/bin/pg_dump/pg_dump.c | 16 ++-
src/bin/pg_dump/pg_dump.h | 1 +
src/bin/psql/describe.c | 8 +-
src/bin/psql/tab-complete.c | 7 +-
src/include/catalog/pg_subscription.h | 3 +
src/include/datatype/timestamp.h | 2 +
src/include/utils/timestamp.h | 2 +
src/test/regress/expected/subscription.out | 149 ++++++++++++---------
src/test/regress/sql/subscription.sql | 20 +++
src/test/subscription/t/032_apply_delay.pl | 110 +++++++++++++++
18 files changed, 455 insertions(+), 71 deletions(-)
create mode 100644 src/test/subscription/t/032_apply_delay.pl
diff --git a/doc/src/sgml/catalogs.sgml b/doc/src/sgml/catalogs.sgml
index 25b02c4e37..9b94b7aef2 100644
--- a/doc/src/sgml/catalogs.sgml
+++ b/doc/src/sgml/catalogs.sgml
@@ -7833,6 +7833,15 @@ SCRAM-SHA-256$<replaceable><iteration count></replaceable>:<replaceable>&l
</para></entry>
</row>
+ <row>
+ <entry role="catalog_table_entry"><para role="column_definition">
+ <structfield>subapplydelay</structfield> <type>int8</type>
+ </para>
+ <para>
+ Delay the application of changes by a specified amount of time.
+ </para></entry>
+ </row>
+
<row>
<entry role="catalog_table_entry"><para role="column_definition">
<structfield>subname</structfield> <type>name</type>
diff --git a/doc/src/sgml/ref/alter_subscription.sgml b/doc/src/sgml/ref/alter_subscription.sgml
index 353ea5def2..ae9d625f9d 100644
--- a/doc/src/sgml/ref/alter_subscription.sgml
+++ b/doc/src/sgml/ref/alter_subscription.sgml
@@ -207,8 +207,9 @@ ALTER SUBSCRIPTION <replaceable class="parameter">name</replaceable> RENAME TO <
information. The parameters that can be altered
are <literal>slot_name</literal>,
<literal>synchronous_commit</literal>,
- <literal>binary</literal>, <literal>streaming</literal>, and
- <literal>disable_on_error</literal>.
+ <literal>binary</literal>, <literal>streaming</literal>,
+ <literal>disable_on_error</literal>, and
+ <literal>min_apply_delay</literal>.
</para>
</listitem>
</varlistentry>
diff --git a/doc/src/sgml/ref/create_subscription.sgml b/doc/src/sgml/ref/create_subscription.sgml
index 34b3264b26..ae80db8a3d 100644
--- a/doc/src/sgml/ref/create_subscription.sgml
+++ b/doc/src/sgml/ref/create_subscription.sgml
@@ -302,7 +302,36 @@ CREATE SUBSCRIPTION <replaceable class="parameter">subscription_name</replaceabl
</para>
</listitem>
</varlistentry>
- </variablelist></para>
+
+ <varlistentry>
+ <term><literal>min_apply_delay</literal> (<type>integer</type>)</term>
+ <listitem>
+ <para>
+ By default, subscriber applies changes as soon as possible. Similar
+ to the physical replication feature
+ (<xref linkend="guc-recovery-min-apply-delay"/>), it may be useful to
+ have a time-delayed copy of data for logical replication. This
+ parameter allows you to delay the application of changes by a
+ specified amount of time. If this value is specified without units,
+ it is taken as milliseconds. The default is zero, adding no delay.
+ </para>
+ <para>
+ The delay occurs only on WAL records for transaction begins and after
+ the initial table synchronization. It is possible that the
+ replication delay between publisher and subscriber exceeds the value
+ of this parameter, in which case no delay is added. Note that the
+ delay is calculated between the WAL time stamp as written on
+ publisher and the current time on the subscriber. Delays in logical
+ decoding and in transfer the transaction may reduce the actual wait
+ time. If the system clocks on publisher and subscriber are not
+ synchronized, this may lead to apply changes earlier than expected.
+ This is not a major issue because a typical setting of this parameter
+ are much larger than typical time deviations between servers.
+ </para>
+ </listitem>
+ </varlistentry>
+ </variablelist>
+ </para>
</listitem>
</varlistentry>
diff --git a/src/backend/catalog/pg_subscription.c b/src/backend/catalog/pg_subscription.c
index 8856ce3b50..42915a2ea9 100644
--- a/src/backend/catalog/pg_subscription.c
+++ b/src/backend/catalog/pg_subscription.c
@@ -64,6 +64,7 @@ GetSubscription(Oid subid, bool missing_ok)
sub->oid = subid;
sub->dbid = subform->subdbid;
sub->skiplsn = subform->subskiplsn;
+ sub->applydelay = subform->subapplydelay;
sub->name = pstrdup(NameStr(subform->subname));
sub->owner = subform->subowner;
sub->enabled = subform->subenabled;
diff --git a/src/backend/catalog/system_views.sql b/src/backend/catalog/system_views.sql
index fedaed533b..6610edb75f 100644
--- a/src/backend/catalog/system_views.sql
+++ b/src/backend/catalog/system_views.sql
@@ -1297,7 +1297,7 @@ REVOKE ALL ON pg_replication_origin_status FROM public;
-- All columns of pg_subscription except subconninfo are publicly readable.
REVOKE ALL ON pg_subscription FROM public;
-GRANT SELECT (oid, subdbid, subskiplsn, subname, subowner, subenabled,
+GRANT SELECT (oid, subdbid, subskiplsn, subapplydelay, subname, subowner, subenabled,
subbinary, substream, subtwophasestate, subdisableonerr, subslotname,
subsynccommit, subpublications)
ON pg_subscription TO public;
diff --git a/src/backend/commands/subscriptioncmds.c b/src/backend/commands/subscriptioncmds.c
index e2852286a7..2c5125f979 100644
--- a/src/backend/commands/subscriptioncmds.c
+++ b/src/backend/commands/subscriptioncmds.c
@@ -47,6 +47,7 @@
#include "utils/memutils.h"
#include "utils/pg_lsn.h"
#include "utils/syscache.h"
+#include "utils/timestamp.h"
/*
* Options that can be specified by the user in CREATE/ALTER SUBSCRIPTION
@@ -64,6 +65,7 @@
#define SUBOPT_TWOPHASE_COMMIT 0x00000200
#define SUBOPT_DISABLE_ON_ERR 0x00000400
#define SUBOPT_LSN 0x00000800
+#define SUBOPT_MIN_APPLY_DELAY 0x00001000
/* check if the 'val' has 'bits' set */
#define IsSet(val, bits) (((val) & (bits)) == (bits))
@@ -87,6 +89,7 @@ typedef struct SubOpts
bool twophase;
bool disableonerr;
XLogRecPtr lsn;
+ int64 min_apply_delay;
} SubOpts;
static List *fetch_table_list(WalReceiverConn *wrconn, List *publications);
@@ -292,12 +295,35 @@ parse_subscription_options(ParseState *pstate, List *stmt_options,
opts->specified_opts |= SUBOPT_LSN;
opts->lsn = lsn;
}
+ else if (IsSet(supported_opts, SUBOPT_MIN_APPLY_DELAY) &&
+ strcmp(defel->defname, "min_apply_delay") == 0)
+ {
+ char *val;
+ Interval *interval;
+
+ if (IsSet(opts->specified_opts, SUBOPT_MIN_APPLY_DELAY))
+ errorConflictingDefElem(defel, pstate);
+
+ opts->specified_opts |= SUBOPT_MIN_APPLY_DELAY;
+ val = defGetString(defel);
+
+ interval = DatumGetIntervalP(DirectFunctionCall3(interval_in,
+ CStringGetDatum(val),
+ ObjectIdGetDatum(InvalidOid),
+ Int32GetDatum(-1)));
+ opts->min_apply_delay = interval_to_ms(interval);
+ }
else
ereport(ERROR,
(errcode(ERRCODE_SYNTAX_ERROR),
errmsg("unrecognized subscription parameter: \"%s\"", defel->defname)));
}
+ if (opts->min_apply_delay < 0)
+ ereport(ERROR,
+ errcode(ERRCODE_NUMERIC_VALUE_OUT_OF_RANGE),
+ errmsg("option \"%s\" must not be negative", "min_apply_delay"));
+
/*
* We've been explicitly asked to not connect, that requires some
* additional processing.
@@ -530,7 +556,7 @@ CreateSubscription(ParseState *pstate, CreateSubscriptionStmt *stmt,
SUBOPT_SLOT_NAME | SUBOPT_COPY_DATA |
SUBOPT_SYNCHRONOUS_COMMIT | SUBOPT_BINARY |
SUBOPT_STREAMING | SUBOPT_TWOPHASE_COMMIT |
- SUBOPT_DISABLE_ON_ERR);
+ SUBOPT_DISABLE_ON_ERR | SUBOPT_MIN_APPLY_DELAY);
parse_subscription_options(pstate, stmt->options, supported_opts, &opts);
/*
@@ -595,6 +621,7 @@ CreateSubscription(ParseState *pstate, CreateSubscriptionStmt *stmt,
values[Anum_pg_subscription_oid - 1] = ObjectIdGetDatum(subid);
values[Anum_pg_subscription_subdbid - 1] = ObjectIdGetDatum(MyDatabaseId);
values[Anum_pg_subscription_subskiplsn - 1] = LSNGetDatum(InvalidXLogRecPtr);
+ values[Anum_pg_subscription_subapplydelay - 1] = Int64GetDatum(opts.min_apply_delay);
values[Anum_pg_subscription_subname - 1] =
DirectFunctionCall1(namein, CStringGetDatum(stmt->subname));
values[Anum_pg_subscription_subowner - 1] = ObjectIdGetDatum(owner);
@@ -1070,6 +1097,12 @@ AlterSubscription(ParseState *pstate, AlterSubscriptionStmt *stmt,
replaces[Anum_pg_subscription_subdisableonerr - 1]
= true;
}
+ if (IsSet(opts.specified_opts, SUBOPT_MIN_APPLY_DELAY))
+ {
+ values[Anum_pg_subscription_subapplydelay - 1] =
+ Int64GetDatum(opts.min_apply_delay);
+ replaces[Anum_pg_subscription_subapplydelay - 1] = true;
+ }
update_tuple = true;
break;
@@ -1093,6 +1126,17 @@ AlterSubscription(ParseState *pstate, AlterSubscriptionStmt *stmt,
if (opts.enabled)
ApplyLauncherWakeupAtCommit();
+ /*
+ * If this subscription has been disabled and it has an apply
+ * delay set, wake up the logical replication worker to finish
+ * it as soon as possible.
+ */
+ if (!opts.enabled && sub->applydelay > 0)
+ {
+ elog(DEBUG1, "subscription has been disabled");
+ logicalrep_worker_wakeup(sub->oid, InvalidOid);
+ }
+
update_tuple = true;
break;
}
diff --git a/src/backend/replication/logical/worker.c b/src/backend/replication/logical/worker.c
index 38e3b1c1b3..db1cc2477c 100644
--- a/src/backend/replication/logical/worker.c
+++ b/src/backend/replication/logical/worker.c
@@ -252,6 +252,7 @@ WalReceiverConn *LogRepWorkerWalRcvConn = NULL;
Subscription *MySubscription = NULL;
static bool MySubscriptionValid = false;
+TimestampTz MySubscriptionMinApplyDelayUntil = 0;
bool in_remote_transaction = false;
static XLogRecPtr remote_final_lsn = InvalidXLogRecPtr;
@@ -324,6 +325,8 @@ static void maybe_reread_subscription(void);
static void DisableSubscriptionAndExit(void);
+static void apply_delay(TimestampTz ts);
+
/* prototype needed because of stream_commit */
static void apply_dispatch(StringInfo s);
@@ -803,6 +806,53 @@ slot_modify_data(TupleTableSlot *slot, TupleTableSlot *srcslot,
ExecStoreVirtualTuple(slot);
}
+/*
+ * Apply the informed delay for the transaction.
+ *
+ * A regular transaction uses the commit time to calculate the delay. A
+ * prepared transaction uses the prepare time to calculate the delay.
+ */
+static void
+apply_delay(TimestampTz ts)
+{
+ /* nothing to do if no delay set */
+ if (MySubscription->applydelay <= 0)
+ return;
+
+ /* set apply delay */
+ MySubscriptionMinApplyDelayUntil = TimestampTzPlusMilliseconds(TimestampTzGetDatum(ts),
+ MySubscription->applydelay);
+
+ while (true)
+ {
+ int diffms;
+
+ diffms = TimestampDifferenceMilliseconds(GetCurrentTimestamp(), MySubscriptionMinApplyDelayUntil);
+
+ /*
+ * Exit without arming the latch if it's already past time to apply
+ * this transaction.
+ */
+ if (diffms <= 0)
+ break;
+
+ elog(DEBUG2, "logical replication apply delay: %u ms", diffms);
+
+ WaitLatch(MyLatch,
+ WL_LATCH_SET | WL_TIMEOUT | WL_EXIT_ON_PM_DEATH,
+ diffms,
+ WAIT_EVENT_RECOVERY_APPLY_DELAY);
+ ResetLatch(MyLatch);
+
+ CHECK_FOR_INTERRUPTS();
+ }
+
+ /*
+ * Delay applied. Reset state.
+ */
+ MySubscriptionMinApplyDelayUntil = 0;
+}
+
/*
* Handle BEGIN message.
*/
@@ -814,6 +864,9 @@ apply_handle_begin(StringInfo s)
logicalrep_read_begin(s, &begin_data);
set_apply_error_context_xact(begin_data.xid, begin_data.final_lsn);
+ /* Should we delay the current transaction? */
+ apply_delay(begin_data.committime);
+
remote_final_lsn = begin_data.final_lsn;
maybe_start_skipping_changes(begin_data.final_lsn);
@@ -868,6 +921,9 @@ apply_handle_begin_prepare(StringInfo s)
logicalrep_read_begin_prepare(s, &begin_data);
set_apply_error_context_xact(begin_data.xid, begin_data.prepare_lsn);
+ /* Should we delay the current prepared transaction? */
+ apply_delay(begin_data.prepare_time);
+
remote_final_lsn = begin_data.prepare_lsn;
maybe_start_skipping_changes(begin_data.prepare_lsn);
@@ -1090,6 +1146,19 @@ apply_handle_stream_prepare(StringInfo s)
elog(DEBUG1, "received prepare for streamed transaction %u", prepare_data.xid);
+ /*
+ * Should we delay the current prepared transaction?
+ *
+ * Although the delay is applied in BEGIN PREPARE messages, streamed
+ * prepared transactions apply the delay in a STREAM PREPARE message.
+ * That's ok because no changes have been applied yet
+ * (apply_spooled_messages() will do it).
+ * The STREAM START message does not contain a prepare time (it will be
+ * available when the in-progress prepared transaction finishes), hence, it
+ * was not possible to apply a delay at that time.
+ */
+ apply_delay(prepare_data.prepare_time);
+
/* Replay all the spooled operations. */
apply_spooled_messages(prepare_data.xid, prepare_data.prepare_lsn);
@@ -1481,6 +1550,19 @@ apply_handle_stream_commit(StringInfo s)
elog(DEBUG1, "received commit for streamed transaction %u", xid);
+ /*
+ * Should we delay the current transaction?
+ *
+ * Although the delay is applied in BEGIN messages, streamed transactions
+ * apply the delay in a STREAM COMMIT message. That's ok because no changes
+ * have been applied yet (apply_spooled_messages() will do it).
+ * The STREAM START message would be a natural choice for this delay but
+ * there is no commit time yet (it will be available when the in-progress
+ * transaction finishes), hence, it was not possible to apply a delay at
+ * that time.
+ */
+ apply_delay(commit_data.committime);
+
apply_spooled_messages(xid, commit_data.commit_lsn);
apply_handle_commit_internal(&commit_data);
diff --git a/src/backend/utils/adt/timestamp.c b/src/backend/utils/adt/timestamp.c
index f70f829d83..dc9f2d6677 100644
--- a/src/backend/utils/adt/timestamp.c
+++ b/src/backend/utils/adt/timestamp.c
@@ -2411,6 +2411,38 @@ interval_cmp_internal(Interval *interval1, Interval *interval2)
return int128_compare(span1, span2);
}
+/*
+ * Given an Interval returns the number of milliseconds.
+ */
+int64
+interval_to_ms(const Interval *interval)
+{
+ int64 days;
+ int64 ms;
+ int64 result;
+
+ days = interval->month * INT64CONST(30);
+ days += interval->day;
+
+ /*
+ * The following operations use these special functions to detect overflow.
+ * Number of ms per informed days.
+ */
+ if (pg_mul_s64_overflow(days, MSECS_PER_DAY, &result))
+ ereport(ERROR,
+ (errcode(ERRCODE_NUMERIC_VALUE_OUT_OF_RANGE),
+ errmsg("bigint out of range")));
+
+ /* adds portion time (in ms) to the previous result. */
+ ms = interval->time / INT64CONST(1000);
+ if (pg_add_s64_overflow(result, ms, &result))
+ ereport(ERROR,
+ (errcode(ERRCODE_NUMERIC_VALUE_OUT_OF_RANGE),
+ errmsg("bigint out of range")));
+
+ return result;
+}
+
Datum
interval_eq(PG_FUNCTION_ARGS)
{
diff --git a/src/bin/pg_dump/pg_dump.c b/src/bin/pg_dump/pg_dump.c
index c871cb727d..b86234ab94 100644
--- a/src/bin/pg_dump/pg_dump.c
+++ b/src/bin/pg_dump/pg_dump.c
@@ -4410,6 +4410,7 @@ getSubscriptions(Archive *fout)
int i_subsynccommit;
int i_subpublications;
int i_subbinary;
+ int i_subapplydelay;
int i,
ntups;
@@ -4454,13 +4455,18 @@ getSubscriptions(Archive *fout)
if (fout->remoteVersion >= 150000)
appendPQExpBufferStr(query,
" s.subtwophasestate,\n"
- " s.subdisableonerr\n");
+ " s.subdisableonerr,\n");
else
appendPQExpBuffer(query,
" '%c' AS subtwophasestate,\n"
- " false AS subdisableonerr\n",
+ " false AS subdisableonerr,\n",
LOGICALREP_TWOPHASE_STATE_DISABLED);
+ if (fout->remoteVersion >= 160000)
+ appendPQExpBufferStr(query, " s.subapplydelay\n");
+ else
+ appendPQExpBufferStr(query, " 0 AS subapplydelay\n");
+
appendPQExpBufferStr(query,
"FROM pg_subscription s\n"
"WHERE s.subdbid = (SELECT oid FROM pg_database\n"
@@ -4486,6 +4492,7 @@ getSubscriptions(Archive *fout)
i_substream = PQfnumber(res, "substream");
i_subtwophasestate = PQfnumber(res, "subtwophasestate");
i_subdisableonerr = PQfnumber(res, "subdisableonerr");
+ i_subapplydelay = PQfnumber(res, "subapplydelay");
subinfo = pg_malloc(ntups * sizeof(SubscriptionInfo));
@@ -4515,6 +4522,8 @@ getSubscriptions(Archive *fout)
pg_strdup(PQgetvalue(res, i, i_subtwophasestate));
subinfo[i].subdisableonerr =
pg_strdup(PQgetvalue(res, i, i_subdisableonerr));
+ subinfo[i].subapplydelay =
+ strtoi64(PQgetvalue(res, i, i_subapplydelay), NULL, 10);
/* Decide whether we want to dump it */
selectDumpableObject(&(subinfo[i].dobj), fout);
@@ -4591,6 +4600,9 @@ dumpSubscription(Archive *fout, const SubscriptionInfo *subinfo)
if (strcmp(subinfo->subsynccommit, "off") != 0)
appendPQExpBuffer(query, ", synchronous_commit = %s", fmtId(subinfo->subsynccommit));
+ if (subinfo->subapplydelay > 0)
+ appendPQExpBuffer(query, ", min_apply_delay = '" INT64_FORMAT " ms'", subinfo->subapplydelay);
+
appendPQExpBufferStr(query, ");\n");
if (subinfo->dobj.dump & DUMP_COMPONENT_DEFINITION)
diff --git a/src/bin/pg_dump/pg_dump.h b/src/bin/pg_dump/pg_dump.h
index 1d21c2906f..3017273c7b 100644
--- a/src/bin/pg_dump/pg_dump.h
+++ b/src/bin/pg_dump/pg_dump.h
@@ -660,6 +660,7 @@ typedef struct _SubscriptionInfo
char *subtwophasestate;
char *subdisableonerr;
char *subsynccommit;
+ int64 subapplydelay;
char *subpublications;
} SubscriptionInfo;
diff --git a/src/bin/psql/describe.c b/src/bin/psql/describe.c
index 88d92a08ae..c7e93c555f 100644
--- a/src/bin/psql/describe.c
+++ b/src/bin/psql/describe.c
@@ -6351,7 +6351,7 @@ describeSubscriptions(const char *pattern, bool verbose)
PGresult *res;
printQueryOpt myopt = pset.popt;
static const bool translate_columns[] = {false, false, false, false,
- false, false, false, false, false, false, false};
+ false, false, false, false, false, false, false, false};
if (pset.sversion < 100000)
{
@@ -6393,6 +6393,12 @@ describeSubscriptions(const char *pattern, bool verbose)
gettext_noop("Two-phase commit"),
gettext_noop("Disable on error"));
+ /* min_apply_delay is only supported in v16 and higher */
+ if (pset.sversion >= 160000)
+ appendPQExpBuffer(&buf,
+ ", subapplydelay AS \"%s\"\n",
+ gettext_noop("Apply delay"));
+
appendPQExpBuffer(&buf,
", subsynccommit AS \"%s\"\n"
", subconninfo AS \"%s\"\n",
diff --git a/src/bin/psql/tab-complete.c b/src/bin/psql/tab-complete.c
index c5cafe6f4b..3e012f5162 100644
--- a/src/bin/psql/tab-complete.c
+++ b/src/bin/psql/tab-complete.c
@@ -1873,7 +1873,8 @@ psql_completion(const char *text, int start, int end)
COMPLETE_WITH("(", "PUBLICATION");
/* ALTER SUBSCRIPTION <name> SET ( */
else if (HeadMatches("ALTER", "SUBSCRIPTION", MatchAny) && TailMatches("SET", "("))
- COMPLETE_WITH("binary", "slot_name", "streaming", "synchronous_commit", "disable_on_error");
+ COMPLETE_WITH("binary", "slot_name", "streaming", "synchronous_commit",
+ "disable_on_error", "min_apply_delay");
/* ALTER SUBSCRIPTION <name> SKIP ( */
else if (HeadMatches("ALTER", "SUBSCRIPTION", MatchAny) && TailMatches("SKIP", "("))
COMPLETE_WITH("lsn");
@@ -3152,8 +3153,8 @@ psql_completion(const char *text, int start, int end)
/* Complete "CREATE SUBSCRIPTION <name> ... WITH ( <opt>" */
else if (HeadMatches("CREATE", "SUBSCRIPTION") && TailMatches("WITH", "("))
COMPLETE_WITH("binary", "connect", "copy_data", "create_slot",
- "enabled", "slot_name", "streaming",
- "synchronous_commit", "two_phase", "disable_on_error");
+ "enabled", "slot_name", "streaming", "synchronous_commit",
+ "two_phase", "disable_on_error", "min_apply_delay");
/* CREATE TRIGGER --- is allowed inside CREATE SCHEMA, so use TailMatches */
diff --git a/src/include/catalog/pg_subscription.h b/src/include/catalog/pg_subscription.h
index d1260f590c..fda97dab56 100644
--- a/src/include/catalog/pg_subscription.h
+++ b/src/include/catalog/pg_subscription.h
@@ -58,6 +58,8 @@ CATALOG(pg_subscription,6100,SubscriptionRelationId) BKI_SHARED_RELATION BKI_ROW
XLogRecPtr subskiplsn; /* All changes finished at this LSN are
* skipped */
+ int64 subapplydelay; /* Replication apply delay */
+
NameData subname; /* Name of the subscription */
Oid subowner BKI_LOOKUP(pg_authid); /* Owner of the subscription */
@@ -104,6 +106,7 @@ typedef struct Subscription
* in */
XLogRecPtr skiplsn; /* All changes finished at this LSN are
* skipped */
+ int64 applydelay; /* Replication apply delay */
char *name; /* Name of the subscription */
Oid owner; /* Oid of the subscription owner */
bool enabled; /* Indicates if the subscription is enabled */
diff --git a/src/include/datatype/timestamp.h b/src/include/datatype/timestamp.h
index d155f1b03b..d5bbfad1c4 100644
--- a/src/include/datatype/timestamp.h
+++ b/src/include/datatype/timestamp.h
@@ -127,6 +127,8 @@ struct pg_itm_in
#define SECS_PER_MINUTE 60
#define MINS_PER_HOUR 60
+#define MSECS_PER_DAY INT64CONST(86400000)
+
#define USECS_PER_DAY INT64CONST(86400000000)
#define USECS_PER_HOUR INT64CONST(3600000000)
#define USECS_PER_MINUTE INT64CONST(60000000)
diff --git a/src/include/utils/timestamp.h b/src/include/utils/timestamp.h
index edf3a97318..91709035da 100644
--- a/src/include/utils/timestamp.h
+++ b/src/include/utils/timestamp.h
@@ -78,6 +78,8 @@ extern bool TimestampDifferenceExceeds(TimestampTz start_time,
TimestampTz stop_time,
int msec);
+extern int64 interval_to_ms(const Interval *interval);
+
extern TimestampTz time_t_to_timestamptz(pg_time_t tm);
extern pg_time_t timestamptz_to_time_t(TimestampTz t);
diff --git a/src/test/regress/expected/subscription.out b/src/test/regress/expected/subscription.out
index 5db7146e06..b35381e065 100644
--- a/src/test/regress/expected/subscription.out
+++ b/src/test/regress/expected/subscription.out
@@ -76,10 +76,10 @@ ALTER SUBSCRIPTION regress_testsub CONNECTION 'foobar';
ERROR: invalid connection string syntax: missing "=" after "foobar" in connection info string
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub} | f | f | d | f | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Apply delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+-------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | f | d | f | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
ALTER SUBSCRIPTION regress_testsub SET PUBLICATION testpub2, testpub3 WITH (refresh = false);
@@ -96,10 +96,10 @@ ERROR: unrecognized subscription parameter: "create_slot"
-- ok
ALTER SUBSCRIPTION regress_testsub SKIP (lsn = '0/12345');
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+---------------------+--------+-----------+------------------+------------------+--------------------+------------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub2,testpub3} | f | f | d | f | off | dbname=regress_doesnotexist2 | 0/12345
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Apply delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+---------------------+--------+-----------+------------------+------------------+-------------+--------------------+------------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub2,testpub3} | f | f | d | f | 0 | off | dbname=regress_doesnotexist2 | 0/12345
(1 row)
-- ok - with lsn = NONE
@@ -108,10 +108,10 @@ ALTER SUBSCRIPTION regress_testsub SKIP (lsn = NONE);
ALTER SUBSCRIPTION regress_testsub SKIP (lsn = '0/0');
ERROR: invalid WAL location (LSN): 0/0
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+---------------------+--------+-----------+------------------+------------------+--------------------+------------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub2,testpub3} | f | f | d | f | off | dbname=regress_doesnotexist2 | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Apply delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+---------------------+--------+-----------+------------------+------------------+-------------+--------------------+------------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub2,testpub3} | f | f | d | f | 0 | off | dbname=regress_doesnotexist2 | 0/0
(1 row)
BEGIN;
@@ -143,10 +143,10 @@ ALTER SUBSCRIPTION regress_testsub_foo SET (synchronous_commit = foobar);
ERROR: invalid value for parameter "synchronous_commit": "foobar"
HINT: Available values: local, remote_write, remote_apply, on, off.
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Synchronous commit | Conninfo | Skip LSN
----------------------+---------------------------+---------+---------------------+--------+-----------+------------------+------------------+--------------------+------------------------------+----------
- regress_testsub_foo | regress_subscription_user | f | {testpub2,testpub3} | f | f | d | f | local | dbname=regress_doesnotexist2 | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Apply delay | Synchronous commit | Conninfo | Skip LSN
+---------------------+---------------------------+---------+---------------------+--------+-----------+------------------+------------------+-------------+--------------------+------------------------------+----------
+ regress_testsub_foo | regress_subscription_user | f | {testpub2,testpub3} | f | f | d | f | 0 | local | dbname=regress_doesnotexist2 | 0/0
(1 row)
-- rename back to keep the rest simple
@@ -179,19 +179,19 @@ ERROR: binary requires a Boolean value
CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUBLICATION testpub WITH (connect = false, binary = true);
WARNING: tables were not subscribed, you will have to run ALTER SUBSCRIPTION ... REFRESH PUBLICATION to subscribe the tables
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub} | t | f | d | f | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Apply delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+-------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | t | f | d | f | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
ALTER SUBSCRIPTION regress_testsub SET (binary = false);
ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE);
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub} | f | f | d | f | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Apply delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+-------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | f | d | f | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
DROP SUBSCRIPTION regress_testsub;
@@ -202,19 +202,19 @@ ERROR: streaming requires a Boolean value
CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUBLICATION testpub WITH (connect = false, streaming = true);
WARNING: tables were not subscribed, you will have to run ALTER SUBSCRIPTION ... REFRESH PUBLICATION to subscribe the tables
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub} | f | t | d | f | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Apply delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+-------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | t | d | f | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
ALTER SUBSCRIPTION regress_testsub SET (streaming = false);
ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE);
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub} | f | f | d | f | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Apply delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+-------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | f | d | f | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
-- fail - publication already exists
@@ -229,10 +229,10 @@ ALTER SUBSCRIPTION regress_testsub ADD PUBLICATION testpub1, testpub2 WITH (refr
ALTER SUBSCRIPTION regress_testsub ADD PUBLICATION testpub1, testpub2 WITH (refresh = false);
ERROR: publication "testpub1" is already in subscription "regress_testsub"
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-----------------------------+--------+-----------+------------------+------------------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub,testpub1,testpub2} | f | f | d | f | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Apply delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-----------------------------+--------+-----------+------------------+------------------+-------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub,testpub1,testpub2} | f | f | d | f | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
-- fail - publication used more then once
@@ -247,10 +247,10 @@ ERROR: publication "testpub3" is not in subscription "regress_testsub"
-- ok - delete publications
ALTER SUBSCRIPTION regress_testsub DROP PUBLICATION testpub1, testpub2 WITH (refresh = false);
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub} | f | f | d | f | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Apply delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+-------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | f | d | f | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
DROP SUBSCRIPTION regress_testsub;
@@ -284,10 +284,10 @@ ERROR: two_phase requires a Boolean value
CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUBLICATION testpub WITH (connect = false, two_phase = true);
WARNING: tables were not subscribed, you will have to run ALTER SUBSCRIPTION ... REFRESH PUBLICATION to subscribe the tables
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub} | f | f | p | f | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Apply delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+-------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | f | p | f | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
--fail - alter of two_phase option not supported.
@@ -296,10 +296,10 @@ ERROR: unrecognized subscription parameter: "two_phase"
-- but can alter streaming when two_phase enabled
ALTER SUBSCRIPTION regress_testsub SET (streaming = true);
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub} | f | t | p | f | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Apply delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+-------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | t | p | f | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE);
@@ -308,10 +308,10 @@ DROP SUBSCRIPTION regress_testsub;
CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUBLICATION testpub WITH (connect = false, streaming = true, two_phase = true);
WARNING: tables were not subscribed, you will have to run ALTER SUBSCRIPTION ... REFRESH PUBLICATION to subscribe the tables
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub} | f | t | p | f | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Apply delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+-------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | t | p | f | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE);
@@ -323,18 +323,47 @@ ERROR: disable_on_error requires a Boolean value
CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUBLICATION testpub WITH (connect = false, disable_on_error = false);
WARNING: tables were not subscribed, you will have to run ALTER SUBSCRIPTION ... REFRESH PUBLICATION to subscribe the tables
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub} | f | f | d | f | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Apply delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+-------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | f | d | f | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
ALTER SUBSCRIPTION regress_testsub SET (disable_on_error = true);
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub} | f | f | d | t | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Apply delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+-------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | f | d | t | 0 | off | dbname=regress_doesnotexist | 0/0
+(1 row)
+
+ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE);
+DROP SUBSCRIPTION regress_testsub;
+-- fail -- min_apply_delay must be a non-negative integer
+CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUBLICATION testpub WITH (connect = false, min_apply_delay = foo);
+ERROR: invalid input syntax for type interval: "foo"
+CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUBLICATION testpub WITH (connect = false, min_apply_delay = -1);
+ERROR: option "min_apply_delay" must not be negative
+-- success -- 123 ms
+CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUBLICATION testpub WITH (connect = false, min_apply_delay = 123);
+WARNING: tables were not subscribed, you will have to run ALTER SUBSCRIPTION ... REFRESH PUBLICATION to subscribe the tables
+\dRs+
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Apply delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+-------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | f | d | f | 123000 | off | dbname=regress_doesnotexist | 0/0
+(1 row)
+
+ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE);
+DROP SUBSCRIPTION regress_testsub;
+-- success -- interval is converted into ms and stored as integer
+CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUBLICATION testpub WITH (connect = false, min_apply_delay = '4h 27min 35s');
+WARNING: tables were not subscribed, you will have to run ALTER SUBSCRIPTION ... REFRESH PUBLICATION to subscribe the tables
+\dRs+
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Apply delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+-------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | f | d | f | 16055000 | off | dbname=regress_doesnotexist | 0/0
(1 row)
ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE);
diff --git a/src/test/regress/sql/subscription.sql b/src/test/regress/sql/subscription.sql
index 74c38ead5d..51dade37b1 100644
--- a/src/test/regress/sql/subscription.sql
+++ b/src/test/regress/sql/subscription.sql
@@ -254,6 +254,26 @@ ALTER SUBSCRIPTION regress_testsub SET (disable_on_error = true);
ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE);
DROP SUBSCRIPTION regress_testsub;
+-- fail -- min_apply_delay must be a non-negative integer
+CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUBLICATION testpub WITH (connect = false, min_apply_delay = foo);
+CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUBLICATION testpub WITH (connect = false, min_apply_delay = -1);
+
+-- success -- 123 ms
+CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUBLICATION testpub WITH (connect = false, min_apply_delay = 123);
+
+\dRs+
+
+ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE);
+DROP SUBSCRIPTION regress_testsub;
+
+-- success -- interval is converted into ms and stored as integer
+CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUBLICATION testpub WITH (connect = false, min_apply_delay = '4h 27min 35s');
+
+\dRs+
+
+ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE);
+DROP SUBSCRIPTION regress_testsub;
+
RESET SESSION AUTHORIZATION;
DROP ROLE regress_subscription_user;
DROP ROLE regress_subscription_user2;
diff --git a/src/test/subscription/t/032_apply_delay.pl b/src/test/subscription/t/032_apply_delay.pl
new file mode 100644
index 0000000000..14591323b6
--- /dev/null
+++ b/src/test/subscription/t/032_apply_delay.pl
@@ -0,0 +1,110 @@
+
+# Copyright (c) 2022, PostgreSQL Global Development Group
+
+# Test replication apply delay
+use strict;
+use warnings;
+use PostgreSQL::Test::Cluster;
+use PostgreSQL::Test::Utils;
+use Test::More;
+
+# Create publisher node
+my $node_publisher = PostgreSQL::Test::Cluster->new('publisher');
+$node_publisher->init(allows_streaming => 'logical');
+$node_publisher->append_conf('postgresql.conf',
+ 'logical_decoding_work_mem = 64kB');
+$node_publisher->start;
+
+# Create subscriber node
+my $node_subscriber = PostgreSQL::Test::Cluster->new('subscriber');
+$node_subscriber->init;
+$node_subscriber->append_conf('postgresql.conf',
+ "log_min_messages = debug2");
+$node_subscriber->start;
+
+# Create some preexisting content on publisher
+$node_publisher->safe_psql('postgres',
+ "CREATE TABLE test_tab (a int primary key, b varchar)");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO test_tab VALUES (1, 'foo'), (2, 'bar')");
+
+# Setup structure on subscriber
+$node_subscriber->safe_psql('postgres',
+ "CREATE TABLE test_tab (a int primary key, b text, c timestamptz DEFAULT now(), d bigint DEFAULT 999)"
+);
+
+# Setup logical replication
+my $publisher_connstr = $node_publisher->connstr . ' dbname=postgres';
+$node_publisher->safe_psql('postgres',
+ "CREATE PUBLICATION tap_pub FOR TABLE test_tab");
+
+my $appname = 'tap_sub';
+$node_subscriber->safe_psql('postgres',
+ "CREATE SUBSCRIPTION tap_sub CONNECTION '$publisher_connstr application_name=$appname' PUBLICATION tap_pub WITH (streaming = on, min_apply_delay = '2s')"
+);
+
+$node_publisher->wait_for_catchup($appname);
+
+# Check log starting now for logical replication apply delay
+my log_location = -s $node_subscriber->logfile;
+
+# Also wait for initial table sync to finish
+my $synced_query =
+ "SELECT count(1) = 0 FROM pg_subscription_rel WHERE srsubstate NOT IN ('r', 's');";
+$node_subscriber->poll_query_until('postgres', $synced_query)
+ or die "Timed out while waiting for subscriber to synchronize data";
+
+my $result =
+ $node_subscriber->safe_psql('postgres',
+ "SELECT count(*), min(a), max(a) FROM test_tab");
+is($result, qq(2|1|2), 'check initial data was copied to subscriber');
+
+# new row to trigger apply delay
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO test_tab VALUES (3, 'baz')");
+
+$node_publisher->wait_for_catchup($appname);
+
+$result =
+ $node_subscriber->safe_psql('postgres',
+ "SELECT count(*), min(a), max(a) FROM test_tab");
+is($result, qq(3|1|3), 'check if the new row was applied to subscriber');
+
+check_apply_delay_log("logical replication apply delay");
+
+# Test streamed transaction.
+# Insert, update and delete enough rows to exceed 64kB limit.
+$node_publisher->safe_psql(
+ 'postgres', q{
+BEGIN;
+INSERT INTO test_tab SELECT i, md5(i::text) FROM generate_series(4, 5000) s(i);
+UPDATE test_tab SET b = md5(b) WHERE mod(a,2) = 0;
+DELETE FROM test_tab WHERE mod(a,3) = 0;
+COMMIT;
+});
+
+$node_publisher->wait_for_catchup($appname);
+
+$result =
+ $node_subscriber->safe_psql('postgres',
+ "SELECT count(*), min(a), max(a) FROM test_tab");
+is($result, qq(3334|1|5000), 'check if the new rows were applied to subscriber');
+
+check_apply_delay_log("logical replication apply delay");
+
+$node_subscriber->stop;
+$node_publisher->stop;
+
+done_testing();
+
+sub check_apply_delay_log
+{
+ my $message = shift;
+ my $old_log_location = $log_location;
+
+ $log_location = $node_subscriber->wait_for_log(qr/$message/, $log_location);
+
+ cmp_ok($log_location, '>', $old_log_location,
+ "$sect: logfile contains triggered logical replication apply delay"
+ );
+}
--
2.30.2
Here are some review comments for your v4-0001 patch. I hope they are
useful for you.
======
1. General
This thread name "logical replication restrictions" seems quite
unrelated to the patch here. Maybe it's better to start a new thread
otherwise nobody is going to recognise what this thread is really
about.
======
2. Commit message
Similar to physical replication, a time-delayed copy of the data for
logical replication is useful for some scenarios (specially to fix
errors that might cause data loss).
"specially" -> "particularly" ?
~~~
3. Commit message
Maybe take some examples from the regression tests to show usage of
the new parameter
======
4. doc/src/sgml/catalogs.sgml
+ <row>
+ <entry role="catalog_table_entry"><para role="column_definition">
+ <structfield>subapplydelay</structfield> <type>int8</type>
+ </para>
+ <para>
+ Delay the application of changes by a specified amount of time.
+ </para></entry>
+ </row>
I think this should say that the units are ms.
======
5. doc/src/sgml/ref/create_subscription.sgml
+ <varlistentry>
+ <term><literal>min_apply_delay</literal> (<type>integer</type>)</term>
+ <listitem>
Is the "integer" type here correct? It might eventually be stored as
an integer, but IIUC (going by the tests) from the user point-of-view
this parameter is really "text" type for representing ms or interval,
right?
~~~
6. doc/src/sgml/ref/create_subscription.sgml
Similar
+ to the physical replication feature
+ (<xref linkend="guc-recovery-min-apply-delay"/>), it may be useful to
+ have a time-delayed copy of data for logical replication.
SUGGESTION
As with the physical replication feature (recovery_min_apply_delay),
it can be useful for logical replication to delay the data
replication.
~~~
7. doc/src/sgml/ref/create_subscription.sgml
Delays in logical
+ decoding and in transfer the transaction may reduce the actual wait
+ time.
SUGGESTION
Time spent in logical decoding and in transferring the transaction may
reduce the actual wait time.
~~~
8. doc/src/sgml/ref/create_subscription.sgml
If the system clocks on publisher and subscriber are not
+ synchronized, this may lead to apply changes earlier than expected.
Why just say "earlier than expected"? If the publisher's time is ahead
of the subscriber then the changes might also be *later* than
expected, right? So, perhaps it is better to just say "other than
expected".
~~~
9. doc/src/sgml/ref/create_subscription.sgml
Should there also be a big warning box about the impact if using
synchronous_commit (like the other streaming replication page has this
warning)?
~~~
10. doc/src/sgml/ref/create_subscription.sgml
I think there should be some examples somewhere showing how to specify
this parameter. Maybe they are better added somewhere in "31.2
Subscription" and xrefed from here.
======
11. src/backend/commands/subscriptioncmds.c - parse_subscription_options
I think there should be a default assignment to 0 (done where all the
other supported option defaults are set)
~~~
12. src/backend/commands/subscriptioncmds.c - parse_subscription_options
+ if (opts->min_apply_delay < 0)
+ ereport(ERROR,
+ errcode(ERRCODE_NUMERIC_VALUE_OUT_OF_RANGE),
+ errmsg("option \"%s\" must not be negative", "min_apply_delay"));
+
I thought this check only needs to be do within scope of the preceding
if - (IsSet(supported_opts, SUBOPT_MIN_APPLY_DELAY) &&
strcmp(defel->defname, "min_apply_delay") == 0)
======
13. src/backend/commands/subscriptioncmds.c - AlterSubscription
@@ -1093,6 +1126,17 @@ AlterSubscription(ParseState *pstate,
AlterSubscriptionStmt *stmt,
if (opts.enabled)
ApplyLauncherWakeupAtCommit();
+ /*
+ * If this subscription has been disabled and it has an apply
+ * delay set, wake up the logical replication worker to finish
+ * it as soon as possible.
+ */
+ if (!opts.enabled && sub->applydelay > 0)
I did not really understand the logic why should the min_apply_delay
override the enabled=false? It is a called *minimum* delay so if it
ends up being way over the parameter value (because the subscription
is disabled) then why does that matter?
======
14. src/backend/replication/logical/worker.c
@@ -252,6 +252,7 @@ WalReceiverConn *LogRepWorkerWalRcvConn = NULL;
Subscription *MySubscription = NULL;
static bool MySubscriptionValid = false;
+TimestampTz MySubscriptionMinApplyDelayUntil = 0;
Looking at the only usage of this variable (in apply_delay) and how it
is used I did see why this cannot just be a local member of the
apply_delay function?
~~~
15. src/backend/replication/logical/worker.c - apply_delay
+/*
+ * Apply the informed delay for the transaction.
+ *
+ * A regular transaction uses the commit time to calculate the delay. A
+ * prepared transaction uses the prepare time to calculate the delay.
+ */
+static void
+apply_delay(TimestampTz ts)
I didn't think it needs to mention here about the different kinds of
transactions because where it comes from has nothing really to do with
this function's logic.
~~~
16. src/backend/replication/logical/worker.c - apply_delay
Refer to comment #14 about MySubscriptionMinApplyDelayUntil.
~~~
17. src/backend/replication/logical/worker.c - apply_handle_stream_prepare
@@ -1090,6 +1146,19 @@ apply_handle_stream_prepare(StringInfo s)
elog(DEBUG1, "received prepare for streamed transaction %u",
prepare_data.xid);
+ /*
+ * Should we delay the current prepared transaction?
+ *
+ * Although the delay is applied in BEGIN PREPARE messages, streamed
+ * prepared transactions apply the delay in a STREAM PREPARE message.
+ * That's ok because no changes have been applied yet
+ * (apply_spooled_messages() will do it).
+ * The STREAM START message does not contain a prepare time (it will be
+ * available when the in-progress prepared transaction finishes), hence, it
+ * was not possible to apply a delay at that time.
+ */
+ apply_delay(prepare_data.prepare_time);
+
It seems to rely on the spooling happening at the end. But won't this
cause a problem later when/if the "parallel apply" patch [1]/messages/by-id/CAA4eK1+wyN6zpaHUkCLorEWNx75MG0xhMwcFhvjqm2KURZEAGw@mail.gmail.com is pushed
and the stream bgworkers are doing stuff on the fly instead of
spooling at the end?
Or are you expecting that the "parallel apply" feature should be
disabled if there is any min_apply_delay parameter specified?
~~~
18. src/backend/replication/logical/worker.c - apply_handle_stream_commit
Ditto comment #17.
======
19. src/bin/psql/tab-complete.c
Let's keep the alphabetical order of the parameters in COMPLETE_WITH, as per [2]/messages/by-id/CAHut+PucvKZgg_eJzUW--iL6DXHg1Jwj6F09tQziE3kUF67uLg@mail.gmail.com
======
20. src/include/catalog/pg_subscription.h
@@ -58,6 +58,8 @@ CATALOG(pg_subscription,6100,SubscriptionRelationId)
BKI_SHARED_RELATION BKI_ROW
XLogRecPtr subskiplsn; /* All changes finished at this LSN are
* skipped */
+ int64 subapplydelay; /* Replication apply delay */
+
IMO the comment should mention the units "(ms)"
======
21. src/test/regress/sql/subscription.sql
There are some test cases for CREATE SUBSCRIPTION but there are no
test cases for ALTER SUBSCRIPTION changing this new parameter.
====
22. src/test/subscription/t/032_apply_delay.pl
I received the following error when trying to run these 'subscription' tests:
t/032_apply_delay.pl ............... No such class log_location at
t/032_apply_delay.pl line 49, near "my log_location"
syntax error at t/032_apply_delay.pl line 49, near "my log_location ="
Global symbol "$log_location" requires explicit package name at
t/032_apply_delay.pl line 103.
Global symbol "$log_location" requires explicit package name at
t/032_apply_delay.pl line 105.
Global symbol "$log_location" requires explicit package name at
t/032_apply_delay.pl line 105.
Global symbol "$log_location" requires explicit package name at
t/032_apply_delay.pl line 107.
Global symbol "$sect" requires explicit package name at
t/032_apply_delay.pl line 108.
Execution of t/032_apply_delay.pl aborted due to compilation errors.
t/032_apply_delay.pl ............... Dubious, test returned 255 (wstat
65280, 0xff00)
No subtests run
t/100_bugs.pl ...................... ok
Test Summary Report
-------------------
t/032_apply_delay.pl (Wstat: 65280 Tests: 0 Failed: 0)
Non-zero exit status: 255
Parse errors: No plan found in TAP output
------
[1]: /messages/by-id/CAA4eK1+wyN6zpaHUkCLorEWNx75MG0xhMwcFhvjqm2KURZEAGw@mail.gmail.com
[2]: /messages/by-id/CAHut+PucvKZgg_eJzUW--iL6DXHg1Jwj6F09tQziE3kUF67uLg@mail.gmail.com
Kind Regards,
Peter Smith.
Fujitsu Australia
On Tue, Jul 5, 2022 at 2:12 PM Peter Smith <smithpb2250@gmail.com> wrote:
Here are some review comments for your v4-0001 patch. I hope they are
useful for you.======
1. General
This thread name "logical replication restrictions" seems quite
unrelated to the patch here. Maybe it's better to start a new thread
otherwise nobody is going to recognise what this thread is really
about.
+1.
17. src/backend/replication/logical/worker.c - apply_handle_stream_prepare
@@ -1090,6 +1146,19 @@ apply_handle_stream_prepare(StringInfo s)
elog(DEBUG1, "received prepare for streamed transaction %u",
prepare_data.xid);+ /* + * Should we delay the current prepared transaction? + * + * Although the delay is applied in BEGIN PREPARE messages, streamed + * prepared transactions apply the delay in a STREAM PREPARE message. + * That's ok because no changes have been applied yet + * (apply_spooled_messages() will do it). + * The STREAM START message does not contain a prepare time (it will be + * available when the in-progress prepared transaction finishes), hence, it + * was not possible to apply a delay at that time. + */ + apply_delay(prepare_data.prepare_time); +It seems to rely on the spooling happening at the end. But won't this
cause a problem later when/if the "parallel apply" patch [1] is pushed
and the stream bgworkers are doing stuff on the fly instead of
spooling at the end?
I wonder why we don't apply the delay on commit/commit_prepared
records only similar to physical replication. See recoveryApplyDelay.
One more advantage would be then we don't need to worry about
transactions that we are going to skip due SKIP feature for
subscribers.
One more thing that might be worth discussing is whether introducing a
new subscription parameter for this feature is a better idea or can we
use guc (either an existing or a new one). Users may want to set this
only for a particular subscription or set of subscriptions in which
case it is better to have this as a subscription level parameter.
OTOH, I was slightly worried that if this will be used for all
subscriptions on a subscriber then it will be burdensome for users.
--
With Regards,
Amit Kapila.
Hi Euler,
I've some comments/questions about the latest version (v4) of your patch.
Firstly, I think the patch needs a rebase. CI currently cannot apply it [1]http://cfbot.cputube.org/patch_38_3581.log.
22. src/test/subscription/t/032_apply_delay.pl
I received the following error when trying to run these 'subscription'
tests:t/032_apply_delay.pl ............... No such class log_location at
t/032_apply_delay.pl line 49, near "my log_location"
syntax error at t/032_apply_delay.pl line 49, near "my log_location ="
I'm having these errors too. Seems like some declarations are missing.
+ specified amount of time. If this value is specified without
units,
+ it is taken as milliseconds. The default is zero, adding no
delay.
+ </para>
I'm also having an issue when I give min_apply_delay parameter without
units.
I expect that if I set min_apply_delay to 5000 (without any unit), it will
be interpreted as 5000 ms.
I tried:
postgres=# CREATE SUBSCRIPTION testsub CONNECTION 'dbname=postgres
port=5432' PUBLICATION testpub WITH (min_apply_delay=5000);
And logs showed:
2022-07-13 20:26:52.231 +03 [5422] LOG: logical replication apply delay:
4999999 ms
2022-07-13 20:26:52.231 +03 [5422] CONTEXT: processing remote data for
replication origin "pg_18126" during "BEGIN" in transaction 3152 finished
at 0/465D7A0
Looks like it starts from 5000000 ms instead of 5000 ms for me. If I state
the unit as ms, then it works correctly.
Lastly, I have a question about this delay during tablesync.
It's stated here that apply delays are not for initial tablesync.
<para>
+ The delay occurs only on WAL records for transaction begins and
after
+ the initial table synchronization. It is possible that the
+ replication delay between publisher and subscriber exceeds the
value
+ of this parameter, in which case no delay is added. Note that
the
+ delay is calculated between the WAL time stamp as written on
+ publisher and the current time on the subscriber. Delays in
logical
+ decoding and in transfer the transaction may reduce the actual
wait
+ time. If the system clocks on publisher and subscriber are not
+ synchronized, this may lead to apply changes earlier than
expected.
+ This is not a major issue because a typical setting of this
parameter
+ are much larger than typical time deviations between servers.
+ </para>
There might be a case where tablesync workers are in SYNCWAIT state and
waiting for apply worker to tell them to CATCHUP.
And if apply worker is waiting in apply_delay function, tablesync workers
will be stuck at SYNCWAIT state and this might delay tablesync at least
"min_apply_delay" amount of time or more.
Is it something we would want? What do you think?
[1]: http://cfbot.cputube.org/patch_38_3581.log
Best,
Melih
On Tue, Jul 5, 2022, at 5:41 AM, Peter Smith wrote:
Here are some review comments for your v4-0001 patch. I hope they are
useful for you.
Thanks for your review.
This thread name "logical replication restrictions" seems quite
unrelated to the patch here. Maybe it's better to start a new thread
otherwise nobody is going to recognise what this thread is really
about.
I agree that the $SUBJECT does not describe the proposal. I decided that it is
not worth creating a thread because (i) there are some interaction and they
could be monitoring this thread and (ii) the CF entry has the correct
description.
Similar to physical replication, a time-delayed copy of the data for
logical replication is useful for some scenarios (specially to fix
errors that might cause data loss).
I changed the commit message a bit.
Maybe take some examples from the regression tests to show usage of
the new parameter
I don't think an example is really useful in a commit message. If you are
checking this commit, it is a matter of reading the regression tests or
documentation to obtain an example of how to use it.
I think this should say that the units are ms.
Unit included.
Is the "integer" type here correct? It might eventually be stored as
an integer, but IIUC (going by the tests) from the user point-of-view
this parameter is really "text" type for representing ms or interval,
right?
The internal representation is integer. The unit is correct. If you use units,
the format is text that what the section [1] calls "Numeric with Unit". Even
if the user is unsure about its usage, an example might help here.
SUGGESTION
As with the physical replication feature (recovery_min_apply_delay),
it can be useful for logical replication to delay the data
replication.
It is not "data replication", it is applying changes. I reworded that sentence.
SUGGESTION
Time spent in logical decoding and in transferring the transaction may
reduce the actual wait time.
Changed.
If the system clocks on publisher and subscriber are not
+ synchronized, this may lead to apply changes earlier than expected.Why just say "earlier than expected"? If the publisher's time is ahead
of the subscriber then the changes might also be *later* than
expected, right? So, perhaps it is better to just say "other than
expected".
This sentence is similar to another one in the recovery_min_apply_delay. I want
to emphasize the fact that even if you use a 30-minute delay, it might apply a
change that happened 29 minutes 55 seconds ago. The main reason for this
feature is to avoid modifying changes *earlier*. If it applies the change 30
minutes 5 seconds, it is fine.
Should there also be a big warning box about the impact if using
synchronous_commit (like the other streaming replication page has this
warning)?
Impact? Could you elaborate?
I think there should be some examples somewhere showing how to specify
this parameter. Maybe they are better added somewhere in "31.2
Subscription" and xrefed from here.
I added one example in the CREATE SUBSCRIPTION. We can add an example in the
section 31.2, however, since it is a new chapter I think it lacks examples for
the other options too (streaming, two_phase, copy_data, ...). It could be
submitted as a separate patch IMO.
I think there should be a default assignment to 0 (done where all the
other supported option defaults are set)
It could for completeness. the memset() takes care of it. Anyway, I added it to
the beginning of the parse_subscription_options().
+ if (opts->min_apply_delay < 0) + ereport(ERROR, + errcode(ERRCODE_NUMERIC_VALUE_OUT_OF_RANGE), + errmsg("option \"%s\" must not be negative", "min_apply_delay")); +I thought this check only needs to be do within scope of the preceding
if - (IsSet(supported_opts, SUBOPT_MIN_APPLY_DELAY) &&
strcmp(defel->defname, "min_apply_delay") == 0)
Fixed.
+ /* + * If this subscription has been disabled and it has an apply + * delay set, wake up the logical replication worker to finish + * it as soon as possible. + */ + if (!opts.enabled && sub->applydelay > 0)I did not really understand the logic why should the min_apply_delay
override the enabled=false? It is a called *minimum* delay so if it
ends up being way over the parameter value (because the subscription
is disabled) then why does that matter?
It doesn't. The main point of this code (as I tried to explain in the comment)
is to kill the worker as soon as possible if you disable the subscription.
Isn't the comment clear?
Subscription *MySubscription = NULL;
static bool MySubscriptionValid = false;
+TimestampTz MySubscriptionMinApplyDelayUntil = 0;Looking at the only usage of this variable (in apply_delay) and how it
is used I did see why this cannot just be a local member of the
apply_delay function?
Good catch. A previous patch used that variable outside that function scope.
+/* + * Apply the informed delay for the transaction. + * + * A regular transaction uses the commit time to calculate the delay. A + * prepared transaction uses the prepare time to calculate the delay. + */ +static void +apply_delay(TimestampTz ts)I didn't think it needs to mention here about the different kinds of
transactions because where it comes from has nothing really to do with
this function's logic.
Fixed.
Refer to comment #14 about MySubscriptionMinApplyDelayUntil.
Fixed.
It seems to rely on the spooling happening at the end. But won't this
cause a problem later when/if the "parallel apply" patch [1] is pushed
and the stream bgworkers are doing stuff on the fly instead of
spooling at the end?Or are you expecting that the "parallel apply" feature should be
disabled if there is any min_apply_delay parameter specified?
I didn't read the "parallel apply" patch yet.
Let's keep the alphabetical order of the parameters in COMPLETE_WITH, as per [2]
Fixed.
+ int64 subapplydelay; /* Replication apply delay */ +IMO the comment should mention the units "(ms)"
I'm not sure. It should be documented in the catalogs. It is an important
information for user-visible interface. There are a few places in the
documentation that the unit is mentioned.
There are some test cases for CREATE SUBSCRIPTION but there are no
test cases for ALTER SUBSCRIPTION changing this new parameter.
I added a test to cover ALTER SUBSCRIPTION and also for the disabling a
subscription that contains a delay set.
I received the following error when trying to run these 'subscription' tests:
Fixed.
--
Euler Taveira
EDB https://www.enterprisedb.com/
Attachments:
v5-0001-Time-delayed-logical-replication-subscriber.patchtext/x-patch; name=v5-0001-Time-delayed-logical-replication-subscriber.patchDownload
From 7a79e6b640826c5602805d2ff27ed226bdb1c940 Mon Sep 17 00:00:00 2001
From: Euler Taveira <euler.taveira@enterprisedb.com>
Date: Sat, 6 Nov 2021 11:31:10 -0300
Subject: [PATCH v5] Time-delayed logical replication subscriber
Similar to physical replication, a time-delayed copy of the data for
logical replication is useful for some scenarios (particularly to fix
errors that might cause data loss).
If the subscriber sets min_apply_delay parameter, the logical
replication worker will delay the transaction commit for min_apply_delay
milliseconds.
The delay is calculated between the WAL time stamp and the current time
on the subscriber.
The delay occurs only on WAL records for transaction begins. The main
reason is to avoid keeping a transaction open for a long time. Regular
and prepared transactions are covered. Streamed transactions are also
covered.
Author: Euler Taveira
Discussion: https://postgr.es/m/CAB-JLwYOYwL=XTyAXKiH5CtM_Vm8KjKh7aaitCKvmCh4rzr5pQ@mail.gmail.com
---
doc/src/sgml/catalogs.sgml | 10 ++
doc/src/sgml/ref/alter_subscription.sgml | 5 +-
doc/src/sgml/ref/create_subscription.sgml | 43 +++++-
src/backend/catalog/pg_subscription.c | 1 +
src/backend/catalog/system_views.sql | 7 +-
src/backend/commands/subscriptioncmds.c | 48 +++++-
src/backend/replication/logical/worker.c | 85 +++++++++++
src/backend/utils/adt/timestamp.c | 32 ++++
src/bin/pg_dump/pg_dump.c | 17 ++-
src/bin/pg_dump/pg_dump.h | 1 +
src/bin/psql/describe.c | 9 +-
src/bin/psql/tab-complete.c | 4 +-
src/include/catalog/pg_subscription.h | 3 +
src/include/datatype/timestamp.h | 2 +
src/include/utils/timestamp.h | 2 +
src/test/regress/expected/subscription.out | 165 ++++++++++++---------
src/test/regress/sql/subscription.sql | 20 +++
src/test/subscription/t/032_apply_delay.pl | 130 ++++++++++++++++
18 files changed, 501 insertions(+), 83 deletions(-)
create mode 100644 src/test/subscription/t/032_apply_delay.pl
diff --git a/doc/src/sgml/catalogs.sgml b/doc/src/sgml/catalogs.sgml
index cd2cc37aeb..efc745a9ea 100644
--- a/doc/src/sgml/catalogs.sgml
+++ b/doc/src/sgml/catalogs.sgml
@@ -7833,6 +7833,16 @@ SCRAM-SHA-256$<replaceable><iteration count></replaceable>:<replaceable>&l
</para></entry>
</row>
+ <row>
+ <entry role="catalog_table_entry"><para role="column_definition">
+ <structfield>subapplydelay</structfield> <type>int8</type>
+ </para>
+ <para>
+ Delay the application of changes by a specified amount of time. The
+ unit is in milliseconds.
+ </para></entry>
+ </row>
+
<row>
<entry role="catalog_table_entry"><para role="column_definition">
<structfield>subname</structfield> <type>name</type>
diff --git a/doc/src/sgml/ref/alter_subscription.sgml b/doc/src/sgml/ref/alter_subscription.sgml
index 64efc21f53..8901e1361c 100644
--- a/doc/src/sgml/ref/alter_subscription.sgml
+++ b/doc/src/sgml/ref/alter_subscription.sgml
@@ -208,8 +208,9 @@ ALTER SUBSCRIPTION <replaceable class="parameter">name</replaceable> RENAME TO <
are <literal>slot_name</literal>,
<literal>synchronous_commit</literal>,
<literal>binary</literal>, <literal>streaming</literal>,
- <literal>disable_on_error</literal>, and
- <literal>origin</literal>.
+ <literal>disable_on_error</literal>,
+ <literal>origin</literal>, and
+ <literal>min_apply_delay</literal>.
</para>
</listitem>
</varlistentry>
diff --git a/doc/src/sgml/ref/create_subscription.sgml b/doc/src/sgml/ref/create_subscription.sgml
index 7390c715bc..1fc8e7474a 100644
--- a/doc/src/sgml/ref/create_subscription.sgml
+++ b/doc/src/sgml/ref/create_subscription.sgml
@@ -317,7 +317,36 @@ CREATE SUBSCRIPTION <replaceable class="parameter">subscription_name</replaceabl
</para>
</listitem>
</varlistentry>
- </variablelist></para>
+
+ <varlistentry>
+ <term><literal>min_apply_delay</literal> (<type>integer</type>)</term>
+ <listitem>
+ <para>
+ By default, subscriber applies changes as soon as possible. As with
+ the physical replication feature
+ (<xref linkend="guc-recovery-min-apply-delay"/>), it can be useful to
+ have a time-delayed logical replica. This parameter allows you to
+ delay the application of changes by a specified amount of time. If
+ this value is specified without units, it is taken as seconds. The
+ default is zero, adding no delay.
+ </para>
+ <para>
+ The delay occurs only on WAL records for transaction begins and after
+ the initial table synchronization. It is possible that the
+ replication delay between publisher and subscriber exceeds the value
+ of this parameter, in which case no delay is added. Note that the
+ delay is calculated between the WAL time stamp as written on
+ publisher and the current time on the subscriber. Time spent in logical
+ decoding and in transfering the transaction may reduce the actual wait
+ time. If the system clocks on publisher and subscriber are not
+ synchronized, this may lead to apply changes earlier than expected.
+ This is not a major issue because a typical setting of this parameter
+ are much larger than typical time deviations between servers.
+ </para>
+ </listitem>
+ </varlistentry>
+ </variablelist>
+ </para>
</listitem>
</varlistentry>
@@ -413,6 +442,18 @@ CREATE SUBSCRIPTION mysub
PUBLICATION insert_only
WITH (enabled = false);
</programlisting></para>
+
+ <para>
+ Create a subscription to a remote server that replicates tables in
+ the <literal>baz</literal> publication and starts replicating immediately on
+ commit. Pre-existing data is not copied. The application of changes is
+ delayed by 4 hours.
+<programlisting>
+CREATE SUBSCRIPTION foo
+ CONNECTION 'host=192.168.1.50 port=5432 user=foo dbname=foodb'
+ PUBLICATION baz
+ WITH (copy_data = false, min_apply_delay = '4h');
+</programlisting></para>
</refsect1>
<refsect1>
diff --git a/src/backend/catalog/pg_subscription.c b/src/backend/catalog/pg_subscription.c
index c7d2537fb5..e1ea3ccece 100644
--- a/src/backend/catalog/pg_subscription.c
+++ b/src/backend/catalog/pg_subscription.c
@@ -64,6 +64,7 @@ GetSubscription(Oid subid, bool missing_ok)
sub->oid = subid;
sub->dbid = subform->subdbid;
sub->skiplsn = subform->subskiplsn;
+ sub->applydelay = subform->subapplydelay;
sub->name = pstrdup(NameStr(subform->subname));
sub->owner = subform->subowner;
sub->enabled = subform->subenabled;
diff --git a/src/backend/catalog/system_views.sql b/src/backend/catalog/system_views.sql
index f369b1fc14..50175323b9 100644
--- a/src/backend/catalog/system_views.sql
+++ b/src/backend/catalog/system_views.sql
@@ -1297,9 +1297,10 @@ REVOKE ALL ON pg_replication_origin_status FROM public;
-- All columns of pg_subscription except subconninfo are publicly readable.
REVOKE ALL ON pg_subscription FROM public;
-GRANT SELECT (oid, subdbid, subskiplsn, subname, subowner, subenabled,
- subbinary, substream, subtwophasestate, subdisableonerr,
- subslotname, subsynccommit, subpublications, suborigin)
+GRANT SELECT (oid, subdbid, subskiplsn, subapplydelay, subname, subowner,
+ subenabled, subbinary, substream, subtwophasestate,
+ subdisableonerr, subslotname, subsynccommit, subpublications,
+ suborigin)
ON pg_subscription TO public;
CREATE VIEW pg_stat_subscription_stats AS
diff --git a/src/backend/commands/subscriptioncmds.c b/src/backend/commands/subscriptioncmds.c
index f73dfb6067..aaeda7e8a9 100644
--- a/src/backend/commands/subscriptioncmds.c
+++ b/src/backend/commands/subscriptioncmds.c
@@ -47,6 +47,7 @@
#include "utils/memutils.h"
#include "utils/pg_lsn.h"
#include "utils/syscache.h"
+#include "utils/timestamp.h"
/*
* Options that can be specified by the user in CREATE/ALTER SUBSCRIPTION
@@ -65,6 +66,7 @@
#define SUBOPT_DISABLE_ON_ERR 0x00000400
#define SUBOPT_LSN 0x00000800
#define SUBOPT_ORIGIN 0x00001000
+#define SUBOPT_MIN_APPLY_DELAY 0x00002000
/* check if the 'val' has 'bits' set */
#define IsSet(val, bits) (((val) & (bits)) == (bits))
@@ -89,6 +91,7 @@ typedef struct SubOpts
bool disableonerr;
char *origin;
XLogRecPtr lsn;
+ int64 min_apply_delay;
} SubOpts;
static List *fetch_table_list(WalReceiverConn *wrconn, List *publications);
@@ -141,6 +144,8 @@ parse_subscription_options(ParseState *pstate, List *stmt_options,
opts->disableonerr = false;
if (IsSet(supported_opts, SUBOPT_ORIGIN))
opts->origin = pstrdup(LOGICALREP_ORIGIN_ANY);
+ if (IsSet(supported_opts, SUBOPT_MIN_APPLY_DELAY))
+ opts->min_apply_delay = 0;
/* Parse options */
foreach(lc, stmt_options)
@@ -319,12 +324,35 @@ parse_subscription_options(ParseState *pstate, List *stmt_options,
opts->specified_opts |= SUBOPT_LSN;
opts->lsn = lsn;
}
+ else if (IsSet(supported_opts, SUBOPT_MIN_APPLY_DELAY) &&
+ strcmp(defel->defname, "min_apply_delay") == 0)
+ {
+ char *val;
+ Interval *interval;
+
+ if (IsSet(opts->specified_opts, SUBOPT_MIN_APPLY_DELAY))
+ errorConflictingDefElem(defel, pstate);
+
+ opts->specified_opts |= SUBOPT_MIN_APPLY_DELAY;
+ val = defGetString(defel);
+
+ interval = DatumGetIntervalP(DirectFunctionCall3(interval_in,
+ CStringGetDatum(val),
+ ObjectIdGetDatum(InvalidOid),
+ Int32GetDatum(-1)));
+ opts->min_apply_delay = interval_to_ms(interval);
+ }
else
ereport(ERROR,
(errcode(ERRCODE_SYNTAX_ERROR),
errmsg("unrecognized subscription parameter: \"%s\"", defel->defname)));
}
+ if (opts->min_apply_delay < 0 && IsSet(supported_opts, SUBOPT_MIN_APPLY_DELAY))
+ ereport(ERROR,
+ errcode(ERRCODE_NUMERIC_VALUE_OUT_OF_RANGE),
+ errmsg("option \"%s\" must not be negative", "min_apply_delay"));
+
/*
* We've been explicitly asked to not connect, that requires some
* additional processing.
@@ -557,7 +585,8 @@ CreateSubscription(ParseState *pstate, CreateSubscriptionStmt *stmt,
SUBOPT_SLOT_NAME | SUBOPT_COPY_DATA |
SUBOPT_SYNCHRONOUS_COMMIT | SUBOPT_BINARY |
SUBOPT_STREAMING | SUBOPT_TWOPHASE_COMMIT |
- SUBOPT_DISABLE_ON_ERR | SUBOPT_ORIGIN);
+ SUBOPT_DISABLE_ON_ERR | SUBOPT_ORIGIN |
+ SUBOPT_MIN_APPLY_DELAY);
parse_subscription_options(pstate, stmt->options, supported_opts, &opts);
/*
@@ -622,6 +651,7 @@ CreateSubscription(ParseState *pstate, CreateSubscriptionStmt *stmt,
values[Anum_pg_subscription_oid - 1] = ObjectIdGetDatum(subid);
values[Anum_pg_subscription_subdbid - 1] = ObjectIdGetDatum(MyDatabaseId);
values[Anum_pg_subscription_subskiplsn - 1] = LSNGetDatum(InvalidXLogRecPtr);
+ values[Anum_pg_subscription_subapplydelay - 1] = Int64GetDatum(opts.min_apply_delay);
values[Anum_pg_subscription_subname - 1] =
DirectFunctionCall1(namein, CStringGetDatum(stmt->subname));
values[Anum_pg_subscription_subowner - 1] = ObjectIdGetDatum(owner);
@@ -1044,7 +1074,7 @@ AlterSubscription(ParseState *pstate, AlterSubscriptionStmt *stmt,
supported_opts = (SUBOPT_SLOT_NAME |
SUBOPT_SYNCHRONOUS_COMMIT | SUBOPT_BINARY |
SUBOPT_STREAMING | SUBOPT_DISABLE_ON_ERR |
- SUBOPT_ORIGIN);
+ SUBOPT_ORIGIN | SUBOPT_MIN_APPLY_DELAY);
parse_subscription_options(pstate, stmt->options,
supported_opts, &opts);
@@ -1100,6 +1130,12 @@ AlterSubscription(ParseState *pstate, AlterSubscriptionStmt *stmt,
replaces[Anum_pg_subscription_subdisableonerr - 1]
= true;
}
+ if (IsSet(opts.specified_opts, SUBOPT_MIN_APPLY_DELAY))
+ {
+ values[Anum_pg_subscription_subapplydelay - 1] =
+ Int64GetDatum(opts.min_apply_delay);
+ replaces[Anum_pg_subscription_subapplydelay - 1] = true;
+ }
if (IsSet(opts.specified_opts, SUBOPT_ORIGIN))
{
@@ -1130,6 +1166,14 @@ AlterSubscription(ParseState *pstate, AlterSubscriptionStmt *stmt,
if (opts.enabled)
ApplyLauncherWakeupAtCommit();
+ /*
+ * If this subscription has been disabled and it has an apply
+ * delay set, wake up the logical replication worker to finish
+ * it as soon as possible.
+ */
+ if (!opts.enabled && sub->applydelay > 0)
+ logicalrep_worker_wakeup(sub->oid, InvalidOid);
+
update_tuple = true;
break;
}
diff --git a/src/backend/replication/logical/worker.c b/src/backend/replication/logical/worker.c
index 5f8c541763..5fa09a867b 100644
--- a/src/backend/replication/logical/worker.c
+++ b/src/backend/replication/logical/worker.c
@@ -324,6 +324,8 @@ static void maybe_reread_subscription(void);
static void DisableSubscriptionAndExit(void);
+static void apply_delay(TimestampTz ts);
+
/* prototype needed because of stream_commit */
static void apply_dispatch(StringInfo s);
@@ -803,6 +805,57 @@ slot_modify_data(TupleTableSlot *slot, TupleTableSlot *srcslot,
ExecStoreVirtualTuple(slot);
}
+/*
+ * When min_apply_delay parameter is set on subscriber, we wait long enough to
+ * make sure a transaction is applied at least that interval behind the
+ * publisher.
+ *
+ * It applies the delay for the next transaction but before starting the
+ * transaction. The main reason for this design is to avoid a long-running
+ * transaction (which can cause some operational challenges) if the user sets a
+ * high value for the delay. This design is different from the physical
+ * replication (that applies the delay at commit time) mainly because write
+ * operations may allow some issues (such as bloat and locks) that can be
+ * minimized if it does not keep the transaction open for such a long time.
+ */
+static void
+apply_delay(TimestampTz ts)
+{
+ TimestampTz delay_until = 0;
+
+ /* nothing to do if no delay set */
+ if (MySubscription->applydelay <= 0)
+ return;
+
+ /* set apply delay */
+ delay_until = TimestampTzPlusMilliseconds(TimestampTzGetDatum(ts),
+ MySubscription->applydelay);
+
+ while (true)
+ {
+ long diffms;
+
+ diffms = TimestampDifferenceMilliseconds(GetCurrentTimestamp(), delay_until);
+
+ /*
+ * Exit without arming the latch if it's already past time to apply
+ * this transaction.
+ */
+ if (diffms <= 0)
+ break;
+
+ elog(DEBUG2, "logical replication apply delay: %ld ms", diffms);
+
+ WaitLatch(MyLatch,
+ WL_LATCH_SET | WL_TIMEOUT | WL_EXIT_ON_PM_DEATH,
+ diffms,
+ WAIT_EVENT_RECOVERY_APPLY_DELAY);
+ ResetLatch(MyLatch);
+
+ CHECK_FOR_INTERRUPTS();
+ }
+}
+
/*
* Handle BEGIN message.
*/
@@ -814,6 +867,9 @@ apply_handle_begin(StringInfo s)
logicalrep_read_begin(s, &begin_data);
set_apply_error_context_xact(begin_data.xid, begin_data.final_lsn);
+ /* Should we delay the current transaction? */
+ apply_delay(begin_data.committime);
+
remote_final_lsn = begin_data.final_lsn;
maybe_start_skipping_changes(begin_data.final_lsn);
@@ -868,6 +924,9 @@ apply_handle_begin_prepare(StringInfo s)
logicalrep_read_begin_prepare(s, &begin_data);
set_apply_error_context_xact(begin_data.xid, begin_data.prepare_lsn);
+ /* Should we delay the current prepared transaction? */
+ apply_delay(begin_data.prepare_time);
+
remote_final_lsn = begin_data.prepare_lsn;
maybe_start_skipping_changes(begin_data.prepare_lsn);
@@ -1090,6 +1149,19 @@ apply_handle_stream_prepare(StringInfo s)
elog(DEBUG1, "received prepare for streamed transaction %u", prepare_data.xid);
+ /*
+ * Should we delay the current prepared transaction?
+ *
+ * Although the delay is applied in BEGIN PREPARE messages, streamed
+ * prepared transactions apply the delay in a STREAM PREPARE message.
+ * That's ok because no changes have been applied yet
+ * (apply_spooled_messages() will do it).
+ * The STREAM START message does not contain a prepare time (it will be
+ * available when the in-progress prepared transaction finishes), hence, it
+ * was not possible to apply a delay at that time.
+ */
+ apply_delay(prepare_data.prepare_time);
+
/* Replay all the spooled operations. */
apply_spooled_messages(prepare_data.xid, prepare_data.prepare_lsn);
@@ -1481,6 +1553,19 @@ apply_handle_stream_commit(StringInfo s)
elog(DEBUG1, "received commit for streamed transaction %u", xid);
+ /*
+ * Should we delay the current transaction?
+ *
+ * Although the delay is applied in BEGIN messages, streamed transactions
+ * apply the delay in a STREAM COMMIT message. That's ok because no changes
+ * have been applied yet (apply_spooled_messages() will do it).
+ * The STREAM START message would be a natural choice for this delay but
+ * there is no commit time yet (it will be available when the in-progress
+ * transaction finishes), hence, it was not possible to apply a delay at
+ * that time.
+ */
+ apply_delay(commit_data.committime);
+
apply_spooled_messages(xid, commit_data.commit_lsn);
apply_handle_commit_internal(&commit_data);
diff --git a/src/backend/utils/adt/timestamp.c b/src/backend/utils/adt/timestamp.c
index 49cdb290ac..89f57f7c33 100644
--- a/src/backend/utils/adt/timestamp.c
+++ b/src/backend/utils/adt/timestamp.c
@@ -2411,6 +2411,38 @@ interval_cmp_internal(const Interval *interval1, const Interval *interval2)
return int128_compare(span1, span2);
}
+/*
+ * Given an Interval returns the number of milliseconds.
+ */
+int64
+interval_to_ms(const Interval *interval)
+{
+ int64 days;
+ int64 ms;
+ int64 result;
+
+ days = interval->month * INT64CONST(30);
+ days += interval->day;
+
+ /*
+ * The following operations use these special functions to detect overflow.
+ * Number of ms per informed days.
+ */
+ if (pg_mul_s64_overflow(days, MSECS_PER_DAY, &result))
+ ereport(ERROR,
+ (errcode(ERRCODE_NUMERIC_VALUE_OUT_OF_RANGE),
+ errmsg("bigint out of range")));
+
+ /* adds portion time (in ms) to the previous result. */
+ ms = interval->time / INT64CONST(1000);
+ if (pg_add_s64_overflow(result, ms, &result))
+ ereport(ERROR,
+ (errcode(ERRCODE_NUMERIC_VALUE_OUT_OF_RANGE),
+ errmsg("bigint out of range")));
+
+ return result;
+}
+
Datum
interval_eq(PG_FUNCTION_ARGS)
{
diff --git a/src/bin/pg_dump/pg_dump.c b/src/bin/pg_dump/pg_dump.c
index da6605175a..5ef61f3de1 100644
--- a/src/bin/pg_dump/pg_dump.c
+++ b/src/bin/pg_dump/pg_dump.c
@@ -4441,6 +4441,7 @@ getSubscriptions(Archive *fout)
int i_subsynccommit;
int i_subpublications;
int i_subbinary;
+ int i_subapplydelay;
int i,
ntups;
@@ -4493,9 +4494,15 @@ getSubscriptions(Archive *fout)
LOGICALREP_TWOPHASE_STATE_DISABLED);
if (fout->remoteVersion >= 160000)
- appendPQExpBufferStr(query, " s.suborigin\n");
+ {
+ appendPQExpBufferStr(query, " s.suborigin,\n");
+ appendPQExpBufferStr(query, " s.subapplydelay\n");
+ }
else
- appendPQExpBuffer(query, " '%s' AS suborigin\n", LOGICALREP_ORIGIN_ANY);
+ {
+ appendPQExpBuffer(query, " '%s' AS suborigin,\n", LOGICALREP_ORIGIN_ANY);
+ appendPQExpBufferStr(query, " 0 AS subapplydelay\n");
+ }
appendPQExpBufferStr(query,
"FROM pg_subscription s\n"
@@ -4523,6 +4530,7 @@ getSubscriptions(Archive *fout)
i_subtwophasestate = PQfnumber(res, "subtwophasestate");
i_subdisableonerr = PQfnumber(res, "subdisableonerr");
i_suborigin = PQfnumber(res, "suborigin");
+ i_subapplydelay = PQfnumber(res, "subapplydelay");
subinfo = pg_malloc(ntups * sizeof(SubscriptionInfo));
@@ -4553,6 +4561,8 @@ getSubscriptions(Archive *fout)
subinfo[i].subdisableonerr =
pg_strdup(PQgetvalue(res, i, i_subdisableonerr));
subinfo[i].suborigin = pg_strdup(PQgetvalue(res, i, i_suborigin));
+ subinfo[i].subapplydelay =
+ strtoi64(PQgetvalue(res, i, i_subapplydelay), NULL, 10);
/* Decide whether we want to dump it */
selectDumpableObject(&(subinfo[i].dobj), fout);
@@ -4632,6 +4642,9 @@ dumpSubscription(Archive *fout, const SubscriptionInfo *subinfo)
if (strcmp(subinfo->subsynccommit, "off") != 0)
appendPQExpBuffer(query, ", synchronous_commit = %s", fmtId(subinfo->subsynccommit));
+ if (subinfo->subapplydelay > 0)
+ appendPQExpBuffer(query, ", min_apply_delay = '" INT64_FORMAT " ms'", subinfo->subapplydelay);
+
appendPQExpBufferStr(query, ");\n");
if (subinfo->dobj.dump & DUMP_COMPONENT_DEFINITION)
diff --git a/src/bin/pg_dump/pg_dump.h b/src/bin/pg_dump/pg_dump.h
index 69ee939d44..91b73e10d2 100644
--- a/src/bin/pg_dump/pg_dump.h
+++ b/src/bin/pg_dump/pg_dump.h
@@ -661,6 +661,7 @@ typedef struct _SubscriptionInfo
char *subdisableonerr;
char *suborigin;
char *subsynccommit;
+ int64 subapplydelay;
char *subpublications;
} SubscriptionInfo;
diff --git a/src/bin/psql/describe.c b/src/bin/psql/describe.c
index 327a69487b..0be1d44e81 100644
--- a/src/bin/psql/describe.c
+++ b/src/bin/psql/describe.c
@@ -6469,7 +6469,7 @@ describeSubscriptions(const char *pattern, bool verbose)
PGresult *res;
printQueryOpt myopt = pset.popt;
static const bool translate_columns[] = {false, false, false, false,
- false, false, false, false, false, false, false, false};
+ false, false, false, false, false, false, false, false, false};
if (pset.sversion < 100000)
{
@@ -6511,10 +6511,13 @@ describeSubscriptions(const char *pattern, bool verbose)
gettext_noop("Two-phase commit"),
gettext_noop("Disable on error"));
+ /* origin and min_apply_delay are only supported in v16 and higher */
if (pset.sversion >= 160000)
appendPQExpBuffer(&buf,
- ", suborigin AS \"%s\"\n",
- gettext_noop("Origin"));
+ ", suborigin AS \"%s\"\n"
+ ", subapplydelay AS \"%s\"\n",
+ gettext_noop("Origin"),
+ gettext_noop("Apply delay"));
appendPQExpBuffer(&buf,
", subsynccommit AS \"%s\"\n"
diff --git a/src/bin/psql/tab-complete.c b/src/bin/psql/tab-complete.c
index f265e043e9..42070f7240 100644
--- a/src/bin/psql/tab-complete.c
+++ b/src/bin/psql/tab-complete.c
@@ -1873,7 +1873,7 @@ psql_completion(const char *text, int start, int end)
COMPLETE_WITH("(", "PUBLICATION");
/* ALTER SUBSCRIPTION <name> SET ( */
else if (HeadMatches("ALTER", "SUBSCRIPTION", MatchAny) && TailMatches("SET", "("))
- COMPLETE_WITH("binary", "disable_on_error", "origin", "slot_name",
+ COMPLETE_WITH("binary", "disable_on_error", "origin", "min_apply_delay", "slot_name",
"streaming", "synchronous_commit");
/* ALTER SUBSCRIPTION <name> SKIP ( */
else if (HeadMatches("ALTER", "SUBSCRIPTION", MatchAny) && TailMatches("SKIP", "("))
@@ -3153,7 +3153,7 @@ psql_completion(const char *text, int start, int end)
/* Complete "CREATE SUBSCRIPTION <name> ... WITH ( <opt>" */
else if (HeadMatches("CREATE", "SUBSCRIPTION") && TailMatches("WITH", "("))
COMPLETE_WITH("binary", "connect", "copy_data", "create_slot",
- "disable_on_error", "enabled", "origin", "slot_name",
+ "disable_on_error", "enabled", "origin", "min_apply_delay", "slot_name",
"streaming", "synchronous_commit", "two_phase");
/* CREATE TRIGGER --- is allowed inside CREATE SCHEMA, so use TailMatches */
diff --git a/src/include/catalog/pg_subscription.h b/src/include/catalog/pg_subscription.h
index c9a3026b28..3221072fe8 100644
--- a/src/include/catalog/pg_subscription.h
+++ b/src/include/catalog/pg_subscription.h
@@ -70,6 +70,8 @@ CATALOG(pg_subscription,6100,SubscriptionRelationId) BKI_SHARED_RELATION BKI_ROW
XLogRecPtr subskiplsn; /* All changes finished at this LSN are
* skipped */
+ int64 subapplydelay; /* Replication apply delay */
+
NameData subname; /* Name of the subscription */
Oid subowner BKI_LOOKUP(pg_authid); /* Owner of the subscription */
@@ -119,6 +121,7 @@ typedef struct Subscription
* in */
XLogRecPtr skiplsn; /* All changes finished at this LSN are
* skipped */
+ int64 applydelay; /* Replication apply delay */
char *name; /* Name of the subscription */
Oid owner; /* Oid of the subscription owner */
bool enabled; /* Indicates if the subscription is enabled */
diff --git a/src/include/datatype/timestamp.h b/src/include/datatype/timestamp.h
index d155f1b03b..d5bbfad1c4 100644
--- a/src/include/datatype/timestamp.h
+++ b/src/include/datatype/timestamp.h
@@ -127,6 +127,8 @@ struct pg_itm_in
#define SECS_PER_MINUTE 60
#define MINS_PER_HOUR 60
+#define MSECS_PER_DAY INT64CONST(86400000)
+
#define USECS_PER_DAY INT64CONST(86400000000)
#define USECS_PER_HOUR INT64CONST(3600000000)
#define USECS_PER_MINUTE INT64CONST(60000000)
diff --git a/src/include/utils/timestamp.h b/src/include/utils/timestamp.h
index edf3a97318..91709035da 100644
--- a/src/include/utils/timestamp.h
+++ b/src/include/utils/timestamp.h
@@ -78,6 +78,8 @@ extern bool TimestampDifferenceExceeds(TimestampTz start_time,
TimestampTz stop_time,
int msec);
+extern int64 interval_to_ms(const Interval *interval);
+
extern TimestampTz time_t_to_timestamptz(pg_time_t tm);
extern pg_time_t timestamptz_to_time_t(TimestampTz t);
diff --git a/src/test/regress/expected/subscription.out b/src/test/regress/expected/subscription.out
index ef0ebf96b9..135d555707 100644
--- a/src/test/regress/expected/subscription.out
+++ b/src/test/regress/expected/subscription.out
@@ -77,18 +77,18 @@ ERROR: unrecognized origin value: "foo"
CREATE SUBSCRIPTION regress_testsub4 CONNECTION 'dbname=regress_doesnotexist' PUBLICATION testpub WITH (slot_name = NONE, connect = false, origin = none);
WARNING: tables were not subscribed, you will have to run ALTER SUBSCRIPTION ... REFRESH PUBLICATION to subscribe the tables
\dRs+ regress_testsub4
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
-------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub4 | regress_subscription_user | f | {testpub} | f | f | d | f | none | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Apply delay | Synchronous commit | Conninfo | Skip LSN
+------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+-------------+--------------------+-----------------------------+----------
+ regress_testsub4 | regress_subscription_user | f | {testpub} | f | f | d | f | none | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
ALTER SUBSCRIPTION regress_testsub4 SET (origin = any);
\dRs+ regress_testsub4
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
-------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub4 | regress_subscription_user | f | {testpub} | f | f | d | f | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Apply delay | Synchronous commit | Conninfo | Skip LSN
+------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+-------------+--------------------+-----------------------------+----------
+ regress_testsub4 | regress_subscription_user | f | {testpub} | f | f | d | f | any | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
DROP SUBSCRIPTION regress_testsub3;
@@ -98,10 +98,10 @@ ALTER SUBSCRIPTION regress_testsub CONNECTION 'foobar';
ERROR: invalid connection string syntax: missing "=" after "foobar" in connection info string
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub} | f | f | d | f | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Apply delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+-------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | f | d | f | any | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
ALTER SUBSCRIPTION regress_testsub SET PUBLICATION testpub2, testpub3 WITH (refresh = false);
@@ -118,10 +118,10 @@ ERROR: unrecognized subscription parameter: "create_slot"
-- ok
ALTER SUBSCRIPTION regress_testsub SKIP (lsn = '0/12345');
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+---------------------+--------+-----------+------------------+------------------+--------+--------------------+------------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub2,testpub3} | f | f | d | f | any | off | dbname=regress_doesnotexist2 | 0/12345
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Apply delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+---------------------+--------+-----------+------------------+------------------+--------+-------------+--------------------+------------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub2,testpub3} | f | f | d | f | any | 0 | off | dbname=regress_doesnotexist2 | 0/12345
(1 row)
-- ok - with lsn = NONE
@@ -130,10 +130,10 @@ ALTER SUBSCRIPTION regress_testsub SKIP (lsn = NONE);
ALTER SUBSCRIPTION regress_testsub SKIP (lsn = '0/0');
ERROR: invalid WAL location (LSN): 0/0
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+---------------------+--------+-----------+------------------+------------------+--------+--------------------+------------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub2,testpub3} | f | f | d | f | any | off | dbname=regress_doesnotexist2 | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Apply delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+---------------------+--------+-----------+------------------+------------------+--------+-------------+--------------------+------------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub2,testpub3} | f | f | d | f | any | 0 | off | dbname=regress_doesnotexist2 | 0/0
(1 row)
BEGIN;
@@ -165,10 +165,10 @@ ALTER SUBSCRIPTION regress_testsub_foo SET (synchronous_commit = foobar);
ERROR: invalid value for parameter "synchronous_commit": "foobar"
HINT: Available values: local, remote_write, remote_apply, on, off.
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
----------------------+---------------------------+---------+---------------------+--------+-----------+------------------+------------------+--------+--------------------+------------------------------+----------
- regress_testsub_foo | regress_subscription_user | f | {testpub2,testpub3} | f | f | d | f | any | local | dbname=regress_doesnotexist2 | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Apply delay | Synchronous commit | Conninfo | Skip LSN
+---------------------+---------------------------+---------+---------------------+--------+-----------+------------------+------------------+--------+-------------+--------------------+------------------------------+----------
+ regress_testsub_foo | regress_subscription_user | f | {testpub2,testpub3} | f | f | d | f | any | 0 | local | dbname=regress_doesnotexist2 | 0/0
(1 row)
-- rename back to keep the rest simple
@@ -201,19 +201,19 @@ ERROR: binary requires a Boolean value
CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUBLICATION testpub WITH (connect = false, binary = true);
WARNING: tables were not subscribed, you will have to run ALTER SUBSCRIPTION ... REFRESH PUBLICATION to subscribe the tables
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub} | t | f | d | f | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Apply delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+-------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | t | f | d | f | any | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
ALTER SUBSCRIPTION regress_testsub SET (binary = false);
ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE);
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub} | f | f | d | f | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Apply delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+-------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | f | d | f | any | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
DROP SUBSCRIPTION regress_testsub;
@@ -224,19 +224,19 @@ ERROR: streaming requires a Boolean value
CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUBLICATION testpub WITH (connect = false, streaming = true);
WARNING: tables were not subscribed, you will have to run ALTER SUBSCRIPTION ... REFRESH PUBLICATION to subscribe the tables
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub} | f | t | d | f | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Apply delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+-------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | t | d | f | any | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
ALTER SUBSCRIPTION regress_testsub SET (streaming = false);
ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE);
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub} | f | f | d | f | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Apply delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+-------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | f | d | f | any | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
-- fail - publication already exists
@@ -251,10 +251,10 @@ ALTER SUBSCRIPTION regress_testsub ADD PUBLICATION testpub1, testpub2 WITH (refr
ALTER SUBSCRIPTION regress_testsub ADD PUBLICATION testpub1, testpub2 WITH (refresh = false);
ERROR: publication "testpub1" is already in subscription "regress_testsub"
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-----------------------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub,testpub1,testpub2} | f | f | d | f | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Apply delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-----------------------------+--------+-----------+------------------+------------------+--------+-------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub,testpub1,testpub2} | f | f | d | f | any | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
-- fail - publication used more then once
@@ -269,10 +269,10 @@ ERROR: publication "testpub3" is not in subscription "regress_testsub"
-- ok - delete publications
ALTER SUBSCRIPTION regress_testsub DROP PUBLICATION testpub1, testpub2 WITH (refresh = false);
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub} | f | f | d | f | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Apply delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+-------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | f | d | f | any | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
DROP SUBSCRIPTION regress_testsub;
@@ -306,10 +306,10 @@ ERROR: two_phase requires a Boolean value
CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUBLICATION testpub WITH (connect = false, two_phase = true);
WARNING: tables were not subscribed, you will have to run ALTER SUBSCRIPTION ... REFRESH PUBLICATION to subscribe the tables
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub} | f | f | p | f | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Apply delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+-------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | f | p | f | any | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
--fail - alter of two_phase option not supported.
@@ -318,10 +318,10 @@ ERROR: unrecognized subscription parameter: "two_phase"
-- but can alter streaming when two_phase enabled
ALTER SUBSCRIPTION regress_testsub SET (streaming = true);
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub} | f | t | p | f | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Apply delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+-------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | t | p | f | any | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE);
@@ -330,10 +330,10 @@ DROP SUBSCRIPTION regress_testsub;
CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUBLICATION testpub WITH (connect = false, streaming = true, two_phase = true);
WARNING: tables were not subscribed, you will have to run ALTER SUBSCRIPTION ... REFRESH PUBLICATION to subscribe the tables
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub} | f | t | p | f | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Apply delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+-------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | t | p | f | any | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE);
@@ -345,18 +345,47 @@ ERROR: disable_on_error requires a Boolean value
CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUBLICATION testpub WITH (connect = false, disable_on_error = false);
WARNING: tables were not subscribed, you will have to run ALTER SUBSCRIPTION ... REFRESH PUBLICATION to subscribe the tables
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub} | f | f | d | f | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Apply delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+-------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | f | d | f | any | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
ALTER SUBSCRIPTION regress_testsub SET (disable_on_error = true);
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub} | f | f | d | t | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Apply delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+-------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | f | d | t | any | 0 | off | dbname=regress_doesnotexist | 0/0
+(1 row)
+
+ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE);
+DROP SUBSCRIPTION regress_testsub;
+-- fail -- min_apply_delay must be a non-negative integer
+CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUBLICATION testpub WITH (connect = false, min_apply_delay = foo);
+ERROR: invalid input syntax for type interval: "foo"
+CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUBLICATION testpub WITH (connect = false, min_apply_delay = -1);
+ERROR: option "min_apply_delay" must not be negative
+-- success -- 123 s
+CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUBLICATION testpub WITH (connect = false, min_apply_delay = 123);
+WARNING: tables were not subscribed, you will have to run ALTER SUBSCRIPTION ... REFRESH PUBLICATION to subscribe the tables
+\dRs+
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Apply delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+-------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | f | d | f | any | 123000 | off | dbname=regress_doesnotexist | 0/0
+(1 row)
+
+ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE);
+DROP SUBSCRIPTION regress_testsub;
+-- success -- interval is converted into ms and stored as integer
+CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUBLICATION testpub WITH (connect = false, min_apply_delay = '4h 27min 35s');
+WARNING: tables were not subscribed, you will have to run ALTER SUBSCRIPTION ... REFRESH PUBLICATION to subscribe the tables
+\dRs+
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Apply delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+-------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | f | d | f | any | 16055000 | off | dbname=regress_doesnotexist | 0/0
(1 row)
ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE);
diff --git a/src/test/regress/sql/subscription.sql b/src/test/regress/sql/subscription.sql
index 4425fafc46..fba4223aa8 100644
--- a/src/test/regress/sql/subscription.sql
+++ b/src/test/regress/sql/subscription.sql
@@ -264,6 +264,26 @@ ALTER SUBSCRIPTION regress_testsub SET (disable_on_error = true);
ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE);
DROP SUBSCRIPTION regress_testsub;
+-- fail -- min_apply_delay must be a non-negative integer
+CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUBLICATION testpub WITH (connect = false, min_apply_delay = foo);
+CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUBLICATION testpub WITH (connect = false, min_apply_delay = -1);
+
+-- success -- 123 s
+CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUBLICATION testpub WITH (connect = false, min_apply_delay = 123);
+
+\dRs+
+
+ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE);
+DROP SUBSCRIPTION regress_testsub;
+
+-- success -- interval is converted into ms and stored as integer
+CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUBLICATION testpub WITH (connect = false, min_apply_delay = '4h 27min 35s');
+
+\dRs+
+
+ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE);
+DROP SUBSCRIPTION regress_testsub;
+
RESET SESSION AUTHORIZATION;
DROP ROLE regress_subscription_user;
DROP ROLE regress_subscription_user2;
diff --git a/src/test/subscription/t/032_apply_delay.pl b/src/test/subscription/t/032_apply_delay.pl
new file mode 100644
index 0000000000..fcdab4a8b0
--- /dev/null
+++ b/src/test/subscription/t/032_apply_delay.pl
@@ -0,0 +1,130 @@
+
+# Copyright (c) 2022, PostgreSQL Global Development Group
+
+# Test replication apply delay
+use strict;
+use warnings;
+use PostgreSQL::Test::Cluster;
+use PostgreSQL::Test::Utils;
+use Test::More;
+
+# Create publisher node.
+my $node_publisher = PostgreSQL::Test::Cluster->new('publisher');
+$node_publisher->init(allows_streaming => 'logical');
+$node_publisher->append_conf('postgresql.conf',
+ 'logical_decoding_work_mem = 64kB');
+$node_publisher->start;
+
+# Create subscriber node.
+my $node_subscriber = PostgreSQL::Test::Cluster->new('subscriber');
+$node_subscriber->init;
+$node_subscriber->append_conf('postgresql.conf',
+ "log_min_messages = debug2");
+$node_subscriber->start;
+
+# Create some preexisting content on publisher.
+$node_publisher->safe_psql('postgres',
+ "CREATE TABLE test_tab (a int primary key, b varchar)");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO test_tab VALUES (1, 'foo'), (2, 'bar')");
+
+# Setup structure on subscriber.
+$node_subscriber->safe_psql('postgres',
+ "CREATE TABLE test_tab (a int primary key, b text, c timestamptz DEFAULT now(), d bigint DEFAULT 999)"
+);
+
+# Setup logical replication.
+my $publisher_connstr = $node_publisher->connstr . ' dbname=postgres';
+$node_publisher->safe_psql('postgres',
+ "CREATE PUBLICATION tap_pub FOR TABLE test_tab");
+
+my $appname = 'tap_sub';
+$node_subscriber->safe_psql('postgres',
+ "CREATE SUBSCRIPTION tap_sub CONNECTION '$publisher_connstr application_name=$appname' PUBLICATION tap_pub WITH (streaming = on, min_apply_delay = '2s')"
+);
+
+$node_publisher->wait_for_catchup($appname);
+
+# Check log starting now for logical replication apply delay.
+my $log_location = -s $node_subscriber->logfile;
+
+# Wait for initial table sync to finish.
+my $synced_query =
+ "SELECT count(1) = 0 FROM pg_subscription_rel WHERE srsubstate NOT IN ('r', 's');";
+$node_subscriber->poll_query_until('postgres', $synced_query)
+ or die "Timed out while waiting for subscriber to synchronize data";
+
+my $result =
+ $node_subscriber->safe_psql('postgres',
+ "SELECT count(*), min(a), max(a) FROM test_tab");
+is($result, qq(2|1|2), 'check initial data was copied to subscriber');
+
+# New row to trigger apply delay.
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO test_tab VALUES (3, 'baz')");
+
+$node_publisher->wait_for_catchup($appname);
+
+$result =
+ $node_subscriber->safe_psql('postgres',
+ "SELECT count(*), min(a), max(a) FROM test_tab");
+is($result, qq(3|1|3), 'check if the new row was applied to subscriber');
+
+check_apply_delay_log("logical replication apply delay");
+
+# Test streamed transaction.
+# Insert, update and delete enough rows to exceed 64kB limit.
+$node_publisher->safe_psql(
+ 'postgres', q{
+BEGIN;
+INSERT INTO test_tab SELECT i, md5(i::text) FROM generate_series(4, 5000) s(i);
+UPDATE test_tab SET b = md5(b) WHERE mod(a,2) = 0;
+DELETE FROM test_tab WHERE mod(a,3) = 0;
+COMMIT;
+});
+
+$node_publisher->wait_for_catchup($appname);
+
+$result =
+ $node_subscriber->safe_psql('postgres',
+ "SELECT count(*), min(a), max(a) FROM test_tab");
+is($result, qq(3334|1|5000), 'check if the new rows were applied to subscriber');
+
+check_apply_delay_log("logical replication apply delay");
+
+# Test ALTER SUBSCRIPTION. Delay 86460 seconds (1 day 1 minute).
+$node_subscriber->safe_psql('postgres',
+ "ALTER SUBSCRIPTION tap_sub SET (min_apply_delay = 86460)"
+);
+
+# New row to trigger apply delay.
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO test_tab VALUES (0, 'foobar')");
+
+# Disable subscription. worker should die immediately.
+$node_subscriber->safe_psql('postgres',
+ "ALTER SUBSCRIPTION tap_sub DISABLE"
+);
+
+# Wait until worker dies.
+my $sub_query =
+ "SELECT count(1) = 0 FROM pg_stat_subscription WHERE subname = 'tap_sub' AND pid IS NOT NULL;";
+$node_subscriber->poll_query_until('postgres', $sub_query)
+ or die "Timed out while waiting for subscriber to die";
+
+$node_subscriber->stop;
+$node_publisher->stop;
+
+done_testing();
+
+sub check_apply_delay_log
+{
+ my $message = shift;
+ my $old_log_location = $log_location;
+
+ $log_location = $node_subscriber->wait_for_log(qr/$message/, $log_location);
+
+ cmp_ok($log_location, '>', $old_log_location,
+ "logfile contains triggered logical replication apply delay"
+ );
+}
--
2.30.2
On Tue, Jul 5, 2022, at 9:29 AM, Amit Kapila wrote:
I wonder why we don't apply the delay on commit/commit_prepared
records only similar to physical replication. See recoveryApplyDelay.
One more advantage would be then we don't need to worry about
transactions that we are going to skip due SKIP feature for
subscribers.
I added an explanation at the top of apply_delay(). I didn't read the "parallel
apply" patch yet. I'll do soon to understand how the current design for
streamed transactions conflicts with the parallel apply patch.
+ * It applies the delay for the next transaction but before starting the
+ * transaction. The main reason for this design is to avoid a long-running
+ * transaction (which can cause some operational challenges) if the user sets a
+ * high value for the delay. This design is different from the physical
+ * replication (that applies the delay at commit time) mainly because write
+ * operations may allow some issues (such as bloat and locks) that can be
+ * minimized if it does not keep the transaction open for such a long time.
+ */
+static void
+apply_delay(TimestampTz ts)
Regarding the skip transaction feature, we could certainly skip the
transactions combined with the apply delay. However, it introduces complexity
for a rare use case IMO. Besides that, the skip transaction code path is fast,
hence, it is very unlikely that the current patch will impose some issues to
the skip transaction feature. (Remember that the main goal for this feature is
to provide an old state of the database.)
One more thing that might be worth discussing is whether introducing a
new subscription parameter for this feature is a better idea or can we
use guc (either an existing or a new one). Users may want to set this
only for a particular subscription or set of subscriptions in which
case it is better to have this as a subscription level parameter.
OTOH, I was slightly worried that if this will be used for all
subscriptions on a subscriber then it will be burdensome for users.
That's a good point. Logical replication is per database and it is slightly
different from physical replication that is per cluster. In physical
replication, you have no choice but to have a GUC. It is very unlikely that
someone wants to delay all logical replicas. Therefore, the benefit of having a
GUC is quite small.
--
Euler Taveira
EDB https://www.enterprisedb.com/
On Mon, Aug 1, 2022 at 6:46 PM Euler Taveira <euler@eulerto.com> wrote:
On Tue, Jul 5, 2022, at 9:29 AM, Amit Kapila wrote:
I wonder why we don't apply the delay on commit/commit_prepared
records only similar to physical replication. See recoveryApplyDelay.
One more advantage would be then we don't need to worry about
transactions that we are going to skip due SKIP feature for
subscribers.I added an explanation at the top of apply_delay(). I didn't read the "parallel
apply" patch yet. I'll do soon to understand how the current design for
streamed transactions conflicts with the parallel apply patch.+ * It applies the delay for the next transaction but before starting the + * transaction. The main reason for this design is to avoid a long-running + * transaction (which can cause some operational challenges) if the user sets a + * high value for the delay. This design is different from the physical + * replication (that applies the delay at commit time) mainly because write + * operations may allow some issues (such as bloat and locks) that can be + * minimized if it does not keep the transaction open for such a long time. + */
Your explanation makes sense to me. The other point to consider is
that there can be cases where we may not apply operation for the
transaction because of empty transactions (we don't yet skip empty
xacts for prepared transactions). So, won't it be better to apply the
delay just before we apply the first change for a transaction? Do we
want to apply the delay during table sync as we sometimes do need to
enter apply phase while doing table sync?
One more thing that might be worth discussing is whether introducing a
new subscription parameter for this feature is a better idea or can we
use guc (either an existing or a new one). Users may want to set this
only for a particular subscription or set of subscriptions in which
case it is better to have this as a subscription level parameter.
OTOH, I was slightly worried that if this will be used for all
subscriptions on a subscriber then it will be burdensome for users.That's a good point. Logical replication is per database and it is slightly
different from physical replication that is per cluster. In physical
replication, you have no choice but to have a GUC. It is very unlikely that
someone wants to delay all logical replicas. Therefore, the benefit of having a
GUC is quite small.
Fair enough.
--
With Regards,
Amit Kapila.
On Wed, Jul 13, 2022, at 2:34 PM, Melih Mutlu wrote:
[Sorry for the delay...]
22. src/test/subscription/t/032_apply_delay.pl
I received the following error when trying to run these 'subscription' tests:
t/032_apply_delay.pl ............... No such class log_location at
t/032_apply_delay.pl line 49, near "my log_location"
syntax error at t/032_apply_delay.pl line 49, near "my log_location ="I'm having these errors too. Seems like some declarations are missing.
Fixed in v5.
+ specified amount of time. If this value is specified without units, + it is taken as milliseconds. The default is zero, adding no delay. + </para>I'm also having an issue when I give min_apply_delay parameter without units.
I expect that if I set min_apply_delay to 5000 (without any unit), it will be interpreted as 5000 ms.
Good catch. I fixed it in v5.
Lastly, I have a question about this delay during tablesync.
It's stated here that apply delays are not for initial tablesync.<para> + The delay occurs only on WAL records for transaction begins and after + the initial table synchronization. It is possible that the + replication delay between publisher and subscriber exceeds the value + of this parameter, in which case no delay is added. Note that the + delay is calculated between the WAL time stamp as written on + publisher and the current time on the subscriber. Delays in logical + decoding and in transfer the transaction may reduce the actual wait + time. If the system clocks on publisher and subscriber are not + synchronized, this may lead to apply changes earlier than expected. + This is not a major issue because a typical setting of this parameter + are much larger than typical time deviations between servers. + </para>There might be a case where tablesync workers are in SYNCWAIT state and waiting for apply worker to tell them to CATCHUP.
And if apply worker is waiting in apply_delay function, tablesync workers will be stuck at SYNCWAIT state and this might delay tablesync at least "min_apply_delay" amount of time or more.
Is it something we would want? What do you think?
Good catch. That's an oversight. It should wait for the initial table
synchronization before starting to apply the delay. The main reason is the
current logical replication worker design. It only closes the tablesync workers
after the catchup phase. As you noticed we cannot impose the delay as soon as
the COPY finishes because it will take a long time to finish due to possibly
lack of workers. Instead, let's wait for the READY state for all tables then
apply the delay. I added an explanation for it.
I also modified the test a bit to use the new function
wait_for_subscription_sync introduced in the commit
0c20dd33db1607d6a85ffce24238c1e55e384b49.
I attached a v6.
--
Euler Taveira
EDB https://www.enterprisedb.com/
Attachments:
v6-0001-Time-delayed-logical-replication-subscriber.patchtext/x-patch; name=v6-0001-Time-delayed-logical-replication-subscriber.patchDownload
From 465a6d5aa491855e2925da24785103cac0c520a2 Mon Sep 17 00:00:00 2001
From: Euler Taveira <euler.taveira@enterprisedb.com>
Date: Sat, 6 Nov 2021 11:31:10 -0300
Subject: [PATCH v6] Time-delayed logical replication subscriber
Similar to physical replication, a time-delayed copy of the data for
logical replication is useful for some scenarios (particularly to fix
errors that might cause data loss).
If the subscriber sets min_apply_delay parameter, the logical
replication worker will delay the transaction commit for min_apply_delay
milliseconds.
The delay is calculated between the WAL time stamp and the current time
on the subscriber.
The delay occurs only on WAL records for transaction begins. The main
reason is to avoid keeping a transaction open for a long time. Regular
and prepared transactions are covered. Streamed transactions are also
covered.
Author: Euler Taveira
Discussion: https://postgr.es/m/CAB-JLwYOYwL=XTyAXKiH5CtM_Vm8KjKh7aaitCKvmCh4rzr5pQ@mail.gmail.com
---
doc/src/sgml/catalogs.sgml | 10 ++
doc/src/sgml/ref/alter_subscription.sgml | 5 +-
doc/src/sgml/ref/create_subscription.sgml | 43 +++++-
src/backend/catalog/pg_subscription.c | 1 +
src/backend/catalog/system_views.sql | 7 +-
src/backend/commands/subscriptioncmds.c | 48 +++++-
src/backend/replication/logical/worker.c | 100 +++++++++++++
src/backend/utils/adt/timestamp.c | 32 ++++
src/bin/pg_dump/pg_dump.c | 17 ++-
src/bin/pg_dump/pg_dump.h | 1 +
src/bin/psql/describe.c | 9 +-
src/bin/psql/tab-complete.c | 4 +-
src/include/catalog/pg_subscription.h | 3 +
src/include/datatype/timestamp.h | 2 +
src/include/utils/timestamp.h | 2 +
src/test/regress/expected/subscription.out | 165 ++++++++++++---------
src/test/regress/sql/subscription.sql | 20 +++
src/test/subscription/t/032_apply_delay.pl | 129 ++++++++++++++++
18 files changed, 515 insertions(+), 83 deletions(-)
create mode 100644 src/test/subscription/t/032_apply_delay.pl
diff --git a/doc/src/sgml/catalogs.sgml b/doc/src/sgml/catalogs.sgml
index cd2cc37aeb..efc745a9ea 100644
--- a/doc/src/sgml/catalogs.sgml
+++ b/doc/src/sgml/catalogs.sgml
@@ -7833,6 +7833,16 @@ SCRAM-SHA-256$<replaceable><iteration count></replaceable>:<replaceable>&l
</para></entry>
</row>
+ <row>
+ <entry role="catalog_table_entry"><para role="column_definition">
+ <structfield>subapplydelay</structfield> <type>int8</type>
+ </para>
+ <para>
+ Delay the application of changes by a specified amount of time. The
+ unit is in milliseconds.
+ </para></entry>
+ </row>
+
<row>
<entry role="catalog_table_entry"><para role="column_definition">
<structfield>subname</structfield> <type>name</type>
diff --git a/doc/src/sgml/ref/alter_subscription.sgml b/doc/src/sgml/ref/alter_subscription.sgml
index 64efc21f53..8901e1361c 100644
--- a/doc/src/sgml/ref/alter_subscription.sgml
+++ b/doc/src/sgml/ref/alter_subscription.sgml
@@ -208,8 +208,9 @@ ALTER SUBSCRIPTION <replaceable class="parameter">name</replaceable> RENAME TO <
are <literal>slot_name</literal>,
<literal>synchronous_commit</literal>,
<literal>binary</literal>, <literal>streaming</literal>,
- <literal>disable_on_error</literal>, and
- <literal>origin</literal>.
+ <literal>disable_on_error</literal>,
+ <literal>origin</literal>, and
+ <literal>min_apply_delay</literal>.
</para>
</listitem>
</varlistentry>
diff --git a/doc/src/sgml/ref/create_subscription.sgml b/doc/src/sgml/ref/create_subscription.sgml
index 7390c715bc..1fc8e7474a 100644
--- a/doc/src/sgml/ref/create_subscription.sgml
+++ b/doc/src/sgml/ref/create_subscription.sgml
@@ -317,7 +317,36 @@ CREATE SUBSCRIPTION <replaceable class="parameter">subscription_name</replaceabl
</para>
</listitem>
</varlistentry>
- </variablelist></para>
+
+ <varlistentry>
+ <term><literal>min_apply_delay</literal> (<type>integer</type>)</term>
+ <listitem>
+ <para>
+ By default, subscriber applies changes as soon as possible. As with
+ the physical replication feature
+ (<xref linkend="guc-recovery-min-apply-delay"/>), it can be useful to
+ have a time-delayed logical replica. This parameter allows you to
+ delay the application of changes by a specified amount of time. If
+ this value is specified without units, it is taken as seconds. The
+ default is zero, adding no delay.
+ </para>
+ <para>
+ The delay occurs only on WAL records for transaction begins and after
+ the initial table synchronization. It is possible that the
+ replication delay between publisher and subscriber exceeds the value
+ of this parameter, in which case no delay is added. Note that the
+ delay is calculated between the WAL time stamp as written on
+ publisher and the current time on the subscriber. Time spent in logical
+ decoding and in transfering the transaction may reduce the actual wait
+ time. If the system clocks on publisher and subscriber are not
+ synchronized, this may lead to apply changes earlier than expected.
+ This is not a major issue because a typical setting of this parameter
+ are much larger than typical time deviations between servers.
+ </para>
+ </listitem>
+ </varlistentry>
+ </variablelist>
+ </para>
</listitem>
</varlistentry>
@@ -413,6 +442,18 @@ CREATE SUBSCRIPTION mysub
PUBLICATION insert_only
WITH (enabled = false);
</programlisting></para>
+
+ <para>
+ Create a subscription to a remote server that replicates tables in
+ the <literal>baz</literal> publication and starts replicating immediately on
+ commit. Pre-existing data is not copied. The application of changes is
+ delayed by 4 hours.
+<programlisting>
+CREATE SUBSCRIPTION foo
+ CONNECTION 'host=192.168.1.50 port=5432 user=foo dbname=foodb'
+ PUBLICATION baz
+ WITH (copy_data = false, min_apply_delay = '4h');
+</programlisting></para>
</refsect1>
<refsect1>
diff --git a/src/backend/catalog/pg_subscription.c b/src/backend/catalog/pg_subscription.c
index a506fc3ec8..d93e374ef4 100644
--- a/src/backend/catalog/pg_subscription.c
+++ b/src/backend/catalog/pg_subscription.c
@@ -64,6 +64,7 @@ GetSubscription(Oid subid, bool missing_ok)
sub->oid = subid;
sub->dbid = subform->subdbid;
sub->skiplsn = subform->subskiplsn;
+ sub->applydelay = subform->subapplydelay;
sub->name = pstrdup(NameStr(subform->subname));
sub->owner = subform->subowner;
sub->enabled = subform->subenabled;
diff --git a/src/backend/catalog/system_views.sql b/src/backend/catalog/system_views.sql
index f369b1fc14..50175323b9 100644
--- a/src/backend/catalog/system_views.sql
+++ b/src/backend/catalog/system_views.sql
@@ -1297,9 +1297,10 @@ REVOKE ALL ON pg_replication_origin_status FROM public;
-- All columns of pg_subscription except subconninfo are publicly readable.
REVOKE ALL ON pg_subscription FROM public;
-GRANT SELECT (oid, subdbid, subskiplsn, subname, subowner, subenabled,
- subbinary, substream, subtwophasestate, subdisableonerr,
- subslotname, subsynccommit, subpublications, suborigin)
+GRANT SELECT (oid, subdbid, subskiplsn, subapplydelay, subname, subowner,
+ subenabled, subbinary, substream, subtwophasestate,
+ subdisableonerr, subslotname, subsynccommit, subpublications,
+ suborigin)
ON pg_subscription TO public;
CREATE VIEW pg_stat_subscription_stats AS
diff --git a/src/backend/commands/subscriptioncmds.c b/src/backend/commands/subscriptioncmds.c
index f73dfb6067..aaeda7e8a9 100644
--- a/src/backend/commands/subscriptioncmds.c
+++ b/src/backend/commands/subscriptioncmds.c
@@ -47,6 +47,7 @@
#include "utils/memutils.h"
#include "utils/pg_lsn.h"
#include "utils/syscache.h"
+#include "utils/timestamp.h"
/*
* Options that can be specified by the user in CREATE/ALTER SUBSCRIPTION
@@ -65,6 +66,7 @@
#define SUBOPT_DISABLE_ON_ERR 0x00000400
#define SUBOPT_LSN 0x00000800
#define SUBOPT_ORIGIN 0x00001000
+#define SUBOPT_MIN_APPLY_DELAY 0x00002000
/* check if the 'val' has 'bits' set */
#define IsSet(val, bits) (((val) & (bits)) == (bits))
@@ -89,6 +91,7 @@ typedef struct SubOpts
bool disableonerr;
char *origin;
XLogRecPtr lsn;
+ int64 min_apply_delay;
} SubOpts;
static List *fetch_table_list(WalReceiverConn *wrconn, List *publications);
@@ -141,6 +144,8 @@ parse_subscription_options(ParseState *pstate, List *stmt_options,
opts->disableonerr = false;
if (IsSet(supported_opts, SUBOPT_ORIGIN))
opts->origin = pstrdup(LOGICALREP_ORIGIN_ANY);
+ if (IsSet(supported_opts, SUBOPT_MIN_APPLY_DELAY))
+ opts->min_apply_delay = 0;
/* Parse options */
foreach(lc, stmt_options)
@@ -319,12 +324,35 @@ parse_subscription_options(ParseState *pstate, List *stmt_options,
opts->specified_opts |= SUBOPT_LSN;
opts->lsn = lsn;
}
+ else if (IsSet(supported_opts, SUBOPT_MIN_APPLY_DELAY) &&
+ strcmp(defel->defname, "min_apply_delay") == 0)
+ {
+ char *val;
+ Interval *interval;
+
+ if (IsSet(opts->specified_opts, SUBOPT_MIN_APPLY_DELAY))
+ errorConflictingDefElem(defel, pstate);
+
+ opts->specified_opts |= SUBOPT_MIN_APPLY_DELAY;
+ val = defGetString(defel);
+
+ interval = DatumGetIntervalP(DirectFunctionCall3(interval_in,
+ CStringGetDatum(val),
+ ObjectIdGetDatum(InvalidOid),
+ Int32GetDatum(-1)));
+ opts->min_apply_delay = interval_to_ms(interval);
+ }
else
ereport(ERROR,
(errcode(ERRCODE_SYNTAX_ERROR),
errmsg("unrecognized subscription parameter: \"%s\"", defel->defname)));
}
+ if (opts->min_apply_delay < 0 && IsSet(supported_opts, SUBOPT_MIN_APPLY_DELAY))
+ ereport(ERROR,
+ errcode(ERRCODE_NUMERIC_VALUE_OUT_OF_RANGE),
+ errmsg("option \"%s\" must not be negative", "min_apply_delay"));
+
/*
* We've been explicitly asked to not connect, that requires some
* additional processing.
@@ -557,7 +585,8 @@ CreateSubscription(ParseState *pstate, CreateSubscriptionStmt *stmt,
SUBOPT_SLOT_NAME | SUBOPT_COPY_DATA |
SUBOPT_SYNCHRONOUS_COMMIT | SUBOPT_BINARY |
SUBOPT_STREAMING | SUBOPT_TWOPHASE_COMMIT |
- SUBOPT_DISABLE_ON_ERR | SUBOPT_ORIGIN);
+ SUBOPT_DISABLE_ON_ERR | SUBOPT_ORIGIN |
+ SUBOPT_MIN_APPLY_DELAY);
parse_subscription_options(pstate, stmt->options, supported_opts, &opts);
/*
@@ -622,6 +651,7 @@ CreateSubscription(ParseState *pstate, CreateSubscriptionStmt *stmt,
values[Anum_pg_subscription_oid - 1] = ObjectIdGetDatum(subid);
values[Anum_pg_subscription_subdbid - 1] = ObjectIdGetDatum(MyDatabaseId);
values[Anum_pg_subscription_subskiplsn - 1] = LSNGetDatum(InvalidXLogRecPtr);
+ values[Anum_pg_subscription_subapplydelay - 1] = Int64GetDatum(opts.min_apply_delay);
values[Anum_pg_subscription_subname - 1] =
DirectFunctionCall1(namein, CStringGetDatum(stmt->subname));
values[Anum_pg_subscription_subowner - 1] = ObjectIdGetDatum(owner);
@@ -1044,7 +1074,7 @@ AlterSubscription(ParseState *pstate, AlterSubscriptionStmt *stmt,
supported_opts = (SUBOPT_SLOT_NAME |
SUBOPT_SYNCHRONOUS_COMMIT | SUBOPT_BINARY |
SUBOPT_STREAMING | SUBOPT_DISABLE_ON_ERR |
- SUBOPT_ORIGIN);
+ SUBOPT_ORIGIN | SUBOPT_MIN_APPLY_DELAY);
parse_subscription_options(pstate, stmt->options,
supported_opts, &opts);
@@ -1100,6 +1130,12 @@ AlterSubscription(ParseState *pstate, AlterSubscriptionStmt *stmt,
replaces[Anum_pg_subscription_subdisableonerr - 1]
= true;
}
+ if (IsSet(opts.specified_opts, SUBOPT_MIN_APPLY_DELAY))
+ {
+ values[Anum_pg_subscription_subapplydelay - 1] =
+ Int64GetDatum(opts.min_apply_delay);
+ replaces[Anum_pg_subscription_subapplydelay - 1] = true;
+ }
if (IsSet(opts.specified_opts, SUBOPT_ORIGIN))
{
@@ -1130,6 +1166,14 @@ AlterSubscription(ParseState *pstate, AlterSubscriptionStmt *stmt,
if (opts.enabled)
ApplyLauncherWakeupAtCommit();
+ /*
+ * If this subscription has been disabled and it has an apply
+ * delay set, wake up the logical replication worker to finish
+ * it as soon as possible.
+ */
+ if (!opts.enabled && sub->applydelay > 0)
+ logicalrep_worker_wakeup(sub->oid, InvalidOid);
+
update_tuple = true;
break;
}
diff --git a/src/backend/replication/logical/worker.c b/src/backend/replication/logical/worker.c
index 5f8c541763..18aece3495 100644
--- a/src/backend/replication/logical/worker.c
+++ b/src/backend/replication/logical/worker.c
@@ -324,6 +324,8 @@ static void maybe_reread_subscription(void);
static void DisableSubscriptionAndExit(void);
+static void apply_delay(TimestampTz ts);
+
/* prototype needed because of stream_commit */
static void apply_dispatch(StringInfo s);
@@ -803,6 +805,72 @@ slot_modify_data(TupleTableSlot *slot, TupleTableSlot *srcslot,
ExecStoreVirtualTuple(slot);
}
+/*
+ * When min_apply_delay parameter is set on subscriber, we wait long enough to
+ * make sure a transaction is applied at least that interval behind the
+ * publisher.
+ *
+ * It applies the delay for the next transaction but before starting the
+ * transaction. The main reason for this design is to avoid a long-running
+ * transaction (which can cause some operational challenges) if the user sets a
+ * high value for the delay. This design is different from the physical
+ * replication (that applies the delay at commit time) mainly because write
+ * operations may allow some issues (such as bloat and locks) that can be
+ * minimized if it does not keep the transaction open for such a long time.
+ *
+ * The min_apply_delay parameter will take effect only after all tables are in
+ * READY state.
+ */
+static void
+apply_delay(TimestampTz ts)
+{
+ TimestampTz delay_until = 0;
+
+ /* nothing to do if no delay set */
+ if (MySubscription->applydelay <= 0)
+ return;
+
+ /*
+ * Apply delay only after all tablesync workers have reached READY state. A
+ * tablesync worker are kept until it reaches READY state. If we allow the
+ * delay during the catchup phase, once we reach the limit of tablesync
+ * workers, it will impose a delay for each subsequent worker. It means it
+ * will take a long time to finish the initial table synchronization.
+ * Instead, the apply delay will be activated only after all tables are in
+ * READY state.
+ */
+ if (!AllTablesyncsReady())
+ return;
+
+ /* set apply delay */
+ delay_until = TimestampTzPlusMilliseconds(TimestampTzGetDatum(ts),
+ MySubscription->applydelay);
+
+ while (true)
+ {
+ long diffms;
+
+ diffms = TimestampDifferenceMilliseconds(GetCurrentTimestamp(), delay_until);
+
+ /*
+ * Exit without arming the latch if it's already past time to apply
+ * this transaction.
+ */
+ if (diffms <= 0)
+ break;
+
+ elog(DEBUG2, "logical replication apply delay: %ld ms", diffms);
+
+ WaitLatch(MyLatch,
+ WL_LATCH_SET | WL_TIMEOUT | WL_EXIT_ON_PM_DEATH,
+ diffms,
+ WAIT_EVENT_RECOVERY_APPLY_DELAY);
+ ResetLatch(MyLatch);
+
+ CHECK_FOR_INTERRUPTS();
+ }
+}
+
/*
* Handle BEGIN message.
*/
@@ -814,6 +882,9 @@ apply_handle_begin(StringInfo s)
logicalrep_read_begin(s, &begin_data);
set_apply_error_context_xact(begin_data.xid, begin_data.final_lsn);
+ /* Should we delay the current transaction? */
+ apply_delay(begin_data.committime);
+
remote_final_lsn = begin_data.final_lsn;
maybe_start_skipping_changes(begin_data.final_lsn);
@@ -868,6 +939,9 @@ apply_handle_begin_prepare(StringInfo s)
logicalrep_read_begin_prepare(s, &begin_data);
set_apply_error_context_xact(begin_data.xid, begin_data.prepare_lsn);
+ /* Should we delay the current prepared transaction? */
+ apply_delay(begin_data.prepare_time);
+
remote_final_lsn = begin_data.prepare_lsn;
maybe_start_skipping_changes(begin_data.prepare_lsn);
@@ -1090,6 +1164,19 @@ apply_handle_stream_prepare(StringInfo s)
elog(DEBUG1, "received prepare for streamed transaction %u", prepare_data.xid);
+ /*
+ * Should we delay the current prepared transaction?
+ *
+ * Although the delay is applied in BEGIN PREPARE messages, streamed
+ * prepared transactions apply the delay in a STREAM PREPARE message.
+ * That's ok because no changes have been applied yet
+ * (apply_spooled_messages() will do it).
+ * The STREAM START message does not contain a prepare time (it will be
+ * available when the in-progress prepared transaction finishes), hence, it
+ * was not possible to apply a delay at that time.
+ */
+ apply_delay(prepare_data.prepare_time);
+
/* Replay all the spooled operations. */
apply_spooled_messages(prepare_data.xid, prepare_data.prepare_lsn);
@@ -1481,6 +1568,19 @@ apply_handle_stream_commit(StringInfo s)
elog(DEBUG1, "received commit for streamed transaction %u", xid);
+ /*
+ * Should we delay the current transaction?
+ *
+ * Although the delay is applied in BEGIN messages, streamed transactions
+ * apply the delay in a STREAM COMMIT message. That's ok because no changes
+ * have been applied yet (apply_spooled_messages() will do it).
+ * The STREAM START message would be a natural choice for this delay but
+ * there is no commit time yet (it will be available when the in-progress
+ * transaction finishes), hence, it was not possible to apply a delay at
+ * that time.
+ */
+ apply_delay(commit_data.committime);
+
apply_spooled_messages(xid, commit_data.commit_lsn);
apply_handle_commit_internal(&commit_data);
diff --git a/src/backend/utils/adt/timestamp.c b/src/backend/utils/adt/timestamp.c
index 49cdb290ac..89f57f7c33 100644
--- a/src/backend/utils/adt/timestamp.c
+++ b/src/backend/utils/adt/timestamp.c
@@ -2411,6 +2411,38 @@ interval_cmp_internal(const Interval *interval1, const Interval *interval2)
return int128_compare(span1, span2);
}
+/*
+ * Given an Interval returns the number of milliseconds.
+ */
+int64
+interval_to_ms(const Interval *interval)
+{
+ int64 days;
+ int64 ms;
+ int64 result;
+
+ days = interval->month * INT64CONST(30);
+ days += interval->day;
+
+ /*
+ * The following operations use these special functions to detect overflow.
+ * Number of ms per informed days.
+ */
+ if (pg_mul_s64_overflow(days, MSECS_PER_DAY, &result))
+ ereport(ERROR,
+ (errcode(ERRCODE_NUMERIC_VALUE_OUT_OF_RANGE),
+ errmsg("bigint out of range")));
+
+ /* adds portion time (in ms) to the previous result. */
+ ms = interval->time / INT64CONST(1000);
+ if (pg_add_s64_overflow(result, ms, &result))
+ ereport(ERROR,
+ (errcode(ERRCODE_NUMERIC_VALUE_OUT_OF_RANGE),
+ errmsg("bigint out of range")));
+
+ return result;
+}
+
Datum
interval_eq(PG_FUNCTION_ARGS)
{
diff --git a/src/bin/pg_dump/pg_dump.c b/src/bin/pg_dump/pg_dump.c
index da6605175a..5ef61f3de1 100644
--- a/src/bin/pg_dump/pg_dump.c
+++ b/src/bin/pg_dump/pg_dump.c
@@ -4441,6 +4441,7 @@ getSubscriptions(Archive *fout)
int i_subsynccommit;
int i_subpublications;
int i_subbinary;
+ int i_subapplydelay;
int i,
ntups;
@@ -4493,9 +4494,15 @@ getSubscriptions(Archive *fout)
LOGICALREP_TWOPHASE_STATE_DISABLED);
if (fout->remoteVersion >= 160000)
- appendPQExpBufferStr(query, " s.suborigin\n");
+ {
+ appendPQExpBufferStr(query, " s.suborigin,\n");
+ appendPQExpBufferStr(query, " s.subapplydelay\n");
+ }
else
- appendPQExpBuffer(query, " '%s' AS suborigin\n", LOGICALREP_ORIGIN_ANY);
+ {
+ appendPQExpBuffer(query, " '%s' AS suborigin,\n", LOGICALREP_ORIGIN_ANY);
+ appendPQExpBufferStr(query, " 0 AS subapplydelay\n");
+ }
appendPQExpBufferStr(query,
"FROM pg_subscription s\n"
@@ -4523,6 +4530,7 @@ getSubscriptions(Archive *fout)
i_subtwophasestate = PQfnumber(res, "subtwophasestate");
i_subdisableonerr = PQfnumber(res, "subdisableonerr");
i_suborigin = PQfnumber(res, "suborigin");
+ i_subapplydelay = PQfnumber(res, "subapplydelay");
subinfo = pg_malloc(ntups * sizeof(SubscriptionInfo));
@@ -4553,6 +4561,8 @@ getSubscriptions(Archive *fout)
subinfo[i].subdisableonerr =
pg_strdup(PQgetvalue(res, i, i_subdisableonerr));
subinfo[i].suborigin = pg_strdup(PQgetvalue(res, i, i_suborigin));
+ subinfo[i].subapplydelay =
+ strtoi64(PQgetvalue(res, i, i_subapplydelay), NULL, 10);
/* Decide whether we want to dump it */
selectDumpableObject(&(subinfo[i].dobj), fout);
@@ -4632,6 +4642,9 @@ dumpSubscription(Archive *fout, const SubscriptionInfo *subinfo)
if (strcmp(subinfo->subsynccommit, "off") != 0)
appendPQExpBuffer(query, ", synchronous_commit = %s", fmtId(subinfo->subsynccommit));
+ if (subinfo->subapplydelay > 0)
+ appendPQExpBuffer(query, ", min_apply_delay = '" INT64_FORMAT " ms'", subinfo->subapplydelay);
+
appendPQExpBufferStr(query, ");\n");
if (subinfo->dobj.dump & DUMP_COMPONENT_DEFINITION)
diff --git a/src/bin/pg_dump/pg_dump.h b/src/bin/pg_dump/pg_dump.h
index 69ee939d44..91b73e10d2 100644
--- a/src/bin/pg_dump/pg_dump.h
+++ b/src/bin/pg_dump/pg_dump.h
@@ -661,6 +661,7 @@ typedef struct _SubscriptionInfo
char *subdisableonerr;
char *suborigin;
char *subsynccommit;
+ int64 subapplydelay;
char *subpublications;
} SubscriptionInfo;
diff --git a/src/bin/psql/describe.c b/src/bin/psql/describe.c
index 327a69487b..0be1d44e81 100644
--- a/src/bin/psql/describe.c
+++ b/src/bin/psql/describe.c
@@ -6469,7 +6469,7 @@ describeSubscriptions(const char *pattern, bool verbose)
PGresult *res;
printQueryOpt myopt = pset.popt;
static const bool translate_columns[] = {false, false, false, false,
- false, false, false, false, false, false, false, false};
+ false, false, false, false, false, false, false, false, false};
if (pset.sversion < 100000)
{
@@ -6511,10 +6511,13 @@ describeSubscriptions(const char *pattern, bool verbose)
gettext_noop("Two-phase commit"),
gettext_noop("Disable on error"));
+ /* origin and min_apply_delay are only supported in v16 and higher */
if (pset.sversion >= 160000)
appendPQExpBuffer(&buf,
- ", suborigin AS \"%s\"\n",
- gettext_noop("Origin"));
+ ", suborigin AS \"%s\"\n"
+ ", subapplydelay AS \"%s\"\n",
+ gettext_noop("Origin"),
+ gettext_noop("Apply delay"));
appendPQExpBuffer(&buf,
", subsynccommit AS \"%s\"\n"
diff --git a/src/bin/psql/tab-complete.c b/src/bin/psql/tab-complete.c
index f265e043e9..42070f7240 100644
--- a/src/bin/psql/tab-complete.c
+++ b/src/bin/psql/tab-complete.c
@@ -1873,7 +1873,7 @@ psql_completion(const char *text, int start, int end)
COMPLETE_WITH("(", "PUBLICATION");
/* ALTER SUBSCRIPTION <name> SET ( */
else if (HeadMatches("ALTER", "SUBSCRIPTION", MatchAny) && TailMatches("SET", "("))
- COMPLETE_WITH("binary", "disable_on_error", "origin", "slot_name",
+ COMPLETE_WITH("binary", "disable_on_error", "origin", "min_apply_delay", "slot_name",
"streaming", "synchronous_commit");
/* ALTER SUBSCRIPTION <name> SKIP ( */
else if (HeadMatches("ALTER", "SUBSCRIPTION", MatchAny) && TailMatches("SKIP", "("))
@@ -3153,7 +3153,7 @@ psql_completion(const char *text, int start, int end)
/* Complete "CREATE SUBSCRIPTION <name> ... WITH ( <opt>" */
else if (HeadMatches("CREATE", "SUBSCRIPTION") && TailMatches("WITH", "("))
COMPLETE_WITH("binary", "connect", "copy_data", "create_slot",
- "disable_on_error", "enabled", "origin", "slot_name",
+ "disable_on_error", "enabled", "origin", "min_apply_delay", "slot_name",
"streaming", "synchronous_commit", "two_phase");
/* CREATE TRIGGER --- is allowed inside CREATE SCHEMA, so use TailMatches */
diff --git a/src/include/catalog/pg_subscription.h b/src/include/catalog/pg_subscription.h
index 7b98714f30..3894b97aca 100644
--- a/src/include/catalog/pg_subscription.h
+++ b/src/include/catalog/pg_subscription.h
@@ -70,6 +70,8 @@ CATALOG(pg_subscription,6100,SubscriptionRelationId) BKI_SHARED_RELATION BKI_ROW
XLogRecPtr subskiplsn; /* All changes finished at this LSN are
* skipped */
+ int64 subapplydelay; /* Replication apply delay */
+
NameData subname; /* Name of the subscription */
Oid subowner BKI_LOOKUP(pg_authid); /* Owner of the subscription */
@@ -119,6 +121,7 @@ typedef struct Subscription
* in */
XLogRecPtr skiplsn; /* All changes finished at this LSN are
* skipped */
+ int64 applydelay; /* Replication apply delay */
char *name; /* Name of the subscription */
Oid owner; /* Oid of the subscription owner */
bool enabled; /* Indicates if the subscription is enabled */
diff --git a/src/include/datatype/timestamp.h b/src/include/datatype/timestamp.h
index d155f1b03b..d5bbfad1c4 100644
--- a/src/include/datatype/timestamp.h
+++ b/src/include/datatype/timestamp.h
@@ -127,6 +127,8 @@ struct pg_itm_in
#define SECS_PER_MINUTE 60
#define MINS_PER_HOUR 60
+#define MSECS_PER_DAY INT64CONST(86400000)
+
#define USECS_PER_DAY INT64CONST(86400000000)
#define USECS_PER_HOUR INT64CONST(3600000000)
#define USECS_PER_MINUTE INT64CONST(60000000)
diff --git a/src/include/utils/timestamp.h b/src/include/utils/timestamp.h
index edf3a97318..91709035da 100644
--- a/src/include/utils/timestamp.h
+++ b/src/include/utils/timestamp.h
@@ -78,6 +78,8 @@ extern bool TimestampDifferenceExceeds(TimestampTz start_time,
TimestampTz stop_time,
int msec);
+extern int64 interval_to_ms(const Interval *interval);
+
extern TimestampTz time_t_to_timestamptz(pg_time_t tm);
extern pg_time_t timestamptz_to_time_t(TimestampTz t);
diff --git a/src/test/regress/expected/subscription.out b/src/test/regress/expected/subscription.out
index ef0ebf96b9..135d555707 100644
--- a/src/test/regress/expected/subscription.out
+++ b/src/test/regress/expected/subscription.out
@@ -77,18 +77,18 @@ ERROR: unrecognized origin value: "foo"
CREATE SUBSCRIPTION regress_testsub4 CONNECTION 'dbname=regress_doesnotexist' PUBLICATION testpub WITH (slot_name = NONE, connect = false, origin = none);
WARNING: tables were not subscribed, you will have to run ALTER SUBSCRIPTION ... REFRESH PUBLICATION to subscribe the tables
\dRs+ regress_testsub4
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
-------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub4 | regress_subscription_user | f | {testpub} | f | f | d | f | none | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Apply delay | Synchronous commit | Conninfo | Skip LSN
+------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+-------------+--------------------+-----------------------------+----------
+ regress_testsub4 | regress_subscription_user | f | {testpub} | f | f | d | f | none | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
ALTER SUBSCRIPTION regress_testsub4 SET (origin = any);
\dRs+ regress_testsub4
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
-------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub4 | regress_subscription_user | f | {testpub} | f | f | d | f | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Apply delay | Synchronous commit | Conninfo | Skip LSN
+------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+-------------+--------------------+-----------------------------+----------
+ regress_testsub4 | regress_subscription_user | f | {testpub} | f | f | d | f | any | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
DROP SUBSCRIPTION regress_testsub3;
@@ -98,10 +98,10 @@ ALTER SUBSCRIPTION regress_testsub CONNECTION 'foobar';
ERROR: invalid connection string syntax: missing "=" after "foobar" in connection info string
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub} | f | f | d | f | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Apply delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+-------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | f | d | f | any | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
ALTER SUBSCRIPTION regress_testsub SET PUBLICATION testpub2, testpub3 WITH (refresh = false);
@@ -118,10 +118,10 @@ ERROR: unrecognized subscription parameter: "create_slot"
-- ok
ALTER SUBSCRIPTION regress_testsub SKIP (lsn = '0/12345');
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+---------------------+--------+-----------+------------------+------------------+--------+--------------------+------------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub2,testpub3} | f | f | d | f | any | off | dbname=regress_doesnotexist2 | 0/12345
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Apply delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+---------------------+--------+-----------+------------------+------------------+--------+-------------+--------------------+------------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub2,testpub3} | f | f | d | f | any | 0 | off | dbname=regress_doesnotexist2 | 0/12345
(1 row)
-- ok - with lsn = NONE
@@ -130,10 +130,10 @@ ALTER SUBSCRIPTION regress_testsub SKIP (lsn = NONE);
ALTER SUBSCRIPTION regress_testsub SKIP (lsn = '0/0');
ERROR: invalid WAL location (LSN): 0/0
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+---------------------+--------+-----------+------------------+------------------+--------+--------------------+------------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub2,testpub3} | f | f | d | f | any | off | dbname=regress_doesnotexist2 | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Apply delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+---------------------+--------+-----------+------------------+------------------+--------+-------------+--------------------+------------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub2,testpub3} | f | f | d | f | any | 0 | off | dbname=regress_doesnotexist2 | 0/0
(1 row)
BEGIN;
@@ -165,10 +165,10 @@ ALTER SUBSCRIPTION regress_testsub_foo SET (synchronous_commit = foobar);
ERROR: invalid value for parameter "synchronous_commit": "foobar"
HINT: Available values: local, remote_write, remote_apply, on, off.
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
----------------------+---------------------------+---------+---------------------+--------+-----------+------------------+------------------+--------+--------------------+------------------------------+----------
- regress_testsub_foo | regress_subscription_user | f | {testpub2,testpub3} | f | f | d | f | any | local | dbname=regress_doesnotexist2 | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Apply delay | Synchronous commit | Conninfo | Skip LSN
+---------------------+---------------------------+---------+---------------------+--------+-----------+------------------+------------------+--------+-------------+--------------------+------------------------------+----------
+ regress_testsub_foo | regress_subscription_user | f | {testpub2,testpub3} | f | f | d | f | any | 0 | local | dbname=regress_doesnotexist2 | 0/0
(1 row)
-- rename back to keep the rest simple
@@ -201,19 +201,19 @@ ERROR: binary requires a Boolean value
CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUBLICATION testpub WITH (connect = false, binary = true);
WARNING: tables were not subscribed, you will have to run ALTER SUBSCRIPTION ... REFRESH PUBLICATION to subscribe the tables
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub} | t | f | d | f | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Apply delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+-------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | t | f | d | f | any | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
ALTER SUBSCRIPTION regress_testsub SET (binary = false);
ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE);
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub} | f | f | d | f | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Apply delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+-------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | f | d | f | any | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
DROP SUBSCRIPTION regress_testsub;
@@ -224,19 +224,19 @@ ERROR: streaming requires a Boolean value
CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUBLICATION testpub WITH (connect = false, streaming = true);
WARNING: tables were not subscribed, you will have to run ALTER SUBSCRIPTION ... REFRESH PUBLICATION to subscribe the tables
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub} | f | t | d | f | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Apply delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+-------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | t | d | f | any | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
ALTER SUBSCRIPTION regress_testsub SET (streaming = false);
ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE);
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub} | f | f | d | f | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Apply delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+-------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | f | d | f | any | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
-- fail - publication already exists
@@ -251,10 +251,10 @@ ALTER SUBSCRIPTION regress_testsub ADD PUBLICATION testpub1, testpub2 WITH (refr
ALTER SUBSCRIPTION regress_testsub ADD PUBLICATION testpub1, testpub2 WITH (refresh = false);
ERROR: publication "testpub1" is already in subscription "regress_testsub"
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-----------------------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub,testpub1,testpub2} | f | f | d | f | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Apply delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-----------------------------+--------+-----------+------------------+------------------+--------+-------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub,testpub1,testpub2} | f | f | d | f | any | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
-- fail - publication used more then once
@@ -269,10 +269,10 @@ ERROR: publication "testpub3" is not in subscription "regress_testsub"
-- ok - delete publications
ALTER SUBSCRIPTION regress_testsub DROP PUBLICATION testpub1, testpub2 WITH (refresh = false);
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub} | f | f | d | f | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Apply delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+-------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | f | d | f | any | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
DROP SUBSCRIPTION regress_testsub;
@@ -306,10 +306,10 @@ ERROR: two_phase requires a Boolean value
CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUBLICATION testpub WITH (connect = false, two_phase = true);
WARNING: tables were not subscribed, you will have to run ALTER SUBSCRIPTION ... REFRESH PUBLICATION to subscribe the tables
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub} | f | f | p | f | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Apply delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+-------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | f | p | f | any | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
--fail - alter of two_phase option not supported.
@@ -318,10 +318,10 @@ ERROR: unrecognized subscription parameter: "two_phase"
-- but can alter streaming when two_phase enabled
ALTER SUBSCRIPTION regress_testsub SET (streaming = true);
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub} | f | t | p | f | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Apply delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+-------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | t | p | f | any | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE);
@@ -330,10 +330,10 @@ DROP SUBSCRIPTION regress_testsub;
CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUBLICATION testpub WITH (connect = false, streaming = true, two_phase = true);
WARNING: tables were not subscribed, you will have to run ALTER SUBSCRIPTION ... REFRESH PUBLICATION to subscribe the tables
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub} | f | t | p | f | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Apply delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+-------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | t | p | f | any | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE);
@@ -345,18 +345,47 @@ ERROR: disable_on_error requires a Boolean value
CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUBLICATION testpub WITH (connect = false, disable_on_error = false);
WARNING: tables were not subscribed, you will have to run ALTER SUBSCRIPTION ... REFRESH PUBLICATION to subscribe the tables
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub} | f | f | d | f | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Apply delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+-------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | f | d | f | any | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
ALTER SUBSCRIPTION regress_testsub SET (disable_on_error = true);
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub} | f | f | d | t | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Apply delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+-------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | f | d | t | any | 0 | off | dbname=regress_doesnotexist | 0/0
+(1 row)
+
+ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE);
+DROP SUBSCRIPTION regress_testsub;
+-- fail -- min_apply_delay must be a non-negative integer
+CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUBLICATION testpub WITH (connect = false, min_apply_delay = foo);
+ERROR: invalid input syntax for type interval: "foo"
+CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUBLICATION testpub WITH (connect = false, min_apply_delay = -1);
+ERROR: option "min_apply_delay" must not be negative
+-- success -- 123 s
+CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUBLICATION testpub WITH (connect = false, min_apply_delay = 123);
+WARNING: tables were not subscribed, you will have to run ALTER SUBSCRIPTION ... REFRESH PUBLICATION to subscribe the tables
+\dRs+
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Apply delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+-------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | f | d | f | any | 123000 | off | dbname=regress_doesnotexist | 0/0
+(1 row)
+
+ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE);
+DROP SUBSCRIPTION regress_testsub;
+-- success -- interval is converted into ms and stored as integer
+CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUBLICATION testpub WITH (connect = false, min_apply_delay = '4h 27min 35s');
+WARNING: tables were not subscribed, you will have to run ALTER SUBSCRIPTION ... REFRESH PUBLICATION to subscribe the tables
+\dRs+
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Apply delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+-------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | f | d | f | any | 16055000 | off | dbname=regress_doesnotexist | 0/0
(1 row)
ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE);
diff --git a/src/test/regress/sql/subscription.sql b/src/test/regress/sql/subscription.sql
index 4425fafc46..fba4223aa8 100644
--- a/src/test/regress/sql/subscription.sql
+++ b/src/test/regress/sql/subscription.sql
@@ -264,6 +264,26 @@ ALTER SUBSCRIPTION regress_testsub SET (disable_on_error = true);
ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE);
DROP SUBSCRIPTION regress_testsub;
+-- fail -- min_apply_delay must be a non-negative integer
+CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUBLICATION testpub WITH (connect = false, min_apply_delay = foo);
+CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUBLICATION testpub WITH (connect = false, min_apply_delay = -1);
+
+-- success -- 123 s
+CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUBLICATION testpub WITH (connect = false, min_apply_delay = 123);
+
+\dRs+
+
+ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE);
+DROP SUBSCRIPTION regress_testsub;
+
+-- success -- interval is converted into ms and stored as integer
+CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUBLICATION testpub WITH (connect = false, min_apply_delay = '4h 27min 35s');
+
+\dRs+
+
+ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE);
+DROP SUBSCRIPTION regress_testsub;
+
RESET SESSION AUTHORIZATION;
DROP ROLE regress_subscription_user;
DROP ROLE regress_subscription_user2;
diff --git a/src/test/subscription/t/032_apply_delay.pl b/src/test/subscription/t/032_apply_delay.pl
new file mode 100644
index 0000000000..4f02223d4d
--- /dev/null
+++ b/src/test/subscription/t/032_apply_delay.pl
@@ -0,0 +1,129 @@
+
+# Copyright (c) 2022, PostgreSQL Global Development Group
+
+# Test replication apply delay
+use strict;
+use warnings;
+use PostgreSQL::Test::Cluster;
+use PostgreSQL::Test::Utils;
+use Test::More;
+
+# Create publisher node.
+my $node_publisher = PostgreSQL::Test::Cluster->new('publisher');
+$node_publisher->init(allows_streaming => 'logical');
+$node_publisher->append_conf('postgresql.conf',
+ 'logical_decoding_work_mem = 64kB');
+$node_publisher->start;
+
+# Create subscriber node.
+my $node_subscriber = PostgreSQL::Test::Cluster->new('subscriber');
+$node_subscriber->init;
+$node_subscriber->append_conf('postgresql.conf',
+ "log_min_messages = debug2");
+$node_subscriber->start;
+
+# Create some preexisting content on publisher.
+$node_publisher->safe_psql('postgres',
+ "CREATE TABLE test_tab (a int primary key, b varchar)");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO test_tab VALUES (1, 'foo'), (2, 'bar')");
+
+# Setup structure on subscriber.
+$node_subscriber->safe_psql('postgres',
+ "CREATE TABLE test_tab (a int primary key, b text, c timestamptz DEFAULT now(), d bigint DEFAULT 999)"
+);
+
+# Setup logical replication.
+my $publisher_connstr = $node_publisher->connstr . ' dbname=postgres';
+$node_publisher->safe_psql('postgres',
+ "CREATE PUBLICATION tap_pub FOR TABLE test_tab");
+
+my $appname = 'tap_sub';
+$node_subscriber->safe_psql('postgres',
+ "CREATE SUBSCRIPTION tap_sub CONNECTION '$publisher_connstr application_name=$appname' PUBLICATION tap_pub WITH (streaming = on, min_apply_delay = '2s')"
+);
+
+# Wait for initial table sync to finish.
+$node_subscriber->wait_for_subscription_sync($node_publisher, $appname);
+
+# Check log starting now for logical replication apply delay.
+my $log_location = -s $node_subscriber->logfile;
+
+my $result =
+ $node_subscriber->safe_psql('postgres',
+ "SELECT count(*), min(a), max(a) FROM test_tab");
+is($result, qq(2|1|2), 'check initial data was copied to subscriber');
+
+# New row to trigger apply delay.
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO test_tab VALUES (3, 'baz')");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO test_tab VALUES (4, 'abc')");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO test_tab VALUES (5, 'def')");
+
+$node_publisher->wait_for_catchup($appname);
+
+$result =
+ $node_subscriber->safe_psql('postgres',
+ "SELECT count(*), min(a), max(a) FROM test_tab");
+is($result, qq(5|1|5), 'check if the new rows were applied to subscriber');
+
+check_apply_delay_log("logical replication apply delay");
+
+# Test streamed transaction.
+# Insert, update and delete enough rows to exceed 64kB limit.
+$node_publisher->safe_psql(
+ 'postgres', q{
+BEGIN;
+INSERT INTO test_tab SELECT i, md5(i::text) FROM generate_series(6, 5000) s(i);
+UPDATE test_tab SET b = md5(b) WHERE mod(a,2) = 0;
+DELETE FROM test_tab WHERE mod(a,3) = 0;
+COMMIT;
+});
+
+$node_publisher->wait_for_catchup($appname);
+
+$result =
+ $node_subscriber->safe_psql('postgres',
+ "SELECT count(*), min(a), max(a) FROM test_tab");
+is($result, qq(3334|1|5000), 'check if the new rows were applied to subscriber');
+
+check_apply_delay_log("logical replication apply delay");
+
+# Test ALTER SUBSCRIPTION. Delay 86460 seconds (1 day 1 minute).
+$node_subscriber->safe_psql('postgres',
+ "ALTER SUBSCRIPTION tap_sub SET (min_apply_delay = 86460)"
+);
+
+# New row to trigger apply delay.
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO test_tab VALUES (0, 'foobar')");
+
+# Disable subscription. worker should die immediately.
+$node_subscriber->safe_psql('postgres',
+ "ALTER SUBSCRIPTION tap_sub DISABLE"
+);
+
+# Wait until worker dies.
+my $sub_query =
+ "SELECT count(1) = 0 FROM pg_stat_subscription WHERE subname = 'tap_sub' AND pid IS NOT NULL;";
+$node_subscriber->poll_query_until('postgres', $sub_query)
+ or die "Timed out while waiting for subscriber to die";
+
+$node_subscriber->stop;
+$node_publisher->stop;
+
+done_testing();
+
+sub check_apply_delay_log
+{
+ my $message = shift;
+ my $old_log_location = $log_location;
+
+ $log_location = $node_subscriber->wait_for_log(qr/$message/, $log_location);
+
+ cmp_ok($log_location, '>', $old_log_location,
+ "logfile contains triggered logical replication apply delay"
+ );
+}
--
2.30.2
On Wed, Aug 3, 2022, at 10:27 AM, Amit Kapila wrote:
Your explanation makes sense to me. The other point to consider is
that there can be cases where we may not apply operation for the
transaction because of empty transactions (we don't yet skip empty
xacts for prepared transactions). So, won't it be better to apply the
delay just before we apply the first change for a transaction? Do we
want to apply the delay during table sync as we sometimes do need to
enter apply phase while doing table sync?
I thought about the empty transactions but decided to not complicate the code
mainly because skipping transactions is not a code path that will slow down
this feature. As explained in the documentation, there is no harm in delaying a
transaction for more than min_apply_delay; it cannot apply earlier. Having said
that I decided to do nothing. I'm also not sure if it deserves a comment or if
this email is a possible explanation for this decision.
Regarding the table sync that was mention by Melih, I sent a new version (v6)
that fixed this oversight. The current logical replication worker design make
it difficult to apply the delay in the catchup phase; tablesync workers are not
closed as soon as the COPY finishes (which means possibly running out of
workers sooner). After all tablesync workers have reached READY state, the
apply delay is activated. The documentation was correct; the code wasn't.
--
Euler Taveira
EDB https://www.enterprisedb.com/
On Tuesday, August 9, 2022 6:47 AM Euler Taveira <euler@eulerto.com> wrote:
I attached a v6.
Hi, thank you for posting the updated patch.
Minor review comments for v6.
(1) commit message
"If the subscriber sets min_apply_delay parameter, ..."
I suggest we use subscription rather than subscriber, because
this parameter refers to and is used for one subscription.
My suggestion is
"If one subscription sets min_apply_delay parameter, ..."
In case if you agree, there are other places to apply this change.
(2) commit message
It might be better to write a note for committer
like "Bump catalog version" at the bottom of the commit message.
(3) unit alignment between recovery_min_apply_delay and min_apply_delay
The former interprets input number as milliseconds in case of no units,
while the latter takes it as seconds without units.
I feel it would be better to make them aligned.
(4) catalogs.sgml
+ Delay the application of changes by a specified amount of time. The
+ unit is in milliseconds.
As a column explanation, it'd be better to use a noun
in the first sentence to make this description aligned with
other places. My suggestion is
"Application delay of changes by ....".
(5) pg_subscription.c
There is one missing blank line before writing if statement.
It's written in the AlterSubscription for other cases.
@@ -1100,6 +1130,12 @@ AlterSubscription(ParseState *pstate, AlterSubscriptionStmt *stmt,
replaces[Anum_pg_subscription_subdisableonerr - 1]
= true;
}
+ if (IsSet(opts.specified_opts, SUBOPT_MIN_APPLY_DELAY))
(6) tab-complete.c
The order of tab-complete parameters listed in the COMPLETE_WITH
should follow alphabetical order. "min_apply_delay" can come before "origin".
We can refer to d547f7c commit.
(7) 032_apply_delay.pl
There are missing whitespaces after comma in the mod functions.
UPDATE test_tab SET b = md5(b) WHERE mod(a,2) = 0;
DELETE FROM test_tab WHERE mod(a,3) = 0;
Best Regards,
Takamichi Osumi
On Wed, Aug 10, 2022, at 9:39 AM, osumi.takamichi@fujitsu.com wrote:
Minor review comments for v6.
Thanks for your review. I'm attaching v7.
"If the subscriber sets min_apply_delay parameter, ..."
I suggest we use subscription rather than subscriber, because
this parameter refers to and is used for one subscription.
My suggestion is
"If one subscription sets min_apply_delay parameter, ..."
In case if you agree, there are other places to apply this change.
I changed the terminology to subscription. I also checked other "subscriber"
occurrences but I don't think it should be changed. Some of them are used as
publisher/subscriber pair. If you think there is another sentence to consider,
point it out.
It might be better to write a note for committer
like "Bump catalog version" at the bottom of the commit message.
It is a committer task to bump the catalog number. IMO it is easy to notice
(using a git hook?) that it must bump it when we are modifying the catalog.
AFAICS there is no recommendation to add such a warning.
The former interprets input number as milliseconds in case of no units,
while the latter takes it as seconds without units.
I feel it would be better to make them aligned.
In a previous version I decided not to add a code to attach a unit when there
isn't one. Instead, I changed the documentation to reflect what interval_in
uses (seconds as unit). Under reflection, let's use ms as default unit if the
user doesn't specify one.
I fixed all the other suggestions too.
--
Euler Taveira
EDB https://www.enterprisedb.com/
Attachments:
v7-0001-Time-delayed-logical-replication-subscriber.patchtext/x-patch; name=v7-0001-Time-delayed-logical-replication-subscriber.patchDownload
From a28987c8adb70d6932558f5e39f9dd4c55223a30 Mon Sep 17 00:00:00 2001
From: Euler Taveira <euler.taveira@enterprisedb.com>
Date: Sat, 6 Nov 2021 11:31:10 -0300
Subject: [PATCH v7] Time-delayed logical replication subscriber
Similar to physical replication, a time-delayed copy of the data for
logical replication is useful for some scenarios (particularly to fix
errors that might cause data loss).
If the subscription sets min_apply_delay parameter, the logical
replication worker will delay the transaction commit for min_apply_delay
milliseconds.
The delay is calculated between the WAL time stamp and the current time
on the subscriber.
The delay occurs only on WAL records for transaction begins. The main
reason is to avoid keeping a transaction open for a long time. Regular
and prepared transactions are covered. Streamed transactions are also
covered.
Author: Euler Taveira
Discussion: https://postgr.es/m/CAB-JLwYOYwL=XTyAXKiH5CtM_Vm8KjKh7aaitCKvmCh4rzr5pQ@mail.gmail.com
---
doc/src/sgml/catalogs.sgml | 10 ++
doc/src/sgml/ref/alter_subscription.sgml | 5 +-
doc/src/sgml/ref/create_subscription.sgml | 43 +++++-
src/backend/catalog/pg_subscription.c | 1 +
src/backend/catalog/system_views.sql | 7 +-
src/backend/commands/subscriptioncmds.c | 59 +++++++-
src/backend/replication/logical/worker.c | 100 +++++++++++++
src/backend/utils/adt/timestamp.c | 32 ++++
src/bin/pg_dump/pg_dump.c | 17 ++-
src/bin/pg_dump/pg_dump.h | 1 +
src/bin/psql/describe.c | 9 +-
src/bin/psql/tab-complete.c | 4 +-
src/include/catalog/pg_subscription.h | 3 +
src/include/datatype/timestamp.h | 2 +
src/include/utils/timestamp.h | 2 +
src/test/regress/expected/subscription.out | 165 ++++++++++++---------
src/test/regress/sql/subscription.sql | 20 +++
src/test/subscription/t/032_apply_delay.pl | 129 ++++++++++++++++
18 files changed, 526 insertions(+), 83 deletions(-)
create mode 100644 src/test/subscription/t/032_apply_delay.pl
diff --git a/doc/src/sgml/catalogs.sgml b/doc/src/sgml/catalogs.sgml
index cd2cc37aeb..291ebdafad 100644
--- a/doc/src/sgml/catalogs.sgml
+++ b/doc/src/sgml/catalogs.sgml
@@ -7833,6 +7833,16 @@ SCRAM-SHA-256$<replaceable><iteration count></replaceable>:<replaceable>&l
</para></entry>
</row>
+ <row>
+ <entry role="catalog_table_entry"><para role="column_definition">
+ <structfield>subapplydelay</structfield> <type>int8</type>
+ </para>
+ <para>
+ Application delay of changes by a specified amount of time. The
+ unit is in milliseconds.
+ </para></entry>
+ </row>
+
<row>
<entry role="catalog_table_entry"><para role="column_definition">
<structfield>subname</structfield> <type>name</type>
diff --git a/doc/src/sgml/ref/alter_subscription.sgml b/doc/src/sgml/ref/alter_subscription.sgml
index 64efc21f53..8901e1361c 100644
--- a/doc/src/sgml/ref/alter_subscription.sgml
+++ b/doc/src/sgml/ref/alter_subscription.sgml
@@ -208,8 +208,9 @@ ALTER SUBSCRIPTION <replaceable class="parameter">name</replaceable> RENAME TO <
are <literal>slot_name</literal>,
<literal>synchronous_commit</literal>,
<literal>binary</literal>, <literal>streaming</literal>,
- <literal>disable_on_error</literal>, and
- <literal>origin</literal>.
+ <literal>disable_on_error</literal>,
+ <literal>origin</literal>, and
+ <literal>min_apply_delay</literal>.
</para>
</listitem>
</varlistentry>
diff --git a/doc/src/sgml/ref/create_subscription.sgml b/doc/src/sgml/ref/create_subscription.sgml
index 7390c715bc..a794c07042 100644
--- a/doc/src/sgml/ref/create_subscription.sgml
+++ b/doc/src/sgml/ref/create_subscription.sgml
@@ -317,7 +317,36 @@ CREATE SUBSCRIPTION <replaceable class="parameter">subscription_name</replaceabl
</para>
</listitem>
</varlistentry>
- </variablelist></para>
+
+ <varlistentry>
+ <term><literal>min_apply_delay</literal> (<type>integer</type>)</term>
+ <listitem>
+ <para>
+ By default, subscriber applies changes as soon as possible. As with
+ the physical replication feature
+ (<xref linkend="guc-recovery-min-apply-delay"/>), it can be useful to
+ have a time-delayed logical replica. This parameter allows you to
+ delay the application of changes by a specified amount of time. If
+ this value is specified without units, it is taken as milliseconds.
+ The default is zero, adding no delay.
+ </para>
+ <para>
+ The delay occurs only on WAL records for transaction begins and after
+ the initial table synchronization. It is possible that the
+ replication delay between publisher and subscriber exceeds the value
+ of this parameter, in which case no delay is added. Note that the
+ delay is calculated between the WAL time stamp as written on
+ publisher and the current time on the subscriber. Time spent in logical
+ decoding and in transfering the transaction may reduce the actual wait
+ time. If the system clocks on publisher and subscriber are not
+ synchronized, this may lead to apply changes earlier than expected.
+ This is not a major issue because a typical setting of this parameter
+ are much larger than typical time deviations between servers.
+ </para>
+ </listitem>
+ </varlistentry>
+ </variablelist>
+ </para>
</listitem>
</varlistentry>
@@ -413,6 +442,18 @@ CREATE SUBSCRIPTION mysub
PUBLICATION insert_only
WITH (enabled = false);
</programlisting></para>
+
+ <para>
+ Create a subscription to a remote server that replicates tables in
+ the <literal>baz</literal> publication and starts replicating immediately on
+ commit. Pre-existing data is not copied. The application of changes is
+ delayed by 4 hours.
+<programlisting>
+CREATE SUBSCRIPTION foo
+ CONNECTION 'host=192.168.1.50 port=5432 user=foo dbname=foodb'
+ PUBLICATION baz
+ WITH (copy_data = false, min_apply_delay = '4h');
+</programlisting></para>
</refsect1>
<refsect1>
diff --git a/src/backend/catalog/pg_subscription.c b/src/backend/catalog/pg_subscription.c
index a506fc3ec8..d93e374ef4 100644
--- a/src/backend/catalog/pg_subscription.c
+++ b/src/backend/catalog/pg_subscription.c
@@ -64,6 +64,7 @@ GetSubscription(Oid subid, bool missing_ok)
sub->oid = subid;
sub->dbid = subform->subdbid;
sub->skiplsn = subform->subskiplsn;
+ sub->applydelay = subform->subapplydelay;
sub->name = pstrdup(NameStr(subform->subname));
sub->owner = subform->subowner;
sub->enabled = subform->subenabled;
diff --git a/src/backend/catalog/system_views.sql b/src/backend/catalog/system_views.sql
index f369b1fc14..50175323b9 100644
--- a/src/backend/catalog/system_views.sql
+++ b/src/backend/catalog/system_views.sql
@@ -1297,9 +1297,10 @@ REVOKE ALL ON pg_replication_origin_status FROM public;
-- All columns of pg_subscription except subconninfo are publicly readable.
REVOKE ALL ON pg_subscription FROM public;
-GRANT SELECT (oid, subdbid, subskiplsn, subname, subowner, subenabled,
- subbinary, substream, subtwophasestate, subdisableonerr,
- subslotname, subsynccommit, subpublications, suborigin)
+GRANT SELECT (oid, subdbid, subskiplsn, subapplydelay, subname, subowner,
+ subenabled, subbinary, substream, subtwophasestate,
+ subdisableonerr, subslotname, subsynccommit, subpublications,
+ suborigin)
ON pg_subscription TO public;
CREATE VIEW pg_stat_subscription_stats AS
diff --git a/src/backend/commands/subscriptioncmds.c b/src/backend/commands/subscriptioncmds.c
index f73dfb6067..3c0d186991 100644
--- a/src/backend/commands/subscriptioncmds.c
+++ b/src/backend/commands/subscriptioncmds.c
@@ -47,6 +47,7 @@
#include "utils/memutils.h"
#include "utils/pg_lsn.h"
#include "utils/syscache.h"
+#include "utils/timestamp.h"
/*
* Options that can be specified by the user in CREATE/ALTER SUBSCRIPTION
@@ -65,6 +66,7 @@
#define SUBOPT_DISABLE_ON_ERR 0x00000400
#define SUBOPT_LSN 0x00000800
#define SUBOPT_ORIGIN 0x00001000
+#define SUBOPT_MIN_APPLY_DELAY 0x00002000
/* check if the 'val' has 'bits' set */
#define IsSet(val, bits) (((val) & (bits)) == (bits))
@@ -89,6 +91,7 @@ typedef struct SubOpts
bool disableonerr;
char *origin;
XLogRecPtr lsn;
+ int64 min_apply_delay;
} SubOpts;
static List *fetch_table_list(WalReceiverConn *wrconn, List *publications);
@@ -141,6 +144,8 @@ parse_subscription_options(ParseState *pstate, List *stmt_options,
opts->disableonerr = false;
if (IsSet(supported_opts, SUBOPT_ORIGIN))
opts->origin = pstrdup(LOGICALREP_ORIGIN_ANY);
+ if (IsSet(supported_opts, SUBOPT_MIN_APPLY_DELAY))
+ opts->min_apply_delay = 0;
/* Parse options */
foreach(lc, stmt_options)
@@ -319,12 +324,45 @@ parse_subscription_options(ParseState *pstate, List *stmt_options,
opts->specified_opts |= SUBOPT_LSN;
opts->lsn = lsn;
}
+ else if (IsSet(supported_opts, SUBOPT_MIN_APPLY_DELAY) &&
+ strcmp(defel->defname, "min_apply_delay") == 0)
+ {
+ char *val, *tmp;
+ Interval *interval;
+
+ if (IsSet(opts->specified_opts, SUBOPT_MIN_APPLY_DELAY))
+ errorConflictingDefElem(defel, pstate);
+
+ opts->specified_opts |= SUBOPT_MIN_APPLY_DELAY;
+ tmp = defGetString(defel);
+
+ /*
+ * If there is no unit, interval_in takes second as unit. This
+ * parameter expects millisecond as unit so add a unit (ms) if
+ * there isn't one.
+ */
+ if (strspn(tmp, "0123456789") == strlen(tmp))
+ val = psprintf("%sms", tmp);
+ else
+ val = tmp;
+
+ interval = DatumGetIntervalP(DirectFunctionCall3(interval_in,
+ CStringGetDatum(val),
+ ObjectIdGetDatum(InvalidOid),
+ Int32GetDatum(-1)));
+ opts->min_apply_delay = interval_to_ms(interval);
+ }
else
ereport(ERROR,
(errcode(ERRCODE_SYNTAX_ERROR),
errmsg("unrecognized subscription parameter: \"%s\"", defel->defname)));
}
+ if (opts->min_apply_delay < 0 && IsSet(supported_opts, SUBOPT_MIN_APPLY_DELAY))
+ ereport(ERROR,
+ errcode(ERRCODE_NUMERIC_VALUE_OUT_OF_RANGE),
+ errmsg("option \"%s\" must not be negative", "min_apply_delay"));
+
/*
* We've been explicitly asked to not connect, that requires some
* additional processing.
@@ -557,7 +595,8 @@ CreateSubscription(ParseState *pstate, CreateSubscriptionStmt *stmt,
SUBOPT_SLOT_NAME | SUBOPT_COPY_DATA |
SUBOPT_SYNCHRONOUS_COMMIT | SUBOPT_BINARY |
SUBOPT_STREAMING | SUBOPT_TWOPHASE_COMMIT |
- SUBOPT_DISABLE_ON_ERR | SUBOPT_ORIGIN);
+ SUBOPT_DISABLE_ON_ERR | SUBOPT_ORIGIN |
+ SUBOPT_MIN_APPLY_DELAY);
parse_subscription_options(pstate, stmt->options, supported_opts, &opts);
/*
@@ -622,6 +661,7 @@ CreateSubscription(ParseState *pstate, CreateSubscriptionStmt *stmt,
values[Anum_pg_subscription_oid - 1] = ObjectIdGetDatum(subid);
values[Anum_pg_subscription_subdbid - 1] = ObjectIdGetDatum(MyDatabaseId);
values[Anum_pg_subscription_subskiplsn - 1] = LSNGetDatum(InvalidXLogRecPtr);
+ values[Anum_pg_subscription_subapplydelay - 1] = Int64GetDatum(opts.min_apply_delay);
values[Anum_pg_subscription_subname - 1] =
DirectFunctionCall1(namein, CStringGetDatum(stmt->subname));
values[Anum_pg_subscription_subowner - 1] = ObjectIdGetDatum(owner);
@@ -1044,7 +1084,7 @@ AlterSubscription(ParseState *pstate, AlterSubscriptionStmt *stmt,
supported_opts = (SUBOPT_SLOT_NAME |
SUBOPT_SYNCHRONOUS_COMMIT | SUBOPT_BINARY |
SUBOPT_STREAMING | SUBOPT_DISABLE_ON_ERR |
- SUBOPT_ORIGIN);
+ SUBOPT_ORIGIN | SUBOPT_MIN_APPLY_DELAY);
parse_subscription_options(pstate, stmt->options,
supported_opts, &opts);
@@ -1101,6 +1141,13 @@ AlterSubscription(ParseState *pstate, AlterSubscriptionStmt *stmt,
= true;
}
+ if (IsSet(opts.specified_opts, SUBOPT_MIN_APPLY_DELAY))
+ {
+ values[Anum_pg_subscription_subapplydelay - 1] =
+ Int64GetDatum(opts.min_apply_delay);
+ replaces[Anum_pg_subscription_subapplydelay - 1] = true;
+ }
+
if (IsSet(opts.specified_opts, SUBOPT_ORIGIN))
{
values[Anum_pg_subscription_suborigin - 1] =
@@ -1130,6 +1177,14 @@ AlterSubscription(ParseState *pstate, AlterSubscriptionStmt *stmt,
if (opts.enabled)
ApplyLauncherWakeupAtCommit();
+ /*
+ * If this subscription has been disabled and it has an apply
+ * delay set, wake up the logical replication worker to finish
+ * it as soon as possible.
+ */
+ if (!opts.enabled && sub->applydelay > 0)
+ logicalrep_worker_wakeup(sub->oid, InvalidOid);
+
update_tuple = true;
break;
}
diff --git a/src/backend/replication/logical/worker.c b/src/backend/replication/logical/worker.c
index 5f8c541763..18aece3495 100644
--- a/src/backend/replication/logical/worker.c
+++ b/src/backend/replication/logical/worker.c
@@ -324,6 +324,8 @@ static void maybe_reread_subscription(void);
static void DisableSubscriptionAndExit(void);
+static void apply_delay(TimestampTz ts);
+
/* prototype needed because of stream_commit */
static void apply_dispatch(StringInfo s);
@@ -803,6 +805,72 @@ slot_modify_data(TupleTableSlot *slot, TupleTableSlot *srcslot,
ExecStoreVirtualTuple(slot);
}
+/*
+ * When min_apply_delay parameter is set on subscriber, we wait long enough to
+ * make sure a transaction is applied at least that interval behind the
+ * publisher.
+ *
+ * It applies the delay for the next transaction but before starting the
+ * transaction. The main reason for this design is to avoid a long-running
+ * transaction (which can cause some operational challenges) if the user sets a
+ * high value for the delay. This design is different from the physical
+ * replication (that applies the delay at commit time) mainly because write
+ * operations may allow some issues (such as bloat and locks) that can be
+ * minimized if it does not keep the transaction open for such a long time.
+ *
+ * The min_apply_delay parameter will take effect only after all tables are in
+ * READY state.
+ */
+static void
+apply_delay(TimestampTz ts)
+{
+ TimestampTz delay_until = 0;
+
+ /* nothing to do if no delay set */
+ if (MySubscription->applydelay <= 0)
+ return;
+
+ /*
+ * Apply delay only after all tablesync workers have reached READY state. A
+ * tablesync worker are kept until it reaches READY state. If we allow the
+ * delay during the catchup phase, once we reach the limit of tablesync
+ * workers, it will impose a delay for each subsequent worker. It means it
+ * will take a long time to finish the initial table synchronization.
+ * Instead, the apply delay will be activated only after all tables are in
+ * READY state.
+ */
+ if (!AllTablesyncsReady())
+ return;
+
+ /* set apply delay */
+ delay_until = TimestampTzPlusMilliseconds(TimestampTzGetDatum(ts),
+ MySubscription->applydelay);
+
+ while (true)
+ {
+ long diffms;
+
+ diffms = TimestampDifferenceMilliseconds(GetCurrentTimestamp(), delay_until);
+
+ /*
+ * Exit without arming the latch if it's already past time to apply
+ * this transaction.
+ */
+ if (diffms <= 0)
+ break;
+
+ elog(DEBUG2, "logical replication apply delay: %ld ms", diffms);
+
+ WaitLatch(MyLatch,
+ WL_LATCH_SET | WL_TIMEOUT | WL_EXIT_ON_PM_DEATH,
+ diffms,
+ WAIT_EVENT_RECOVERY_APPLY_DELAY);
+ ResetLatch(MyLatch);
+
+ CHECK_FOR_INTERRUPTS();
+ }
+}
+
/*
* Handle BEGIN message.
*/
@@ -814,6 +882,9 @@ apply_handle_begin(StringInfo s)
logicalrep_read_begin(s, &begin_data);
set_apply_error_context_xact(begin_data.xid, begin_data.final_lsn);
+ /* Should we delay the current transaction? */
+ apply_delay(begin_data.committime);
+
remote_final_lsn = begin_data.final_lsn;
maybe_start_skipping_changes(begin_data.final_lsn);
@@ -868,6 +939,9 @@ apply_handle_begin_prepare(StringInfo s)
logicalrep_read_begin_prepare(s, &begin_data);
set_apply_error_context_xact(begin_data.xid, begin_data.prepare_lsn);
+ /* Should we delay the current prepared transaction? */
+ apply_delay(begin_data.prepare_time);
+
remote_final_lsn = begin_data.prepare_lsn;
maybe_start_skipping_changes(begin_data.prepare_lsn);
@@ -1090,6 +1164,19 @@ apply_handle_stream_prepare(StringInfo s)
elog(DEBUG1, "received prepare for streamed transaction %u", prepare_data.xid);
+ /*
+ * Should we delay the current prepared transaction?
+ *
+ * Although the delay is applied in BEGIN PREPARE messages, streamed
+ * prepared transactions apply the delay in a STREAM PREPARE message.
+ * That's ok because no changes have been applied yet
+ * (apply_spooled_messages() will do it).
+ * The STREAM START message does not contain a prepare time (it will be
+ * available when the in-progress prepared transaction finishes), hence, it
+ * was not possible to apply a delay at that time.
+ */
+ apply_delay(prepare_data.prepare_time);
+
/* Replay all the spooled operations. */
apply_spooled_messages(prepare_data.xid, prepare_data.prepare_lsn);
@@ -1481,6 +1568,19 @@ apply_handle_stream_commit(StringInfo s)
elog(DEBUG1, "received commit for streamed transaction %u", xid);
+ /*
+ * Should we delay the current transaction?
+ *
+ * Although the delay is applied in BEGIN messages, streamed transactions
+ * apply the delay in a STREAM COMMIT message. That's ok because no changes
+ * have been applied yet (apply_spooled_messages() will do it).
+ * The STREAM START message would be a natural choice for this delay but
+ * there is no commit time yet (it will be available when the in-progress
+ * transaction finishes), hence, it was not possible to apply a delay at
+ * that time.
+ */
+ apply_delay(commit_data.committime);
+
apply_spooled_messages(xid, commit_data.commit_lsn);
apply_handle_commit_internal(&commit_data);
diff --git a/src/backend/utils/adt/timestamp.c b/src/backend/utils/adt/timestamp.c
index 49cdb290ac..89f57f7c33 100644
--- a/src/backend/utils/adt/timestamp.c
+++ b/src/backend/utils/adt/timestamp.c
@@ -2411,6 +2411,38 @@ interval_cmp_internal(const Interval *interval1, const Interval *interval2)
return int128_compare(span1, span2);
}
+/*
+ * Given an Interval returns the number of milliseconds.
+ */
+int64
+interval_to_ms(const Interval *interval)
+{
+ int64 days;
+ int64 ms;
+ int64 result;
+
+ days = interval->month * INT64CONST(30);
+ days += interval->day;
+
+ /*
+ * The following operations use these special functions to detect overflow.
+ * Number of ms per informed days.
+ */
+ if (pg_mul_s64_overflow(days, MSECS_PER_DAY, &result))
+ ereport(ERROR,
+ (errcode(ERRCODE_NUMERIC_VALUE_OUT_OF_RANGE),
+ errmsg("bigint out of range")));
+
+ /* adds portion time (in ms) to the previous result. */
+ ms = interval->time / INT64CONST(1000);
+ if (pg_add_s64_overflow(result, ms, &result))
+ ereport(ERROR,
+ (errcode(ERRCODE_NUMERIC_VALUE_OUT_OF_RANGE),
+ errmsg("bigint out of range")));
+
+ return result;
+}
+
Datum
interval_eq(PG_FUNCTION_ARGS)
{
diff --git a/src/bin/pg_dump/pg_dump.c b/src/bin/pg_dump/pg_dump.c
index da6605175a..5ef61f3de1 100644
--- a/src/bin/pg_dump/pg_dump.c
+++ b/src/bin/pg_dump/pg_dump.c
@@ -4441,6 +4441,7 @@ getSubscriptions(Archive *fout)
int i_subsynccommit;
int i_subpublications;
int i_subbinary;
+ int i_subapplydelay;
int i,
ntups;
@@ -4493,9 +4494,15 @@ getSubscriptions(Archive *fout)
LOGICALREP_TWOPHASE_STATE_DISABLED);
if (fout->remoteVersion >= 160000)
- appendPQExpBufferStr(query, " s.suborigin\n");
+ {
+ appendPQExpBufferStr(query, " s.suborigin,\n");
+ appendPQExpBufferStr(query, " s.subapplydelay\n");
+ }
else
- appendPQExpBuffer(query, " '%s' AS suborigin\n", LOGICALREP_ORIGIN_ANY);
+ {
+ appendPQExpBuffer(query, " '%s' AS suborigin,\n", LOGICALREP_ORIGIN_ANY);
+ appendPQExpBufferStr(query, " 0 AS subapplydelay\n");
+ }
appendPQExpBufferStr(query,
"FROM pg_subscription s\n"
@@ -4523,6 +4530,7 @@ getSubscriptions(Archive *fout)
i_subtwophasestate = PQfnumber(res, "subtwophasestate");
i_subdisableonerr = PQfnumber(res, "subdisableonerr");
i_suborigin = PQfnumber(res, "suborigin");
+ i_subapplydelay = PQfnumber(res, "subapplydelay");
subinfo = pg_malloc(ntups * sizeof(SubscriptionInfo));
@@ -4553,6 +4561,8 @@ getSubscriptions(Archive *fout)
subinfo[i].subdisableonerr =
pg_strdup(PQgetvalue(res, i, i_subdisableonerr));
subinfo[i].suborigin = pg_strdup(PQgetvalue(res, i, i_suborigin));
+ subinfo[i].subapplydelay =
+ strtoi64(PQgetvalue(res, i, i_subapplydelay), NULL, 10);
/* Decide whether we want to dump it */
selectDumpableObject(&(subinfo[i].dobj), fout);
@@ -4632,6 +4642,9 @@ dumpSubscription(Archive *fout, const SubscriptionInfo *subinfo)
if (strcmp(subinfo->subsynccommit, "off") != 0)
appendPQExpBuffer(query, ", synchronous_commit = %s", fmtId(subinfo->subsynccommit));
+ if (subinfo->subapplydelay > 0)
+ appendPQExpBuffer(query, ", min_apply_delay = '" INT64_FORMAT " ms'", subinfo->subapplydelay);
+
appendPQExpBufferStr(query, ");\n");
if (subinfo->dobj.dump & DUMP_COMPONENT_DEFINITION)
diff --git a/src/bin/pg_dump/pg_dump.h b/src/bin/pg_dump/pg_dump.h
index 69ee939d44..91b73e10d2 100644
--- a/src/bin/pg_dump/pg_dump.h
+++ b/src/bin/pg_dump/pg_dump.h
@@ -661,6 +661,7 @@ typedef struct _SubscriptionInfo
char *subdisableonerr;
char *suborigin;
char *subsynccommit;
+ int64 subapplydelay;
char *subpublications;
} SubscriptionInfo;
diff --git a/src/bin/psql/describe.c b/src/bin/psql/describe.c
index 327a69487b..0be1d44e81 100644
--- a/src/bin/psql/describe.c
+++ b/src/bin/psql/describe.c
@@ -6469,7 +6469,7 @@ describeSubscriptions(const char *pattern, bool verbose)
PGresult *res;
printQueryOpt myopt = pset.popt;
static const bool translate_columns[] = {false, false, false, false,
- false, false, false, false, false, false, false, false};
+ false, false, false, false, false, false, false, false, false};
if (pset.sversion < 100000)
{
@@ -6511,10 +6511,13 @@ describeSubscriptions(const char *pattern, bool verbose)
gettext_noop("Two-phase commit"),
gettext_noop("Disable on error"));
+ /* origin and min_apply_delay are only supported in v16 and higher */
if (pset.sversion >= 160000)
appendPQExpBuffer(&buf,
- ", suborigin AS \"%s\"\n",
- gettext_noop("Origin"));
+ ", suborigin AS \"%s\"\n"
+ ", subapplydelay AS \"%s\"\n",
+ gettext_noop("Origin"),
+ gettext_noop("Apply delay"));
appendPQExpBuffer(&buf,
", subsynccommit AS \"%s\"\n"
diff --git a/src/bin/psql/tab-complete.c b/src/bin/psql/tab-complete.c
index f265e043e9..d577abfc29 100644
--- a/src/bin/psql/tab-complete.c
+++ b/src/bin/psql/tab-complete.c
@@ -1873,7 +1873,7 @@ psql_completion(const char *text, int start, int end)
COMPLETE_WITH("(", "PUBLICATION");
/* ALTER SUBSCRIPTION <name> SET ( */
else if (HeadMatches("ALTER", "SUBSCRIPTION", MatchAny) && TailMatches("SET", "("))
- COMPLETE_WITH("binary", "disable_on_error", "origin", "slot_name",
+ COMPLETE_WITH("binary", "disable_on_error", "min_apply_delay", "origin", "slot_name",
"streaming", "synchronous_commit");
/* ALTER SUBSCRIPTION <name> SKIP ( */
else if (HeadMatches("ALTER", "SUBSCRIPTION", MatchAny) && TailMatches("SKIP", "("))
@@ -3153,7 +3153,7 @@ psql_completion(const char *text, int start, int end)
/* Complete "CREATE SUBSCRIPTION <name> ... WITH ( <opt>" */
else if (HeadMatches("CREATE", "SUBSCRIPTION") && TailMatches("WITH", "("))
COMPLETE_WITH("binary", "connect", "copy_data", "create_slot",
- "disable_on_error", "enabled", "origin", "slot_name",
+ "disable_on_error", "enabled", "min_apply_delay", "origin", "slot_name",
"streaming", "synchronous_commit", "two_phase");
/* CREATE TRIGGER --- is allowed inside CREATE SCHEMA, so use TailMatches */
diff --git a/src/include/catalog/pg_subscription.h b/src/include/catalog/pg_subscription.h
index 7b98714f30..3894b97aca 100644
--- a/src/include/catalog/pg_subscription.h
+++ b/src/include/catalog/pg_subscription.h
@@ -70,6 +70,8 @@ CATALOG(pg_subscription,6100,SubscriptionRelationId) BKI_SHARED_RELATION BKI_ROW
XLogRecPtr subskiplsn; /* All changes finished at this LSN are
* skipped */
+ int64 subapplydelay; /* Replication apply delay */
+
NameData subname; /* Name of the subscription */
Oid subowner BKI_LOOKUP(pg_authid); /* Owner of the subscription */
@@ -119,6 +121,7 @@ typedef struct Subscription
* in */
XLogRecPtr skiplsn; /* All changes finished at this LSN are
* skipped */
+ int64 applydelay; /* Replication apply delay */
char *name; /* Name of the subscription */
Oid owner; /* Oid of the subscription owner */
bool enabled; /* Indicates if the subscription is enabled */
diff --git a/src/include/datatype/timestamp.h b/src/include/datatype/timestamp.h
index d155f1b03b..d5bbfad1c4 100644
--- a/src/include/datatype/timestamp.h
+++ b/src/include/datatype/timestamp.h
@@ -127,6 +127,8 @@ struct pg_itm_in
#define SECS_PER_MINUTE 60
#define MINS_PER_HOUR 60
+#define MSECS_PER_DAY INT64CONST(86400000)
+
#define USECS_PER_DAY INT64CONST(86400000000)
#define USECS_PER_HOUR INT64CONST(3600000000)
#define USECS_PER_MINUTE INT64CONST(60000000)
diff --git a/src/include/utils/timestamp.h b/src/include/utils/timestamp.h
index edf3a97318..91709035da 100644
--- a/src/include/utils/timestamp.h
+++ b/src/include/utils/timestamp.h
@@ -78,6 +78,8 @@ extern bool TimestampDifferenceExceeds(TimestampTz start_time,
TimestampTz stop_time,
int msec);
+extern int64 interval_to_ms(const Interval *interval);
+
extern TimestampTz time_t_to_timestamptz(pg_time_t tm);
extern pg_time_t timestamptz_to_time_t(TimestampTz t);
diff --git a/src/test/regress/expected/subscription.out b/src/test/regress/expected/subscription.out
index ef0ebf96b9..056b6e36f8 100644
--- a/src/test/regress/expected/subscription.out
+++ b/src/test/regress/expected/subscription.out
@@ -77,18 +77,18 @@ ERROR: unrecognized origin value: "foo"
CREATE SUBSCRIPTION regress_testsub4 CONNECTION 'dbname=regress_doesnotexist' PUBLICATION testpub WITH (slot_name = NONE, connect = false, origin = none);
WARNING: tables were not subscribed, you will have to run ALTER SUBSCRIPTION ... REFRESH PUBLICATION to subscribe the tables
\dRs+ regress_testsub4
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
-------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub4 | regress_subscription_user | f | {testpub} | f | f | d | f | none | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Apply delay | Synchronous commit | Conninfo | Skip LSN
+------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+-------------+--------------------+-----------------------------+----------
+ regress_testsub4 | regress_subscription_user | f | {testpub} | f | f | d | f | none | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
ALTER SUBSCRIPTION regress_testsub4 SET (origin = any);
\dRs+ regress_testsub4
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
-------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub4 | regress_subscription_user | f | {testpub} | f | f | d | f | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Apply delay | Synchronous commit | Conninfo | Skip LSN
+------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+-------------+--------------------+-----------------------------+----------
+ regress_testsub4 | regress_subscription_user | f | {testpub} | f | f | d | f | any | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
DROP SUBSCRIPTION regress_testsub3;
@@ -98,10 +98,10 @@ ALTER SUBSCRIPTION regress_testsub CONNECTION 'foobar';
ERROR: invalid connection string syntax: missing "=" after "foobar" in connection info string
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub} | f | f | d | f | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Apply delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+-------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | f | d | f | any | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
ALTER SUBSCRIPTION regress_testsub SET PUBLICATION testpub2, testpub3 WITH (refresh = false);
@@ -118,10 +118,10 @@ ERROR: unrecognized subscription parameter: "create_slot"
-- ok
ALTER SUBSCRIPTION regress_testsub SKIP (lsn = '0/12345');
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+---------------------+--------+-----------+------------------+------------------+--------+--------------------+------------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub2,testpub3} | f | f | d | f | any | off | dbname=regress_doesnotexist2 | 0/12345
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Apply delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+---------------------+--------+-----------+------------------+------------------+--------+-------------+--------------------+------------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub2,testpub3} | f | f | d | f | any | 0 | off | dbname=regress_doesnotexist2 | 0/12345
(1 row)
-- ok - with lsn = NONE
@@ -130,10 +130,10 @@ ALTER SUBSCRIPTION regress_testsub SKIP (lsn = NONE);
ALTER SUBSCRIPTION regress_testsub SKIP (lsn = '0/0');
ERROR: invalid WAL location (LSN): 0/0
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+---------------------+--------+-----------+------------------+------------------+--------+--------------------+------------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub2,testpub3} | f | f | d | f | any | off | dbname=regress_doesnotexist2 | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Apply delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+---------------------+--------+-----------+------------------+------------------+--------+-------------+--------------------+------------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub2,testpub3} | f | f | d | f | any | 0 | off | dbname=regress_doesnotexist2 | 0/0
(1 row)
BEGIN;
@@ -165,10 +165,10 @@ ALTER SUBSCRIPTION regress_testsub_foo SET (synchronous_commit = foobar);
ERROR: invalid value for parameter "synchronous_commit": "foobar"
HINT: Available values: local, remote_write, remote_apply, on, off.
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
----------------------+---------------------------+---------+---------------------+--------+-----------+------------------+------------------+--------+--------------------+------------------------------+----------
- regress_testsub_foo | regress_subscription_user | f | {testpub2,testpub3} | f | f | d | f | any | local | dbname=regress_doesnotexist2 | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Apply delay | Synchronous commit | Conninfo | Skip LSN
+---------------------+---------------------------+---------+---------------------+--------+-----------+------------------+------------------+--------+-------------+--------------------+------------------------------+----------
+ regress_testsub_foo | regress_subscription_user | f | {testpub2,testpub3} | f | f | d | f | any | 0 | local | dbname=regress_doesnotexist2 | 0/0
(1 row)
-- rename back to keep the rest simple
@@ -201,19 +201,19 @@ ERROR: binary requires a Boolean value
CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUBLICATION testpub WITH (connect = false, binary = true);
WARNING: tables were not subscribed, you will have to run ALTER SUBSCRIPTION ... REFRESH PUBLICATION to subscribe the tables
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub} | t | f | d | f | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Apply delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+-------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | t | f | d | f | any | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
ALTER SUBSCRIPTION regress_testsub SET (binary = false);
ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE);
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub} | f | f | d | f | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Apply delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+-------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | f | d | f | any | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
DROP SUBSCRIPTION regress_testsub;
@@ -224,19 +224,19 @@ ERROR: streaming requires a Boolean value
CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUBLICATION testpub WITH (connect = false, streaming = true);
WARNING: tables were not subscribed, you will have to run ALTER SUBSCRIPTION ... REFRESH PUBLICATION to subscribe the tables
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub} | f | t | d | f | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Apply delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+-------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | t | d | f | any | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
ALTER SUBSCRIPTION regress_testsub SET (streaming = false);
ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE);
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub} | f | f | d | f | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Apply delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+-------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | f | d | f | any | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
-- fail - publication already exists
@@ -251,10 +251,10 @@ ALTER SUBSCRIPTION regress_testsub ADD PUBLICATION testpub1, testpub2 WITH (refr
ALTER SUBSCRIPTION regress_testsub ADD PUBLICATION testpub1, testpub2 WITH (refresh = false);
ERROR: publication "testpub1" is already in subscription "regress_testsub"
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-----------------------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub,testpub1,testpub2} | f | f | d | f | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Apply delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-----------------------------+--------+-----------+------------------+------------------+--------+-------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub,testpub1,testpub2} | f | f | d | f | any | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
-- fail - publication used more then once
@@ -269,10 +269,10 @@ ERROR: publication "testpub3" is not in subscription "regress_testsub"
-- ok - delete publications
ALTER SUBSCRIPTION regress_testsub DROP PUBLICATION testpub1, testpub2 WITH (refresh = false);
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub} | f | f | d | f | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Apply delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+-------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | f | d | f | any | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
DROP SUBSCRIPTION regress_testsub;
@@ -306,10 +306,10 @@ ERROR: two_phase requires a Boolean value
CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUBLICATION testpub WITH (connect = false, two_phase = true);
WARNING: tables were not subscribed, you will have to run ALTER SUBSCRIPTION ... REFRESH PUBLICATION to subscribe the tables
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub} | f | f | p | f | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Apply delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+-------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | f | p | f | any | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
--fail - alter of two_phase option not supported.
@@ -318,10 +318,10 @@ ERROR: unrecognized subscription parameter: "two_phase"
-- but can alter streaming when two_phase enabled
ALTER SUBSCRIPTION regress_testsub SET (streaming = true);
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub} | f | t | p | f | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Apply delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+-------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | t | p | f | any | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE);
@@ -330,10 +330,10 @@ DROP SUBSCRIPTION regress_testsub;
CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUBLICATION testpub WITH (connect = false, streaming = true, two_phase = true);
WARNING: tables were not subscribed, you will have to run ALTER SUBSCRIPTION ... REFRESH PUBLICATION to subscribe the tables
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub} | f | t | p | f | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Apply delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+-------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | t | p | f | any | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE);
@@ -345,18 +345,47 @@ ERROR: disable_on_error requires a Boolean value
CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUBLICATION testpub WITH (connect = false, disable_on_error = false);
WARNING: tables were not subscribed, you will have to run ALTER SUBSCRIPTION ... REFRESH PUBLICATION to subscribe the tables
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub} | f | f | d | f | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Apply delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+-------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | f | d | f | any | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
ALTER SUBSCRIPTION regress_testsub SET (disable_on_error = true);
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub} | f | f | d | t | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Apply delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+-------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | f | d | t | any | 0 | off | dbname=regress_doesnotexist | 0/0
+(1 row)
+
+ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE);
+DROP SUBSCRIPTION regress_testsub;
+-- fail -- min_apply_delay must be a non-negative integer
+CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUBLICATION testpub WITH (connect = false, min_apply_delay = foo);
+ERROR: invalid input syntax for type interval: "foo"
+CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUBLICATION testpub WITH (connect = false, min_apply_delay = -1);
+ERROR: option "min_apply_delay" must not be negative
+-- success -- 123 ms
+CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUBLICATION testpub WITH (connect = false, min_apply_delay = 123);
+WARNING: tables were not subscribed, you will have to run ALTER SUBSCRIPTION ... REFRESH PUBLICATION to subscribe the tables
+\dRs+
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Apply delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+-------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | f | d | f | any | 123 | off | dbname=regress_doesnotexist | 0/0
+(1 row)
+
+ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE);
+DROP SUBSCRIPTION regress_testsub;
+-- success -- interval is converted into ms and stored as integer
+CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUBLICATION testpub WITH (connect = false, min_apply_delay = '4h 27min 35s');
+WARNING: tables were not subscribed, you will have to run ALTER SUBSCRIPTION ... REFRESH PUBLICATION to subscribe the tables
+\dRs+
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Apply delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+-------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | f | d | f | any | 16055000 | off | dbname=regress_doesnotexist | 0/0
(1 row)
ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE);
diff --git a/src/test/regress/sql/subscription.sql b/src/test/regress/sql/subscription.sql
index 4425fafc46..885f8bb6fa 100644
--- a/src/test/regress/sql/subscription.sql
+++ b/src/test/regress/sql/subscription.sql
@@ -264,6 +264,26 @@ ALTER SUBSCRIPTION regress_testsub SET (disable_on_error = true);
ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE);
DROP SUBSCRIPTION regress_testsub;
+-- fail -- min_apply_delay must be a non-negative integer
+CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUBLICATION testpub WITH (connect = false, min_apply_delay = foo);
+CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUBLICATION testpub WITH (connect = false, min_apply_delay = -1);
+
+-- success -- 123 ms
+CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUBLICATION testpub WITH (connect = false, min_apply_delay = 123);
+
+\dRs+
+
+ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE);
+DROP SUBSCRIPTION regress_testsub;
+
+-- success -- interval is converted into ms and stored as integer
+CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUBLICATION testpub WITH (connect = false, min_apply_delay = '4h 27min 35s');
+
+\dRs+
+
+ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE);
+DROP SUBSCRIPTION regress_testsub;
+
RESET SESSION AUTHORIZATION;
DROP ROLE regress_subscription_user;
DROP ROLE regress_subscription_user2;
diff --git a/src/test/subscription/t/032_apply_delay.pl b/src/test/subscription/t/032_apply_delay.pl
new file mode 100644
index 0000000000..3d9e0b05f9
--- /dev/null
+++ b/src/test/subscription/t/032_apply_delay.pl
@@ -0,0 +1,129 @@
+
+# Copyright (c) 2022, PostgreSQL Global Development Group
+
+# Test replication apply delay
+use strict;
+use warnings;
+use PostgreSQL::Test::Cluster;
+use PostgreSQL::Test::Utils;
+use Test::More;
+
+# Create publisher node.
+my $node_publisher = PostgreSQL::Test::Cluster->new('publisher');
+$node_publisher->init(allows_streaming => 'logical');
+$node_publisher->append_conf('postgresql.conf',
+ 'logical_decoding_work_mem = 64kB');
+$node_publisher->start;
+
+# Create subscriber node.
+my $node_subscriber = PostgreSQL::Test::Cluster->new('subscriber');
+$node_subscriber->init;
+$node_subscriber->append_conf('postgresql.conf',
+ "log_min_messages = debug2");
+$node_subscriber->start;
+
+# Create some preexisting content on publisher.
+$node_publisher->safe_psql('postgres',
+ "CREATE TABLE test_tab (a int primary key, b varchar)");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO test_tab VALUES (1, 'foo'), (2, 'bar')");
+
+# Setup structure on subscriber.
+$node_subscriber->safe_psql('postgres',
+ "CREATE TABLE test_tab (a int primary key, b text, c timestamptz DEFAULT now(), d bigint DEFAULT 999)"
+);
+
+# Setup logical replication.
+my $publisher_connstr = $node_publisher->connstr . ' dbname=postgres';
+$node_publisher->safe_psql('postgres',
+ "CREATE PUBLICATION tap_pub FOR TABLE test_tab");
+
+my $appname = 'tap_sub';
+$node_subscriber->safe_psql('postgres',
+ "CREATE SUBSCRIPTION tap_sub CONNECTION '$publisher_connstr application_name=$appname' PUBLICATION tap_pub WITH (streaming = on, min_apply_delay = '2s')"
+);
+
+# Wait for initial table sync to finish.
+$node_subscriber->wait_for_subscription_sync($node_publisher, $appname);
+
+# Check log starting now for logical replication apply delay.
+my $log_location = -s $node_subscriber->logfile;
+
+my $result =
+ $node_subscriber->safe_psql('postgres',
+ "SELECT count(*), min(a), max(a) FROM test_tab");
+is($result, qq(2|1|2), 'check initial data was copied to subscriber');
+
+# New row to trigger apply delay.
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO test_tab VALUES (3, 'baz')");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO test_tab VALUES (4, 'abc')");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO test_tab VALUES (5, 'def')");
+
+$node_publisher->wait_for_catchup($appname);
+
+$result =
+ $node_subscriber->safe_psql('postgres',
+ "SELECT count(*), min(a), max(a) FROM test_tab");
+is($result, qq(5|1|5), 'check if the new rows were applied to subscriber');
+
+check_apply_delay_log("logical replication apply delay");
+
+# Test streamed transaction.
+# Insert, update and delete enough rows to exceed 64kB limit.
+$node_publisher->safe_psql(
+ 'postgres', q{
+BEGIN;
+INSERT INTO test_tab SELECT i, md5(i::text) FROM generate_series(6, 5000) s(i);
+UPDATE test_tab SET b = md5(b) WHERE mod(a, 2) = 0;
+DELETE FROM test_tab WHERE mod(a, 3) = 0;
+COMMIT;
+});
+
+$node_publisher->wait_for_catchup($appname);
+
+$result =
+ $node_subscriber->safe_psql('postgres',
+ "SELECT count(*), min(a), max(a) FROM test_tab");
+is($result, qq(3334|1|5000), 'check if the new rows were applied to subscriber');
+
+check_apply_delay_log("logical replication apply delay");
+
+# Test ALTER SUBSCRIPTION. Delay 86460 seconds (1 day 1 minute).
+$node_subscriber->safe_psql('postgres',
+ "ALTER SUBSCRIPTION tap_sub SET (min_apply_delay = 86460000)"
+);
+
+# New row to trigger apply delay.
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO test_tab VALUES (0, 'foobar')");
+
+# Disable subscription. worker should die immediately.
+$node_subscriber->safe_psql('postgres',
+ "ALTER SUBSCRIPTION tap_sub DISABLE"
+);
+
+# Wait until worker dies.
+my $sub_query =
+ "SELECT count(1) = 0 FROM pg_stat_subscription WHERE subname = 'tap_sub' AND pid IS NOT NULL;";
+$node_subscriber->poll_query_until('postgres', $sub_query)
+ or die "Timed out while waiting for subscriber to die";
+
+$node_subscriber->stop;
+$node_publisher->stop;
+
+done_testing();
+
+sub check_apply_delay_log
+{
+ my $message = shift;
+ my $old_log_location = $log_location;
+
+ $log_location = $node_subscriber->wait_for_log(qr/$message/, $log_location);
+
+ cmp_ok($log_location, '>', $old_log_location,
+ "logfile contains triggered logical replication apply delay"
+ );
+}
--
2.30.2
On Tue, Aug 9, 2022 at 3:52 AM Euler Taveira <euler@eulerto.com> wrote:
On Wed, Aug 3, 2022, at 10:27 AM, Amit Kapila wrote:
Your explanation makes sense to me. The other point to consider is
that there can be cases where we may not apply operation for the
transaction because of empty transactions (we don't yet skip empty
xacts for prepared transactions). So, won't it be better to apply the
delay just before we apply the first change for a transaction? Do we
want to apply the delay during table sync as we sometimes do need to
enter apply phase while doing table sync?I thought about the empty transactions but decided to not complicate the code
mainly because skipping transactions is not a code path that will slow down
this feature. As explained in the documentation, there is no harm in delaying a
transaction for more than min_apply_delay; it cannot apply earlier. Having said
that I decided to do nothing. I'm also not sure if it deserves a comment or if
this email is a possible explanation for this decision.
I don't know what makes you think it will complicate the code. But
anyway thinking further about the way apply_delay is used at various
places in the patch, as pointed out by Peter Smith it seems it won't
work for the parallel apply feature where we start applying the
transaction immediately after start stream. I was wondering why don't
we apply delay after each commit of the transaction rather than at the
begin command. We can remember if the transaction has made any change
and if so then after commit, apply the delay. If we can do that then
it will alleviate the concern of empty and skipped xacts as well.
Another thing I was wondering how to determine what is a good delay
time for tests and found that current tests in replay_delay.pl uses
3s, so should we use the same for apply delay tests in this patch as
well?
--
With Regards,
Amit Kapila.
Dear Euler,
Thank you for making the patch! I'm also interested in the patch so I want to join the thread.
While testing your patch, I noticed that the 032_apply_delay.pl failed.
PSA logs that generated on my machine. This failure is same as reported by cfbot[1]https://cirrus-ci.com/task/4888001967816704.
It seemed that the apply worker could not exit and starts WaitLatch() again even if the subscription had been disabled.
Followings are cited from attached log.
```
...
2022-09-14 09:44:30.489 UTC [14880] 032_apply_delay.pl LOG: statement: ALTER SUBSCRIPTION tap_sub SET (min_apply_delay = 86460000)
2022-09-14 09:44:30.525 UTC [14777] DEBUG: sending feedback (force 0) to recv 0/1690220, write 0/1690220, flush 0/1690220
2022-09-14 09:44:30.526 UTC [14759] DEBUG: server process (PID 14878) exited with exit code 0
2022-09-14 09:44:30.535 UTC [14777] DEBUG: logical replication apply delay: 86460000 ms
2022-09-14 09:44:30.535 UTC [14777] CONTEXT: processing remote data for replication origin "pg_16393" during "BEGIN" in transaction 734 finished at 0/16902A8
2022-09-14 09:44:30.576 UTC [14759] DEBUG: forked new backend, pid=14884 socket=6
2022-09-14 09:44:30.578 UTC [14759] DEBUG: server process (PID 14880) exited with exit code 0
2022-09-14 09:44:30.583 UTC [14884] 032_apply_delay.pl LOG: statement: ALTER SUBSCRIPTION tap_sub DISABLE
2022-09-14 09:44:30.589 UTC [14777] DEBUG: logical replication apply delay: 86459945 ms
2022-09-14 09:44:30.589 UTC [14777] CONTEXT: processing remote data for replication origin "pg_16393" during "BEGIN" in transaction 734 finished at 0/16902A8
2022-09-14 09:44:30.608 UTC [14759] DEBUG: forked new backend, pid=14886 socket=6
2022-09-14 09:44:30.632 UTC [14886] 032_apply_delay.pl LOG: statement: SELECT count(1) = 0 FROM pg_stat_subscription WHERE subname = 'tap_sub' AND pid IS NOT NULL;
2022-09-14 09:44:30.665 UTC [14759] DEBUG: server process (PID 14884) exited with exit code 0
...
```
I think this may be caused because the delayed worker will not read the modified catalog even if ALTER SUBSCRIPTION ... DISABLED is called.
I also attached the fix patch that can be applied after yours. It seems OK on my env.
[1]: https://cirrus-ci.com/task/4888001967816704
Best Regards,
Hayato Kuroda
FUJITSU LIMITED
Attachments:
0002-Make-apply-workers-to-read-pg_subscription.patchapplication/octet-stream; name=0002-Make-apply-workers-to-read-pg_subscription.patchDownload
From 0d78217de2ba10e2631fca83469c0bda40fdb0c1 Mon Sep 17 00:00:00 2001
From: "kuroda.hayato%40jp.fujitsu.com" <kuroda.hayato@jp.fujitsu.com>
Date: Wed, 14 Sep 2022 10:56:35 +0000
Subject: [PATCH] Make apply workers to read pg_subscription
---
src/backend/replication/logical/worker.c | 15 +++++++++++++++
1 file changed, 15 insertions(+)
diff --git a/src/backend/replication/logical/worker.c b/src/backend/replication/logical/worker.c
index 5181d02356..8654d79992 100644
--- a/src/backend/replication/logical/worker.c
+++ b/src/backend/replication/logical/worker.c
@@ -868,6 +868,21 @@ apply_delay(TimestampTz ts)
ResetLatch(MyLatch);
CHECK_FOR_INTERRUPTS();
+
+ /*
+ * The worker may be waken because of the ALTER SUBSCRIPTION ... DISABLE,
+ * so the catalog pg_subscription should be read again.
+ *
+ * Note that MySubscriptionValid must be false by myself because it is modified
+ * by the syscache callback but it will be called in between of transactions,
+ * not in the CHECK_FOR_INTERRUPTS().
+ */
+ if (TimestampDifferenceMilliseconds(GetCurrentTimestamp(), delay_until) > 0)
+ {
+ elog(DEBUG2, "check status of MySubscription");
+ MySubscriptionValid = false;
+ maybe_reread_subscription();
+ }
}
}
--
2.27.0
v7-0001-Time-delayed-logical-replication-subscriber.patchapplication/octet-stream; name=v7-0001-Time-delayed-logical-replication-subscriber.patchDownload
From a28987c8adb70d6932558f5e39f9dd4c55223a30 Mon Sep 17 00:00:00 2001
From: Euler Taveira <euler.taveira@enterprisedb.com>
Date: Sat, 6 Nov 2021 11:31:10 -0300
Subject: [PATCH v7] Time-delayed logical replication subscriber
Similar to physical replication, a time-delayed copy of the data for
logical replication is useful for some scenarios (particularly to fix
errors that might cause data loss).
If the subscription sets min_apply_delay parameter, the logical
replication worker will delay the transaction commit for min_apply_delay
milliseconds.
The delay is calculated between the WAL time stamp and the current time
on the subscriber.
The delay occurs only on WAL records for transaction begins. The main
reason is to avoid keeping a transaction open for a long time. Regular
and prepared transactions are covered. Streamed transactions are also
covered.
Author: Euler Taveira
Discussion: https://postgr.es/m/CAB-JLwYOYwL=XTyAXKiH5CtM_Vm8KjKh7aaitCKvmCh4rzr5pQ@mail.gmail.com
---
doc/src/sgml/catalogs.sgml | 10 ++
doc/src/sgml/ref/alter_subscription.sgml | 5 +-
doc/src/sgml/ref/create_subscription.sgml | 43 +++++-
src/backend/catalog/pg_subscription.c | 1 +
src/backend/catalog/system_views.sql | 7 +-
src/backend/commands/subscriptioncmds.c | 59 +++++++-
src/backend/replication/logical/worker.c | 100 +++++++++++++
src/backend/utils/adt/timestamp.c | 32 ++++
src/bin/pg_dump/pg_dump.c | 17 ++-
src/bin/pg_dump/pg_dump.h | 1 +
src/bin/psql/describe.c | 9 +-
src/bin/psql/tab-complete.c | 4 +-
src/include/catalog/pg_subscription.h | 3 +
src/include/datatype/timestamp.h | 2 +
src/include/utils/timestamp.h | 2 +
src/test/regress/expected/subscription.out | 165 ++++++++++++---------
src/test/regress/sql/subscription.sql | 20 +++
src/test/subscription/t/032_apply_delay.pl | 129 ++++++++++++++++
18 files changed, 526 insertions(+), 83 deletions(-)
create mode 100644 src/test/subscription/t/032_apply_delay.pl
diff --git a/doc/src/sgml/catalogs.sgml b/doc/src/sgml/catalogs.sgml
index cd2cc37aeb..291ebdafad 100644
--- a/doc/src/sgml/catalogs.sgml
+++ b/doc/src/sgml/catalogs.sgml
@@ -7833,6 +7833,16 @@ SCRAM-SHA-256$<replaceable><iteration count></replaceable>:<replaceable>&l
</para></entry>
</row>
+ <row>
+ <entry role="catalog_table_entry"><para role="column_definition">
+ <structfield>subapplydelay</structfield> <type>int8</type>
+ </para>
+ <para>
+ Application delay of changes by a specified amount of time. The
+ unit is in milliseconds.
+ </para></entry>
+ </row>
+
<row>
<entry role="catalog_table_entry"><para role="column_definition">
<structfield>subname</structfield> <type>name</type>
diff --git a/doc/src/sgml/ref/alter_subscription.sgml b/doc/src/sgml/ref/alter_subscription.sgml
index 64efc21f53..8901e1361c 100644
--- a/doc/src/sgml/ref/alter_subscription.sgml
+++ b/doc/src/sgml/ref/alter_subscription.sgml
@@ -208,8 +208,9 @@ ALTER SUBSCRIPTION <replaceable class="parameter">name</replaceable> RENAME TO <
are <literal>slot_name</literal>,
<literal>synchronous_commit</literal>,
<literal>binary</literal>, <literal>streaming</literal>,
- <literal>disable_on_error</literal>, and
- <literal>origin</literal>.
+ <literal>disable_on_error</literal>,
+ <literal>origin</literal>, and
+ <literal>min_apply_delay</literal>.
</para>
</listitem>
</varlistentry>
diff --git a/doc/src/sgml/ref/create_subscription.sgml b/doc/src/sgml/ref/create_subscription.sgml
index 7390c715bc..a794c07042 100644
--- a/doc/src/sgml/ref/create_subscription.sgml
+++ b/doc/src/sgml/ref/create_subscription.sgml
@@ -317,7 +317,36 @@ CREATE SUBSCRIPTION <replaceable class="parameter">subscription_name</replaceabl
</para>
</listitem>
</varlistentry>
- </variablelist></para>
+
+ <varlistentry>
+ <term><literal>min_apply_delay</literal> (<type>integer</type>)</term>
+ <listitem>
+ <para>
+ By default, subscriber applies changes as soon as possible. As with
+ the physical replication feature
+ (<xref linkend="guc-recovery-min-apply-delay"/>), it can be useful to
+ have a time-delayed logical replica. This parameter allows you to
+ delay the application of changes by a specified amount of time. If
+ this value is specified without units, it is taken as milliseconds.
+ The default is zero, adding no delay.
+ </para>
+ <para>
+ The delay occurs only on WAL records for transaction begins and after
+ the initial table synchronization. It is possible that the
+ replication delay between publisher and subscriber exceeds the value
+ of this parameter, in which case no delay is added. Note that the
+ delay is calculated between the WAL time stamp as written on
+ publisher and the current time on the subscriber. Time spent in logical
+ decoding and in transfering the transaction may reduce the actual wait
+ time. If the system clocks on publisher and subscriber are not
+ synchronized, this may lead to apply changes earlier than expected.
+ This is not a major issue because a typical setting of this parameter
+ are much larger than typical time deviations between servers.
+ </para>
+ </listitem>
+ </varlistentry>
+ </variablelist>
+ </para>
</listitem>
</varlistentry>
@@ -413,6 +442,18 @@ CREATE SUBSCRIPTION mysub
PUBLICATION insert_only
WITH (enabled = false);
</programlisting></para>
+
+ <para>
+ Create a subscription to a remote server that replicates tables in
+ the <literal>baz</literal> publication and starts replicating immediately on
+ commit. Pre-existing data is not copied. The application of changes is
+ delayed by 4 hours.
+<programlisting>
+CREATE SUBSCRIPTION foo
+ CONNECTION 'host=192.168.1.50 port=5432 user=foo dbname=foodb'
+ PUBLICATION baz
+ WITH (copy_data = false, min_apply_delay = '4h');
+</programlisting></para>
</refsect1>
<refsect1>
diff --git a/src/backend/catalog/pg_subscription.c b/src/backend/catalog/pg_subscription.c
index a506fc3ec8..d93e374ef4 100644
--- a/src/backend/catalog/pg_subscription.c
+++ b/src/backend/catalog/pg_subscription.c
@@ -64,6 +64,7 @@ GetSubscription(Oid subid, bool missing_ok)
sub->oid = subid;
sub->dbid = subform->subdbid;
sub->skiplsn = subform->subskiplsn;
+ sub->applydelay = subform->subapplydelay;
sub->name = pstrdup(NameStr(subform->subname));
sub->owner = subform->subowner;
sub->enabled = subform->subenabled;
diff --git a/src/backend/catalog/system_views.sql b/src/backend/catalog/system_views.sql
index f369b1fc14..50175323b9 100644
--- a/src/backend/catalog/system_views.sql
+++ b/src/backend/catalog/system_views.sql
@@ -1297,9 +1297,10 @@ REVOKE ALL ON pg_replication_origin_status FROM public;
-- All columns of pg_subscription except subconninfo are publicly readable.
REVOKE ALL ON pg_subscription FROM public;
-GRANT SELECT (oid, subdbid, subskiplsn, subname, subowner, subenabled,
- subbinary, substream, subtwophasestate, subdisableonerr,
- subslotname, subsynccommit, subpublications, suborigin)
+GRANT SELECT (oid, subdbid, subskiplsn, subapplydelay, subname, subowner,
+ subenabled, subbinary, substream, subtwophasestate,
+ subdisableonerr, subslotname, subsynccommit, subpublications,
+ suborigin)
ON pg_subscription TO public;
CREATE VIEW pg_stat_subscription_stats AS
diff --git a/src/backend/commands/subscriptioncmds.c b/src/backend/commands/subscriptioncmds.c
index f73dfb6067..3c0d186991 100644
--- a/src/backend/commands/subscriptioncmds.c
+++ b/src/backend/commands/subscriptioncmds.c
@@ -47,6 +47,7 @@
#include "utils/memutils.h"
#include "utils/pg_lsn.h"
#include "utils/syscache.h"
+#include "utils/timestamp.h"
/*
* Options that can be specified by the user in CREATE/ALTER SUBSCRIPTION
@@ -65,6 +66,7 @@
#define SUBOPT_DISABLE_ON_ERR 0x00000400
#define SUBOPT_LSN 0x00000800
#define SUBOPT_ORIGIN 0x00001000
+#define SUBOPT_MIN_APPLY_DELAY 0x00002000
/* check if the 'val' has 'bits' set */
#define IsSet(val, bits) (((val) & (bits)) == (bits))
@@ -89,6 +91,7 @@ typedef struct SubOpts
bool disableonerr;
char *origin;
XLogRecPtr lsn;
+ int64 min_apply_delay;
} SubOpts;
static List *fetch_table_list(WalReceiverConn *wrconn, List *publications);
@@ -141,6 +144,8 @@ parse_subscription_options(ParseState *pstate, List *stmt_options,
opts->disableonerr = false;
if (IsSet(supported_opts, SUBOPT_ORIGIN))
opts->origin = pstrdup(LOGICALREP_ORIGIN_ANY);
+ if (IsSet(supported_opts, SUBOPT_MIN_APPLY_DELAY))
+ opts->min_apply_delay = 0;
/* Parse options */
foreach(lc, stmt_options)
@@ -319,12 +324,45 @@ parse_subscription_options(ParseState *pstate, List *stmt_options,
opts->specified_opts |= SUBOPT_LSN;
opts->lsn = lsn;
}
+ else if (IsSet(supported_opts, SUBOPT_MIN_APPLY_DELAY) &&
+ strcmp(defel->defname, "min_apply_delay") == 0)
+ {
+ char *val, *tmp;
+ Interval *interval;
+
+ if (IsSet(opts->specified_opts, SUBOPT_MIN_APPLY_DELAY))
+ errorConflictingDefElem(defel, pstate);
+
+ opts->specified_opts |= SUBOPT_MIN_APPLY_DELAY;
+ tmp = defGetString(defel);
+
+ /*
+ * If there is no unit, interval_in takes second as unit. This
+ * parameter expects millisecond as unit so add a unit (ms) if
+ * there isn't one.
+ */
+ if (strspn(tmp, "0123456789") == strlen(tmp))
+ val = psprintf("%sms", tmp);
+ else
+ val = tmp;
+
+ interval = DatumGetIntervalP(DirectFunctionCall3(interval_in,
+ CStringGetDatum(val),
+ ObjectIdGetDatum(InvalidOid),
+ Int32GetDatum(-1)));
+ opts->min_apply_delay = interval_to_ms(interval);
+ }
else
ereport(ERROR,
(errcode(ERRCODE_SYNTAX_ERROR),
errmsg("unrecognized subscription parameter: \"%s\"", defel->defname)));
}
+ if (opts->min_apply_delay < 0 && IsSet(supported_opts, SUBOPT_MIN_APPLY_DELAY))
+ ereport(ERROR,
+ errcode(ERRCODE_NUMERIC_VALUE_OUT_OF_RANGE),
+ errmsg("option \"%s\" must not be negative", "min_apply_delay"));
+
/*
* We've been explicitly asked to not connect, that requires some
* additional processing.
@@ -557,7 +595,8 @@ CreateSubscription(ParseState *pstate, CreateSubscriptionStmt *stmt,
SUBOPT_SLOT_NAME | SUBOPT_COPY_DATA |
SUBOPT_SYNCHRONOUS_COMMIT | SUBOPT_BINARY |
SUBOPT_STREAMING | SUBOPT_TWOPHASE_COMMIT |
- SUBOPT_DISABLE_ON_ERR | SUBOPT_ORIGIN);
+ SUBOPT_DISABLE_ON_ERR | SUBOPT_ORIGIN |
+ SUBOPT_MIN_APPLY_DELAY);
parse_subscription_options(pstate, stmt->options, supported_opts, &opts);
/*
@@ -622,6 +661,7 @@ CreateSubscription(ParseState *pstate, CreateSubscriptionStmt *stmt,
values[Anum_pg_subscription_oid - 1] = ObjectIdGetDatum(subid);
values[Anum_pg_subscription_subdbid - 1] = ObjectIdGetDatum(MyDatabaseId);
values[Anum_pg_subscription_subskiplsn - 1] = LSNGetDatum(InvalidXLogRecPtr);
+ values[Anum_pg_subscription_subapplydelay - 1] = Int64GetDatum(opts.min_apply_delay);
values[Anum_pg_subscription_subname - 1] =
DirectFunctionCall1(namein, CStringGetDatum(stmt->subname));
values[Anum_pg_subscription_subowner - 1] = ObjectIdGetDatum(owner);
@@ -1044,7 +1084,7 @@ AlterSubscription(ParseState *pstate, AlterSubscriptionStmt *stmt,
supported_opts = (SUBOPT_SLOT_NAME |
SUBOPT_SYNCHRONOUS_COMMIT | SUBOPT_BINARY |
SUBOPT_STREAMING | SUBOPT_DISABLE_ON_ERR |
- SUBOPT_ORIGIN);
+ SUBOPT_ORIGIN | SUBOPT_MIN_APPLY_DELAY);
parse_subscription_options(pstate, stmt->options,
supported_opts, &opts);
@@ -1101,6 +1141,13 @@ AlterSubscription(ParseState *pstate, AlterSubscriptionStmt *stmt,
= true;
}
+ if (IsSet(opts.specified_opts, SUBOPT_MIN_APPLY_DELAY))
+ {
+ values[Anum_pg_subscription_subapplydelay - 1] =
+ Int64GetDatum(opts.min_apply_delay);
+ replaces[Anum_pg_subscription_subapplydelay - 1] = true;
+ }
+
if (IsSet(opts.specified_opts, SUBOPT_ORIGIN))
{
values[Anum_pg_subscription_suborigin - 1] =
@@ -1130,6 +1177,14 @@ AlterSubscription(ParseState *pstate, AlterSubscriptionStmt *stmt,
if (opts.enabled)
ApplyLauncherWakeupAtCommit();
+ /*
+ * If this subscription has been disabled and it has an apply
+ * delay set, wake up the logical replication worker to finish
+ * it as soon as possible.
+ */
+ if (!opts.enabled && sub->applydelay > 0)
+ logicalrep_worker_wakeup(sub->oid, InvalidOid);
+
update_tuple = true;
break;
}
diff --git a/src/backend/replication/logical/worker.c b/src/backend/replication/logical/worker.c
index 5f8c541763..18aece3495 100644
--- a/src/backend/replication/logical/worker.c
+++ b/src/backend/replication/logical/worker.c
@@ -324,6 +324,8 @@ static void maybe_reread_subscription(void);
static void DisableSubscriptionAndExit(void);
+static void apply_delay(TimestampTz ts);
+
/* prototype needed because of stream_commit */
static void apply_dispatch(StringInfo s);
@@ -803,6 +805,72 @@ slot_modify_data(TupleTableSlot *slot, TupleTableSlot *srcslot,
ExecStoreVirtualTuple(slot);
}
+/*
+ * When min_apply_delay parameter is set on subscriber, we wait long enough to
+ * make sure a transaction is applied at least that interval behind the
+ * publisher.
+ *
+ * It applies the delay for the next transaction but before starting the
+ * transaction. The main reason for this design is to avoid a long-running
+ * transaction (which can cause some operational challenges) if the user sets a
+ * high value for the delay. This design is different from the physical
+ * replication (that applies the delay at commit time) mainly because write
+ * operations may allow some issues (such as bloat and locks) that can be
+ * minimized if it does not keep the transaction open for such a long time.
+ *
+ * The min_apply_delay parameter will take effect only after all tables are in
+ * READY state.
+ */
+static void
+apply_delay(TimestampTz ts)
+{
+ TimestampTz delay_until = 0;
+
+ /* nothing to do if no delay set */
+ if (MySubscription->applydelay <= 0)
+ return;
+
+ /*
+ * Apply delay only after all tablesync workers have reached READY state. A
+ * tablesync worker are kept until it reaches READY state. If we allow the
+ * delay during the catchup phase, once we reach the limit of tablesync
+ * workers, it will impose a delay for each subsequent worker. It means it
+ * will take a long time to finish the initial table synchronization.
+ * Instead, the apply delay will be activated only after all tables are in
+ * READY state.
+ */
+ if (!AllTablesyncsReady())
+ return;
+
+ /* set apply delay */
+ delay_until = TimestampTzPlusMilliseconds(TimestampTzGetDatum(ts),
+ MySubscription->applydelay);
+
+ while (true)
+ {
+ long diffms;
+
+ diffms = TimestampDifferenceMilliseconds(GetCurrentTimestamp(), delay_until);
+
+ /*
+ * Exit without arming the latch if it's already past time to apply
+ * this transaction.
+ */
+ if (diffms <= 0)
+ break;
+
+ elog(DEBUG2, "logical replication apply delay: %ld ms", diffms);
+
+ WaitLatch(MyLatch,
+ WL_LATCH_SET | WL_TIMEOUT | WL_EXIT_ON_PM_DEATH,
+ diffms,
+ WAIT_EVENT_RECOVERY_APPLY_DELAY);
+ ResetLatch(MyLatch);
+
+ CHECK_FOR_INTERRUPTS();
+ }
+}
+
/*
* Handle BEGIN message.
*/
@@ -814,6 +882,9 @@ apply_handle_begin(StringInfo s)
logicalrep_read_begin(s, &begin_data);
set_apply_error_context_xact(begin_data.xid, begin_data.final_lsn);
+ /* Should we delay the current transaction? */
+ apply_delay(begin_data.committime);
+
remote_final_lsn = begin_data.final_lsn;
maybe_start_skipping_changes(begin_data.final_lsn);
@@ -868,6 +939,9 @@ apply_handle_begin_prepare(StringInfo s)
logicalrep_read_begin_prepare(s, &begin_data);
set_apply_error_context_xact(begin_data.xid, begin_data.prepare_lsn);
+ /* Should we delay the current prepared transaction? */
+ apply_delay(begin_data.prepare_time);
+
remote_final_lsn = begin_data.prepare_lsn;
maybe_start_skipping_changes(begin_data.prepare_lsn);
@@ -1090,6 +1164,19 @@ apply_handle_stream_prepare(StringInfo s)
elog(DEBUG1, "received prepare for streamed transaction %u", prepare_data.xid);
+ /*
+ * Should we delay the current prepared transaction?
+ *
+ * Although the delay is applied in BEGIN PREPARE messages, streamed
+ * prepared transactions apply the delay in a STREAM PREPARE message.
+ * That's ok because no changes have been applied yet
+ * (apply_spooled_messages() will do it).
+ * The STREAM START message does not contain a prepare time (it will be
+ * available when the in-progress prepared transaction finishes), hence, it
+ * was not possible to apply a delay at that time.
+ */
+ apply_delay(prepare_data.prepare_time);
+
/* Replay all the spooled operations. */
apply_spooled_messages(prepare_data.xid, prepare_data.prepare_lsn);
@@ -1481,6 +1568,19 @@ apply_handle_stream_commit(StringInfo s)
elog(DEBUG1, "received commit for streamed transaction %u", xid);
+ /*
+ * Should we delay the current transaction?
+ *
+ * Although the delay is applied in BEGIN messages, streamed transactions
+ * apply the delay in a STREAM COMMIT message. That's ok because no changes
+ * have been applied yet (apply_spooled_messages() will do it).
+ * The STREAM START message would be a natural choice for this delay but
+ * there is no commit time yet (it will be available when the in-progress
+ * transaction finishes), hence, it was not possible to apply a delay at
+ * that time.
+ */
+ apply_delay(commit_data.committime);
+
apply_spooled_messages(xid, commit_data.commit_lsn);
apply_handle_commit_internal(&commit_data);
diff --git a/src/backend/utils/adt/timestamp.c b/src/backend/utils/adt/timestamp.c
index 49cdb290ac..89f57f7c33 100644
--- a/src/backend/utils/adt/timestamp.c
+++ b/src/backend/utils/adt/timestamp.c
@@ -2411,6 +2411,38 @@ interval_cmp_internal(const Interval *interval1, const Interval *interval2)
return int128_compare(span1, span2);
}
+/*
+ * Given an Interval returns the number of milliseconds.
+ */
+int64
+interval_to_ms(const Interval *interval)
+{
+ int64 days;
+ int64 ms;
+ int64 result;
+
+ days = interval->month * INT64CONST(30);
+ days += interval->day;
+
+ /*
+ * The following operations use these special functions to detect overflow.
+ * Number of ms per informed days.
+ */
+ if (pg_mul_s64_overflow(days, MSECS_PER_DAY, &result))
+ ereport(ERROR,
+ (errcode(ERRCODE_NUMERIC_VALUE_OUT_OF_RANGE),
+ errmsg("bigint out of range")));
+
+ /* adds portion time (in ms) to the previous result. */
+ ms = interval->time / INT64CONST(1000);
+ if (pg_add_s64_overflow(result, ms, &result))
+ ereport(ERROR,
+ (errcode(ERRCODE_NUMERIC_VALUE_OUT_OF_RANGE),
+ errmsg("bigint out of range")));
+
+ return result;
+}
+
Datum
interval_eq(PG_FUNCTION_ARGS)
{
diff --git a/src/bin/pg_dump/pg_dump.c b/src/bin/pg_dump/pg_dump.c
index da6605175a..5ef61f3de1 100644
--- a/src/bin/pg_dump/pg_dump.c
+++ b/src/bin/pg_dump/pg_dump.c
@@ -4441,6 +4441,7 @@ getSubscriptions(Archive *fout)
int i_subsynccommit;
int i_subpublications;
int i_subbinary;
+ int i_subapplydelay;
int i,
ntups;
@@ -4493,9 +4494,15 @@ getSubscriptions(Archive *fout)
LOGICALREP_TWOPHASE_STATE_DISABLED);
if (fout->remoteVersion >= 160000)
- appendPQExpBufferStr(query, " s.suborigin\n");
+ {
+ appendPQExpBufferStr(query, " s.suborigin,\n");
+ appendPQExpBufferStr(query, " s.subapplydelay\n");
+ }
else
- appendPQExpBuffer(query, " '%s' AS suborigin\n", LOGICALREP_ORIGIN_ANY);
+ {
+ appendPQExpBuffer(query, " '%s' AS suborigin,\n", LOGICALREP_ORIGIN_ANY);
+ appendPQExpBufferStr(query, " 0 AS subapplydelay\n");
+ }
appendPQExpBufferStr(query,
"FROM pg_subscription s\n"
@@ -4523,6 +4530,7 @@ getSubscriptions(Archive *fout)
i_subtwophasestate = PQfnumber(res, "subtwophasestate");
i_subdisableonerr = PQfnumber(res, "subdisableonerr");
i_suborigin = PQfnumber(res, "suborigin");
+ i_subapplydelay = PQfnumber(res, "subapplydelay");
subinfo = pg_malloc(ntups * sizeof(SubscriptionInfo));
@@ -4553,6 +4561,8 @@ getSubscriptions(Archive *fout)
subinfo[i].subdisableonerr =
pg_strdup(PQgetvalue(res, i, i_subdisableonerr));
subinfo[i].suborigin = pg_strdup(PQgetvalue(res, i, i_suborigin));
+ subinfo[i].subapplydelay =
+ strtoi64(PQgetvalue(res, i, i_subapplydelay), NULL, 10);
/* Decide whether we want to dump it */
selectDumpableObject(&(subinfo[i].dobj), fout);
@@ -4632,6 +4642,9 @@ dumpSubscription(Archive *fout, const SubscriptionInfo *subinfo)
if (strcmp(subinfo->subsynccommit, "off") != 0)
appendPQExpBuffer(query, ", synchronous_commit = %s", fmtId(subinfo->subsynccommit));
+ if (subinfo->subapplydelay > 0)
+ appendPQExpBuffer(query, ", min_apply_delay = '" INT64_FORMAT " ms'", subinfo->subapplydelay);
+
appendPQExpBufferStr(query, ");\n");
if (subinfo->dobj.dump & DUMP_COMPONENT_DEFINITION)
diff --git a/src/bin/pg_dump/pg_dump.h b/src/bin/pg_dump/pg_dump.h
index 69ee939d44..91b73e10d2 100644
--- a/src/bin/pg_dump/pg_dump.h
+++ b/src/bin/pg_dump/pg_dump.h
@@ -661,6 +661,7 @@ typedef struct _SubscriptionInfo
char *subdisableonerr;
char *suborigin;
char *subsynccommit;
+ int64 subapplydelay;
char *subpublications;
} SubscriptionInfo;
diff --git a/src/bin/psql/describe.c b/src/bin/psql/describe.c
index 327a69487b..0be1d44e81 100644
--- a/src/bin/psql/describe.c
+++ b/src/bin/psql/describe.c
@@ -6469,7 +6469,7 @@ describeSubscriptions(const char *pattern, bool verbose)
PGresult *res;
printQueryOpt myopt = pset.popt;
static const bool translate_columns[] = {false, false, false, false,
- false, false, false, false, false, false, false, false};
+ false, false, false, false, false, false, false, false, false};
if (pset.sversion < 100000)
{
@@ -6511,10 +6511,13 @@ describeSubscriptions(const char *pattern, bool verbose)
gettext_noop("Two-phase commit"),
gettext_noop("Disable on error"));
+ /* origin and min_apply_delay are only supported in v16 and higher */
if (pset.sversion >= 160000)
appendPQExpBuffer(&buf,
- ", suborigin AS \"%s\"\n",
- gettext_noop("Origin"));
+ ", suborigin AS \"%s\"\n"
+ ", subapplydelay AS \"%s\"\n",
+ gettext_noop("Origin"),
+ gettext_noop("Apply delay"));
appendPQExpBuffer(&buf,
", subsynccommit AS \"%s\"\n"
diff --git a/src/bin/psql/tab-complete.c b/src/bin/psql/tab-complete.c
index f265e043e9..d577abfc29 100644
--- a/src/bin/psql/tab-complete.c
+++ b/src/bin/psql/tab-complete.c
@@ -1873,7 +1873,7 @@ psql_completion(const char *text, int start, int end)
COMPLETE_WITH("(", "PUBLICATION");
/* ALTER SUBSCRIPTION <name> SET ( */
else if (HeadMatches("ALTER", "SUBSCRIPTION", MatchAny) && TailMatches("SET", "("))
- COMPLETE_WITH("binary", "disable_on_error", "origin", "slot_name",
+ COMPLETE_WITH("binary", "disable_on_error", "min_apply_delay", "origin", "slot_name",
"streaming", "synchronous_commit");
/* ALTER SUBSCRIPTION <name> SKIP ( */
else if (HeadMatches("ALTER", "SUBSCRIPTION", MatchAny) && TailMatches("SKIP", "("))
@@ -3153,7 +3153,7 @@ psql_completion(const char *text, int start, int end)
/* Complete "CREATE SUBSCRIPTION <name> ... WITH ( <opt>" */
else if (HeadMatches("CREATE", "SUBSCRIPTION") && TailMatches("WITH", "("))
COMPLETE_WITH("binary", "connect", "copy_data", "create_slot",
- "disable_on_error", "enabled", "origin", "slot_name",
+ "disable_on_error", "enabled", "min_apply_delay", "origin", "slot_name",
"streaming", "synchronous_commit", "two_phase");
/* CREATE TRIGGER --- is allowed inside CREATE SCHEMA, so use TailMatches */
diff --git a/src/include/catalog/pg_subscription.h b/src/include/catalog/pg_subscription.h
index 7b98714f30..3894b97aca 100644
--- a/src/include/catalog/pg_subscription.h
+++ b/src/include/catalog/pg_subscription.h
@@ -70,6 +70,8 @@ CATALOG(pg_subscription,6100,SubscriptionRelationId) BKI_SHARED_RELATION BKI_ROW
XLogRecPtr subskiplsn; /* All changes finished at this LSN are
* skipped */
+ int64 subapplydelay; /* Replication apply delay */
+
NameData subname; /* Name of the subscription */
Oid subowner BKI_LOOKUP(pg_authid); /* Owner of the subscription */
@@ -119,6 +121,7 @@ typedef struct Subscription
* in */
XLogRecPtr skiplsn; /* All changes finished at this LSN are
* skipped */
+ int64 applydelay; /* Replication apply delay */
char *name; /* Name of the subscription */
Oid owner; /* Oid of the subscription owner */
bool enabled; /* Indicates if the subscription is enabled */
diff --git a/src/include/datatype/timestamp.h b/src/include/datatype/timestamp.h
index d155f1b03b..d5bbfad1c4 100644
--- a/src/include/datatype/timestamp.h
+++ b/src/include/datatype/timestamp.h
@@ -127,6 +127,8 @@ struct pg_itm_in
#define SECS_PER_MINUTE 60
#define MINS_PER_HOUR 60
+#define MSECS_PER_DAY INT64CONST(86400000)
+
#define USECS_PER_DAY INT64CONST(86400000000)
#define USECS_PER_HOUR INT64CONST(3600000000)
#define USECS_PER_MINUTE INT64CONST(60000000)
diff --git a/src/include/utils/timestamp.h b/src/include/utils/timestamp.h
index edf3a97318..91709035da 100644
--- a/src/include/utils/timestamp.h
+++ b/src/include/utils/timestamp.h
@@ -78,6 +78,8 @@ extern bool TimestampDifferenceExceeds(TimestampTz start_time,
TimestampTz stop_time,
int msec);
+extern int64 interval_to_ms(const Interval *interval);
+
extern TimestampTz time_t_to_timestamptz(pg_time_t tm);
extern pg_time_t timestamptz_to_time_t(TimestampTz t);
diff --git a/src/test/regress/expected/subscription.out b/src/test/regress/expected/subscription.out
index ef0ebf96b9..056b6e36f8 100644
--- a/src/test/regress/expected/subscription.out
+++ b/src/test/regress/expected/subscription.out
@@ -77,18 +77,18 @@ ERROR: unrecognized origin value: "foo"
CREATE SUBSCRIPTION regress_testsub4 CONNECTION 'dbname=regress_doesnotexist' PUBLICATION testpub WITH (slot_name = NONE, connect = false, origin = none);
WARNING: tables were not subscribed, you will have to run ALTER SUBSCRIPTION ... REFRESH PUBLICATION to subscribe the tables
\dRs+ regress_testsub4
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
-------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub4 | regress_subscription_user | f | {testpub} | f | f | d | f | none | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Apply delay | Synchronous commit | Conninfo | Skip LSN
+------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+-------------+--------------------+-----------------------------+----------
+ regress_testsub4 | regress_subscription_user | f | {testpub} | f | f | d | f | none | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
ALTER SUBSCRIPTION regress_testsub4 SET (origin = any);
\dRs+ regress_testsub4
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
-------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub4 | regress_subscription_user | f | {testpub} | f | f | d | f | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Apply delay | Synchronous commit | Conninfo | Skip LSN
+------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+-------------+--------------------+-----------------------------+----------
+ regress_testsub4 | regress_subscription_user | f | {testpub} | f | f | d | f | any | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
DROP SUBSCRIPTION regress_testsub3;
@@ -98,10 +98,10 @@ ALTER SUBSCRIPTION regress_testsub CONNECTION 'foobar';
ERROR: invalid connection string syntax: missing "=" after "foobar" in connection info string
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub} | f | f | d | f | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Apply delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+-------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | f | d | f | any | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
ALTER SUBSCRIPTION regress_testsub SET PUBLICATION testpub2, testpub3 WITH (refresh = false);
@@ -118,10 +118,10 @@ ERROR: unrecognized subscription parameter: "create_slot"
-- ok
ALTER SUBSCRIPTION regress_testsub SKIP (lsn = '0/12345');
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+---------------------+--------+-----------+------------------+------------------+--------+--------------------+------------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub2,testpub3} | f | f | d | f | any | off | dbname=regress_doesnotexist2 | 0/12345
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Apply delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+---------------------+--------+-----------+------------------+------------------+--------+-------------+--------------------+------------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub2,testpub3} | f | f | d | f | any | 0 | off | dbname=regress_doesnotexist2 | 0/12345
(1 row)
-- ok - with lsn = NONE
@@ -130,10 +130,10 @@ ALTER SUBSCRIPTION regress_testsub SKIP (lsn = NONE);
ALTER SUBSCRIPTION regress_testsub SKIP (lsn = '0/0');
ERROR: invalid WAL location (LSN): 0/0
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+---------------------+--------+-----------+------------------+------------------+--------+--------------------+------------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub2,testpub3} | f | f | d | f | any | off | dbname=regress_doesnotexist2 | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Apply delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+---------------------+--------+-----------+------------------+------------------+--------+-------------+--------------------+------------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub2,testpub3} | f | f | d | f | any | 0 | off | dbname=regress_doesnotexist2 | 0/0
(1 row)
BEGIN;
@@ -165,10 +165,10 @@ ALTER SUBSCRIPTION regress_testsub_foo SET (synchronous_commit = foobar);
ERROR: invalid value for parameter "synchronous_commit": "foobar"
HINT: Available values: local, remote_write, remote_apply, on, off.
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
----------------------+---------------------------+---------+---------------------+--------+-----------+------------------+------------------+--------+--------------------+------------------------------+----------
- regress_testsub_foo | regress_subscription_user | f | {testpub2,testpub3} | f | f | d | f | any | local | dbname=regress_doesnotexist2 | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Apply delay | Synchronous commit | Conninfo | Skip LSN
+---------------------+---------------------------+---------+---------------------+--------+-----------+------------------+------------------+--------+-------------+--------------------+------------------------------+----------
+ regress_testsub_foo | regress_subscription_user | f | {testpub2,testpub3} | f | f | d | f | any | 0 | local | dbname=regress_doesnotexist2 | 0/0
(1 row)
-- rename back to keep the rest simple
@@ -201,19 +201,19 @@ ERROR: binary requires a Boolean value
CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUBLICATION testpub WITH (connect = false, binary = true);
WARNING: tables were not subscribed, you will have to run ALTER SUBSCRIPTION ... REFRESH PUBLICATION to subscribe the tables
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub} | t | f | d | f | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Apply delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+-------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | t | f | d | f | any | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
ALTER SUBSCRIPTION regress_testsub SET (binary = false);
ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE);
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub} | f | f | d | f | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Apply delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+-------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | f | d | f | any | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
DROP SUBSCRIPTION regress_testsub;
@@ -224,19 +224,19 @@ ERROR: streaming requires a Boolean value
CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUBLICATION testpub WITH (connect = false, streaming = true);
WARNING: tables were not subscribed, you will have to run ALTER SUBSCRIPTION ... REFRESH PUBLICATION to subscribe the tables
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub} | f | t | d | f | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Apply delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+-------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | t | d | f | any | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
ALTER SUBSCRIPTION regress_testsub SET (streaming = false);
ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE);
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub} | f | f | d | f | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Apply delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+-------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | f | d | f | any | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
-- fail - publication already exists
@@ -251,10 +251,10 @@ ALTER SUBSCRIPTION regress_testsub ADD PUBLICATION testpub1, testpub2 WITH (refr
ALTER SUBSCRIPTION regress_testsub ADD PUBLICATION testpub1, testpub2 WITH (refresh = false);
ERROR: publication "testpub1" is already in subscription "regress_testsub"
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-----------------------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub,testpub1,testpub2} | f | f | d | f | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Apply delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-----------------------------+--------+-----------+------------------+------------------+--------+-------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub,testpub1,testpub2} | f | f | d | f | any | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
-- fail - publication used more then once
@@ -269,10 +269,10 @@ ERROR: publication "testpub3" is not in subscription "regress_testsub"
-- ok - delete publications
ALTER SUBSCRIPTION regress_testsub DROP PUBLICATION testpub1, testpub2 WITH (refresh = false);
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub} | f | f | d | f | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Apply delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+-------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | f | d | f | any | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
DROP SUBSCRIPTION regress_testsub;
@@ -306,10 +306,10 @@ ERROR: two_phase requires a Boolean value
CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUBLICATION testpub WITH (connect = false, two_phase = true);
WARNING: tables were not subscribed, you will have to run ALTER SUBSCRIPTION ... REFRESH PUBLICATION to subscribe the tables
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub} | f | f | p | f | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Apply delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+-------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | f | p | f | any | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
--fail - alter of two_phase option not supported.
@@ -318,10 +318,10 @@ ERROR: unrecognized subscription parameter: "two_phase"
-- but can alter streaming when two_phase enabled
ALTER SUBSCRIPTION regress_testsub SET (streaming = true);
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub} | f | t | p | f | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Apply delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+-------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | t | p | f | any | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE);
@@ -330,10 +330,10 @@ DROP SUBSCRIPTION regress_testsub;
CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUBLICATION testpub WITH (connect = false, streaming = true, two_phase = true);
WARNING: tables were not subscribed, you will have to run ALTER SUBSCRIPTION ... REFRESH PUBLICATION to subscribe the tables
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub} | f | t | p | f | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Apply delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+-------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | t | p | f | any | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE);
@@ -345,18 +345,47 @@ ERROR: disable_on_error requires a Boolean value
CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUBLICATION testpub WITH (connect = false, disable_on_error = false);
WARNING: tables were not subscribed, you will have to run ALTER SUBSCRIPTION ... REFRESH PUBLICATION to subscribe the tables
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub} | f | f | d | f | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Apply delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+-------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | f | d | f | any | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
ALTER SUBSCRIPTION regress_testsub SET (disable_on_error = true);
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub} | f | f | d | t | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Apply delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+-------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | f | d | t | any | 0 | off | dbname=regress_doesnotexist | 0/0
+(1 row)
+
+ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE);
+DROP SUBSCRIPTION regress_testsub;
+-- fail -- min_apply_delay must be a non-negative integer
+CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUBLICATION testpub WITH (connect = false, min_apply_delay = foo);
+ERROR: invalid input syntax for type interval: "foo"
+CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUBLICATION testpub WITH (connect = false, min_apply_delay = -1);
+ERROR: option "min_apply_delay" must not be negative
+-- success -- 123 ms
+CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUBLICATION testpub WITH (connect = false, min_apply_delay = 123);
+WARNING: tables were not subscribed, you will have to run ALTER SUBSCRIPTION ... REFRESH PUBLICATION to subscribe the tables
+\dRs+
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Apply delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+-------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | f | d | f | any | 123 | off | dbname=regress_doesnotexist | 0/0
+(1 row)
+
+ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE);
+DROP SUBSCRIPTION regress_testsub;
+-- success -- interval is converted into ms and stored as integer
+CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUBLICATION testpub WITH (connect = false, min_apply_delay = '4h 27min 35s');
+WARNING: tables were not subscribed, you will have to run ALTER SUBSCRIPTION ... REFRESH PUBLICATION to subscribe the tables
+\dRs+
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Apply delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+-------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | f | d | f | any | 16055000 | off | dbname=regress_doesnotexist | 0/0
(1 row)
ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE);
diff --git a/src/test/regress/sql/subscription.sql b/src/test/regress/sql/subscription.sql
index 4425fafc46..885f8bb6fa 100644
--- a/src/test/regress/sql/subscription.sql
+++ b/src/test/regress/sql/subscription.sql
@@ -264,6 +264,26 @@ ALTER SUBSCRIPTION regress_testsub SET (disable_on_error = true);
ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE);
DROP SUBSCRIPTION regress_testsub;
+-- fail -- min_apply_delay must be a non-negative integer
+CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUBLICATION testpub WITH (connect = false, min_apply_delay = foo);
+CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUBLICATION testpub WITH (connect = false, min_apply_delay = -1);
+
+-- success -- 123 ms
+CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUBLICATION testpub WITH (connect = false, min_apply_delay = 123);
+
+\dRs+
+
+ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE);
+DROP SUBSCRIPTION regress_testsub;
+
+-- success -- interval is converted into ms and stored as integer
+CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUBLICATION testpub WITH (connect = false, min_apply_delay = '4h 27min 35s');
+
+\dRs+
+
+ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE);
+DROP SUBSCRIPTION regress_testsub;
+
RESET SESSION AUTHORIZATION;
DROP ROLE regress_subscription_user;
DROP ROLE regress_subscription_user2;
diff --git a/src/test/subscription/t/032_apply_delay.pl b/src/test/subscription/t/032_apply_delay.pl
new file mode 100644
index 0000000000..3d9e0b05f9
--- /dev/null
+++ b/src/test/subscription/t/032_apply_delay.pl
@@ -0,0 +1,129 @@
+
+# Copyright (c) 2022, PostgreSQL Global Development Group
+
+# Test replication apply delay
+use strict;
+use warnings;
+use PostgreSQL::Test::Cluster;
+use PostgreSQL::Test::Utils;
+use Test::More;
+
+# Create publisher node.
+my $node_publisher = PostgreSQL::Test::Cluster->new('publisher');
+$node_publisher->init(allows_streaming => 'logical');
+$node_publisher->append_conf('postgresql.conf',
+ 'logical_decoding_work_mem = 64kB');
+$node_publisher->start;
+
+# Create subscriber node.
+my $node_subscriber = PostgreSQL::Test::Cluster->new('subscriber');
+$node_subscriber->init;
+$node_subscriber->append_conf('postgresql.conf',
+ "log_min_messages = debug2");
+$node_subscriber->start;
+
+# Create some preexisting content on publisher.
+$node_publisher->safe_psql('postgres',
+ "CREATE TABLE test_tab (a int primary key, b varchar)");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO test_tab VALUES (1, 'foo'), (2, 'bar')");
+
+# Setup structure on subscriber.
+$node_subscriber->safe_psql('postgres',
+ "CREATE TABLE test_tab (a int primary key, b text, c timestamptz DEFAULT now(), d bigint DEFAULT 999)"
+);
+
+# Setup logical replication.
+my $publisher_connstr = $node_publisher->connstr . ' dbname=postgres';
+$node_publisher->safe_psql('postgres',
+ "CREATE PUBLICATION tap_pub FOR TABLE test_tab");
+
+my $appname = 'tap_sub';
+$node_subscriber->safe_psql('postgres',
+ "CREATE SUBSCRIPTION tap_sub CONNECTION '$publisher_connstr application_name=$appname' PUBLICATION tap_pub WITH (streaming = on, min_apply_delay = '2s')"
+);
+
+# Wait for initial table sync to finish.
+$node_subscriber->wait_for_subscription_sync($node_publisher, $appname);
+
+# Check log starting now for logical replication apply delay.
+my $log_location = -s $node_subscriber->logfile;
+
+my $result =
+ $node_subscriber->safe_psql('postgres',
+ "SELECT count(*), min(a), max(a) FROM test_tab");
+is($result, qq(2|1|2), 'check initial data was copied to subscriber');
+
+# New row to trigger apply delay.
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO test_tab VALUES (3, 'baz')");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO test_tab VALUES (4, 'abc')");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO test_tab VALUES (5, 'def')");
+
+$node_publisher->wait_for_catchup($appname);
+
+$result =
+ $node_subscriber->safe_psql('postgres',
+ "SELECT count(*), min(a), max(a) FROM test_tab");
+is($result, qq(5|1|5), 'check if the new rows were applied to subscriber');
+
+check_apply_delay_log("logical replication apply delay");
+
+# Test streamed transaction.
+# Insert, update and delete enough rows to exceed 64kB limit.
+$node_publisher->safe_psql(
+ 'postgres', q{
+BEGIN;
+INSERT INTO test_tab SELECT i, md5(i::text) FROM generate_series(6, 5000) s(i);
+UPDATE test_tab SET b = md5(b) WHERE mod(a, 2) = 0;
+DELETE FROM test_tab WHERE mod(a, 3) = 0;
+COMMIT;
+});
+
+$node_publisher->wait_for_catchup($appname);
+
+$result =
+ $node_subscriber->safe_psql('postgres',
+ "SELECT count(*), min(a), max(a) FROM test_tab");
+is($result, qq(3334|1|5000), 'check if the new rows were applied to subscriber');
+
+check_apply_delay_log("logical replication apply delay");
+
+# Test ALTER SUBSCRIPTION. Delay 86460 seconds (1 day 1 minute).
+$node_subscriber->safe_psql('postgres',
+ "ALTER SUBSCRIPTION tap_sub SET (min_apply_delay = 86460000)"
+);
+
+# New row to trigger apply delay.
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO test_tab VALUES (0, 'foobar')");
+
+# Disable subscription. worker should die immediately.
+$node_subscriber->safe_psql('postgres',
+ "ALTER SUBSCRIPTION tap_sub DISABLE"
+);
+
+# Wait until worker dies.
+my $sub_query =
+ "SELECT count(1) = 0 FROM pg_stat_subscription WHERE subname = 'tap_sub' AND pid IS NOT NULL;";
+$node_subscriber->poll_query_until('postgres', $sub_query)
+ or die "Timed out while waiting for subscriber to die";
+
+$node_subscriber->stop;
+$node_publisher->stop;
+
+done_testing();
+
+sub check_apply_delay_log
+{
+ my $message = shift;
+ my $old_log_location = $log_location;
+
+ $log_location = $node_subscriber->wait_for_log(qr/$message/, $log_location);
+
+ cmp_ok($log_location, '>', $old_log_location,
+ "logfile contains triggered logical replication apply delay"
+ );
+}
--
2.30.2
Hi,
Sorry for noise but I found another bug.
When the 032_apply_delay.pl is modified like following,
the test will be always failed even if my patch is applied.
```
# Disable subscription. worker should die immediately.
-$node_subscriber->safe_psql('postgres',
- "ALTER SUBSCRIPTION tap_sub DISABLE"
+$node_subscriber->safe_psql('postgres', q{
+BEGIN;
+ALTER SUBSCRIPTION tap_sub DISABLE;
+SELECT pg_sleep(1);
+COMMIT;
+}
);
```
The point of failure is same as I reported previously.
```
...
2022-09-14 12:00:48.891 UTC [11330] 032_apply_delay.pl LOG: statement: ALTER SUBSCRIPTION tap_sub SET (min_apply_delay = 86460000)
2022-09-14 12:00:48.910 UTC [11226] DEBUG: sending feedback (force 0) to recv 0/1690220, write 0/1690220, flush 0/1690220
2022-09-14 12:00:48.937 UTC [11208] DEBUG: server process (PID 11328) exited with exit code 0
2022-09-14 12:00:48.950 UTC [11226] DEBUG: logical replication apply delay: 86459996 ms
2022-09-14 12:00:48.950 UTC [11226] CONTEXT: processing remote data for replication origin "pg_16393" during "BEGIN" in transaction 734 finished at 0/16902A8
2022-09-14 12:00:48.979 UTC [11208] DEBUG: forked new backend, pid=11334 socket=6
2022-09-14 12:00:49.007 UTC [11334] 032_apply_delay.pl LOG: statement: BEGIN;
2022-09-14 12:00:49.008 UTC [11334] 032_apply_delay.pl LOG: statement: ALTER SUBSCRIPTION tap_sub DISABLE;
2022-09-14 12:00:49.009 UTC [11334] 032_apply_delay.pl LOG: statement: SELECT pg_sleep(1);
2022-09-14 12:00:49.009 UTC [11226] DEBUG: check status of MySubscription
2022-09-14 12:00:49.009 UTC [11226] CONTEXT: processing remote data for replication origin "pg_16393" during "BEGIN" in transaction 734 finished at 0/16902A8
2022-09-14 12:00:49.009 UTC [11226] DEBUG: logical replication apply delay: 86459937 ms
2022-09-14 12:00:49.009 UTC [11226] CONTEXT: processing remote data for replication origin "pg_16393" during "BEGIN" in transaction 734 finished at 0/16902A8
...
```
I think it may be caused that waken worker read catalogs that have not modified yet.
In AlterSubscription(), the backend kicks the apply worker ASAP, but it should be at
end of the transaction, like ApplyLauncherWakeupAtCommit() and AtEOXact_ApplyLauncher().
```
+ /*
+ * If this subscription has been disabled and it has an apply
+ * delay set, wake up the logical replication worker to finish
+ * it as soon as possible.
+ */
+ if (!opts.enabled && sub->applydelay > 0)
+ logicalrep_worker_wakeup(sub->oid, InvalidOid);
+
```
How do you think?
Best Regards,
Hayato Kuroda
FUJITSU LIMITED
Hi Euler, a long time ago you ask me a few questions about my previous
review [1]My v4 review - /messages/by-id/CAHut+Pvugkna7avUQLydg602hymc8qMp=CRT2ZCTGbi8Bkfv+A@mail.gmail.com.
Here are my replies, plus a few other review comments for patch v7-0001.
======
1. doc/src/sgml/catalogs.sgml
+ <para>
+ Application delay of changes by a specified amount of time. The
+ unit is in milliseconds.
+ </para></entry>
The wording sounds a bit strange still. How about below
SUGGESTION
The length of time (ms) to delay the application of changes.
=======
2. Other documentation?
Maybe should say something on the Logical Replication Subscription
page about this? (31.2 Subscription)
=======
3. doc/src/sgml/ref/create_subscription.sgml
+ synchronized, this may lead to apply changes earlier than expected.
+ This is not a major issue because a typical setting of this parameter
+ are much larger than typical time deviations between servers.
Wording?
SUGGESTION
... than expected, but this is not a major issue because this
parameter is typically much larger than the time deviations between
servers.
~~~
4. Q/A
From [2]Euler's reply to my v4 review - /messages/by-id/acfc1946-a73e-4e9d-86b3-b19cba225a41@www.fastmail.com you asked:
Should there also be a big warning box about the impact if using
synchronous_commit (like the other streaming replication page has this
warning)?
Impact? Could you elaborate?
~
I noticed the streaming replication docs for recovery_min_apply_delay
has a big red warning box saying that setting this GUC may block the
synchronous commits. So I was saying won’t a similar big red warning
be needed also for this min_apply_delay parameter if the delay is used
in conjunction with a publisher wanting synchronous commit because it
might block everything?
~~~
4. Example
+<programlisting>
+CREATE SUBSCRIPTION foo
+ CONNECTION 'host=192.168.1.50 port=5432 user=foo dbname=foodb'
+ PUBLICATION baz
+ WITH (copy_data = false, min_apply_delay = '4h');
+</programlisting></para>
If the example named the subscription/publication as ‘mysub’ and
‘mypub’ I think it would be more consistent with the existing
examples.
======
5. src/backend/commands/subscriptioncmds.c - SubOpts
@@ -89,6 +91,7 @@ typedef struct SubOpts
bool disableonerr;
char *origin;
XLogRecPtr lsn;
+ int64 min_apply_delay;
} SubOpts;
I feel it would be better to be explicit about the storage units. So
call this member ‘min_apply_delay_ms’. E.g. then other code in
parse_subscription_options will be more natural when you are
converting using and assigning them to this member.
~~~
6. - parse_subscription_options
+ /*
+ * If there is no unit, interval_in takes second as unit. This
+ * parameter expects millisecond as unit so add a unit (ms) if
+ * there isn't one.
+ */
The comment feels awkward. How about below
SUGGESTION
If no unit was specified, then explicitly add 'ms' otherwise the
interval_in function would assume 'seconds'
~~~
7. - parse_subscription_options
(This is a repeat of [1]My v4 review - /messages/by-id/CAHut+Pvugkna7avUQLydg602hymc8qMp=CRT2ZCTGbi8Bkfv+A@mail.gmail.com review comment #12)
+ if (opts->min_apply_delay < 0 && IsSet(supported_opts,
SUBOPT_MIN_APPLY_DELAY))
+ ereport(ERROR,
+ errcode(ERRCODE_NUMERIC_VALUE_OUT_OF_RANGE),
+ errmsg("option \"%s\" must not be negative", "min_apply_delay"));
Why is this code here instead of inside the previous code block where
the min_apply_delay was assigned in the first place?
======
8. src/backend/replication/logical/worker.c - apply_delay
+ * When min_apply_delay parameter is set on subscriber, we wait long enough to
+ * make sure a transaction is applied at least that interval behind the
+ * publisher.
"on subscriber" -> "on the subscription"
~~~
9.
+ * Apply delay only after all tablesync workers have reached READY state. A
+ * tablesync worker are kept until it reaches READY state. If we allow the
Wording ??
"A tablesync worker are kept until it reaches READY state." ??
~~~
10.
10a.
+ /* nothing to do if no delay set */
Uppercase comment
/* Nothing to do if no delay set */
~
10b.
+ /* set apply delay */
Uppercase comment
/* Set apply delay */
~~~
11. - apply_handle_stream_prepare / apply_handle_stream_commit
The previous concern about incompatibility with the "Parallel Apply"
work (see [1]My v4 review - /messages/by-id/CAHut+Pvugkna7avUQLydg602hymc8qMp=CRT2ZCTGbi8Bkfv+A@mail.gmail.com review comments #17, #18) is still a pending issue,
isn't it?
======
12. src/backend/utils/adt/timestamp.c interval_to_ms
+/*
+ * Given an Interval returns the number of milliseconds.
+ */
+int64
+interval_to_ms(const Interval *interval)
SUGGESTION
Returns the number of milliseconds in the specified Interval.
~~~
13.
+ /* adds portion time (in ms) to the previous result. */
Uppercase comment
/* Adds portion time (in ms) to the previous result. *
======
14. src/bin/pg_dump/pg_dump.c - getSubscriptions
+ {
+ appendPQExpBufferStr(query, " s.suborigin,\n");
+ appendPQExpBufferStr(query, " s.subapplydelay\n");
+ }
This could be done using just a single appendPQExpBufferStr if you
want to have 1 call instead of 2.
======
15. src/bin/psql/describe.c - describeSubscriptions
+ /* origin and min_apply_delay are only supported in v16 and higher */
Uppercase comment
/* Origin and min_apply_delay are only supported in v16 and higher */
======
16. src/include/catalog/pg_subscription.h
+ int64 subapplydelay; /* Replication apply delay */
+
Consider renaming this as 'subapplydelayms' to make the units perfectly clear.
======
17. src/test/regress/sql/subscription.sql
Is [1]My v4 review - /messages/by-id/CAHut+Pvugkna7avUQLydg602hymc8qMp=CRT2ZCTGbi8Bkfv+A@mail.gmail.com review comment 21 (There are some test cases for CREATE
SUBSCRIPTION but there are no
test cases for ALTER SUBSCRIPTION changing this new parameter.) still
a pending item?
------
[1]: My v4 review - /messages/by-id/CAHut+Pvugkna7avUQLydg602hymc8qMp=CRT2ZCTGbi8Bkfv+A@mail.gmail.com
/messages/by-id/CAHut+Pvugkna7avUQLydg602hymc8qMp=CRT2ZCTGbi8Bkfv+A@mail.gmail.com
[2]: Euler's reply to my v4 review - /messages/by-id/acfc1946-a73e-4e9d-86b3-b19cba225a41@www.fastmail.com
/messages/by-id/acfc1946-a73e-4e9d-86b3-b19cba225a41@www.fastmail.com
Kind Regards,
Peter Smith.
Fujitsu Australia
Dear Euler,
Do you have enough time to handle the issue? Our discussion has been suspended for two months...
If you could not allocate a time to discuss this problem because of other important tasks or events,
we would like to take over the thread and modify your patch.
We've planned that we will start to address comments and reported bugs if you would not respond by the end of this week.
I look forward to hearing from you.
Best Regards,
Hayato Kuroda
FUJITSU LIMITED
At Wed, 10 Aug 2022 17:33:00 -0300, "Euler Taveira" <euler@eulerto.com> wrote in
On Wed, Aug 10, 2022, at 9:39 AM, osumi.takamichi@fujitsu.com wrote:
Minor review comments for v6.
Thanks for your review. I'm attaching v7.
Using interval is not standard as this kind of parameters but it seems
convenient. On the other hand, it's not great that the unit month
introduces some subtle ambiguity. This patch translates a month to 30
days but I'm not sure it's the right thing to do. Perhaps we shouldn't
allow the units upper than days.
apply_delay() chokes the message-receiving path so that a not-so-long
delay can cause a replication timeout to fire. I think we should
process walsender pings even while delaying. Needing to make
replication timeout longer than apply delay is not great, I think.
regards.
--
Kyotaro Horiguchi
NTT Open Source Software Center
On Thu, 11 Aug 2022 at 02:03, Euler Taveira <euler@eulerto.com> wrote:
On Wed, Aug 10, 2022, at 9:39 AM, osumi.takamichi@fujitsu.com wrote:
Minor review comments for v6.
Thanks for your review. I'm attaching v7.
"If the subscriber sets min_apply_delay parameter, ..."
I suggest we use subscription rather than subscriber, because
this parameter refers to and is used for one subscription.
My suggestion is
"If one subscription sets min_apply_delay parameter, ..."
In case if you agree, there are other places to apply this change.I changed the terminology to subscription. I also checked other "subscriber"
occurrences but I don't think it should be changed. Some of them are used as
publisher/subscriber pair. If you think there is another sentence to consider,
point it out.It might be better to write a note for committer
like "Bump catalog version" at the bottom of the commit message.It is a committer task to bump the catalog number. IMO it is easy to notice
(using a git hook?) that it must bump it when we are modifying the catalog.
AFAICS there is no recommendation to add such a warning.The former interprets input number as milliseconds in case of no units,
while the latter takes it as seconds without units.
I feel it would be better to make them aligned.In a previous version I decided not to add a code to attach a unit when there
isn't one. Instead, I changed the documentation to reflect what interval_in
uses (seconds as unit). Under reflection, let's use ms as default unit if the
user doesn't specify one.I fixed all the other suggestions too.
Few comments:
1) I feel if the user has specified a long delay there is a chance
that replication may not continue if the replication slot falls behind
the current LSN by more than max_slot_wal_keep_size. I feel we should
add this reference in the documentation of min_apply_delay as the
replication will not continue in this case.
2) I also noticed that if we have to shut down the publisher server
with a long min_apply_delay configuration, the publisher server cannot
be stopped as the walsender waits for the data to be replicated. Is
this behavior ok for the server to wait in this case? If this behavior
is ok, we could add a log message as it is not very evident from the
log files why the server could not be shut down.
Regards,
Vignesh
On Tuesday, November 8, 2022 2:27 PM Kuroda, Hayato/黒田 隼人 <kuroda.hayato@fujitsu.com> wrote:
If you could not allocate a time to discuss this problem because of other
important tasks or events, we would like to take over the thread and modify
your patch.We've planned that we will start to address comments and reported bugs if
you would not respond by the end of this week.
Hi,
I've simply rebased the patch to make it applicable on top of HEAD
and make the tests pass. Note there are still open pending comments
and I'm going to start to address those.
I've written Euler as the original author in the commit message
to note his credit.
Best Regards,
Takamichi Osumi
Attachments:
v8-0001-Time-delayed-logical-replication-subscriber.patchapplication/octet-stream; name=v8-0001-Time-delayed-logical-replication-subscriber.patchDownload
From 0f6683b46af899c071f72e423fa7cf9658073b6c Mon Sep 17 00:00:00 2001
From: Takamichi Osumi <osumi.takamichi@fujitsu.com>
Date: Fri, 11 Nov 2022 13:53:43 +0000
Subject: [PATCH v8] Time-delayed logical replication subscriber
Similar to physical replication, a time-delayed copy of the data for
logical replication is useful for some scenarios (particularly to fix
errors that might cause data loss).
If the subscription sets min_apply_delay parameter, the logical
replication worker will delay the transaction commit for min_apply_delay
milliseconds.
The delay is calculated between the WAL time stamp and the current time
on the subscriber.
The delay occurs only on WAL records for transaction begins. The main
reason is to avoid keeping a transaction open for a long time. Regular
and prepared transactions are covered. Streamed transactions are also
covered.
Written originally by Euler Taveira
Discussion: https://postgr.es/m/CAB-JLwYOYwL=XTyAXKiH5CtM_Vm8KjKh7aaitCKvmCh4rzr5pQ@mail.gmail.com
---
doc/src/sgml/catalogs.sgml | 10 ++
doc/src/sgml/ref/alter_subscription.sgml | 5 +-
doc/src/sgml/ref/create_subscription.sgml | 43 +++++-
src/backend/catalog/pg_subscription.c | 1 +
src/backend/catalog/system_views.sql | 7 +-
src/backend/commands/subscriptioncmds.c | 59 +++++++-
src/backend/replication/logical/worker.c | 100 ++++++++++++
src/backend/utils/adt/timestamp.c | 32 ++++
src/bin/pg_dump/pg_dump.c | 17 ++-
src/bin/pg_dump/pg_dump.h | 1 +
src/bin/psql/describe.c | 9 +-
src/bin/psql/tab-complete.c | 4 +-
src/include/catalog/pg_subscription.h | 3 +
src/include/datatype/timestamp.h | 2 +
src/include/utils/timestamp.h | 2 +
src/test/regress/expected/subscription.out | 167 ++++++++++++---------
src/test/regress/sql/subscription.sql | 20 +++
src/test/subscription/t/032_apply_delay.pl | 129 ++++++++++++++++
18 files changed, 528 insertions(+), 83 deletions(-)
create mode 100644 src/test/subscription/t/032_apply_delay.pl
diff --git a/doc/src/sgml/catalogs.sgml b/doc/src/sgml/catalogs.sgml
index 00f833d210..f368758aec 100644
--- a/doc/src/sgml/catalogs.sgml
+++ b/doc/src/sgml/catalogs.sgml
@@ -7852,6 +7852,16 @@ SCRAM-SHA-256$<replaceable><iteration count></replaceable>:<replaceable>&l
</para></entry>
</row>
+ <row>
+ <entry role="catalog_table_entry"><para role="column_definition">
+ <structfield>subapplydelay</structfield> <type>int8</type>
+ </para>
+ <para>
+ Application delay of changes by a specified amount of time. The
+ unit is in milliseconds.
+ </para></entry>
+ </row>
+
<row>
<entry role="catalog_table_entry"><para role="column_definition">
<structfield>subname</structfield> <type>name</type>
diff --git a/doc/src/sgml/ref/alter_subscription.sgml b/doc/src/sgml/ref/alter_subscription.sgml
index 1e8d72062b..d63aff1b90 100644
--- a/doc/src/sgml/ref/alter_subscription.sgml
+++ b/doc/src/sgml/ref/alter_subscription.sgml
@@ -213,8 +213,9 @@ ALTER SUBSCRIPTION <replaceable class="parameter">name</replaceable> RENAME TO <
are <literal>slot_name</literal>,
<literal>synchronous_commit</literal>,
<literal>binary</literal>, <literal>streaming</literal>,
- <literal>disable_on_error</literal>, and
- <literal>origin</literal>.
+ <literal>disable_on_error</literal>,
+ <literal>origin</literal>, and
+ <literal>min_apply_delay</literal>.
</para>
</listitem>
</varlistentry>
diff --git a/doc/src/sgml/ref/create_subscription.sgml b/doc/src/sgml/ref/create_subscription.sgml
index f9a1776380..1dc6f33f60 100644
--- a/doc/src/sgml/ref/create_subscription.sgml
+++ b/doc/src/sgml/ref/create_subscription.sgml
@@ -333,7 +333,36 @@ CREATE SUBSCRIPTION <replaceable class="parameter">subscription_name</replaceabl
</para>
</listitem>
</varlistentry>
- </variablelist></para>
+
+ <varlistentry>
+ <term><literal>min_apply_delay</literal> (<type>integer</type>)</term>
+ <listitem>
+ <para>
+ By default, subscriber applies changes as soon as possible. As with
+ the physical replication feature
+ (<xref linkend="guc-recovery-min-apply-delay"/>), it can be useful to
+ have a time-delayed logical replica. This parameter allows you to
+ delay the application of changes by a specified amount of time. If
+ this value is specified without units, it is taken as milliseconds.
+ The default is zero, adding no delay.
+ </para>
+ <para>
+ The delay occurs only on WAL records for transaction begins and after
+ the initial table synchronization. It is possible that the
+ replication delay between publisher and subscriber exceeds the value
+ of this parameter, in which case no delay is added. Note that the
+ delay is calculated between the WAL time stamp as written on
+ publisher and the current time on the subscriber. Time spent in logical
+ decoding and in transfering the transaction may reduce the actual wait
+ time. If the system clocks on publisher and subscriber are not
+ synchronized, this may lead to apply changes earlier than expected.
+ This is not a major issue because a typical setting of this parameter
+ are much larger than typical time deviations between servers.
+ </para>
+ </listitem>
+ </varlistentry>
+ </variablelist>
+ </para>
</listitem>
</varlistentry>
@@ -456,6 +485,18 @@ CREATE SUBSCRIPTION mysub
PUBLICATION insert_only
WITH (enabled = false);
</programlisting></para>
+
+ <para>
+ Create a subscription to a remote server that replicates tables in
+ the <literal>baz</literal> publication and starts replicating immediately on
+ commit. Pre-existing data is not copied. The application of changes is
+ delayed by 4 hours.
+<programlisting>
+CREATE SUBSCRIPTION foo
+ CONNECTION 'host=192.168.1.50 port=5432 user=foo dbname=foodb'
+ PUBLICATION baz
+ WITH (copy_data = false, min_apply_delay = '4h');
+</programlisting></para>
</refsect1>
<refsect1>
diff --git a/src/backend/catalog/pg_subscription.c b/src/backend/catalog/pg_subscription.c
index a506fc3ec8..d93e374ef4 100644
--- a/src/backend/catalog/pg_subscription.c
+++ b/src/backend/catalog/pg_subscription.c
@@ -64,6 +64,7 @@ GetSubscription(Oid subid, bool missing_ok)
sub->oid = subid;
sub->dbid = subform->subdbid;
sub->skiplsn = subform->subskiplsn;
+ sub->applydelay = subform->subapplydelay;
sub->name = pstrdup(NameStr(subform->subname));
sub->owner = subform->subowner;
sub->enabled = subform->subenabled;
diff --git a/src/backend/catalog/system_views.sql b/src/backend/catalog/system_views.sql
index 2d8104b090..a84019bf2a 100644
--- a/src/backend/catalog/system_views.sql
+++ b/src/backend/catalog/system_views.sql
@@ -1299,9 +1299,10 @@ REVOKE ALL ON pg_replication_origin_status FROM public;
-- All columns of pg_subscription except subconninfo are publicly readable.
REVOKE ALL ON pg_subscription FROM public;
-GRANT SELECT (oid, subdbid, subskiplsn, subname, subowner, subenabled,
- subbinary, substream, subtwophasestate, subdisableonerr,
- subslotname, subsynccommit, subpublications, suborigin)
+GRANT SELECT (oid, subdbid, subskiplsn, subapplydelay, subname, subowner,
+ subenabled, subbinary, substream, subtwophasestate,
+ subdisableonerr, subslotname, subsynccommit, subpublications,
+ suborigin)
ON pg_subscription TO public;
CREATE VIEW pg_stat_subscription_stats AS
diff --git a/src/backend/commands/subscriptioncmds.c b/src/backend/commands/subscriptioncmds.c
index f0cec2ad5e..2e297b76b1 100644
--- a/src/backend/commands/subscriptioncmds.c
+++ b/src/backend/commands/subscriptioncmds.c
@@ -47,6 +47,7 @@
#include "utils/memutils.h"
#include "utils/pg_lsn.h"
#include "utils/syscache.h"
+#include "utils/timestamp.h"
/*
* Options that can be specified by the user in CREATE/ALTER SUBSCRIPTION
@@ -65,6 +66,7 @@
#define SUBOPT_DISABLE_ON_ERR 0x00000400
#define SUBOPT_LSN 0x00000800
#define SUBOPT_ORIGIN 0x00001000
+#define SUBOPT_MIN_APPLY_DELAY 0x00002000
/* check if the 'val' has 'bits' set */
#define IsSet(val, bits) (((val) & (bits)) == (bits))
@@ -89,6 +91,7 @@ typedef struct SubOpts
bool disableonerr;
char *origin;
XLogRecPtr lsn;
+ int64 min_apply_delay;
} SubOpts;
static List *fetch_table_list(WalReceiverConn *wrconn, List *publications);
@@ -145,6 +148,8 @@ parse_subscription_options(ParseState *pstate, List *stmt_options,
opts->disableonerr = false;
if (IsSet(supported_opts, SUBOPT_ORIGIN))
opts->origin = pstrdup(LOGICALREP_ORIGIN_ANY);
+ if (IsSet(supported_opts, SUBOPT_MIN_APPLY_DELAY))
+ opts->min_apply_delay = 0;
/* Parse options */
foreach(lc, stmt_options)
@@ -323,12 +328,45 @@ parse_subscription_options(ParseState *pstate, List *stmt_options,
opts->specified_opts |= SUBOPT_LSN;
opts->lsn = lsn;
}
+ else if (IsSet(supported_opts, SUBOPT_MIN_APPLY_DELAY) &&
+ strcmp(defel->defname, "min_apply_delay") == 0)
+ {
+ char *val, *tmp;
+ Interval *interval;
+
+ if (IsSet(opts->specified_opts, SUBOPT_MIN_APPLY_DELAY))
+ errorConflictingDefElem(defel, pstate);
+
+ opts->specified_opts |= SUBOPT_MIN_APPLY_DELAY;
+ tmp = defGetString(defel);
+
+ /*
+ * If there is no unit, interval_in takes second as unit. This
+ * parameter expects millisecond as unit so add a unit (ms) if
+ * there isn't one.
+ */
+ if (strspn(tmp, "0123456789") == strlen(tmp))
+ val = psprintf("%sms", tmp);
+ else
+ val = tmp;
+
+ interval = DatumGetIntervalP(DirectFunctionCall3(interval_in,
+ CStringGetDatum(val),
+ ObjectIdGetDatum(InvalidOid),
+ Int32GetDatum(-1)));
+ opts->min_apply_delay = interval_to_ms(interval);
+ }
else
ereport(ERROR,
(errcode(ERRCODE_SYNTAX_ERROR),
errmsg("unrecognized subscription parameter: \"%s\"", defel->defname)));
}
+ if (opts->min_apply_delay < 0 && IsSet(supported_opts, SUBOPT_MIN_APPLY_DELAY))
+ ereport(ERROR,
+ errcode(ERRCODE_NUMERIC_VALUE_OUT_OF_RANGE),
+ errmsg("option \"%s\" must not be negative", "min_apply_delay"));
+
/*
* We've been explicitly asked to not connect, that requires some
* additional processing.
@@ -559,7 +597,8 @@ CreateSubscription(ParseState *pstate, CreateSubscriptionStmt *stmt,
SUBOPT_SLOT_NAME | SUBOPT_COPY_DATA |
SUBOPT_SYNCHRONOUS_COMMIT | SUBOPT_BINARY |
SUBOPT_STREAMING | SUBOPT_TWOPHASE_COMMIT |
- SUBOPT_DISABLE_ON_ERR | SUBOPT_ORIGIN);
+ SUBOPT_DISABLE_ON_ERR | SUBOPT_ORIGIN |
+ SUBOPT_MIN_APPLY_DELAY);
parse_subscription_options(pstate, stmt->options, supported_opts, &opts);
/*
@@ -624,6 +663,7 @@ CreateSubscription(ParseState *pstate, CreateSubscriptionStmt *stmt,
values[Anum_pg_subscription_oid - 1] = ObjectIdGetDatum(subid);
values[Anum_pg_subscription_subdbid - 1] = ObjectIdGetDatum(MyDatabaseId);
values[Anum_pg_subscription_subskiplsn - 1] = LSNGetDatum(InvalidXLogRecPtr);
+ values[Anum_pg_subscription_subapplydelay - 1] = Int64GetDatum(opts.min_apply_delay);
values[Anum_pg_subscription_subname - 1] =
DirectFunctionCall1(namein, CStringGetDatum(stmt->subname));
values[Anum_pg_subscription_subowner - 1] = ObjectIdGetDatum(owner);
@@ -1053,7 +1093,7 @@ AlterSubscription(ParseState *pstate, AlterSubscriptionStmt *stmt,
supported_opts = (SUBOPT_SLOT_NAME |
SUBOPT_SYNCHRONOUS_COMMIT | SUBOPT_BINARY |
SUBOPT_STREAMING | SUBOPT_DISABLE_ON_ERR |
- SUBOPT_ORIGIN);
+ SUBOPT_ORIGIN | SUBOPT_MIN_APPLY_DELAY);
parse_subscription_options(pstate, stmt->options,
supported_opts, &opts);
@@ -1110,6 +1150,13 @@ AlterSubscription(ParseState *pstate, AlterSubscriptionStmt *stmt,
= true;
}
+ if (IsSet(opts.specified_opts, SUBOPT_MIN_APPLY_DELAY))
+ {
+ values[Anum_pg_subscription_subapplydelay - 1] =
+ Int64GetDatum(opts.min_apply_delay);
+ replaces[Anum_pg_subscription_subapplydelay - 1] = true;
+ }
+
if (IsSet(opts.specified_opts, SUBOPT_ORIGIN))
{
values[Anum_pg_subscription_suborigin - 1] =
@@ -1139,6 +1186,14 @@ AlterSubscription(ParseState *pstate, AlterSubscriptionStmt *stmt,
if (opts.enabled)
ApplyLauncherWakeupAtCommit();
+ /*
+ * If this subscription has been disabled and it has an apply
+ * delay set, wake up the logical replication worker to finish
+ * it as soon as possible.
+ */
+ if (!opts.enabled && sub->applydelay > 0)
+ logicalrep_worker_wakeup(sub->oid, InvalidOid);
+
update_tuple = true;
break;
}
diff --git a/src/backend/replication/logical/worker.c b/src/backend/replication/logical/worker.c
index e48a3f589a..0410c169c2 100644
--- a/src/backend/replication/logical/worker.c
+++ b/src/backend/replication/logical/worker.c
@@ -325,6 +325,8 @@ static void maybe_reread_subscription(void);
static void DisableSubscriptionAndExit(void);
+static void apply_delay(TimestampTz ts);
+
/* prototype needed because of stream_commit */
static void apply_dispatch(StringInfo s);
@@ -828,6 +830,72 @@ slot_modify_data(TupleTableSlot *slot, TupleTableSlot *srcslot,
ExecStoreVirtualTuple(slot);
}
+/*
+ * When min_apply_delay parameter is set on subscriber, we wait long enough to
+ * make sure a transaction is applied at least that interval behind the
+ * publisher.
+ *
+ * It applies the delay for the next transaction but before starting the
+ * transaction. The main reason for this design is to avoid a long-running
+ * transaction (which can cause some operational challenges) if the user sets a
+ * high value for the delay. This design is different from the physical
+ * replication (that applies the delay at commit time) mainly because write
+ * operations may allow some issues (such as bloat and locks) that can be
+ * minimized if it does not keep the transaction open for such a long time.
+ *
+ * The min_apply_delay parameter will take effect only after all tables are in
+ * READY state.
+ */
+static void
+apply_delay(TimestampTz ts)
+{
+ TimestampTz delay_until = 0;
+
+ /* nothing to do if no delay set */
+ if (MySubscription->applydelay <= 0)
+ return;
+
+ /*
+ * Apply delay only after all tablesync workers have reached READY state. A
+ * tablesync worker are kept until it reaches READY state. If we allow the
+ * delay during the catchup phase, once we reach the limit of tablesync
+ * workers, it will impose a delay for each subsequent worker. It means it
+ * will take a long time to finish the initial table synchronization.
+ * Instead, the apply delay will be activated only after all tables are in
+ * READY state.
+ */
+ if (!AllTablesyncsReady())
+ return;
+
+ /* set apply delay */
+ delay_until = TimestampTzPlusMilliseconds(TimestampTzGetDatum(ts),
+ MySubscription->applydelay);
+
+ while (true)
+ {
+ long diffms;
+
+ diffms = TimestampDifferenceMilliseconds(GetCurrentTimestamp(), delay_until);
+
+ /*
+ * Exit without arming the latch if it's already past time to apply
+ * this transaction.
+ */
+ if (diffms <= 0)
+ break;
+
+ elog(DEBUG2, "logical replication apply delay: %ld ms", diffms);
+
+ WaitLatch(MyLatch,
+ WL_LATCH_SET | WL_TIMEOUT | WL_EXIT_ON_PM_DEATH,
+ diffms,
+ WAIT_EVENT_RECOVERY_APPLY_DELAY);
+ ResetLatch(MyLatch);
+
+ CHECK_FOR_INTERRUPTS();
+ }
+}
+
/*
* Handle BEGIN message.
*/
@@ -839,6 +907,9 @@ apply_handle_begin(StringInfo s)
logicalrep_read_begin(s, &begin_data);
set_apply_error_context_xact(begin_data.xid, begin_data.final_lsn);
+ /* Should we delay the current transaction? */
+ apply_delay(begin_data.committime);
+
remote_final_lsn = begin_data.final_lsn;
maybe_start_skipping_changes(begin_data.final_lsn);
@@ -893,6 +964,9 @@ apply_handle_begin_prepare(StringInfo s)
logicalrep_read_begin_prepare(s, &begin_data);
set_apply_error_context_xact(begin_data.xid, begin_data.prepare_lsn);
+ /* Should we delay the current prepared transaction? */
+ apply_delay(begin_data.prepare_time);
+
remote_final_lsn = begin_data.prepare_lsn;
maybe_start_skipping_changes(begin_data.prepare_lsn);
@@ -1115,6 +1189,19 @@ apply_handle_stream_prepare(StringInfo s)
elog(DEBUG1, "received prepare for streamed transaction %u", prepare_data.xid);
+ /*
+ * Should we delay the current prepared transaction?
+ *
+ * Although the delay is applied in BEGIN PREPARE messages, streamed
+ * prepared transactions apply the delay in a STREAM PREPARE message.
+ * That's ok because no changes have been applied yet
+ * (apply_spooled_messages() will do it).
+ * The STREAM START message does not contain a prepare time (it will be
+ * available when the in-progress prepared transaction finishes), hence, it
+ * was not possible to apply a delay at that time.
+ */
+ apply_delay(prepare_data.prepare_time);
+
/* Replay all the spooled operations. */
apply_spooled_messages(prepare_data.xid, prepare_data.prepare_lsn);
@@ -1506,6 +1593,19 @@ apply_handle_stream_commit(StringInfo s)
elog(DEBUG1, "received commit for streamed transaction %u", xid);
+ /*
+ * Should we delay the current transaction?
+ *
+ * Although the delay is applied in BEGIN messages, streamed transactions
+ * apply the delay in a STREAM COMMIT message. That's ok because no changes
+ * have been applied yet (apply_spooled_messages() will do it).
+ * The STREAM START message would be a natural choice for this delay but
+ * there is no commit time yet (it will be available when the in-progress
+ * transaction finishes), hence, it was not possible to apply a delay at
+ * that time.
+ */
+ apply_delay(commit_data.committime);
+
apply_spooled_messages(xid, commit_data.commit_lsn);
apply_handle_commit_internal(&commit_data);
diff --git a/src/backend/utils/adt/timestamp.c b/src/backend/utils/adt/timestamp.c
index d8552a1f18..2b3d961955 100644
--- a/src/backend/utils/adt/timestamp.c
+++ b/src/backend/utils/adt/timestamp.c
@@ -2411,6 +2411,38 @@ interval_cmp_internal(const Interval *interval1, const Interval *interval2)
return int128_compare(span1, span2);
}
+/*
+ * Given an Interval returns the number of milliseconds.
+ */
+int64
+interval_to_ms(const Interval *interval)
+{
+ int64 days;
+ int64 ms;
+ int64 result;
+
+ days = interval->month * INT64CONST(30);
+ days += interval->day;
+
+ /*
+ * The following operations use these special functions to detect overflow.
+ * Number of ms per informed days.
+ */
+ if (pg_mul_s64_overflow(days, MSECS_PER_DAY, &result))
+ ereport(ERROR,
+ (errcode(ERRCODE_NUMERIC_VALUE_OUT_OF_RANGE),
+ errmsg("bigint out of range")));
+
+ /* adds portion time (in ms) to the previous result. */
+ ms = interval->time / INT64CONST(1000);
+ if (pg_add_s64_overflow(result, ms, &result))
+ ereport(ERROR,
+ (errcode(ERRCODE_NUMERIC_VALUE_OUT_OF_RANGE),
+ errmsg("bigint out of range")));
+
+ return result;
+}
+
Datum
interval_eq(PG_FUNCTION_ARGS)
{
diff --git a/src/bin/pg_dump/pg_dump.c b/src/bin/pg_dump/pg_dump.c
index da427f4d4a..0bc37e0a54 100644
--- a/src/bin/pg_dump/pg_dump.c
+++ b/src/bin/pg_dump/pg_dump.c
@@ -4442,6 +4442,7 @@ getSubscriptions(Archive *fout)
int i_subsynccommit;
int i_subpublications;
int i_subbinary;
+ int i_subapplydelay;
int i,
ntups;
@@ -4494,9 +4495,15 @@ getSubscriptions(Archive *fout)
LOGICALREP_TWOPHASE_STATE_DISABLED);
if (fout->remoteVersion >= 160000)
- appendPQExpBufferStr(query, " s.suborigin\n");
+ {
+ appendPQExpBufferStr(query, " s.suborigin,\n");
+ appendPQExpBufferStr(query, " s.subapplydelay\n");
+ }
else
- appendPQExpBuffer(query, " '%s' AS suborigin\n", LOGICALREP_ORIGIN_ANY);
+ {
+ appendPQExpBuffer(query, " '%s' AS suborigin,\n", LOGICALREP_ORIGIN_ANY);
+ appendPQExpBufferStr(query, " 0 AS subapplydelay\n");
+ }
appendPQExpBufferStr(query,
"FROM pg_subscription s\n"
@@ -4524,6 +4531,7 @@ getSubscriptions(Archive *fout)
i_subtwophasestate = PQfnumber(res, "subtwophasestate");
i_subdisableonerr = PQfnumber(res, "subdisableonerr");
i_suborigin = PQfnumber(res, "suborigin");
+ i_subapplydelay = PQfnumber(res, "subapplydelay");
subinfo = pg_malloc(ntups * sizeof(SubscriptionInfo));
@@ -4554,6 +4562,8 @@ getSubscriptions(Archive *fout)
subinfo[i].subdisableonerr =
pg_strdup(PQgetvalue(res, i, i_subdisableonerr));
subinfo[i].suborigin = pg_strdup(PQgetvalue(res, i, i_suborigin));
+ subinfo[i].subapplydelay =
+ strtoi64(PQgetvalue(res, i, i_subapplydelay), NULL, 10);
/* Decide whether we want to dump it */
selectDumpableObject(&(subinfo[i].dobj), fout);
@@ -4633,6 +4643,9 @@ dumpSubscription(Archive *fout, const SubscriptionInfo *subinfo)
if (strcmp(subinfo->subsynccommit, "off") != 0)
appendPQExpBuffer(query, ", synchronous_commit = %s", fmtId(subinfo->subsynccommit));
+ if (subinfo->subapplydelay > 0)
+ appendPQExpBuffer(query, ", min_apply_delay = '" INT64_FORMAT " ms'", subinfo->subapplydelay);
+
appendPQExpBufferStr(query, ");\n");
if (subinfo->dobj.dump & DUMP_COMPONENT_DEFINITION)
diff --git a/src/bin/pg_dump/pg_dump.h b/src/bin/pg_dump/pg_dump.h
index 427f5d45f6..26d03141d8 100644
--- a/src/bin/pg_dump/pg_dump.h
+++ b/src/bin/pg_dump/pg_dump.h
@@ -661,6 +661,7 @@ typedef struct _SubscriptionInfo
char *subdisableonerr;
char *suborigin;
char *subsynccommit;
+ int64 subapplydelay;
char *subpublications;
} SubscriptionInfo;
diff --git a/src/bin/psql/describe.c b/src/bin/psql/describe.c
index 2eae519b1d..b408a6a60c 100644
--- a/src/bin/psql/describe.c
+++ b/src/bin/psql/describe.c
@@ -6471,7 +6471,7 @@ describeSubscriptions(const char *pattern, bool verbose)
PGresult *res;
printQueryOpt myopt = pset.popt;
static const bool translate_columns[] = {false, false, false, false,
- false, false, false, false, false, false, false, false};
+ false, false, false, false, false, false, false, false, false};
if (pset.sversion < 100000)
{
@@ -6513,10 +6513,13 @@ describeSubscriptions(const char *pattern, bool verbose)
gettext_noop("Two-phase commit"),
gettext_noop("Disable on error"));
+ /* origin and min_apply_delay are only supported in v16 and higher */
if (pset.sversion >= 160000)
appendPQExpBuffer(&buf,
- ", suborigin AS \"%s\"\n",
- gettext_noop("Origin"));
+ ", suborigin AS \"%s\"\n"
+ ", subapplydelay AS \"%s\"\n",
+ gettext_noop("Origin"),
+ gettext_noop("Apply delay"));
appendPQExpBuffer(&buf,
", subsynccommit AS \"%s\"\n"
diff --git a/src/bin/psql/tab-complete.c b/src/bin/psql/tab-complete.c
index 4c45e4747a..05b677443b 100644
--- a/src/bin/psql/tab-complete.c
+++ b/src/bin/psql/tab-complete.c
@@ -1873,7 +1873,7 @@ psql_completion(const char *text, int start, int end)
COMPLETE_WITH("(", "PUBLICATION");
/* ALTER SUBSCRIPTION <name> SET ( */
else if (HeadMatches("ALTER", "SUBSCRIPTION", MatchAny) && TailMatches("SET", "("))
- COMPLETE_WITH("binary", "disable_on_error", "origin", "slot_name",
+ COMPLETE_WITH("binary", "disable_on_error", "min_apply_delay", "origin", "slot_name",
"streaming", "synchronous_commit");
/* ALTER SUBSCRIPTION <name> SKIP ( */
else if (HeadMatches("ALTER", "SUBSCRIPTION", MatchAny) && TailMatches("SKIP", "("))
@@ -3195,7 +3195,7 @@ psql_completion(const char *text, int start, int end)
/* Complete "CREATE SUBSCRIPTION <name> ... WITH ( <opt>" */
else if (HeadMatches("CREATE", "SUBSCRIPTION") && TailMatches("WITH", "("))
COMPLETE_WITH("binary", "connect", "copy_data", "create_slot",
- "disable_on_error", "enabled", "origin", "slot_name",
+ "disable_on_error", "enabled", "min_apply_delay", "origin", "slot_name",
"streaming", "synchronous_commit", "two_phase");
/* CREATE TRIGGER --- is allowed inside CREATE SCHEMA, so use TailMatches */
diff --git a/src/include/catalog/pg_subscription.h b/src/include/catalog/pg_subscription.h
index 7b98714f30..3894b97aca 100644
--- a/src/include/catalog/pg_subscription.h
+++ b/src/include/catalog/pg_subscription.h
@@ -70,6 +70,8 @@ CATALOG(pg_subscription,6100,SubscriptionRelationId) BKI_SHARED_RELATION BKI_ROW
XLogRecPtr subskiplsn; /* All changes finished at this LSN are
* skipped */
+ int64 subapplydelay; /* Replication apply delay */
+
NameData subname; /* Name of the subscription */
Oid subowner BKI_LOOKUP(pg_authid); /* Owner of the subscription */
@@ -119,6 +121,7 @@ typedef struct Subscription
* in */
XLogRecPtr skiplsn; /* All changes finished at this LSN are
* skipped */
+ int64 applydelay; /* Replication apply delay */
char *name; /* Name of the subscription */
Oid owner; /* Oid of the subscription owner */
bool enabled; /* Indicates if the subscription is enabled */
diff --git a/src/include/datatype/timestamp.h b/src/include/datatype/timestamp.h
index d155f1b03b..d5bbfad1c4 100644
--- a/src/include/datatype/timestamp.h
+++ b/src/include/datatype/timestamp.h
@@ -127,6 +127,8 @@ struct pg_itm_in
#define SECS_PER_MINUTE 60
#define MINS_PER_HOUR 60
+#define MSECS_PER_DAY INT64CONST(86400000)
+
#define USECS_PER_DAY INT64CONST(86400000000)
#define USECS_PER_HOUR INT64CONST(3600000000)
#define USECS_PER_MINUTE INT64CONST(60000000)
diff --git a/src/include/utils/timestamp.h b/src/include/utils/timestamp.h
index 76b7b4a3ca..7b2ad3e329 100644
--- a/src/include/utils/timestamp.h
+++ b/src/include/utils/timestamp.h
@@ -106,6 +106,8 @@ extern bool TimestampDifferenceExceeds(TimestampTz start_time,
TimestampTz stop_time,
int msec);
+extern int64 interval_to_ms(const Interval *interval);
+
extern TimestampTz time_t_to_timestamptz(pg_time_t tm);
extern pg_time_t timestamptz_to_time_t(TimestampTz t);
diff --git a/src/test/regress/expected/subscription.out b/src/test/regress/expected/subscription.out
index c13d218dcf..3ac85dc3e7 100644
--- a/src/test/regress/expected/subscription.out
+++ b/src/test/regress/expected/subscription.out
@@ -114,18 +114,18 @@ CREATE SUBSCRIPTION regress_testsub4 CONNECTION 'dbname=regress_doesnotexist' PU
WARNING: subscription was created, but is not connected
HINT: To initiate replication, you must manually create the replication slot, enable the subscription, and refresh the subscription.
\dRs+ regress_testsub4
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
-------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub4 | regress_subscription_user | f | {testpub} | f | f | d | f | none | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Apply delay | Synchronous commit | Conninfo | Skip LSN
+------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+-------------+--------------------+-----------------------------+----------
+ regress_testsub4 | regress_subscription_user | f | {testpub} | f | f | d | f | none | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
ALTER SUBSCRIPTION regress_testsub4 SET (origin = any);
\dRs+ regress_testsub4
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
-------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub4 | regress_subscription_user | f | {testpub} | f | f | d | f | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Apply delay | Synchronous commit | Conninfo | Skip LSN
+------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+-------------+--------------------+-----------------------------+----------
+ regress_testsub4 | regress_subscription_user | f | {testpub} | f | f | d | f | any | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
DROP SUBSCRIPTION regress_testsub3;
@@ -135,10 +135,10 @@ ALTER SUBSCRIPTION regress_testsub CONNECTION 'foobar';
ERROR: invalid connection string syntax: missing "=" after "foobar" in connection info string
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub} | f | f | d | f | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Apply delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+-------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | f | d | f | any | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
ALTER SUBSCRIPTION regress_testsub SET PUBLICATION testpub2, testpub3 WITH (refresh = false);
@@ -155,10 +155,10 @@ ERROR: unrecognized subscription parameter: "create_slot"
-- ok
ALTER SUBSCRIPTION regress_testsub SKIP (lsn = '0/12345');
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+---------------------+--------+-----------+------------------+------------------+--------+--------------------+------------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub2,testpub3} | f | f | d | f | any | off | dbname=regress_doesnotexist2 | 0/12345
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Apply delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+---------------------+--------+-----------+------------------+------------------+--------+-------------+--------------------+------------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub2,testpub3} | f | f | d | f | any | 0 | off | dbname=regress_doesnotexist2 | 0/12345
(1 row)
-- ok - with lsn = NONE
@@ -167,10 +167,10 @@ ALTER SUBSCRIPTION regress_testsub SKIP (lsn = NONE);
ALTER SUBSCRIPTION regress_testsub SKIP (lsn = '0/0');
ERROR: invalid WAL location (LSN): 0/0
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+---------------------+--------+-----------+------------------+------------------+--------+--------------------+------------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub2,testpub3} | f | f | d | f | any | off | dbname=regress_doesnotexist2 | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Apply delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+---------------------+--------+-----------+------------------+------------------+--------+-------------+--------------------+------------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub2,testpub3} | f | f | d | f | any | 0 | off | dbname=regress_doesnotexist2 | 0/0
(1 row)
BEGIN;
@@ -202,10 +202,10 @@ ALTER SUBSCRIPTION regress_testsub_foo SET (synchronous_commit = foobar);
ERROR: invalid value for parameter "synchronous_commit": "foobar"
HINT: Available values: local, remote_write, remote_apply, on, off.
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
----------------------+---------------------------+---------+---------------------+--------+-----------+------------------+------------------+--------+--------------------+------------------------------+----------
- regress_testsub_foo | regress_subscription_user | f | {testpub2,testpub3} | f | f | d | f | any | local | dbname=regress_doesnotexist2 | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Apply delay | Synchronous commit | Conninfo | Skip LSN
+---------------------+---------------------------+---------+---------------------+--------+-----------+------------------+------------------+--------+-------------+--------------------+------------------------------+----------
+ regress_testsub_foo | regress_subscription_user | f | {testpub2,testpub3} | f | f | d | f | any | 0 | local | dbname=regress_doesnotexist2 | 0/0
(1 row)
-- rename back to keep the rest simple
@@ -239,19 +239,19 @@ CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUB
WARNING: subscription was created, but is not connected
HINT: To initiate replication, you must manually create the replication slot, enable the subscription, and refresh the subscription.
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub} | t | f | d | f | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Apply delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+-------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | t | f | d | f | any | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
ALTER SUBSCRIPTION regress_testsub SET (binary = false);
ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE);
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub} | f | f | d | f | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Apply delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+-------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | f | d | f | any | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
DROP SUBSCRIPTION regress_testsub;
@@ -263,19 +263,19 @@ CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUB
WARNING: subscription was created, but is not connected
HINT: To initiate replication, you must manually create the replication slot, enable the subscription, and refresh the subscription.
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub} | f | t | d | f | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Apply delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+-------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | t | d | f | any | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
ALTER SUBSCRIPTION regress_testsub SET (streaming = false);
ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE);
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub} | f | f | d | f | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Apply delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+-------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | f | d | f | any | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
-- fail - publication already exists
@@ -290,10 +290,10 @@ ALTER SUBSCRIPTION regress_testsub ADD PUBLICATION testpub1, testpub2 WITH (refr
ALTER SUBSCRIPTION regress_testsub ADD PUBLICATION testpub1, testpub2 WITH (refresh = false);
ERROR: publication "testpub1" is already in subscription "regress_testsub"
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-----------------------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub,testpub1,testpub2} | f | f | d | f | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Apply delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-----------------------------+--------+-----------+------------------+------------------+--------+-------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub,testpub1,testpub2} | f | f | d | f | any | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
-- fail - publication used more then once
@@ -308,10 +308,10 @@ ERROR: publication "testpub3" is not in subscription "regress_testsub"
-- ok - delete publications
ALTER SUBSCRIPTION regress_testsub DROP PUBLICATION testpub1, testpub2 WITH (refresh = false);
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub} | f | f | d | f | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Apply delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+-------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | f | d | f | any | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
DROP SUBSCRIPTION regress_testsub;
@@ -347,10 +347,10 @@ CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUB
WARNING: subscription was created, but is not connected
HINT: To initiate replication, you must manually create the replication slot, enable the subscription, and refresh the subscription.
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub} | f | f | p | f | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Apply delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+-------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | f | p | f | any | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
--fail - alter of two_phase option not supported.
@@ -359,10 +359,10 @@ ERROR: unrecognized subscription parameter: "two_phase"
-- but can alter streaming when two_phase enabled
ALTER SUBSCRIPTION regress_testsub SET (streaming = true);
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub} | f | t | p | f | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Apply delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+-------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | t | p | f | any | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE);
@@ -372,10 +372,10 @@ CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUB
WARNING: subscription was created, but is not connected
HINT: To initiate replication, you must manually create the replication slot, enable the subscription, and refresh the subscription.
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub} | f | t | p | f | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Apply delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+-------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | t | p | f | any | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE);
@@ -388,18 +388,49 @@ CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUB
WARNING: subscription was created, but is not connected
HINT: To initiate replication, you must manually create the replication slot, enable the subscription, and refresh the subscription.
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub} | f | f | d | f | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Apply delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+-------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | f | d | f | any | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
ALTER SUBSCRIPTION regress_testsub SET (disable_on_error = true);
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub} | f | f | d | t | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Apply delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+-------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | f | d | t | any | 0 | off | dbname=regress_doesnotexist | 0/0
+(1 row)
+
+ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE);
+DROP SUBSCRIPTION regress_testsub;
+-- fail -- min_apply_delay must be a non-negative integer
+CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUBLICATION testpub WITH (connect = false, min_apply_delay = foo);
+ERROR: invalid input syntax for type interval: "foo"
+CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUBLICATION testpub WITH (connect = false, min_apply_delay = -1);
+ERROR: option "min_apply_delay" must not be negative
+-- success -- 123 ms
+CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUBLICATION testpub WITH (connect = false, min_apply_delay = 123);
+WARNING: subscription was created, but is not connected
+HINT: To initiate replication, you must manually create the replication slot, enable the subscription, and refresh the subscription.
+\dRs+
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Apply delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+-------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | f | d | f | any | 123 | off | dbname=regress_doesnotexist | 0/0
+(1 row)
+
+ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE);
+DROP SUBSCRIPTION regress_testsub;
+-- success -- interval is converted into ms and stored as integer
+CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUBLICATION testpub WITH (connect = false, min_apply_delay = '4h 27min 35s');
+WARNING: subscription was created, but is not connected
+HINT: To initiate replication, you must manually create the replication slot, enable the subscription, and refresh the subscription.
+\dRs+
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Apply delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+-------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | f | d | f | any | 16055000 | off | dbname=regress_doesnotexist | 0/0
(1 row)
ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE);
diff --git a/src/test/regress/sql/subscription.sql b/src/test/regress/sql/subscription.sql
index eaeade8cce..5ef0a8abca 100644
--- a/src/test/regress/sql/subscription.sql
+++ b/src/test/regress/sql/subscription.sql
@@ -275,6 +275,26 @@ ALTER SUBSCRIPTION regress_testsub SET (disable_on_error = true);
ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE);
DROP SUBSCRIPTION regress_testsub;
+-- fail -- min_apply_delay must be a non-negative integer
+CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUBLICATION testpub WITH (connect = false, min_apply_delay = foo);
+CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUBLICATION testpub WITH (connect = false, min_apply_delay = -1);
+
+-- success -- 123 ms
+CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUBLICATION testpub WITH (connect = false, min_apply_delay = 123);
+
+\dRs+
+
+ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE);
+DROP SUBSCRIPTION regress_testsub;
+
+-- success -- interval is converted into ms and stored as integer
+CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUBLICATION testpub WITH (connect = false, min_apply_delay = '4h 27min 35s');
+
+\dRs+
+
+ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE);
+DROP SUBSCRIPTION regress_testsub;
+
RESET SESSION AUTHORIZATION;
DROP ROLE regress_subscription_user;
DROP ROLE regress_subscription_user2;
diff --git a/src/test/subscription/t/032_apply_delay.pl b/src/test/subscription/t/032_apply_delay.pl
new file mode 100644
index 0000000000..3d9e0b05f9
--- /dev/null
+++ b/src/test/subscription/t/032_apply_delay.pl
@@ -0,0 +1,129 @@
+
+# Copyright (c) 2022, PostgreSQL Global Development Group
+
+# Test replication apply delay
+use strict;
+use warnings;
+use PostgreSQL::Test::Cluster;
+use PostgreSQL::Test::Utils;
+use Test::More;
+
+# Create publisher node.
+my $node_publisher = PostgreSQL::Test::Cluster->new('publisher');
+$node_publisher->init(allows_streaming => 'logical');
+$node_publisher->append_conf('postgresql.conf',
+ 'logical_decoding_work_mem = 64kB');
+$node_publisher->start;
+
+# Create subscriber node.
+my $node_subscriber = PostgreSQL::Test::Cluster->new('subscriber');
+$node_subscriber->init;
+$node_subscriber->append_conf('postgresql.conf',
+ "log_min_messages = debug2");
+$node_subscriber->start;
+
+# Create some preexisting content on publisher.
+$node_publisher->safe_psql('postgres',
+ "CREATE TABLE test_tab (a int primary key, b varchar)");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO test_tab VALUES (1, 'foo'), (2, 'bar')");
+
+# Setup structure on subscriber.
+$node_subscriber->safe_psql('postgres',
+ "CREATE TABLE test_tab (a int primary key, b text, c timestamptz DEFAULT now(), d bigint DEFAULT 999)"
+);
+
+# Setup logical replication.
+my $publisher_connstr = $node_publisher->connstr . ' dbname=postgres';
+$node_publisher->safe_psql('postgres',
+ "CREATE PUBLICATION tap_pub FOR TABLE test_tab");
+
+my $appname = 'tap_sub';
+$node_subscriber->safe_psql('postgres',
+ "CREATE SUBSCRIPTION tap_sub CONNECTION '$publisher_connstr application_name=$appname' PUBLICATION tap_pub WITH (streaming = on, min_apply_delay = '2s')"
+);
+
+# Wait for initial table sync to finish.
+$node_subscriber->wait_for_subscription_sync($node_publisher, $appname);
+
+# Check log starting now for logical replication apply delay.
+my $log_location = -s $node_subscriber->logfile;
+
+my $result =
+ $node_subscriber->safe_psql('postgres',
+ "SELECT count(*), min(a), max(a) FROM test_tab");
+is($result, qq(2|1|2), 'check initial data was copied to subscriber');
+
+# New row to trigger apply delay.
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO test_tab VALUES (3, 'baz')");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO test_tab VALUES (4, 'abc')");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO test_tab VALUES (5, 'def')");
+
+$node_publisher->wait_for_catchup($appname);
+
+$result =
+ $node_subscriber->safe_psql('postgres',
+ "SELECT count(*), min(a), max(a) FROM test_tab");
+is($result, qq(5|1|5), 'check if the new rows were applied to subscriber');
+
+check_apply_delay_log("logical replication apply delay");
+
+# Test streamed transaction.
+# Insert, update and delete enough rows to exceed 64kB limit.
+$node_publisher->safe_psql(
+ 'postgres', q{
+BEGIN;
+INSERT INTO test_tab SELECT i, md5(i::text) FROM generate_series(6, 5000) s(i);
+UPDATE test_tab SET b = md5(b) WHERE mod(a, 2) = 0;
+DELETE FROM test_tab WHERE mod(a, 3) = 0;
+COMMIT;
+});
+
+$node_publisher->wait_for_catchup($appname);
+
+$result =
+ $node_subscriber->safe_psql('postgres',
+ "SELECT count(*), min(a), max(a) FROM test_tab");
+is($result, qq(3334|1|5000), 'check if the new rows were applied to subscriber');
+
+check_apply_delay_log("logical replication apply delay");
+
+# Test ALTER SUBSCRIPTION. Delay 86460 seconds (1 day 1 minute).
+$node_subscriber->safe_psql('postgres',
+ "ALTER SUBSCRIPTION tap_sub SET (min_apply_delay = 86460000)"
+);
+
+# New row to trigger apply delay.
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO test_tab VALUES (0, 'foobar')");
+
+# Disable subscription. worker should die immediately.
+$node_subscriber->safe_psql('postgres',
+ "ALTER SUBSCRIPTION tap_sub DISABLE"
+);
+
+# Wait until worker dies.
+my $sub_query =
+ "SELECT count(1) = 0 FROM pg_stat_subscription WHERE subname = 'tap_sub' AND pid IS NOT NULL;";
+$node_subscriber->poll_query_until('postgres', $sub_query)
+ or die "Timed out while waiting for subscriber to die";
+
+$node_subscriber->stop;
+$node_publisher->stop;
+
+done_testing();
+
+sub check_apply_delay_log
+{
+ my $message = shift;
+ my $old_log_location = $log_location;
+
+ $log_location = $node_subscriber->wait_for_log(qr/$message/, $log_location);
+
+ cmp_ok($log_location, '>', $old_log_location,
+ "logfile contains triggered logical replication apply delay"
+ );
+}
--
2.30.0
Hi,
The thread title doesn't really convey the topic under discussion, so
changed it. IIRC, this has been mentioned by others as well in the
thread.
On Sat, Nov 12, 2022 at 7:21 PM vignesh C <vignesh21@gmail.com> wrote:
Few comments:
1) I feel if the user has specified a long delay there is a chance
that replication may not continue if the replication slot falls behind
the current LSN by more than max_slot_wal_keep_size. I feel we should
add this reference in the documentation of min_apply_delay as the
replication will not continue in this case.
This makes sense to me.
2) I also noticed that if we have to shut down the publisher server
with a long min_apply_delay configuration, the publisher server cannot
be stopped as the walsender waits for the data to be replicated. Is
this behavior ok for the server to wait in this case? If this behavior
is ok, we could add a log message as it is not very evident from the
log files why the server could not be shut down.
I think for this case, the behavior should be the same as for physical
replication. Can you please check what is behavior for the case you
are worried about in physical replication? Note, we already have a
similar parameter for recovery_min_apply_delay for physical
replication.
--
With Regards,
Amit Kapila.
On Mon, Nov 14, 2022 at 12:14 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
On Sat, Nov 12, 2022 at 7:21 PM vignesh C <vignesh21@gmail.com> wrote:
Few comments:
1) I feel if the user has specified a long delay there is a chance
that replication may not continue if the replication slot falls behind
the current LSN by more than max_slot_wal_keep_size. I feel we should
add this reference in the documentation of min_apply_delay as the
replication will not continue in this case.This makes sense to me.
2) I also noticed that if we have to shut down the publisher server
with a long min_apply_delay configuration, the publisher server cannot
be stopped as the walsender waits for the data to be replicated. Is
this behavior ok for the server to wait in this case? If this behavior
is ok, we could add a log message as it is not very evident from the
log files why the server could not be shut down.I think for this case, the behavior should be the same as for physical
replication. Can you please check what is behavior for the case you
are worried about in physical replication? Note, we already have a
similar parameter for recovery_min_apply_delay for physical
replication.
I don't understand the reason for the below change in the patch:
+ /*
+ * If this subscription has been disabled and it has an apply
+ * delay set, wake up the logical replication worker to finish
+ * it as soon as possible.
+ */
+ if (!opts.enabled && sub->applydelay > 0)
+ logicalrep_worker_wakeup(sub->oid, InvalidOid);
+
It seems to me Kuroda-San has proposed this change [1]/messages/by-id/TYAPR01MB5866F9716A18DA0C68A2CDB3F5469@TYAPR01MB5866.jpnprd01.prod.outlook.com to fix the test
but it is not clear to me why such a change is required. Why can't
CHECK_FOR_INTERRUPTS() after waiting, followed by the existing below
code [2]if (!in_remote_transaction && !in_streamed_transaction) { /* * If we didn't get any transactions for a while there might be * unconsumed invalidation messages in the queue, consume them * now. */ AcceptInvalidationMessages(); maybe_reread_subscription(); ... in LogicalRepApplyLoop() sufficient to handle parameter
updates?
[2]: if (!in_remote_transaction && !in_streamed_transaction) { /* * If we didn't get any transactions for a while there might be * unconsumed invalidation messages in the queue, consume them * now. */ AcceptInvalidationMessages(); maybe_reread_subscription(); ...
if (!in_remote_transaction && !in_streamed_transaction)
{
/*
* If we didn't get any transactions for a while there might be
* unconsumed invalidation messages in the queue, consume them
* now.
*/
AcceptInvalidationMessages();
maybe_reread_subscription();
...
[1]: /messages/by-id/TYAPR01MB5866F9716A18DA0C68A2CDB3F5469@TYAPR01MB5866.jpnprd01.prod.outlook.com
--
With Regards,
Amit Kapila.
Dear Amit,
I don't understand the reason for the below change in the patch:
+ /* + * If this subscription has been disabled and it has an apply + * delay set, wake up the logical replication worker to finish + * it as soon as possible. + */ + if (!opts.enabled && sub->applydelay > 0) + logicalrep_worker_wakeup(sub->oid, InvalidOid); +It seems to me Kuroda-San has proposed this change [1] to fix the test
but it is not clear to me why such a change is required. Why can't
CHECK_FOR_INTERRUPTS() after waiting, followed by the existing below
code [2] in LogicalRepApplyLoop() sufficient to handle parameter
updates?[2]
if (!in_remote_transaction && !in_streamed_transaction)
{
/*
* If we didn't get any transactions for a while there might be
* unconsumed invalidation messages in the queue, consume them
* now.
*/
AcceptInvalidationMessages();
maybe_reread_subscription();
...
I mentioned the case with a long min_apply_delay configuration.
The worker will exit normally if apply_delay() has been ended and then it can reach
LogicalRepApplyLoop(). It works well if the delay is short and workers can wake up
immediately. But if workers have long min_apply_delay, they cannot go out the
while-loop, so worker processes remain for a long time. According to test code,
it is determined that worker should die immediately and we have a
test-case that we try to kill the worker with min_apply_delay = 1 day.
Also note that the launcher process will not set a latch or send a SIGTERM even
if the subscription is altered to enabled=f. In the launcher main loop, the
launcher reads pg_subscription periodically but they do not consider about changes
of parameters. They just skip doing something if they find disabled subscriptions.
If the situation can be ignored, we may be able to remove lines.
Best Regards,
Hayato Kuroda
FUJITSU LIMITED
On Wed, Nov 9, 2022 at 12:11 PM Kyotaro Horiguchi
<horikyota.ntt@gmail.com> wrote:
At Wed, 10 Aug 2022 17:33:00 -0300, "Euler Taveira" <euler@eulerto.com> wrote in
On Wed, Aug 10, 2022, at 9:39 AM, osumi.takamichi@fujitsu.com wrote:
Minor review comments for v6.
Thanks for your review. I'm attaching v7.
Using interval is not standard as this kind of parameters but it seems
convenient. On the other hand, it's not great that the unit month
introduces some subtle ambiguity. This patch translates a month to 30
days but I'm not sure it's the right thing to do. Perhaps we shouldn't
allow the units upper than days.
Agreed. Isn't the same thing already apply to recovery_min_apply_delay
for which the maximum unit seems to be in days? If so, there is no
reason to do something different here?
apply_delay() chokes the message-receiving path so that a not-so-long
delay can cause a replication timeout to fire. I think we should
process walsender pings even while delaying. Needing to make
replication timeout longer than apply delay is not great, I think.
Again, I think for this case also the behavior should be similar to
how we handle recovery_min_apply_delay.
--
With Regards,
Amit Kapila.
On Mon, Nov 14, 2022 at 2:28 PM Hayato Kuroda (Fujitsu)
<kuroda.hayato@fujitsu.com> wrote:
I don't understand the reason for the below change in the patch:
+ /* + * If this subscription has been disabled and it has an apply + * delay set, wake up the logical replication worker to finish + * it as soon as possible. + */ + if (!opts.enabled && sub->applydelay > 0) + logicalrep_worker_wakeup(sub->oid, InvalidOid); +It seems to me Kuroda-San has proposed this change [1] to fix the test
but it is not clear to me why such a change is required. Why can't
CHECK_FOR_INTERRUPTS() after waiting, followed by the existing below
code [2] in LogicalRepApplyLoop() sufficient to handle parameter
updates?[2]
if (!in_remote_transaction && !in_streamed_transaction)
{
/*
* If we didn't get any transactions for a while there might be
* unconsumed invalidation messages in the queue, consume them
* now.
*/
AcceptInvalidationMessages();
maybe_reread_subscription();
...I mentioned the case with a long min_apply_delay configuration.
The worker will exit normally if apply_delay() has been ended and then it can reach
LogicalRepApplyLoop(). It works well if the delay is short and workers can wake up
immediately. But if workers have long min_apply_delay, they cannot go out the
while-loop, so worker processes remain for a long time. According to test code,
it is determined that worker should die immediately and we have a
test-case that we try to kill the worker with min_apply_delay = 1 day.
So, why only honor the 'disable' option of the subscription? For
example, one can change 'min_apply_delay' and it seems
recoveryApplyDelay() honors a similar change in the recovery
parameter. Is there a way to set the latch of the worker process, so
that it can recheck if anything is changed?
--
With Regards,
Amit Kapila.
Dear Amit,
It seems to me Kuroda-San has proposed this change [1] to fix the test
but it is not clear to me why such a change is required. Why can't
CHECK_FOR_INTERRUPTS() after waiting, followed by the existing below
code [2] in LogicalRepApplyLoop() sufficient to handle parameter
updates?
(I forgot to say, this change was not proposed by me. I said that there should be
modified. I thought workers should wake up after the transaction was committed.)
So, why only honor the 'disable' option of the subscription? For
example, one can change 'min_apply_delay' and it seems
recoveryApplyDelay() honors a similar change in the recovery
parameter. Is there a way to set the latch of the worker process, so
that it can recheck if anything is changed?
I have not considered about it, but seems reasonable. We may be able to
do maybe_reread_subscription() if subscription parameters are changed
and latch is set.
Currently, IIUC we try to disable subscription regardless of the state, but
should we avoid to reread catalog if workers are handling the transactions,
like LogicalRepApplyLoop()?
Best Regards,
Hayato Kuroda
FUJITSU LIMITED
On Mon, Nov 14, 2022 at 6:52 PM Hayato Kuroda (Fujitsu)
<kuroda.hayato@fujitsu.com> wrote:
Dear Amit,
It seems to me Kuroda-San has proposed this change [1] to fix the test
but it is not clear to me why such a change is required. Why can't
CHECK_FOR_INTERRUPTS() after waiting, followed by the existing below
code [2] in LogicalRepApplyLoop() sufficient to handle parameter
updates?(I forgot to say, this change was not proposed by me. I said that there should be
modified. I thought workers should wake up after the transaction was committed.)So, why only honor the 'disable' option of the subscription? For
example, one can change 'min_apply_delay' and it seems
recoveryApplyDelay() honors a similar change in the recovery
parameter. Is there a way to set the latch of the worker process, so
that it can recheck if anything is changed?I have not considered about it, but seems reasonable. We may be able to
do maybe_reread_subscription() if subscription parameters are changed
and latch is set.
One more thing I would like you to consider is the point raised by me
related to this patch's interaction with the parallel apply feature as
mentioned in the email [1]/messages/by-id/CAA4eK1JRs0v9Z65HWKEZg3quWx4LiQ=pddTJZ_P1koXsbR3TMA@mail.gmail.com. I am not sure the idea proposed in that
email [1]/messages/by-id/CAA4eK1JRs0v9Z65HWKEZg3quWx4LiQ=pddTJZ_P1koXsbR3TMA@mail.gmail.com is a good one because delaying after applying commit may not
be good as we want to delay the apply of the transaction(s) on
subscribers by this feature. I feel this needs more thought.
Currently, IIUC we try to disable subscription regardless of the state, but
should we avoid to reread catalog if workers are handling the transactions,
like LogicalRepApplyLoop()?
IIUC, here you are referring to reading catalogs again via the
function maybe_reread_subscription(), right? If so, I think the idea
is to not invoke it frequently to avoid increasing transaction apply
time. However, when you are anyway going to wait for a delay, it may
not matter. I feel it would be better to add some comments saying that
we don't want workers to wait for a long time if users have disabled
the subscription or reduced the apply_delay time.
[1]: /messages/by-id/CAA4eK1JRs0v9Z65HWKEZg3quWx4LiQ=pddTJZ_P1koXsbR3TMA@mail.gmail.com
--
With Regards,
Amit Kapila.
2022年11月14日(月) 10:09 Takamichi Osumi (Fujitsu) <osumi.takamichi@fujitsu.com>:
On Tuesday, November 8, 2022 2:27 PM Kuroda, Hayato/黒田 隼人 <kuroda.hayato@fujitsu.com> wrote:
If you could not allocate a time to discuss this problem because of other
important tasks or events, we would like to take over the thread and modify
your patch.We've planned that we will start to address comments and reported bugs if
you would not respond by the end of this week.Hi,
I've simply rebased the patch to make it applicable on top of HEAD
and make the tests pass. Note there are still open pending comments
and I'm going to start to address those.I've written Euler as the original author in the commit message
to note his credit.
Hi
Thanks for the updated patch.
While reviewing the patch backlog, we have determined that this patch adds
one or more TAP tests but has not added the test to the "meson.build" file.
To do this, locate the relevant "meson.build" file for each test and add it
in the 'tests' dictionary, which will look something like this:
'tap': {
'tests': [
't/001_basic.pl',
],
},
For some additional details please see this Wiki article:
https://wiki.postgresql.org/wiki/Meson_for_patch_authors
For more information on the meson build system for PostgreSQL see:
https://wiki.postgresql.org/wiki/Meson
Regards
Ian Barwick
On Mon, 14 Nov 2022 at 12:14, Amit Kapila <amit.kapila16@gmail.com> wrote:
Hi,
The thread title doesn't really convey the topic under discussion, so
changed it. IIRC, this has been mentioned by others as well in the
thread.On Sat, Nov 12, 2022 at 7:21 PM vignesh C <vignesh21@gmail.com> wrote:
Few comments:
1) I feel if the user has specified a long delay there is a chance
that replication may not continue if the replication slot falls behind
the current LSN by more than max_slot_wal_keep_size. I feel we should
add this reference in the documentation of min_apply_delay as the
replication will not continue in this case.This makes sense to me.
2) I also noticed that if we have to shut down the publisher server
with a long min_apply_delay configuration, the publisher server cannot
be stopped as the walsender waits for the data to be replicated. Is
this behavior ok for the server to wait in this case? If this behavior
is ok, we could add a log message as it is not very evident from the
log files why the server could not be shut down.I think for this case, the behavior should be the same as for physical
replication. Can you please check what is behavior for the case you
are worried about in physical replication? Note, we already have a
similar parameter for recovery_min_apply_delay for physical
replication.
In the case of physical replication by setting
recovery_min_apply_delay, I noticed that both primary and standby
nodes were getting stopped successfully immediately after the stop
server command. In case of logical replication, stop server fails:
pg_ctl -D publisher -l publisher.log stop -c
waiting for server to shut
down...............................................................
failed
pg_ctl: server does not shut down
In case of logical replication, the server does not get stopped
because the walsender process is not able to exit:
ps ux | grep walsender
vignesh 1950789 75.3 0.0 8695216 22284 ? Rs 11:51 1:08
postgres: walsender vignesh [local] START_REPLICATION
Regards,
Vignesh
On Wednesday, October 5, 2022 6:42 PM Peter Smith <smithpb2250@gmail.com> wrote:
Hi Euler, a long time ago you ask me a few questions about my previous review
[1].Here are my replies, plus a few other review comments for patch v7-0001.
Hi, thank you for your comments.
======
1. doc/src/sgml/catalogs.sgml
+ <para> + Application delay of changes by a specified amount of time. The + unit is in milliseconds. + </para></entry>The wording sounds a bit strange still. How about below
SUGGESTION
The length of time (ms) to delay the application of changes.
Fixed.
=======
2. Other documentation?
Maybe should say something on the Logical Replication Subscription page
about this? (31.2 Subscription)
Added.
=======
3. doc/src/sgml/ref/create_subscription.sgml
+ synchronized, this may lead to apply changes earlier than expected. + This is not a major issue because a typical setting of this parameter + are much larger than typical time deviations between servers.Wording?
SUGGESTION
... than expected, but this is not a major issue because this parameter is
typically much larger than the time deviations between servers.
Fixed.
~~~
4. Q/A
From [2] you asked:
Should there also be a big warning box about the impact if using
synchronous_commit (like the other streaming replication page has this
warning)?Impact? Could you elaborate?
~
I noticed the streaming replication docs for recovery_min_apply_delay has a big
red warning box saying that setting this GUC may block the synchronous
commits. So I was saying won’t a similar big red warning be needed also for
this min_apply_delay parameter if the delay is used in conjunction with a
publisher wanting synchronous commit because it might block everything?
I agree with you. Fixed.
~~~
4. Example
+<programlisting> +CREATE SUBSCRIPTION foo + CONNECTION 'host=192.168.1.50 port=5432 user=foo dbname=foodb' + PUBLICATION baz + WITH (copy_data = false, min_apply_delay = '4h'); +</programlisting></para>If the example named the subscription/publication as ‘mysub’ and ‘mypub’ I
think it would be more consistent with the existing examples.
Fixed.
======
5. src/backend/commands/subscriptioncmds.c - SubOpts
@@ -89,6 +91,7 @@ typedef struct SubOpts
bool disableonerr;
char *origin;
XLogRecPtr lsn;
+ int64 min_apply_delay;
} SubOpts;I feel it would be better to be explicit about the storage units. So call this
member ‘min_apply_delay_ms’. E.g. then other code in
parse_subscription_options will be more natural when you are converting using
and assigning them to this member.
I don't think we use such names including units explicitly.
Could you please tell me a similar example for this ?
~~~
6. - parse_subscription_options
+ /* + * If there is no unit, interval_in takes second as unit. This + * parameter expects millisecond as unit so add a unit (ms) if + * there isn't one. + */The comment feels awkward. How about below
SUGGESTION
If no unit was specified, then explicitly add 'ms' otherwise the interval_in
function would assume 'seconds'
Fixed.
~~~
7. - parse_subscription_options
(This is a repeat of [1] review comment #12)
+ if (opts->min_apply_delay < 0 && IsSet(supported_opts, SUBOPT_MIN_APPLY_DELAY)) + ereport(ERROR, + errcode(ERRCODE_NUMERIC_VALUE_OUT_OF_RANGE), + errmsg("option \"%s\" must not be negative", "min_apply_delay"));Why is this code here instead of inside the previous code block where the
min_apply_delay was assigned in the first place?
Changed.
======
8. src/backend/replication/logical/worker.c - apply_delay
+ * When min_apply_delay parameter is set on subscriber, we wait long + enough to + * make sure a transaction is applied at least that interval behind the + * publisher."on subscriber" -> "on the subscription"
Fixed.
~~~
9.
+ * Apply delay only after all tablesync workers have reached READY + state. A + * tablesync worker are kept until it reaches READY state. If we allow + theWording ??
"A tablesync worker are kept until it reaches READY state." ??
I removed the sentence.
~~~
10.
10a.
+ /* nothing to do if no delay set */Uppercase comment
/* Nothing to do if no delay set */~
10b.
+ /* set apply delay */Uppercase comment
/* Set apply delay */
Both are fixed.
~~~
11. - apply_handle_stream_prepare / apply_handle_stream_commit
The previous concern about incompatibility with the "Parallel Apply"
work (see [1] review comments #17, #18) is still a pending issue, isn't it?
Yes, I think so.
Kindly have a look at [1]/messages/by-id/CAA4eK1JJFpgqE0ehAb7C9YFkJ-Xe-W1ZUPZptEfYjNJM4G-sLA@mail.gmail.com.
======
12. src/backend/utils/adt/timestamp.c interval_to_ms
+/* + * Given an Interval returns the number of milliseconds. + */ +int64 +interval_to_ms(const Interval *interval)SUGGESTION
Returns the number of milliseconds in the specified Interval.
Fixed.
~~~
13.
+ /* adds portion time (in ms) to the previous result. */
Uppercase comment
/* Adds portion time (in ms) to the previous result. *
Fixed.
======
14. src/bin/pg_dump/pg_dump.c - getSubscriptions
+ { + appendPQExpBufferStr(query, " s.suborigin,\n"); + appendPQExpBufferStr(query, " s.subapplydelay\n"); }This could be done using just a single appendPQExpBufferStr if you want to
have 1 call instead of 2.
Made them together.
======
15. src/bin/psql/describe.c - describeSubscriptions
+ /* origin and min_apply_delay are only supported in v16 and higher */
Uppercase comment
/* Origin and min_apply_delay are only supported in v16 and higher */
Fixed.
======
16. src/include/catalog/pg_subscription.h
+ int64 subapplydelay; /* Replication apply delay */ +Consider renaming this as 'subapplydelayms' to make the units perfectly clear.
Similar to the 5th comments, I can't find any examples for this.
I'd like to keep it general, which makes me feel it is more aligned with
existing codes.
======
17. src/test/regress/sql/subscription.sql
Is [1] review comment 21 (There are some test cases for CREATE
SUBSCRIPTION but there are no test cases for ALTER SUBSCRIPTION
changing this new parameter.) still a pending item?
Added one test case for alter subscription.
Also, I removed the function of logicalrep_worker_wakeup()
that was trigged by AlterSubscription only when disabling the subscription.
This is achieved and replaced by another patch proposed in [2]/messages/by-id/20221122004119.GA132961@nathanxps13 in a general manner.
There are still some pending comments for this patch,
but I'll share the current patch once.
Lastly, thank you so much, Kuroda-san for giving me many advice and
suggestion for some modification of this patch.
[1]: /messages/by-id/CAA4eK1JJFpgqE0ehAb7C9YFkJ-Xe-W1ZUPZptEfYjNJM4G-sLA@mail.gmail.com
[2]: /messages/by-id/20221122004119.GA132961@nathanxps13
Best Regards,
Takamichi Osumi
Attachments:
v9-0001-Time-delayed-logical-replication-subscriber.patchapplication/octet-stream; name=v9-0001-Time-delayed-logical-replication-subscriber.patchDownload
From 4e8ea4e327c32aa442a2113d91e6a2ef23bd7ca9 Mon Sep 17 00:00:00 2001
From: Takamichi Osumi <osumi.takamichi@fujitsu.com>
Date: Thu, 24 Nov 2022 14:35:36 +0000
Subject: [PATCH v9] Time-delayed logical replication subscriber
Similar to physical replication, a time-delayed copy of the data for
logical replication is useful for some scenarios (particularly to fix
errors that might cause data loss).
If the subscription sets min_apply_delay parameter, the logical
replication worker will delay the transaction commit for min_apply_delay
milliseconds.
The delay is calculated between the WAL time stamp and the current time
on the subscriber.
The delay occurs only on WAL records for transaction begins. The main
reason is to avoid keeping a transaction open for a long time. Regular
and prepared transactions are covered. Streamed transactions are also
covered.
Author: Euler Taveira
Discussion: https://postgr.es/m/CAB-JLwYOYwL=XTyAXKiH5CtM_Vm8KjKh7aaitCKvmCh4rzr5pQ@mail.gmail.com
---
doc/src/sgml/catalogs.sgml | 9 ++
doc/src/sgml/logical-replication.sgml | 6 +
doc/src/sgml/ref/alter_subscription.sgml | 5 +-
doc/src/sgml/ref/create_subscription.sgml | 55 ++++++-
src/backend/catalog/pg_subscription.c | 1 +
src/backend/catalog/system_views.sql | 7 +-
src/backend/commands/subscriptioncmds.c | 55 ++++++-
src/backend/replication/logical/worker.c | 108 +++++++++++++
src/backend/utils/adt/timestamp.c | 32 ++++
src/bin/pg_dump/pg_dump.c | 16 +-
src/bin/pg_dump/pg_dump.h | 1 +
src/bin/psql/describe.c | 9 +-
src/bin/psql/tab-complete.c | 4 +-
src/include/catalog/pg_subscription.h | 3 +
src/include/datatype/timestamp.h | 2 +
src/include/utils/timestamp.h | 2 +
src/test/regress/expected/subscription.out | 176 +++++++++++++--------
src/test/regress/sql/subscription.sql | 25 +++
src/test/subscription/meson.build | 1 +
src/test/subscription/t/032_apply_delay.pl | 129 +++++++++++++++
20 files changed, 563 insertions(+), 83 deletions(-)
create mode 100644 src/test/subscription/t/032_apply_delay.pl
diff --git a/doc/src/sgml/catalogs.sgml b/doc/src/sgml/catalogs.sgml
index 9ed2b020b7..708e3520c0 100644
--- a/doc/src/sgml/catalogs.sgml
+++ b/doc/src/sgml/catalogs.sgml
@@ -7863,6 +7863,15 @@ SCRAM-SHA-256$<replaceable><iteration count></replaceable>:<replaceable>&l
</para></entry>
</row>
+ <row>
+ <entry role="catalog_table_entry"><para role="column_definition">
+ <structfield>subapplydelay</structfield> <type>int8</type>
+ </para>
+ <para>
+ The length of time (ms) to delay the application of changes.
+ </para></entry>
+ </row>
+
<row>
<entry role="catalog_table_entry"><para role="column_definition">
<structfield>subname</structfield> <type>name</type>
diff --git a/doc/src/sgml/logical-replication.sgml b/doc/src/sgml/logical-replication.sgml
index f8756389a3..8a6ed1bc5c 100644
--- a/doc/src/sgml/logical-replication.sgml
+++ b/doc/src/sgml/logical-replication.sgml
@@ -247,6 +247,12 @@
target table.
</para>
+ <para>
+ Time delayed replica of subscription is available by indicating
+ <literal>min_apply_delay</literal>. See
+ <xref linkend="sql-createsubscription"/> for details.
+ </para>
+
<sect2 id="logical-replication-subscription-slot">
<title>Replication Slot Management</title>
diff --git a/doc/src/sgml/ref/alter_subscription.sgml b/doc/src/sgml/ref/alter_subscription.sgml
index 1e8d72062b..d63aff1b90 100644
--- a/doc/src/sgml/ref/alter_subscription.sgml
+++ b/doc/src/sgml/ref/alter_subscription.sgml
@@ -213,8 +213,9 @@ ALTER SUBSCRIPTION <replaceable class="parameter">name</replaceable> RENAME TO <
are <literal>slot_name</literal>,
<literal>synchronous_commit</literal>,
<literal>binary</literal>, <literal>streaming</literal>,
- <literal>disable_on_error</literal>, and
- <literal>origin</literal>.
+ <literal>disable_on_error</literal>,
+ <literal>origin</literal>, and
+ <literal>min_apply_delay</literal>.
</para>
</listitem>
</varlistentry>
diff --git a/doc/src/sgml/ref/create_subscription.sgml b/doc/src/sgml/ref/create_subscription.sgml
index f9a1776380..b472655ccb 100644
--- a/doc/src/sgml/ref/create_subscription.sgml
+++ b/doc/src/sgml/ref/create_subscription.sgml
@@ -333,7 +333,48 @@ CREATE SUBSCRIPTION <replaceable class="parameter">subscription_name</replaceabl
</para>
</listitem>
</varlistentry>
- </variablelist></para>
+
+ <varlistentry>
+ <term><literal>min_apply_delay</literal> (<type>integer</type>)</term>
+ <listitem>
+ <para>
+ By default, subscriber applies changes as soon as possible. As with
+ the physical replication feature
+ (<xref linkend="guc-recovery-min-apply-delay"/>), it can be useful to
+ have a time-delayed logical replica. This parameter allows you to
+ delay the application of changes by a specified amount of time. If
+ this value is specified without units, it is taken as milliseconds.
+ The default is zero, adding no delay.
+ </para>
+ <para>
+ The delay occurs only on WAL records for transaction begins and after
+ the initial table synchronization. It is possible that the
+ replication delay between publisher and subscriber exceeds the value
+ of this parameter, in which case no delay is added. Note that the
+ delay is calculated between the WAL time stamp as written on
+ publisher and the current time on the subscriber. Time spent in logical
+ decoding and in transfering the transaction may reduce the actual wait
+ time. If the system clocks on publisher and subscriber are not
+ synchronized, this may lead to apply changes earlier than expected,
+ but this is not a major issue because this parameter is typically much
+ larger than the time deviations between servers. Note that
+ in the case when this parameter is set to a long value, the
+ replication may not continue if the replication slot falls behind the
+ current LSN by more than <literal>max_slot_wal_keep_size</literal>.
+ See more details in <xref linkend="guc-max-slot-wal-keep-size"/>.
+ </para>
+ <warning>
+ <para>
+ Synchronous replication is affected by this setting when
+ <varname>synchronous_commit</varname> is set to
+ <literal>remote_write</literal>; every <literal>COMMIT</literal>
+ will need to wait to be applied.
+ </para>
+ </warning>
+ </listitem>
+ </varlistentry>
+ </variablelist>
+ </para>
</listitem>
</varlistentry>
@@ -456,6 +497,18 @@ CREATE SUBSCRIPTION mysub
PUBLICATION insert_only
WITH (enabled = false);
</programlisting></para>
+
+ <para>
+ Create a subscription to a remote server that replicates tables in
+ the <literal>mypub</literal> publication and starts replicating immediately
+ on commit. Pre-existing data is not copied. The application of changes is
+ delayed by 4 hours.
+<programlisting>
+CREATE SUBSCRIPTION mysub
+ CONNECTION 'host=192.168.1.50 port=5432 user=foo dbname=foodb'
+ PUBLICATION mypub
+ WITH (copy_data = false, min_apply_delay = '4h');
+</programlisting></para>
</refsect1>
<refsect1>
diff --git a/src/backend/catalog/pg_subscription.c b/src/backend/catalog/pg_subscription.c
index a506fc3ec8..d93e374ef4 100644
--- a/src/backend/catalog/pg_subscription.c
+++ b/src/backend/catalog/pg_subscription.c
@@ -64,6 +64,7 @@ GetSubscription(Oid subid, bool missing_ok)
sub->oid = subid;
sub->dbid = subform->subdbid;
sub->skiplsn = subform->subskiplsn;
+ sub->applydelay = subform->subapplydelay;
sub->name = pstrdup(NameStr(subform->subname));
sub->owner = subform->subowner;
sub->enabled = subform->subenabled;
diff --git a/src/backend/catalog/system_views.sql b/src/backend/catalog/system_views.sql
index 2d8104b090..a84019bf2a 100644
--- a/src/backend/catalog/system_views.sql
+++ b/src/backend/catalog/system_views.sql
@@ -1299,9 +1299,10 @@ REVOKE ALL ON pg_replication_origin_status FROM public;
-- All columns of pg_subscription except subconninfo are publicly readable.
REVOKE ALL ON pg_subscription FROM public;
-GRANT SELECT (oid, subdbid, subskiplsn, subname, subowner, subenabled,
- subbinary, substream, subtwophasestate, subdisableonerr,
- subslotname, subsynccommit, subpublications, suborigin)
+GRANT SELECT (oid, subdbid, subskiplsn, subapplydelay, subname, subowner,
+ subenabled, subbinary, substream, subtwophasestate,
+ subdisableonerr, subslotname, subsynccommit, subpublications,
+ suborigin)
ON pg_subscription TO public;
CREATE VIEW pg_stat_subscription_stats AS
diff --git a/src/backend/commands/subscriptioncmds.c b/src/backend/commands/subscriptioncmds.c
index d673557ea4..5ca7a05dae 100644
--- a/src/backend/commands/subscriptioncmds.c
+++ b/src/backend/commands/subscriptioncmds.c
@@ -47,6 +47,7 @@
#include "utils/memutils.h"
#include "utils/pg_lsn.h"
#include "utils/syscache.h"
+#include "utils/timestamp.h"
/*
* Options that can be specified by the user in CREATE/ALTER SUBSCRIPTION
@@ -65,6 +66,7 @@
#define SUBOPT_DISABLE_ON_ERR 0x00000400
#define SUBOPT_LSN 0x00000800
#define SUBOPT_ORIGIN 0x00001000
+#define SUBOPT_MIN_APPLY_DELAY 0x00002000
/* check if the 'val' has 'bits' set */
#define IsSet(val, bits) (((val) & (bits)) == (bits))
@@ -89,6 +91,7 @@ typedef struct SubOpts
bool disableonerr;
char *origin;
XLogRecPtr lsn;
+ int64 min_apply_delay;
} SubOpts;
static List *fetch_table_list(WalReceiverConn *wrconn, List *publications);
@@ -145,6 +148,8 @@ parse_subscription_options(ParseState *pstate, List *stmt_options,
opts->disableonerr = false;
if (IsSet(supported_opts, SUBOPT_ORIGIN))
opts->origin = pstrdup(LOGICALREP_ORIGIN_ANY);
+ if (IsSet(supported_opts, SUBOPT_MIN_APPLY_DELAY))
+ opts->min_apply_delay = 0;
/* Parse options */
foreach(lc, stmt_options)
@@ -323,6 +328,43 @@ parse_subscription_options(ParseState *pstate, List *stmt_options,
opts->specified_opts |= SUBOPT_LSN;
opts->lsn = lsn;
}
+ else if (IsSet(supported_opts, SUBOPT_MIN_APPLY_DELAY) &&
+ strcmp(defel->defname, "min_apply_delay") == 0)
+ {
+ char *val,
+ *tmp;
+ Interval *interval;
+ int64 ms;
+
+ if (IsSet(opts->specified_opts, SUBOPT_MIN_APPLY_DELAY))
+ errorConflictingDefElem(defel, pstate);
+
+ opts->specified_opts |= SUBOPT_MIN_APPLY_DELAY;
+ tmp = defGetString(defel);
+
+ /*
+ * If no unit was specified, then explicitly add 'ms' otherwise
+ * the interval_in function would assume 'seconds'.
+ */
+ if (strspn(tmp, "0123456789") == strlen(tmp))
+ val = psprintf("%sms", tmp);
+ else
+ val = tmp;
+
+ interval = DatumGetIntervalP(DirectFunctionCall3(interval_in,
+ CStringGetDatum(val),
+ ObjectIdGetDatum(InvalidOid),
+ Int32GetDatum(-1)));
+
+ ms = interval_to_ms(interval);
+ if (ms < 0 || ms > INT_MAX)
+ ereport(ERROR,
+ errcode(ERRCODE_NUMERIC_VALUE_OUT_OF_RANGE),
+ errmsg("%lld ms is outside the valid range for option \"%s\"",
+ (long long) ms, "min_apply_delay"));
+
+ opts->min_apply_delay = ms;
+ }
else
ereport(ERROR,
(errcode(ERRCODE_SYNTAX_ERROR),
@@ -559,7 +601,8 @@ CreateSubscription(ParseState *pstate, CreateSubscriptionStmt *stmt,
SUBOPT_SLOT_NAME | SUBOPT_COPY_DATA |
SUBOPT_SYNCHRONOUS_COMMIT | SUBOPT_BINARY |
SUBOPT_STREAMING | SUBOPT_TWOPHASE_COMMIT |
- SUBOPT_DISABLE_ON_ERR | SUBOPT_ORIGIN);
+ SUBOPT_DISABLE_ON_ERR | SUBOPT_ORIGIN |
+ SUBOPT_MIN_APPLY_DELAY);
parse_subscription_options(pstate, stmt->options, supported_opts, &opts);
/*
@@ -624,6 +667,7 @@ CreateSubscription(ParseState *pstate, CreateSubscriptionStmt *stmt,
values[Anum_pg_subscription_oid - 1] = ObjectIdGetDatum(subid);
values[Anum_pg_subscription_subdbid - 1] = ObjectIdGetDatum(MyDatabaseId);
values[Anum_pg_subscription_subskiplsn - 1] = LSNGetDatum(InvalidXLogRecPtr);
+ values[Anum_pg_subscription_subapplydelay - 1] = Int64GetDatum(opts.min_apply_delay);
values[Anum_pg_subscription_subname - 1] =
DirectFunctionCall1(namein, CStringGetDatum(stmt->subname));
values[Anum_pg_subscription_subowner - 1] = ObjectIdGetDatum(owner);
@@ -1053,7 +1097,7 @@ AlterSubscription(ParseState *pstate, AlterSubscriptionStmt *stmt,
supported_opts = (SUBOPT_SLOT_NAME |
SUBOPT_SYNCHRONOUS_COMMIT | SUBOPT_BINARY |
SUBOPT_STREAMING | SUBOPT_DISABLE_ON_ERR |
- SUBOPT_ORIGIN);
+ SUBOPT_ORIGIN | SUBOPT_MIN_APPLY_DELAY);
parse_subscription_options(pstate, stmt->options,
supported_opts, &opts);
@@ -1110,6 +1154,13 @@ AlterSubscription(ParseState *pstate, AlterSubscriptionStmt *stmt,
= true;
}
+ if (IsSet(opts.specified_opts, SUBOPT_MIN_APPLY_DELAY))
+ {
+ values[Anum_pg_subscription_subapplydelay - 1] =
+ Int64GetDatum(opts.min_apply_delay);
+ replaces[Anum_pg_subscription_subapplydelay - 1] = true;
+ }
+
if (IsSet(opts.specified_opts, SUBOPT_ORIGIN))
{
values[Anum_pg_subscription_suborigin - 1] =
diff --git a/src/backend/replication/logical/worker.c b/src/backend/replication/logical/worker.c
index e48a3f589a..aff7bf7127 100644
--- a/src/backend/replication/logical/worker.c
+++ b/src/backend/replication/logical/worker.c
@@ -325,6 +325,8 @@ static void maybe_reread_subscription(void);
static void DisableSubscriptionAndExit(void);
+static void apply_delay(TimestampTz ts);
+
/* prototype needed because of stream_commit */
static void apply_dispatch(StringInfo s);
@@ -828,6 +830,80 @@ slot_modify_data(TupleTableSlot *slot, TupleTableSlot *srcslot,
ExecStoreVirtualTuple(slot);
}
+/*
+ * When min_apply_delay parameter is set on the subscriber, we wait long enough
+ * to make sure a transaction is applied at least that interval behind the
+ * publisher.
+ *
+ * It applies the delay for the next transaction but before starting the
+ * transaction. The main reason for this design is to avoid a long-running
+ * transaction (which can cause some operational challenges) if the user sets a
+ * high value for the delay. This design is different from the physical
+ * replication (that applies the delay at commit time) mainly because write
+ * operations may allow some issues (such as bloat and locks) that can be
+ * minimized if it does not keep the transaction open for such a long time.
+ *
+ * The min_apply_delay parameter will take effect only after all tables are in
+ * READY state.
+ */
+static void
+apply_delay(TimestampTz ts)
+{
+ TimestampTz delay_until = 0;
+
+ /* Nothing to do if no delay set */
+ if (MySubscription->applydelay <= 0)
+ return;
+
+ /*
+ * Delay apply until all tablesync workers have reached READY state. If we
+ * allow the delay during the catchup phase, once we reach the limit of
+ * tablesync workers, it will impose a delay for each subsequent worker.
+ * It means it will take a long time to finish the initial table
+ * synchronization.
+ */
+ if (!AllTablesyncsReady())
+ return;
+
+ /* Set apply delay */
+ delay_until = TimestampTzPlusMilliseconds(TimestampTzGetDatum(ts),
+ MySubscription->applydelay);
+
+ while (true)
+ {
+ long diffms;
+
+ diffms = TimestampDifferenceMilliseconds(GetCurrentTimestamp(), delay_until);
+
+ /*
+ * Exit without arming the latch if it's already past time to apply
+ * this transaction.
+ */
+ if (diffms <= 0)
+ break;
+
+ elog(DEBUG2, "logical replication apply delay: %ld ms", diffms);
+
+ WaitLatch(MyLatch,
+ WL_LATCH_SET | WL_TIMEOUT | WL_EXIT_ON_PM_DEATH,
+ diffms,
+ WAIT_EVENT_RECOVERY_APPLY_DELAY);
+ ResetLatch(MyLatch);
+
+ CHECK_FOR_INTERRUPTS();
+
+ /*
+ * The worker may be waken because of the ALTER SUBSCRIPTION ...
+ * DISABLE, so the catalog pg_subscription should be read again.
+ */
+ if (!in_remote_transaction && !in_streamed_transaction)
+ {
+ AcceptInvalidationMessages();
+ maybe_reread_subscription();
+ }
+ }
+}
+
/*
* Handle BEGIN message.
*/
@@ -839,6 +915,9 @@ apply_handle_begin(StringInfo s)
logicalrep_read_begin(s, &begin_data);
set_apply_error_context_xact(begin_data.xid, begin_data.final_lsn);
+ /* Should we delay the current transaction? */
+ apply_delay(begin_data.committime);
+
remote_final_lsn = begin_data.final_lsn;
maybe_start_skipping_changes(begin_data.final_lsn);
@@ -893,6 +972,9 @@ apply_handle_begin_prepare(StringInfo s)
logicalrep_read_begin_prepare(s, &begin_data);
set_apply_error_context_xact(begin_data.xid, begin_data.prepare_lsn);
+ /* Should we delay the current prepared transaction? */
+ apply_delay(begin_data.prepare_time);
+
remote_final_lsn = begin_data.prepare_lsn;
maybe_start_skipping_changes(begin_data.prepare_lsn);
@@ -1115,6 +1197,19 @@ apply_handle_stream_prepare(StringInfo s)
elog(DEBUG1, "received prepare for streamed transaction %u", prepare_data.xid);
+ /*
+ * Should we delay the current prepared transaction?
+ *
+ * Although the delay is applied in BEGIN PREPARE messages, streamed
+ * prepared transactions apply the delay in a STREAM PREPARE message.
+ * That's ok because no changes have been applied yet
+ * (apply_spooled_messages() will do it).
+ * The STREAM START message does not contain a prepare time (it will be
+ * available when the in-progress prepared transaction finishes), hence, it
+ * was not possible to apply a delay at that time.
+ */
+ apply_delay(prepare_data.prepare_time);
+
/* Replay all the spooled operations. */
apply_spooled_messages(prepare_data.xid, prepare_data.prepare_lsn);
@@ -1506,6 +1601,19 @@ apply_handle_stream_commit(StringInfo s)
elog(DEBUG1, "received commit for streamed transaction %u", xid);
+ /*
+ * Should we delay the current transaction?
+ *
+ * Although the delay is applied in BEGIN messages, streamed transactions
+ * apply the delay in a STREAM COMMIT message. That's ok because no changes
+ * have been applied yet (apply_spooled_messages() will do it).
+ * The STREAM START message would be a natural choice for this delay but
+ * there is no commit time yet (it will be available when the in-progress
+ * transaction finishes), hence, it was not possible to apply a delay at
+ * that time.
+ */
+ apply_delay(commit_data.committime);
+
apply_spooled_messages(xid, commit_data.commit_lsn);
apply_handle_commit_internal(&commit_data);
diff --git a/src/backend/utils/adt/timestamp.c b/src/backend/utils/adt/timestamp.c
index ef92323fd0..f25556b0f1 100644
--- a/src/backend/utils/adt/timestamp.c
+++ b/src/backend/utils/adt/timestamp.c
@@ -2425,6 +2425,38 @@ interval_cmp_internal(const Interval *interval1, const Interval *interval2)
return int128_compare(span1, span2);
}
+/*
+ * Returns the number of milliseconds in the specified Interval.
+ */
+int64
+interval_to_ms(const Interval *interval)
+{
+ int64 days;
+ int64 ms;
+ int64 result;
+
+ days = interval->month * INT64CONST(30);
+ days += interval->day;
+
+ /*
+ * The following operations use these special functions to detect
+ * overflow. Number of ms per informed days.
+ */
+ if (pg_mul_s64_overflow(days, MSECS_PER_DAY, &result))
+ ereport(ERROR,
+ (errcode(ERRCODE_NUMERIC_VALUE_OUT_OF_RANGE),
+ errmsg("bigint out of range")));
+
+ /* Adds portion time (in ms) to the previous result. */
+ ms = interval->time / INT64CONST(1000);
+ if (pg_add_s64_overflow(result, ms, &result))
+ ereport(ERROR,
+ (errcode(ERRCODE_NUMERIC_VALUE_OUT_OF_RANGE),
+ errmsg("bigint out of range")));
+
+ return result;
+}
+
Datum
interval_eq(PG_FUNCTION_ARGS)
{
diff --git a/src/bin/pg_dump/pg_dump.c b/src/bin/pg_dump/pg_dump.c
index da427f4d4a..a8e493526b 100644
--- a/src/bin/pg_dump/pg_dump.c
+++ b/src/bin/pg_dump/pg_dump.c
@@ -4442,6 +4442,7 @@ getSubscriptions(Archive *fout)
int i_subsynccommit;
int i_subpublications;
int i_subbinary;
+ int i_subapplydelay;
int i,
ntups;
@@ -4494,9 +4495,14 @@ getSubscriptions(Archive *fout)
LOGICALREP_TWOPHASE_STATE_DISABLED);
if (fout->remoteVersion >= 160000)
- appendPQExpBufferStr(query, " s.suborigin\n");
+ appendPQExpBufferStr(query,
+ " s.suborigin,\n"
+ " s.subapplydelay\n");
else
- appendPQExpBuffer(query, " '%s' AS suborigin\n", LOGICALREP_ORIGIN_ANY);
+ {
+ appendPQExpBuffer(query, " '%s' AS suborigin,\n", LOGICALREP_ORIGIN_ANY);
+ appendPQExpBufferStr(query, " 0 AS subapplydelay\n");
+ }
appendPQExpBufferStr(query,
"FROM pg_subscription s\n"
@@ -4524,6 +4530,7 @@ getSubscriptions(Archive *fout)
i_subtwophasestate = PQfnumber(res, "subtwophasestate");
i_subdisableonerr = PQfnumber(res, "subdisableonerr");
i_suborigin = PQfnumber(res, "suborigin");
+ i_subapplydelay = PQfnumber(res, "subapplydelay");
subinfo = pg_malloc(ntups * sizeof(SubscriptionInfo));
@@ -4554,6 +4561,8 @@ getSubscriptions(Archive *fout)
subinfo[i].subdisableonerr =
pg_strdup(PQgetvalue(res, i, i_subdisableonerr));
subinfo[i].suborigin = pg_strdup(PQgetvalue(res, i, i_suborigin));
+ subinfo[i].subapplydelay =
+ strtoi64(PQgetvalue(res, i, i_subapplydelay), NULL, 10);
/* Decide whether we want to dump it */
selectDumpableObject(&(subinfo[i].dobj), fout);
@@ -4633,6 +4642,9 @@ dumpSubscription(Archive *fout, const SubscriptionInfo *subinfo)
if (strcmp(subinfo->subsynccommit, "off") != 0)
appendPQExpBuffer(query, ", synchronous_commit = %s", fmtId(subinfo->subsynccommit));
+ if (subinfo->subapplydelay > 0)
+ appendPQExpBuffer(query, ", min_apply_delay = '" INT64_FORMAT " ms'", subinfo->subapplydelay);
+
appendPQExpBufferStr(query, ");\n");
if (subinfo->dobj.dump & DUMP_COMPONENT_DEFINITION)
diff --git a/src/bin/pg_dump/pg_dump.h b/src/bin/pg_dump/pg_dump.h
index 427f5d45f6..26d03141d8 100644
--- a/src/bin/pg_dump/pg_dump.h
+++ b/src/bin/pg_dump/pg_dump.h
@@ -661,6 +661,7 @@ typedef struct _SubscriptionInfo
char *subdisableonerr;
char *suborigin;
char *subsynccommit;
+ int64 subapplydelay;
char *subpublications;
} SubscriptionInfo;
diff --git a/src/bin/psql/describe.c b/src/bin/psql/describe.c
index 2eae519b1d..de7a052697 100644
--- a/src/bin/psql/describe.c
+++ b/src/bin/psql/describe.c
@@ -6471,7 +6471,7 @@ describeSubscriptions(const char *pattern, bool verbose)
PGresult *res;
printQueryOpt myopt = pset.popt;
static const bool translate_columns[] = {false, false, false, false,
- false, false, false, false, false, false, false, false};
+ false, false, false, false, false, false, false, false, false};
if (pset.sversion < 100000)
{
@@ -6513,10 +6513,13 @@ describeSubscriptions(const char *pattern, bool verbose)
gettext_noop("Two-phase commit"),
gettext_noop("Disable on error"));
+ /* Origin and min_apply_delay are only supported in v16 and higher */
if (pset.sversion >= 160000)
appendPQExpBuffer(&buf,
- ", suborigin AS \"%s\"\n",
- gettext_noop("Origin"));
+ ", suborigin AS \"%s\"\n"
+ ", subapplydelay AS \"%s\"\n",
+ gettext_noop("Origin"),
+ gettext_noop("Apply delay"));
appendPQExpBuffer(&buf,
", subsynccommit AS \"%s\"\n"
diff --git a/src/bin/psql/tab-complete.c b/src/bin/psql/tab-complete.c
index 13014f074f..6f606a1ebd 100644
--- a/src/bin/psql/tab-complete.c
+++ b/src/bin/psql/tab-complete.c
@@ -1880,7 +1880,7 @@ psql_completion(const char *text, int start, int end)
COMPLETE_WITH("(", "PUBLICATION");
/* ALTER SUBSCRIPTION <name> SET ( */
else if (HeadMatches("ALTER", "SUBSCRIPTION", MatchAny) && TailMatches("SET", "("))
- COMPLETE_WITH("binary", "disable_on_error", "origin", "slot_name",
+ COMPLETE_WITH("binary", "disable_on_error", "min_apply_delay", "origin", "slot_name",
"streaming", "synchronous_commit");
/* ALTER SUBSCRIPTION <name> SKIP ( */
else if (HeadMatches("ALTER", "SUBSCRIPTION", MatchAny) && TailMatches("SKIP", "("))
@@ -3202,7 +3202,7 @@ psql_completion(const char *text, int start, int end)
/* Complete "CREATE SUBSCRIPTION <name> ... WITH ( <opt>" */
else if (HeadMatches("CREATE", "SUBSCRIPTION") && TailMatches("WITH", "("))
COMPLETE_WITH("binary", "connect", "copy_data", "create_slot",
- "disable_on_error", "enabled", "origin", "slot_name",
+ "disable_on_error", "enabled", "min_apply_delay", "origin", "slot_name",
"streaming", "synchronous_commit", "two_phase");
/* CREATE TRIGGER --- is allowed inside CREATE SCHEMA, so use TailMatches */
diff --git a/src/include/catalog/pg_subscription.h b/src/include/catalog/pg_subscription.h
index 7b98714f30..3894b97aca 100644
--- a/src/include/catalog/pg_subscription.h
+++ b/src/include/catalog/pg_subscription.h
@@ -70,6 +70,8 @@ CATALOG(pg_subscription,6100,SubscriptionRelationId) BKI_SHARED_RELATION BKI_ROW
XLogRecPtr subskiplsn; /* All changes finished at this LSN are
* skipped */
+ int64 subapplydelay; /* Replication apply delay */
+
NameData subname; /* Name of the subscription */
Oid subowner BKI_LOOKUP(pg_authid); /* Owner of the subscription */
@@ -119,6 +121,7 @@ typedef struct Subscription
* in */
XLogRecPtr skiplsn; /* All changes finished at this LSN are
* skipped */
+ int64 applydelay; /* Replication apply delay */
char *name; /* Name of the subscription */
Oid owner; /* Oid of the subscription owner */
bool enabled; /* Indicates if the subscription is enabled */
diff --git a/src/include/datatype/timestamp.h b/src/include/datatype/timestamp.h
index d155f1b03b..d5bbfad1c4 100644
--- a/src/include/datatype/timestamp.h
+++ b/src/include/datatype/timestamp.h
@@ -127,6 +127,8 @@ struct pg_itm_in
#define SECS_PER_MINUTE 60
#define MINS_PER_HOUR 60
+#define MSECS_PER_DAY INT64CONST(86400000)
+
#define USECS_PER_DAY INT64CONST(86400000000)
#define USECS_PER_HOUR INT64CONST(3600000000)
#define USECS_PER_MINUTE INT64CONST(60000000)
diff --git a/src/include/utils/timestamp.h b/src/include/utils/timestamp.h
index 7fd0b58825..b94efaf530 100644
--- a/src/include/utils/timestamp.h
+++ b/src/include/utils/timestamp.h
@@ -102,6 +102,8 @@ extern bool TimestampDifferenceExceeds(TimestampTz start_time,
TimestampTz stop_time,
int msec);
+extern int64 interval_to_ms(const Interval *interval);
+
extern TimestampTz time_t_to_timestamptz(pg_time_t tm);
extern pg_time_t timestamptz_to_time_t(TimestampTz t);
diff --git a/src/test/regress/expected/subscription.out b/src/test/regress/expected/subscription.out
index c13d218dcf..1cb2fe3fb3 100644
--- a/src/test/regress/expected/subscription.out
+++ b/src/test/regress/expected/subscription.out
@@ -114,18 +114,18 @@ CREATE SUBSCRIPTION regress_testsub4 CONNECTION 'dbname=regress_doesnotexist' PU
WARNING: subscription was created, but is not connected
HINT: To initiate replication, you must manually create the replication slot, enable the subscription, and refresh the subscription.
\dRs+ regress_testsub4
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
-------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub4 | regress_subscription_user | f | {testpub} | f | f | d | f | none | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Apply delay | Synchronous commit | Conninfo | Skip LSN
+------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+-------------+--------------------+-----------------------------+----------
+ regress_testsub4 | regress_subscription_user | f | {testpub} | f | f | d | f | none | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
ALTER SUBSCRIPTION regress_testsub4 SET (origin = any);
\dRs+ regress_testsub4
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
-------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub4 | regress_subscription_user | f | {testpub} | f | f | d | f | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Apply delay | Synchronous commit | Conninfo | Skip LSN
+------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+-------------+--------------------+-----------------------------+----------
+ regress_testsub4 | regress_subscription_user | f | {testpub} | f | f | d | f | any | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
DROP SUBSCRIPTION regress_testsub3;
@@ -135,10 +135,10 @@ ALTER SUBSCRIPTION regress_testsub CONNECTION 'foobar';
ERROR: invalid connection string syntax: missing "=" after "foobar" in connection info string
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub} | f | f | d | f | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Apply delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+-------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | f | d | f | any | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
ALTER SUBSCRIPTION regress_testsub SET PUBLICATION testpub2, testpub3 WITH (refresh = false);
@@ -155,10 +155,10 @@ ERROR: unrecognized subscription parameter: "create_slot"
-- ok
ALTER SUBSCRIPTION regress_testsub SKIP (lsn = '0/12345');
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+---------------------+--------+-----------+------------------+------------------+--------+--------------------+------------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub2,testpub3} | f | f | d | f | any | off | dbname=regress_doesnotexist2 | 0/12345
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Apply delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+---------------------+--------+-----------+------------------+------------------+--------+-------------+--------------------+------------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub2,testpub3} | f | f | d | f | any | 0 | off | dbname=regress_doesnotexist2 | 0/12345
(1 row)
-- ok - with lsn = NONE
@@ -167,10 +167,10 @@ ALTER SUBSCRIPTION regress_testsub SKIP (lsn = NONE);
ALTER SUBSCRIPTION regress_testsub SKIP (lsn = '0/0');
ERROR: invalid WAL location (LSN): 0/0
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+---------------------+--------+-----------+------------------+------------------+--------+--------------------+------------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub2,testpub3} | f | f | d | f | any | off | dbname=regress_doesnotexist2 | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Apply delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+---------------------+--------+-----------+------------------+------------------+--------+-------------+--------------------+------------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub2,testpub3} | f | f | d | f | any | 0 | off | dbname=regress_doesnotexist2 | 0/0
(1 row)
BEGIN;
@@ -202,10 +202,10 @@ ALTER SUBSCRIPTION regress_testsub_foo SET (synchronous_commit = foobar);
ERROR: invalid value for parameter "synchronous_commit": "foobar"
HINT: Available values: local, remote_write, remote_apply, on, off.
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
----------------------+---------------------------+---------+---------------------+--------+-----------+------------------+------------------+--------+--------------------+------------------------------+----------
- regress_testsub_foo | regress_subscription_user | f | {testpub2,testpub3} | f | f | d | f | any | local | dbname=regress_doesnotexist2 | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Apply delay | Synchronous commit | Conninfo | Skip LSN
+---------------------+---------------------------+---------+---------------------+--------+-----------+------------------+------------------+--------+-------------+--------------------+------------------------------+----------
+ regress_testsub_foo | regress_subscription_user | f | {testpub2,testpub3} | f | f | d | f | any | 0 | local | dbname=regress_doesnotexist2 | 0/0
(1 row)
-- rename back to keep the rest simple
@@ -239,19 +239,19 @@ CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUB
WARNING: subscription was created, but is not connected
HINT: To initiate replication, you must manually create the replication slot, enable the subscription, and refresh the subscription.
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub} | t | f | d | f | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Apply delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+-------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | t | f | d | f | any | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
ALTER SUBSCRIPTION regress_testsub SET (binary = false);
ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE);
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub} | f | f | d | f | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Apply delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+-------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | f | d | f | any | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
DROP SUBSCRIPTION regress_testsub;
@@ -263,19 +263,19 @@ CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUB
WARNING: subscription was created, but is not connected
HINT: To initiate replication, you must manually create the replication slot, enable the subscription, and refresh the subscription.
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub} | f | t | d | f | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Apply delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+-------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | t | d | f | any | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
ALTER SUBSCRIPTION regress_testsub SET (streaming = false);
ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE);
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub} | f | f | d | f | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Apply delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+-------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | f | d | f | any | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
-- fail - publication already exists
@@ -290,10 +290,10 @@ ALTER SUBSCRIPTION regress_testsub ADD PUBLICATION testpub1, testpub2 WITH (refr
ALTER SUBSCRIPTION regress_testsub ADD PUBLICATION testpub1, testpub2 WITH (refresh = false);
ERROR: publication "testpub1" is already in subscription "regress_testsub"
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-----------------------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub,testpub1,testpub2} | f | f | d | f | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Apply delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-----------------------------+--------+-----------+------------------+------------------+--------+-------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub,testpub1,testpub2} | f | f | d | f | any | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
-- fail - publication used more then once
@@ -308,10 +308,10 @@ ERROR: publication "testpub3" is not in subscription "regress_testsub"
-- ok - delete publications
ALTER SUBSCRIPTION regress_testsub DROP PUBLICATION testpub1, testpub2 WITH (refresh = false);
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub} | f | f | d | f | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Apply delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+-------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | f | d | f | any | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
DROP SUBSCRIPTION regress_testsub;
@@ -347,10 +347,10 @@ CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUB
WARNING: subscription was created, but is not connected
HINT: To initiate replication, you must manually create the replication slot, enable the subscription, and refresh the subscription.
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub} | f | f | p | f | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Apply delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+-------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | f | p | f | any | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
--fail - alter of two_phase option not supported.
@@ -359,10 +359,10 @@ ERROR: unrecognized subscription parameter: "two_phase"
-- but can alter streaming when two_phase enabled
ALTER SUBSCRIPTION regress_testsub SET (streaming = true);
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub} | f | t | p | f | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Apply delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+-------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | t | p | f | any | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE);
@@ -372,10 +372,10 @@ CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUB
WARNING: subscription was created, but is not connected
HINT: To initiate replication, you must manually create the replication slot, enable the subscription, and refresh the subscription.
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub} | f | t | p | f | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Apply delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+-------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | t | p | f | any | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE);
@@ -388,18 +388,58 @@ CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUB
WARNING: subscription was created, but is not connected
HINT: To initiate replication, you must manually create the replication slot, enable the subscription, and refresh the subscription.
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub} | f | f | d | f | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Apply delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+-------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | f | d | f | any | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
ALTER SUBSCRIPTION regress_testsub SET (disable_on_error = true);
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub} | f | f | d | t | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Apply delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+-------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | f | d | t | any | 0 | off | dbname=regress_doesnotexist | 0/0
+(1 row)
+
+ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE);
+DROP SUBSCRIPTION regress_testsub;
+-- fail -- min_apply_delay must be a non-negative integer
+CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUBLICATION testpub WITH (connect = false, min_apply_delay = foo);
+ERROR: invalid input syntax for type interval: "foo"
+CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUBLICATION testpub WITH (connect = false, min_apply_delay = -1);
+ERROR: -1000 ms is outside the valid range for option "min_apply_delay"
+-- success -- 123 ms
+CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUBLICATION testpub WITH (connect = false, min_apply_delay = 123);
+WARNING: subscription was created, but is not connected
+HINT: To initiate replication, you must manually create the replication slot, enable the subscription, and refresh the subscription.
+\dRs+
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Apply delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+-------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | f | d | f | any | 123 | off | dbname=regress_doesnotexist | 0/0
+(1 row)
+
+-- success -- 86400000 ms
+ALTER SUBSCRIPTION regress_testsub SET (min_apply_delay = '1d');
+\dRs+
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Apply delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+-------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | f | d | f | any | 86400000 | off | dbname=regress_doesnotexist | 0/0
+(1 row)
+
+ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE);
+DROP SUBSCRIPTION regress_testsub;
+-- success -- interval is converted into ms and stored as integer
+CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUBLICATION testpub WITH (connect = false, min_apply_delay = '4h 27min 35s');
+WARNING: subscription was created, but is not connected
+HINT: To initiate replication, you must manually create the replication slot, enable the subscription, and refresh the subscription.
+\dRs+
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Apply delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+-------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | f | d | f | any | 16055000 | off | dbname=regress_doesnotexist | 0/0
(1 row)
ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE);
diff --git a/src/test/regress/sql/subscription.sql b/src/test/regress/sql/subscription.sql
index eaeade8cce..7a4e818857 100644
--- a/src/test/regress/sql/subscription.sql
+++ b/src/test/regress/sql/subscription.sql
@@ -275,6 +275,31 @@ ALTER SUBSCRIPTION regress_testsub SET (disable_on_error = true);
ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE);
DROP SUBSCRIPTION regress_testsub;
+-- fail -- min_apply_delay must be a non-negative integer
+CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUBLICATION testpub WITH (connect = false, min_apply_delay = foo);
+CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUBLICATION testpub WITH (connect = false, min_apply_delay = -1);
+
+-- success -- 123 ms
+CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUBLICATION testpub WITH (connect = false, min_apply_delay = 123);
+
+\dRs+
+
+-- success -- 86400000 ms
+ALTER SUBSCRIPTION regress_testsub SET (min_apply_delay = '1d');
+
+\dRs+
+
+ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE);
+DROP SUBSCRIPTION regress_testsub;
+
+-- success -- interval is converted into ms and stored as integer
+CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUBLICATION testpub WITH (connect = false, min_apply_delay = '4h 27min 35s');
+
+\dRs+
+
+ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE);
+DROP SUBSCRIPTION regress_testsub;
+
RESET SESSION AUTHORIZATION;
DROP ROLE regress_subscription_user;
DROP ROLE regress_subscription_user2;
diff --git a/src/test/subscription/meson.build b/src/test/subscription/meson.build
index 85d1dd9295..bb15d062b8 100644
--- a/src/test/subscription/meson.build
+++ b/src/test/subscription/meson.build
@@ -36,6 +36,7 @@ tests += {
't/029_on_error.pl',
't/030_origin.pl',
't/031_column_list.pl',
+ 't/032_apply_delay.pl',
't/100_bugs.pl',
],
},
diff --git a/src/test/subscription/t/032_apply_delay.pl b/src/test/subscription/t/032_apply_delay.pl
new file mode 100644
index 0000000000..ad8e4e200d
--- /dev/null
+++ b/src/test/subscription/t/032_apply_delay.pl
@@ -0,0 +1,129 @@
+
+# Copyright (c) 2022, PostgreSQL Global Development Group
+
+# Test replication apply delay
+use strict;
+use warnings;
+use PostgreSQL::Test::Cluster;
+use PostgreSQL::Test::Utils;
+use Test::More;
+
+# Create publisher node.
+my $node_publisher = PostgreSQL::Test::Cluster->new('publisher');
+$node_publisher->init(allows_streaming => 'logical');
+$node_publisher->append_conf('postgresql.conf',
+ 'logical_decoding_work_mem = 64kB');
+$node_publisher->start;
+
+# Create subscriber node.
+my $node_subscriber = PostgreSQL::Test::Cluster->new('subscriber');
+$node_subscriber->init;
+$node_subscriber->append_conf('postgresql.conf',
+ "log_min_messages = debug2");
+$node_subscriber->start;
+
+# Create some preexisting content on publisher.
+$node_publisher->safe_psql('postgres',
+ "CREATE TABLE test_tab (a int primary key, b varchar)");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO test_tab VALUES (1, 'foo'), (2, 'bar')");
+
+# Setup structure on subscriber.
+$node_subscriber->safe_psql('postgres',
+ "CREATE TABLE test_tab (a int primary key, b text, c timestamptz DEFAULT now(), d bigint DEFAULT 999)"
+);
+
+# Setup logical replication.
+my $publisher_connstr = $node_publisher->connstr . ' dbname=postgres';
+$node_publisher->safe_psql('postgres',
+ "CREATE PUBLICATION tap_pub FOR TABLE test_tab");
+
+my $appname = 'tap_sub';
+$node_subscriber->safe_psql('postgres',
+ "CREATE SUBSCRIPTION tap_sub CONNECTION '$publisher_connstr application_name=$appname' PUBLICATION tap_pub WITH (streaming = on, min_apply_delay = '3s')"
+);
+
+# Wait for initial table sync to finish.
+$node_subscriber->wait_for_subscription_sync($node_publisher, $appname);
+
+# Check log starting now for logical replication apply delay.
+my $log_location = -s $node_subscriber->logfile;
+
+my $result =
+ $node_subscriber->safe_psql('postgres',
+ "SELECT count(*), min(a), max(a) FROM test_tab");
+is($result, qq(2|1|2), 'check initial data was copied to subscriber');
+
+# New row to trigger apply delay.
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO test_tab VALUES (3, 'baz')");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO test_tab VALUES (4, 'abc')");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO test_tab VALUES (5, 'def')");
+
+$node_publisher->wait_for_catchup($appname);
+
+$result =
+ $node_subscriber->safe_psql('postgres',
+ "SELECT count(*), min(a), max(a) FROM test_tab");
+is($result, qq(5|1|5), 'check if the new rows were applied to subscriber');
+
+check_apply_delay_log("logical replication apply delay");
+
+# Test streamed transaction.
+# Insert, update and delete enough rows to exceed 64kB limit.
+$node_publisher->safe_psql(
+ 'postgres', q{
+BEGIN;
+INSERT INTO test_tab SELECT i, md5(i::text) FROM generate_series(6, 5000) s(i);
+UPDATE test_tab SET b = md5(b) WHERE mod(a, 2) = 0;
+DELETE FROM test_tab WHERE mod(a, 3) = 0;
+COMMIT;
+});
+
+$node_publisher->wait_for_catchup($appname);
+
+$result =
+ $node_subscriber->safe_psql('postgres',
+ "SELECT count(*), min(a), max(a) FROM test_tab");
+is($result, qq(3334|1|5000), 'check if the new rows were applied to subscriber');
+
+check_apply_delay_log("logical replication apply delay");
+
+# Test ALTER SUBSCRIPTION. Delay 86460 seconds (1 day 1 minute).
+$node_subscriber->safe_psql('postgres',
+ "ALTER SUBSCRIPTION tap_sub SET (min_apply_delay = 86460000)"
+);
+
+# New row to trigger apply delay.
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO test_tab VALUES (0, 'foobar')");
+
+# Disable subscription. worker should die immediately.
+$node_subscriber->safe_psql('postgres',
+ "ALTER SUBSCRIPTION tap_sub DISABLE;"
+);
+
+# Wait until worker dies.
+my $sub_query =
+ "SELECT count(1) = 0 FROM pg_stat_subscription WHERE subname = 'tap_sub' AND pid IS NOT NULL;";
+$node_subscriber->poll_query_until('postgres', $sub_query)
+ or die "Timed out while waiting for subscriber to die";
+
+$node_subscriber->stop;
+$node_publisher->stop;
+
+done_testing();
+
+sub check_apply_delay_log
+{
+ my $message = shift;
+ my $old_log_location = $log_location;
+
+ $log_location = $node_subscriber->wait_for_log(qr/$message/, $log_location);
+
+ cmp_ok($log_location, '>', $old_log_location,
+ "logfile contains triggered logical replication apply delay"
+ );
+}
--
2.30.0
On Wednesday, November 16, 2022 12:58 PM Ian Lawrence Barwick <barwick@gmail.com> wrote:
2022年11月14日(月) 10:09 Takamichi Osumi (Fujitsu)
<osumi.takamichi@fujitsu.com>:I've simply rebased the patch to make it applicable on top of HEAD and
make the tests pass. Note there are still open pending comments and
I'm going to start to address those.Thanks for the updated patch.
While reviewing the patch backlog, we have determined that this patch adds
one or more TAP tests but has not added the test to the "meson.build" file.To do this, locate the relevant "meson.build" file for each test and add it in the
'tests' dictionary, which will look something like this:'tap': {
'tests': [
't/001_basic.pl',
],
},For some additional details please see this Wiki article:
https://wiki.postgresql.org/wiki/Meson_for_patch_authors
For more information on the meson build system for PostgreSQL see:
Hi, thanks for your notification.
You are right. Modified.
The updated patch can be seen in [1]/messages/by-id/TYCPR01MB8373775ECC6972289AF8CB30ED0F9@TYCPR01MB8373.jpnprd01.prod.outlook.com.
[1]: /messages/by-id/TYCPR01MB8373775ECC6972289AF8CB30ED0F9@TYCPR01MB8373.jpnprd01.prod.outlook.com
Best Regards,
Takamichi Osumi
Hi,
On Tuesday, November 22, 2022 6:15 PM vignesh C <vignesh21@gmail.com> wrote:
On Mon, 14 Nov 2022 at 12:14, Amit Kapila <amit.kapila16@gmail.com> wrote:
Hi,
The thread title doesn't really convey the topic under discussion, so
changed it. IIRC, this has been mentioned by others as well in the
thread.On Sat, Nov 12, 2022 at 7:21 PM vignesh C <vignesh21@gmail.com> wrote:
Few comments:
1) I feel if the user has specified a long delay there is a chance
that replication may not continue if the replication slot falls
behind the current LSN by more than max_slot_wal_keep_size. I feel
we should add this reference in the documentation of min_apply_delay
as the replication will not continue in this case.This makes sense to me.
Modified accordingly. The updated patch is in [1]/messages/by-id/TYCPR01MB8373775ECC6972289AF8CB30ED0F9@TYCPR01MB8373.jpnprd01.prod.outlook.com.
2) I also noticed that if we have to shut down the publisher server
with a long min_apply_delay configuration, the publisher server
cannot be stopped as the walsender waits for the data to be
replicated. Is this behavior ok for the server to wait in this case?
If this behavior is ok, we could add a log message as it is not very
evident from the log files why the server could not be shut down.I think for this case, the behavior should be the same as for physical
replication. Can you please check what is behavior for the case you
are worried about in physical replication? Note, we already have a
similar parameter for recovery_min_apply_delay for physical
replication.In the case of physical replication by setting recovery_min_apply_delay, I
noticed that both primary and standby nodes were getting stopped successfully
immediately after the stop server command. In case of logical replication, stop
server fails:
pg_ctl -D publisher -l publisher.log stop -c waiting for server to shut
down...............................................................
failed
pg_ctl: server does not shut downIn case of logical replication, the server does not get stopped because the
walsender process is not able to exit:
ps ux | grep walsender
vignesh 1950789 75.3 0.0 8695216 22284 ? Rs 11:51 1:08
postgres: walsender vignesh [local] START_REPLICATION
Thanks, I could reproduce this and I'll update this point in a subsequent version.
[1]: /messages/by-id/TYCPR01MB8373775ECC6972289AF8CB30ED0F9@TYCPR01MB8373.jpnprd01.prod.outlook.com
Best Regards,
Takamichi Osumi
Hi,
On Monday, November 14, 2022 7:15 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
On Wed, Nov 9, 2022 at 12:11 PM Kyotaro Horiguchi
<horikyota.ntt@gmail.com> wrote:At Wed, 10 Aug 2022 17:33:00 -0300, "Euler Taveira"
<euler@eulerto.com> wrote inOn Wed, Aug 10, 2022, at 9:39 AM, osumi.takamichi@fujitsu.com wrote:
Minor review comments for v6.
Thanks for your review. I'm attaching v7.
Using interval is not standard as this kind of parameters but it seems
convenient. On the other hand, it's not great that the unit month
introduces some subtle ambiguity. This patch translates a month to 30
days but I'm not sure it's the right thing to do. Perhaps we shouldn't
allow the units upper than days.Agreed. Isn't the same thing already apply to recovery_min_apply_delay for
which the maximum unit seems to be in days? If so, there is no reason to do
something different here?
The corresponding one of physical replication had the
upper limit of INT_MAX(like it means 24 days is OK, but 25 days isn't).
I added this test in the patch posted in [1]/messages/by-id/TYCPR01MB8373775ECC6972289AF8CB30ED0F9@TYCPR01MB8373.jpnprd01.prod.outlook.com.
apply_delay() chokes the message-receiving path so that a not-so-long
delay can cause a replication timeout to fire. I think we should
process walsender pings even while delaying. Needing to make
replication timeout longer than apply delay is not great, I think.Again, I think for this case also the behavior should be similar to how we handle
recovery_min_apply_delay.
Yes, I agree with you.
This feature makes it easier to trigger the publisher's timeout,
which can't be observed in the physical replication.
I'll do the investigation and modify this point in a subsequent version.
[1]: /messages/by-id/TYCPR01MB8373775ECC6972289AF8CB30ED0F9@TYCPR01MB8373.jpnprd01.prod.outlook.com
Best Regards,
Takamichi Osumi
Hi,
On Thursday, August 11, 2022 7:33 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
On Tue, Aug 9, 2022 at 3:52 AM Euler Taveira <euler@eulerto.com> wrote:
On Wed, Aug 3, 2022, at 10:27 AM, Amit Kapila wrote:
Your explanation makes sense to me. The other point to consider is
that there can be cases where we may not apply operation for the
transaction because of empty transactions (we don't yet skip empty
xacts for prepared transactions). So, won't it be better to apply the
delay just before we apply the first change for a transaction? Do we
want to apply the delay during table sync as we sometimes do need to
enter apply phase while doing table sync?I thought about the empty transactions but decided to not complicate
the code mainly because skipping transactions is not a code path that
will slow down this feature. As explained in the documentation, there
is no harm in delaying a transaction for more than min_apply_delay; it
cannot apply earlier. Having said that I decided to do nothing. I'm
also not sure if it deserves a comment or if this email is a possible explanationfor this decision.
I don't know what makes you think it will complicate the code. But anyway
thinking further about the way apply_delay is used at various places in the patch,
as pointed out by Peter Smith it seems it won't work for the parallel apply
feature where we start applying the transaction immediately after start stream.
I was wondering why don't we apply delay after each commit of the transaction
rather than at the begin command. We can remember if the transaction has
made any change and if so then after commit, apply the delay. If we can do that
then it will alleviate the concern of empty and skipped xacts as well.
I agree with this direction. I'll update this point in a subsequent patch.
Another thing I was wondering how to determine what is a good delay time for
tests and found that current tests in replay_delay.pl uses 3s, so should we use
the same for apply delay tests in this patch as well?
Fixed in the patch posted in [1]/messages/by-id/TYCPR01MB8373775ECC6972289AF8CB30ED0F9@TYCPR01MB8373.jpnprd01.prod.outlook.com.
[1]: /messages/by-id/TYCPR01MB8373775ECC6972289AF8CB30ED0F9@TYCPR01MB8373.jpnprd01.prod.outlook.com
Best Regards,
Takamichi Osumi
On Fri, Nov 25, 2022 at 2:15 AM Takamichi Osumi (Fujitsu)
<osumi.takamichi@fujitsu.com> wrote:
On Wednesday, October 5, 2022 6:42 PM Peter Smith <smithpb2250@gmail.com> wrote:
...
======
5. src/backend/commands/subscriptioncmds.c - SubOpts
@@ -89,6 +91,7 @@ typedef struct SubOpts
bool disableonerr;
char *origin;
XLogRecPtr lsn;
+ int64 min_apply_delay;
} SubOpts;I feel it would be better to be explicit about the storage units. So call this
member ‘min_apply_delay_ms’. E.g. then other code in
parse_subscription_options will be more natural when you are converting using
and assigning them to this member.I don't think we use such names including units explicitly.
Could you please tell me a similar example for this ?
Regex search "\..*_ms[e\s]" finds some members where the unit is in
the member name.
e.g. delay_ms (see EnableTimeoutParams in timeout.h)
e.g. interval_in_ms (see timeout_paramsin timeout.c)
Regex search ".*_ms[e\s]" finds many local variables where the unit is
in the variable name
======
16. src/include/catalog/pg_subscription.h
+ int64 subapplydelay; /* Replication apply delay */ +Consider renaming this as 'subapplydelayms' to make the units perfectly clear.
Similar to the 5th comments, I can't find any examples for this.
I'd like to keep it general, which makes me feel it is more aligned with
existing codes.
As above.
------
Kind Regards,
Peter Smith.
Fujitsu Australia
On Tue, Nov 15, 2022 at 12:33 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
On Mon, Nov 14, 2022 at 6:52 PM Hayato Kuroda (Fujitsu)
<kuroda.hayato@fujitsu.com> wrote:Dear Amit,
It seems to me Kuroda-San has proposed this change [1] to fix the test
but it is not clear to me why such a change is required. Why can't
CHECK_FOR_INTERRUPTS() after waiting, followed by the existing below
code [2] in LogicalRepApplyLoop() sufficient to handle parameter
updates?(I forgot to say, this change was not proposed by me. I said that there should be
modified. I thought workers should wake up after the transaction was committed.)So, why only honor the 'disable' option of the subscription? For
example, one can change 'min_apply_delay' and it seems
recoveryApplyDelay() honors a similar change in the recovery
parameter. Is there a way to set the latch of the worker process, so
that it can recheck if anything is changed?I have not considered about it, but seems reasonable. We may be able to
do maybe_reread_subscription() if subscription parameters are changed
and latch is set.One more thing I would like you to consider is the point raised by me
related to this patch's interaction with the parallel apply feature as
mentioned in the email [1]. I am not sure the idea proposed in that
email [1] is a good one because delaying after applying commit may not
be good as we want to delay the apply of the transaction(s) on
subscribers by this feature. I feel this needs more thought.
I have thought a bit more about this and we have the following options
to choose the delay point from. (a) apply delay just before committing
a transaction. As mentioned in comments in the patch this can lead to
bloat and locks held for a long time. (b) apply delay before starting
to apply changes for a transaction but here the problem is which time
to consider. In some cases, like for streaming transactions, we don't
receive the commit/prepare xact time in the start message. (c) use (b)
but use the previous transaction's commit time. (d) apply delay after
committing a transaction by using the xact's commit time.
At this stage, among above, I feel any one of (c) or (d) is worth
considering. Now, the difference between (c) and (d) is that if after
commit the next xact's data is already delayed by more than
min_apply_delay time then we don't need to kick the new logic of apply
delay.
The other thing to consider whether we need to process any keepalive
messages during the delay because otherwise, walsender may think that
the subscriber is not available and time out. This may not be a
problem for synchronous replication but otherwise, it could be a
problem.
Thoughts?
--
With Regards,
Amit Kapila.
Here are some review comments for patch v9-0001:
======
GENERAL
1. min_ prefix?
What's the significance of the "min_" prefix for this parameter? I'm
guessing the background is that at one time it was considered to be a
GUC so took a name similar to GUC recovery_min_apply_delay (??)
But in practice, I think it is meaningless and/or misleading. For
example, suppose the user wants to defer replication by 1hr. IMO it is
more natural to just say "defer replication by 1 hr" (aka
apply_delay='1hr') Clearly it means replication will take place about
1 hr into the future. OTHO saying "defer replication by a MINIMUM of 1
hr" (aka min_apply_delay='1hr') is quite vague because then it is
equally valid if the replication gets delayed by 1 hr or 2 hrs or 5
days or 3 weeks since all of those satisfy the minimum delay. The
implementation could hardwire a delay of INT_MAX ms but clearly,
that's not really what the user would expect.
~
So, I think this parameter should be renamed just as 'apply_delay'.
But, if you still decide to keep it as 'min_apply_delay' then there is
a lot of other code that ought to be changed to be consistent with
that name.
e.g.
- subapplydelay in catalogs.sgml --> subminapplydelay
- subapplydelay in system_views.sql --> subminapplydelay
- subapplydelay in pg_subscription.h --> subminapplydelay
- subapplydelay in dump.h --> subminapplydelay
- i_subapplydelay in pg_dump.c --> i_subminapplydelay
- applydelay member name of Form_pg_subscription --> minapplydelay
- "Apply Delay" for the column name displayed by describe.c --> "Min
apply delay"
- more...
(IMO the fact that so much code does not currently say 'min' at all is
just evidence that the 'min' prefix really didn't really mean much in
the first place)
======
doc/src/sgml/catalogs.sgml
2. Section 31.2 Subscription
+ <para>
+ Time delayed replica of subscription is available by indicating
+ <literal>min_apply_delay</literal>. See
+ <xref linkend="sql-createsubscription"/> for details.
+ </para>
How about saying like:
SUGGESTION
The subscriber replication can be instructed to lag behind the
publisher side changes by specifying the
<literal>min_apply_delay</literal> subscription parameter. See XXX for
details.
======
doc/src/sgml/ref/create_subscription.sgml
3. min_apply_delay
+ <para>
+ By default, subscriber applies changes as soon as possible. As with
+ the physical replication feature
+ (<xref linkend="guc-recovery-min-apply-delay"/>), it can be useful to
+ have a time-delayed logical replica. This parameter allows you to
+ delay the application of changes by a specified amount of time. If
+ this value is specified without units, it is taken as milliseconds.
+ The default is zero, adding no delay.
+ </para>
"subscriber applies" -> "the subscriber applies"
"allows you" -> "lets the user"
"The default is zero, adding no delay." -> "The default is zero (no delay)."
~
4.
+ larger than the time deviations between servers. Note that
+ in the case when this parameter is set to a long value, the
+ replication may not continue if the replication slot falls behind the
+ current LSN by more than <literal>max_slot_wal_keep_size</literal>.
+ See more details in <xref linkend="guc-max-slot-wal-keep-size"/>.
+ </para>
4a.
SUGGESTION
Note that if this parameter is set to a long delay, the replication
will stop if the replication slot falls behind the current LSN by more
than <literal>max_slot_wal_keep_size</literal>.
~
4b.
When it is rendered (like below) it looks a bit repetitive:
... if the replication slot falls behind the current LSN by more than
max_slot_wal_keep_size. See more details in max_slot_wal_keep_size.
~
IMO the previous sentence should include the link.
SUGGESTION
if the replication slot falls behind the current LSN by more than
<link linkend =
"guc-max-slot-wal-keep-size"><literal>max_slot_wal_keep_size</literal></link>.
~
5.
+ <para>
+ Synchronous replication is affected by this setting when
+ <varname>synchronous_commit</varname> is set to
+ <literal>remote_write</literal>; every <literal>COMMIT</literal>
+ will need to wait to be applied.
+ </para>
Yes, this deserves a big warning -- but I am just not quite sure of
the details. I think this impacts more than just "remote_rewrite" --
e.g. the same problem would happen if "synchronous_standby_names" is
non-empty.
I think this warning needs to be more generic to cover everything.
Maybe something like below
SUGGESTION:
Delaying the replication can mean there is a much longer time between
making a change on the publisher, and that change being committed on
the subscriber. This can have a big impact on synchronous replication.
See https://www.postgresql.org/docs/current/runtime-config-wal.html#GUC-SYNCHRONOUS-COMMIT
======
src/backend/commands/subscriptioncmds.c
6. parse_subscription_options
+ ms = interval_to_ms(interval);
+ if (ms < 0 || ms > INT_MAX)
+ ereport(ERROR,
+ errcode(ERRCODE_NUMERIC_VALUE_OUT_OF_RANGE),
+ errmsg("%lld ms is outside the valid range for option \"%s\"",
+ (long long) ms, "min_apply_delay"));
"for option" -> "for parameter"
======
src/backend/replication/logical/worker.c
7. apply_delay
+static void
+apply_delay(TimestampTz ts)
IMO having a delay is not the usual case. So, would a better name for
this function be 'maybe_delay'?
~
8.
+ * high value for the delay. This design is different from the physical
+ * replication (that applies the delay at commit time) mainly because write
+ * operations may allow some issues (such as bloat and locks) that can be
+ * minimized if it does not keep the transaction open for such a long time.
Something seems not quite right with this wording -- is there a better
way of describing this?
~
9.
+ /*
+ * Delay apply until all tablesync workers have reached READY state. If we
+ * allow the delay during the catchup phase, once we reach the limit of
+ * tablesync workers, it will impose a delay for each subsequent worker.
+ * It means it will take a long time to finish the initial table
+ * synchronization.
+ */
+ if (!AllTablesyncsReady())
+ return;
"Delay apply until..." -> "The min_apply_delay parameter is ignored until..."
~
10.
+ /*
+ * The worker may be waken because of the ALTER SUBSCRIPTION ...
+ * DISABLE, so the catalog pg_subscription should be read again.
+ */
+ if (!in_remote_transaction && !in_streamed_transaction)
+ {
+ AcceptInvalidationMessages();
+ maybe_reread_subscription();
+ }
+ }
"waken" -> "woken"
======
src/bin/psql/describe.c
11. describeSubscriptions
+ /* Origin and min_apply_delay are only supported in v16 and higher */
if (pset.sversion >= 160000)
appendPQExpBuffer(&buf,
- ", suborigin AS \"%s\"\n",
- gettext_noop("Origin"));
+ ", suborigin AS \"%s\"\n"
+ ", subapplydelay AS \"%s\"\n",
+ gettext_noop("Origin"),
+ gettext_noop("Apply delay"));
IIUC the psql command is supposed to display useful information to the
user, so I wondered if it is worthwhile to put the units in this
column header -- "Apply delay (ms)" instead of just "Apply delay"
because that would make it far easier to understand the meaning
without having to check the documentation to discover the units.
======
src/include/utils/timestamp.h
12.
+extern int64 interval_to_ms(const Interval *interval);
+
For consistency with the other interval conversion functions exposed
here maybe this one should have been called 'interval2ms'
======
src/test/subscription/t/032_apply_delay.pl
13.
IIUC this test is checking if a delay has occurred by inspecting the
debug logs to see if a certain code path including "logical
replication apply delay" is logged. I guess that is OK, but another
way might be to compare the actual timing values of the published and
replicated rows.
The publisher table can have a column with default now() and the
subscriber side table can have an *additional* column also with
default now(). After replication, those two timestamp values can be
compared to check if the difference exceeds the min_time_delay
parameter specified.
------
Kind Regards,
Peter Smith.
Fujitsu Australia
On Tue, Dec 6, 2022 at 1:30 PM Peter Smith <smithpb2250@gmail.com> wrote:
Here are some review comments for patch v9-0001:
======
GENERAL
1. min_ prefix?
What's the significance of the "min_" prefix for this parameter? I'm
guessing the background is that at one time it was considered to be a
GUC so took a name similar to GUC recovery_min_apply_delay (??)But in practice, I think it is meaningless and/or misleading. For
example, suppose the user wants to defer replication by 1hr. IMO it is
more natural to just say "defer replication by 1 hr" (aka
apply_delay='1hr') Clearly it means replication will take place about
1 hr into the future. OTHO saying "defer replication by a MINIMUM of 1
hr" (aka min_apply_delay='1hr') is quite vague because then it is
equally valid if the replication gets delayed by 1 hr or 2 hrs or 5
days or 3 weeks since all of those satisfy the minimum delay. The
implementation could hardwire a delay of INT_MAX ms but clearly,
that's not really what the user would expect.
There is another way to look at this naming. It is quite possible user
has set its value as '1 second' and the transaction is delayed by more
than that say because the publisher delayed sending it. There could be
various reasons why the publisher could delay like it was busy
processing another workload, the replication connection between
publisher and subscriber was not working, etc. Moreover, it will be
similar to the same parameter for physical replication. So, I think
keeping min in the name is a good idea.
--
With Regards,
Amit Kapila.
On Friday, December 2, 2022 4:05 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
On Tue, Nov 15, 2022 at 12:33 PM Amit Kapila <amit.kapila16@gmail.com>
wrote:One more thing I would like you to consider is the point raised by me
related to this patch's interaction with the parallel apply feature as
mentioned in the email [1]. I am not sure the idea proposed in that
email [1] is a good one because delaying after applying commit may not
be good as we want to delay the apply of the transaction(s) on
subscribers by this feature. I feel this needs more thought.I have thought a bit more about this and we have the following options to
choose the delay point from. (a) apply delay just before committing a
transaction. As mentioned in comments in the patch this can lead to bloat and
locks held for a long time. (b) apply delay before starting to apply changes for a
transaction but here the problem is which time to consider. In some cases, like
for streaming transactions, we don't receive the commit/prepare xact time in
the start message. (c) use (b) but use the previous transaction's commit time.
(d) apply delay after committing a transaction by using the xact's commit time.At this stage, among above, I feel any one of (c) or (d) is worth considering. Now,
the difference between (c) and (d) is that if after commit the next xact's data is
already delayed by more than min_apply_delay time then we don't need to kick
the new logic of apply delay.The other thing to consider whether we need to process any keepalive
messages during the delay because otherwise, walsender may think that the
subscriber is not available and time out. This may not be a problem for
synchronous replication but otherwise, it could be a problem.Thoughts?
Hi,
Thank you for your comments !
Below are some analysis for the major points above.
(1) About the timing to apply the delay
One approach of (b) would be best. The idea is to delay all types of transaction's application
based on the time when one transaction arrives at the subscriber node.
One advantage of this approach over (c) and (d) is that this can avoid the case
where we might apply a transaction immediately without waiting,
if there are two transactions sequentially and the time in between exceeds the min_apply_delay time.
When we receive stream-in-progress transactions, we'll check whether the time for delay
has passed or not at first in this approach.
(2) About the timeout issue
When having a look at the physical replication internals,
it conducts sending feedback and application of delay separately on different processes.
OTOH, the logical replication needs to achieve those within one process.
When we want to apply delay and avoid the timeout,
we should not store all the transactions data into memory.
So, one approach for this is to serialize the transaction data and after the delay,
we apply the transactions data. However, this means if users adopt this feature,
then all transaction data that should be delayed would be serialized.
We are not sure if this sounds a valid approach or not.
One another approach was to divide the time of delay in apply_delay() and
utilize the divided time for WaitLatch and sends the keepalive messages from there.
But, this approach requires some change on the level of libpq layer
(like implementing a new function for wal receiver in order to monitor if
the data from the publisher is readable or not there).
Probably, the first idea to serialize the delayed transactions might be better on this point.
Any feedback is welcome.
Best Regards,
Takamichi Osumi
Hi,
The tests fail on cfbot:
https://cirrus-ci.com/task/4533866329800704
They only seem to fail on 32bit linux.
https://api.cirrus-ci.com/v1/artifact/task/4533866329800704/testrun/build-32/testrun/subscription/032_apply_delay/log/regress_log_032_apply_delay
[06:27:10.628](0.138s) ok 2 - check if the new rows were applied to subscriber
timed out waiting for match: (?^:logical replication apply delay) at /tmp/cirrus-ci-build/src/test/subscription/t/032_apply_delay.pl line 124.
Greetings,
Andres Freund
At Tue, 6 Dec 2022 11:08:43 -0800, Andres Freund <andres@anarazel.de> wrote in
Hi,
The tests fail on cfbot:
https://cirrus-ci.com/task/4533866329800704They only seem to fail on 32bit linux.
https://api.cirrus-ci.com/v1/artifact/task/4533866329800704/testrun/build-32/testrun/subscription/032_apply_delay/log/regress_log_032_apply_delay
[06:27:10.628](0.138s) ok 2 - check if the new rows were applied to subscriber
timed out waiting for match: (?^:logical replication apply delay) at /tmp/cirrus-ci-build/src/test/subscription/t/032_apply_delay.pl line 124.
It fails for me on 64bit Linux.. (Rocky 8.7)
t/032_apply_delay.pl ............... Dubious, test returned 29 (wstat 7424, 0x1d00)
No subtests run
..
t/032_apply_delay.pl (Wstat: 7424 Tests: 0 Failed: 0)
Non-zero exit status: 29
Parse errors: No plan found in TAP output
regards.
--
Kyotaro Horiguchi
NTT Open Source Software Center
On Tue, Dec 6, 2022 at 5:44 PM Takamichi Osumi (Fujitsu)
<osumi.takamichi@fujitsu.com> wrote:
On Friday, December 2, 2022 4:05 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
On Tue, Nov 15, 2022 at 12:33 PM Amit Kapila <amit.kapila16@gmail.com>
wrote:One more thing I would like you to consider is the point raised by me
related to this patch's interaction with the parallel apply feature as
mentioned in the email [1]. I am not sure the idea proposed in that
email [1] is a good one because delaying after applying commit may not
be good as we want to delay the apply of the transaction(s) on
subscribers by this feature. I feel this needs more thought.I have thought a bit more about this and we have the following options to
choose the delay point from. (a) apply delay just before committing a
transaction. As mentioned in comments in the patch this can lead to bloat and
locks held for a long time. (b) apply delay before starting to apply changes for a
transaction but here the problem is which time to consider. In some cases, like
for streaming transactions, we don't receive the commit/prepare xact time in
the start message. (c) use (b) but use the previous transaction's commit time.
(d) apply delay after committing a transaction by using the xact's commit time.At this stage, among above, I feel any one of (c) or (d) is worth considering. Now,
the difference between (c) and (d) is that if after commit the next xact's data is
already delayed by more than min_apply_delay time then we don't need to kick
the new logic of apply delay.The other thing to consider whether we need to process any keepalive
messages during the delay because otherwise, walsender may think that the
subscriber is not available and time out. This may not be a problem for
synchronous replication but otherwise, it could be a problem.Thoughts?
Hi,
Thank you for your comments !
Below are some analysis for the major points above.(1) About the timing to apply the delay
One approach of (b) would be best. The idea is to delay all types of transaction's application
based on the time when one transaction arrives at the subscriber node.
But I think it will unnecessarily add the delay when there is a delay
in sending the transaction by the publisher (say due to the reason
that publisher was busy handling other workloads or there was a
temporary network communication break between publisher and
subscriber). This could probably be the reason why physical
replication (via recovery_min_apply_delay) uses the commit time of the
sending side.
One advantage of this approach over (c) and (d) is that this can avoid the case
where we might apply a transaction immediately without waiting,
if there are two transactions sequentially and the time in between exceeds the min_apply_delay time.
I am not sure if I understand your point. However, I think even if the
transactions are sequential but if the time between them exceeds (say
because the publisher was down) min_apply_delay, there is no need to
apply additional delay.
When we receive stream-in-progress transactions, we'll check whether the time for delay
has passed or not at first in this approach.(2) About the timeout issue
When having a look at the physical replication internals,
it conducts sending feedback and application of delay separately on different processes.
OTOH, the logical replication needs to achieve those within one process.When we want to apply delay and avoid the timeout,
we should not store all the transactions data into memory.
So, one approach for this is to serialize the transaction data and after the delay,
we apply the transactions data.
It is not clear to me how this will avoid a timeout.
However, this means if users adopt this feature,
then all transaction data that should be delayed would be serialized.
We are not sure if this sounds a valid approach or not.One another approach was to divide the time of delay in apply_delay() and
utilize the divided time for WaitLatch and sends the keepalive messages from there.
Do we anytime send keepalive messages from the apply side? I think we
only send feedback reply messages as a response to the publisher's
keep_alive message. So, we need to do something similar for this if
you want to follow this approach.
--
With Regards,
Amit Kapila.
On Wednesday, December 7, 2022 12:00 PM Kyotaro Horiguchi <horikyota.ntt@gmail.com> wrote:
At Tue, 6 Dec 2022 11:08:43 -0800, Andres Freund <andres@anarazel.de> wrote
inHi,
The tests fail on cfbot:
https://cirrus-ci.com/task/4533866329800704They only seem to fail on 32bit linux.
https://api.cirrus-ci.com/v1/artifact/task/4533866329800704/testrun/bu
ild-32/testrun/subscription/032_apply_delay/log/regress_log_032_apply_
delay
[06:27:10.628](0.138s) ok 2 - check if the new rows were applied to
subscriber timed out waiting for match: (?^:logical replication apply delay) at/tmp/cirrus-ci-build/src/test/subscription/t/032_apply_delay.pl line 124.
It fails for me on 64bit Linux.. (Rocky 8.7)
t/032_apply_delay.pl ............... Dubious, test returned 29 (wstat
7424, 0x1d00) No subtests run..
t/032_apply_delay.pl (Wstat: 7424 Tests: 0 Failed: 0)
Non-zero exit status: 29
Parse errors: No plan found in TAP outputregards.
Hi, thank you so much for your notifications !
I'll look into the failures.
Best Regards,
Takamichi Osumi
Hi Vignesh,
In the case of physical replication by setting
recovery_min_apply_delay, I noticed that both primary and standby
nodes were getting stopped successfully immediately after the stop
server command. In case of logical replication, stop server fails:
pg_ctl -D publisher -l publisher.log stop -c
waiting for server to shut
down...............................................................
failed
pg_ctl: server does not shut downIn case of logical replication, the server does not get stopped
because the walsender process is not able to exit:
ps ux | grep walsender
vignesh 1950789 75.3 0.0 8695216 22284 ? Rs 11:51 1:08
postgres: walsender vignesh [local] START_REPLICATION
Thanks for reporting the issue. I analyzed about it.
This issue has occurred because the apply worker cannot reply during the delay.
I think we may have to modify the mechanism that delays applying transactions.
When walsender processes are requested to shut down, it can shut down only after
that all the sent WALs are replicated on the subscriber. This check is done in
WalSndDone(), and the replicated position will be updated when processes handle
the reply messages from a subscriber, in ProcessStandbyReplyMessage().
In the case of physical replication, the walreciever can receive WALs and reply
even if the application is delayed. It means that the replicated position will
be transported to the publisher side immediately. So the walsender can exit.
In terms of logical replication, however, the worker cannot reply to the
walsender while delaying the transaction with this patch at present. It causes
the replicated position to be never transported upstream and the walsender cannot
exit.
Based on the above analysis, we can conclude that the worker must update the
flushpos and reply to the walsender while delaying the transaction if we want
to solve the issue. This cannot be done in the current approach, and a newer
proposed one[1]/messages/by-id/TYCPR01MB8373FA10EB2DB2BF8E458604ED1B9@TYCPR01MB8373.jpnprd01.prod.outlook.com may be able to solve this, although it's currently under discussion.
Note that a similar issue can reproduce while doing the physical replication.
When the wal_sender_timeout is set to 0 and the network between primary and
secondary is broken after that primary sends WALs to secondary, we cannot stop
the primary node.
[1]: /messages/by-id/TYCPR01MB8373FA10EB2DB2BF8E458604ED1B9@TYCPR01MB8373.jpnprd01.prod.outlook.com
Best Regards,
Hayato Kuroda
FUJITSU LIMITED
Dear Andres,
Thanks for reporting! I have analyzed the problem and found the root cause.
This feature seemed not to work on 32-bit OSes. This was because the calculation
of delay_time was wrong. The first argument of this should be TimestampTz datatype, not Datum:
```
+ /* Set apply delay */
+ delay_until = TimestampTzPlusMilliseconds(TimestampTzGetDatum(ts),
+ MySubscription->applydelay);
```
In more detail, the datum representation of int64 contains the value itself
on 64-bit OSes, but it contains the pointer to the value on 32-bit.
After modifying the issue, this will work on 32-bit environments.
Best Regards,
Hayato Kuroda
FUJITSU LIMITED
Hi,
On Friday, December 9, 2022 3:38 PM Kuroda, Hayato/黒田 隼人 <kuroda.hayato@fujitsu.com> wrote:
Thanks for reporting! I have analyzed the problem and found the root cause.
This feature seemed not to work on 32-bit OSes. This was because the
calculation of delay_time was wrong. The first argument of this should be
TimestampTz datatype, not Datum:``` + /* Set apply delay */ + delay_until = TimestampTzPlusMilliseconds(TimestampTzGetDatum(ts), + + MySubscription->applydelay); ```In more detail, the datum representation of int64 contains the value itself on
64-bit OSes, but it contains the pointer to the value on 32-bit.After modifying the issue, this will work on 32-bit environments.
Thank you for your analysis.
Yeah, it seems we conduct addition of values to the pointer value,
which is returned from the call of TimestampTzGetDatum(), on 32-bit machine
by mistake.
I'll remove the call in my next version.
Best Regards,
Takamichi Osumi
Hello.
I asked about unexpected walsender termination caused by this feature
but I think I didn't received an answer for it and the behavior is
still exists.
Specifically, if servers have the following settings, walsender
terminates for replication timeout. After that, connection is restored
after the LR delay elapses. Although it can be said to be working in
that sense, the error happens repeatedly every about min_apply_delay
internvals but is hard to distinguish from network troubles. I'm not
sure you're deliberately okay with it but, I don't think the behavior
causing replication timeouts is acceptable.
wal_sender_timeout = 10s;
wal_receiver_temeout = 10s;create subscription ... with (min_apply_delay='60s');
This is a kind of artificial but timeout=60s and delay=5m is not an
uncommon setup and that also causes this "issue".
subscriber:
2022-12-12 14:17:18.139 JST LOG: terminating walsender process due to replication timeout
2022-12-12 14:18:11.076 JST LOG: starting logical decoding for slot "s1"
...
regards.
--
Kyotaro Horiguchi
NTT Open Source Software Center
On Wednesday, December 7, 2022 2:07 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
On Tue, Dec 6, 2022 at 5:44 PM Takamichi Osumi (Fujitsu)
<osumi.takamichi@fujitsu.com> wrote:On Friday, December 2, 2022 4:05 PM Amit Kapila <amit.kapila16@gmail.com>
wrote:
On Tue, Nov 15, 2022 at 12:33 PM Amit Kapila
<amit.kapila16@gmail.com>
wrote:One more thing I would like you to consider is the point raised by
me related to this patch's interaction with the parallel apply
feature as mentioned in the email [1]. I am not sure the idea
proposed in that email [1] is a good one because delaying after
applying commit may not be good as we want to delay the apply of
the transaction(s) on subscribers by this feature. I feel this needs morethought.
I have thought a bit more about this and we have the following
options to choose the delay point from. (a) apply delay just before
committing a transaction. As mentioned in comments in the patch this
can lead to bloat and locks held for a long time. (b) apply delay
before starting to apply changes for a transaction but here the
problem is which time to consider. In some cases, like for streaming
transactions, we don't receive the commit/prepare xact time in the startmessage. (c) use (b) but use the previous transaction's commit time.
(d) apply delay after committing a transaction by using the xact's commit
time.
At this stage, among above, I feel any one of (c) or (d) is worth
considering. Now, the difference between (c) and (d) is that if
after commit the next xact's data is already delayed by more than
min_apply_delay time then we don't need to kick the new logic of applydelay.
The other thing to consider whether we need to process any keepalive
messages during the delay because otherwise, walsender may think
that the subscriber is not available and time out. This may not be a
problem for synchronous replication but otherwise, it could be a problem.Thoughts?
(1) About the timing to apply the delay
One approach of (b) would be best. The idea is to delay all types of
transaction's application based on the time when one transaction arrives atthe subscriber node.
But I think it will unnecessarily add the delay when there is a delay in sending
the transaction by the publisher (say due to the reason that publisher was busy
handling other workloads or there was a temporary network communication
break between publisher and subscriber). This could probably be the reason
why physical replication (via recovery_min_apply_delay) uses the commit time of
the sending side.
You are right. The approach (b) adds additional (or unnecessary) delay
due to network communication or machine troubles in streaming-in-progress cases.
We agreed this approach (b) has the disadvantage.
One advantage of this approach over (c) and (d) is that this can avoid
the case where we might apply a transaction immediately without
waiting, if there are two transactions sequentially and the time in betweenexceeds the min_apply_delay time.
I am not sure if I understand your point. However, I think even if the
transactions are sequential but if the time between them exceeds (say because
the publisher was down) min_apply_delay, there is no need to apply additional
delay.
I'm sorry, my description was not accurate.
As for the approach (c), kindly imagine two transactions (txn1, txn2) are executed
on the publisher side and the publisher tries to send both of them to the subscriber.
Here, there is no network trouble and the publisher isn't busy for other workloads.
However, the diff of the time between txn1 and txn2 execeeds "min_apply_delay"
(which is set to the subscriber).
In this case, when the txn2 is a stream-in-progress transaction,
we don't apply any delay for txn2 when it arrives on the subscriber.
It's because before txn2 comes to the subscriber, "min_apply_delay"
has already passed on the publisher side.
This means there's a case we don't apply any delay when we choose approach (c).
The approach (d) has also similar disadvantage.
IIUC, in this approach the subscriber applies delay after committing a transaction,
based on the commit/prepare time of publisher side. Kindly, imagine two transactions
are executed on the publisher and the 2nd transaction completes after the subscriber's delay
for the 1st transaction. Again, there is no network troubles and no heavy workloads on the publisher.
If so, the delay for the txn1 already finishes when the 2nd transaction
arrives on the subscriber, then the 2nd transaction will be applied immediately without delay.
Another new discussion point is to utilize (b) and stream commit/stream prepare time
and apply the delay immediately before applying the spool files of the transactions
in the stream-in-progress transaction cases.
Does someone has any opinion on those approaches ?
Lastly, thanks Amit-san and Kuroda-san for giving me
so many offlist feedbacks about those significant points.
Best Regards,
Takamichi Osumi
Dear Amit,
This is a reply for later part of your e-mail.
(2) About the timeout issue
When having a look at the physical replication internals,
it conducts sending feedback and application of delay separately on differentprocesses.
OTOH, the logical replication needs to achieve those within one process.
When we want to apply delay and avoid the timeout,
we should not store all the transactions data into memory.
So, one approach for this is to serialize the transaction data and after the delay,
we apply the transactions data.It is not clear to me how this will avoid a timeout.
At first, the reason why the timeout occurs is that while delaying the apply
worker neither reads messages from the walsender nor replies to it.
The worker's last_recv_timeout will be not updated because it does not receive
messages. This leads to wal_receiver_timeout. Similarly, the walsender's
last_processing will be not updated and exit due to the timeout because the
worker does not reply to upstream.
Based on the above, we thought that workers must receive and handle messages
evenif they are delaying applying transactions. In more detail, workers must
iterate the outer loop in LogicalRepApplyLoop().
If workers receive transactions but they need to delay applying, they must keep
messages somewhere. So we came up with the idea that workers serialize changes
once and apply later. Our basic design is as follows:
* All transactions areserialized to files if min_apply_delay is set to non-zero.
* After receiving the commit message and spending time, workers reads and
applies spooled messages
However, this means if users adopt this feature,
then all transaction data that should be delayed would be serialized.
We are not sure if this sounds a valid approach or not.One another approach was to divide the time of delay in apply_delay() and
utilize the divided time for WaitLatch and sends the keepalive messages fromthere.
Do we anytime send keepalive messages from the apply side? I think we
only send feedback reply messages as a response to the publisher's
keep_alive message. So, we need to do something similar for this if
you want to follow this approach.
Right, and the above mechanism is needed for workers to understand messages
and send feedback replies as a response to the publisher's keepalive message.
Best Regards,
Hayato Kuroda
FUJITSU LIMITED
On Monday, December 12, 2022 2:54 PM Kyotaro Horiguchi <horikyota.ntt@gmail.com> wrote:
I asked about unexpected walsender termination caused by this feature but I
think I didn't received an answer for it and the behavior is still exists.Specifically, if servers have the following settings, walsender terminates for
replication timeout. After that, connection is restored after the LR delay elapses.
Although it can be said to be working in that sense, the error happens
repeatedly every about min_apply_delay internvals but is hard to distinguish
from network troubles. I'm not sure you're deliberately okay with it but, I don't
think the behavior causing replication timeouts is acceptable.wal_sender_timeout = 10s;
wal_receiver_temeout = 10s;create subscription ... with (min_apply_delay='60s');
This is a kind of artificial but timeout=60s and delay=5m is not an uncommon
setup and that also causes this "issue".subscriber:
2022-12-12 14:17:18.139 JST LOG: terminating walsender process due to
replication timeout
2022-12-12 14:18:11.076 JST LOG: starting logical decoding for slot "s1"...
Hi, Horiguchi-san
Thank you so much for your report!
Yes. Currently, how to deal with the timeout issue is under discussion.
Some analysis about the root cause are also there.
Kindly have a look at [1]/messages/by-id/TYAPR01MB58669394A67F2340B82E42D1F5E29@TYAPR01MB5866.jpnprd01.prod.outlook.com.
[1]: /messages/by-id/TYAPR01MB58669394A67F2340B82E42D1F5E29@TYAPR01MB5866.jpnprd01.prod.outlook.com
Best Regards,
Takamichi Osumi
On Tuesday, December 6, 2022 5:00 PM Peter Smith <smithpb2250@gmail.com> wrote:
Here are some review comments for patch v9-0001:
Hi, thank you for your reviews !
======
GENERAL
1. min_ prefix?
What's the significance of the "min_" prefix for this parameter? I'm guessing the
background is that at one time it was considered to be a GUC so took a name
similar to GUC recovery_min_apply_delay (??)But in practice, I think it is meaningless and/or misleading. For example,
suppose the user wants to defer replication by 1hr. IMO it is more natural to
just say "defer replication by 1 hr" (aka
apply_delay='1hr') Clearly it means replication will take place about
1 hr into the future. OTHO saying "defer replication by a MINIMUM of 1 hr" (aka
min_apply_delay='1hr') is quite vague because then it is equally valid if the
replication gets delayed by 1 hr or 2 hrs or 5 days or 3 weeks since all of those
satisfy the minimum delay. The implementation could hardwire a delay of
INT_MAX ms but clearly, that's not really what the user would expect.~
So, I think this parameter should be renamed just as 'apply_delay'.
But, if you still decide to keep it as 'min_apply_delay' then there is a lot of other
code that ought to be changed to be consistent with that name.
e.g.
- subapplydelay in catalogs.sgml --> subminapplydelay
- subapplydelay in system_views.sql --> subminapplydelay
- subapplydelay in pg_subscription.h --> subminapplydelay
- subapplydelay in dump.h --> subminapplydelay
- i_subapplydelay in pg_dump.c --> i_subminapplydelay
- applydelay member name of Form_pg_subscription --> minapplydelay
- "Apply Delay" for the column name displayed by describe.c --> "Min apply
delay"
I followed the suggestion to keep the "min_" prefix in [1]/messages/by-id/CAA4eK1J9HEL-U32FwkHXLOGXPV_Fu+nb+1KpV7hTbnqbBNnDUQ@mail.gmail.com.
Fixed.
- more...
(IMO the fact that so much code does not currently say 'min' at all is just
evidence that the 'min' prefix really didn't really mean much in the first place)======
doc/src/sgml/catalogs.sgml
2. Section 31.2 Subscription
+ <para> + Time delayed replica of subscription is available by indicating + <literal>min_apply_delay</literal>. See + <xref linkend="sql-createsubscription"/> for details. + </para>How about saying like:
SUGGESTION
The subscriber replication can be instructed to lag behind the publisher side
changes by specifying the <literal>min_apply_delay</literal> subscription
parameter. See XXX for details.
Fixed.
======
doc/src/sgml/ref/create_subscription.sgml
3. min_apply_delay
+ <para> + By default, subscriber applies changes as soon as possible. As with + the physical replication feature + (<xref linkend="guc-recovery-min-apply-delay"/>), it can be useful to + have a time-delayed logical replica. This parameter allows you to + delay the application of changes by a specified amount of time. If + this value is specified without units, it is taken as milliseconds. + The default is zero, adding no delay. + </para>"subscriber applies" -> "the subscriber applies"
"allows you" -> "lets the user"
"The default is zero, adding no delay." -> "The default is zero (no delay)."
Fixed.
~
4.
+ larger than the time deviations between servers. Note that + in the case when this parameter is set to a long value, the + replication may not continue if the replication slot falls behind the + current LSN by more than <literal>max_slot_wal_keep_size</literal>. + See more details in <xref linkend="guc-max-slot-wal-keep-size"/>. + </para>4a.
SUGGESTION
Note that if this parameter is set to a long delay, the replication will stop if the
replication slot falls behind the current LSN by more than
<literal>max_slot_wal_keep_size</literal>.
Fixed.
~
4b.
When it is rendered (like below) it looks a bit repetitive:
... if the replication slot falls behind the current LSN by more than
max_slot_wal_keep_size. See more details in max_slot_wal_keep_size.
Thanks! Fixed the redundancy.
~
IMO the previous sentence should include the link.
SUGGESTION
if the replication slot falls behind the current LSN by more than <link linkend =
"guc-max-slot-wal-keep-size"><literal>max_slot_wal_keep_size</literal></lin
k>.
Fixed.
~
5.
+ <para> + Synchronous replication is affected by this setting when + <varname>synchronous_commit</varname> is set to + <literal>remote_write</literal>; every <literal>COMMIT</literal> + will need to wait to be applied. + </para>Yes, this deserves a big warning -- but I am just not quite sure of the details. I
think this impacts more than just "remote_rewrite" -- e.g. the same problem
would happen if "synchronous_standby_names" is non-empty.I think this warning needs to be more generic to cover everything.
Maybe something like belowSUGGESTION:
Delaying the replication can mean there is a much longer time between making
a change on the publisher, and that change being committed on the subscriber.
This can have a big impact on synchronous replication.
See
https://www.postgresql.org/docs/current/runtime-config-wal.html#GUC-SYN
CHRONOUS-COMMIT
Fixed.
======
src/backend/commands/subscriptioncmds.c
6. parse_subscription_options
+ ms = interval_to_ms(interval); + if (ms < 0 || ms > INT_MAX) + ereport(ERROR, + errcode(ERRCODE_NUMERIC_VALUE_OUT_OF_RANGE), + errmsg("%lld ms is outside the valid range for option \"%s\"", + (long long) ms, "min_apply_delay"));"for option" -> "for parameter"
Fixed.
======
src/backend/replication/logical/worker.c
7. apply_delay
+static void
+apply_delay(TimestampTz ts)IMO having a delay is not the usual case. So, would a better name for this
function be 'maybe_delay'?
Makes sense. I follow some other functions such as
maybe_reread_subscription and maybe_start_skipping_changes.
~
8.
+ * high value for the delay. This design is different from the physical + * replication (that applies the delay at commit time) mainly because + write + * operations may allow some issues (such as bloat and locks) that can + be + * minimized if it does not keep the transaction open for such a long time.Something seems not quite right with this wording -- is there a better way of
describing this?
I reworded the entire paragraph. Could you please check ?
~
9.
+ /* + * Delay apply until all tablesync workers have reached READY state. If + we + * allow the delay during the catchup phase, once we reach the limit of + * tablesync workers, it will impose a delay for each subsequent worker. + * It means it will take a long time to finish the initial table + * synchronization. + */ + if (!AllTablesyncsReady()) + return;"Delay apply until..." -> "The min_apply_delay parameter is ignored until..."
Fixed.
~
10.
+ /* + * The worker may be waken because of the ALTER SUBSCRIPTION ... + * DISABLE, so the catalog pg_subscription should be read again. + */ + if (!in_remote_transaction && !in_streamed_transaction) { + AcceptInvalidationMessages(); maybe_reread_subscription(); } }"waken" -> "woken"
I have removed this sentence for a new change
to recalculate the diffms for any updates of the "min_apply_delay" parameter.
Please have a look at maybe_delay_apply().
======
src/bin/psql/describe.c
11. describeSubscriptions
+ /* Origin and min_apply_delay are only supported in v16 and higher */ if (pset.sversion >= 160000) appendPQExpBuffer(&buf, - ", suborigin AS \"%s\"\n", - gettext_noop("Origin")); + ", suborigin AS \"%s\"\n" + ", subapplydelay AS \"%s\"\n", + gettext_noop("Origin"), + gettext_noop("Apply delay"));IIUC the psql command is supposed to display useful information to the user, so
I wondered if it is worthwhile to put the units in this column header -- "Apply
delay (ms)" instead of just "Apply delay"
because that would make it far easier to understand the meaning without
having to check the documentation to discover the units.
Fixed.
======
src/include/utils/timestamp.h
12.
+extern int64 interval_to_ms(const Interval *interval); +For consistency with the other interval conversion functions exposed here
maybe this one should have been called 'interval2ms'
Fixed.
======
src/test/subscription/t/032_apply_delay.pl
13.
IIUC this test is checking if a delay has occurred by inspecting the debug logs to
see if a certain code path including "logical replication apply delay" is logged. I
guess that is OK, but another way might be to compare the actual timing values
of the published and replicated rows.The publisher table can have a column with default now() and the subscriber
side table can have an *additional* column also with default now(). After
replication, those two timestamp values can be compared to check if the
difference exceeds the min_time_delay parameter specified.
Added this check.
This patch now depends on a patch posted in another thread in [2]/messages/by-id/20221122004119.GA132961@nathanxps13
for TAP test of "min_apply_delay" feature. Without this patch,
if one backend process executes ALTER SUBSCRIPTION SET min_apply_delay,
while the apply worker gets another message for apply_dispatch,
the apply worker doesn't notice the reset and utilizes the old value for
that incoming transaction. To fix this, I posted the patch together.
(During the patch creation, I don't any change any code logs of the
wakeup patch, but for my env, I adjusted the line feed.)
Kindly have a look at the updated patch.
[1]: /messages/by-id/CAA4eK1J9HEL-U32FwkHXLOGXPV_Fu+nb+1KpV7hTbnqbBNnDUQ@mail.gmail.com
[2]: /messages/by-id/20221122004119.GA132961@nathanxps13
Best Regards,
Takamichi Osumi
Attachments:
v10-0001-wake-up-logical-workers-as-needed-instead-of-rel.patchapplication/octet-stream; name=v10-0001-wake-up-logical-workers-as-needed-instead-of-rel.patchDownload
From ebdfb623ab9c1d19b564af45cdc73555d757625a Mon Sep 17 00:00:00 2001
From: Nathan Bossart <nathandbossart@gmail.com>
Date: Mon, 21 Nov 2022 16:01:01 -0800
Subject: [PATCH v10 1/2] wake up logical workers as needed instead of relying
on periodic wakeups
---
src/backend/access/transam/xact.c | 3 ++
src/backend/commands/alter.c | 7 ++++
src/backend/commands/subscriptioncmds.c | 4 ++
src/backend/replication/logical/tablesync.c | 10 +++++
src/backend/replication/logical/worker.c | 46 +++++++++++++++++++++
src/include/replication/logicalworker.h | 3 ++
6 files changed, 73 insertions(+)
diff --git a/src/backend/access/transam/xact.c b/src/backend/access/transam/xact.c
index 8086b857b9..dc00e66cfb 100644
--- a/src/backend/access/transam/xact.c
+++ b/src/backend/access/transam/xact.c
@@ -47,6 +47,7 @@
#include "pgstat.h"
#include "replication/logical.h"
#include "replication/logicallauncher.h"
+#include "replication/logicalworker.h"
#include "replication/origin.h"
#include "replication/snapbuild.h"
#include "replication/syncrep.h"
@@ -2360,6 +2361,7 @@ CommitTransaction(void)
AtEOXact_PgStat(true, is_parallel_worker);
AtEOXact_Snapshot(true, false);
AtEOXact_ApplyLauncher(true);
+ AtEOXact_LogicalRepWorkers(true);
pgstat_report_xact_timestamp(0);
CurrentResourceOwner = NULL;
@@ -2860,6 +2862,7 @@ AbortTransaction(void)
AtEOXact_HashTables(false);
AtEOXact_PgStat(false, is_parallel_worker);
AtEOXact_ApplyLauncher(false);
+ AtEOXact_LogicalRepWorkers(false);
pgstat_report_xact_timestamp(0);
}
diff --git a/src/backend/commands/alter.c b/src/backend/commands/alter.c
index 10b6fe19a2..d095cd3ced 100644
--- a/src/backend/commands/alter.c
+++ b/src/backend/commands/alter.c
@@ -59,6 +59,7 @@
#include "commands/user.h"
#include "miscadmin.h"
#include "parser/parse_func.h"
+#include "replication/logicalworker.h"
#include "rewrite/rewriteDefine.h"
#include "tcop/utility.h"
#include "utils/builtins.h"
@@ -279,6 +280,12 @@ AlterObjectRename_internal(Relation rel, Oid objectId, const char *new_name)
if (strncmp(new_name, "regress_", 8) != 0)
elog(WARNING, "subscriptions created by regression test cases should have names starting with \"regress_\"");
#endif
+
+ /*
+ * Wake up the logical replication workers to handle this change
+ * quickly.
+ */
+ LogicalRepWorkersWakeupAtCommit(objectId);
}
else if (nameCacheId >= 0)
{
diff --git a/src/backend/commands/subscriptioncmds.c b/src/backend/commands/subscriptioncmds.c
index d673557ea4..d6993c26e5 100644
--- a/src/backend/commands/subscriptioncmds.c
+++ b/src/backend/commands/subscriptioncmds.c
@@ -34,6 +34,7 @@
#include "nodes/makefuncs.h"
#include "pgstat.h"
#include "replication/logicallauncher.h"
+#include "replication/logicalworker.h"
#include "replication/origin.h"
#include "replication/slot.h"
#include "replication/walreceiver.h"
@@ -1362,6 +1363,9 @@ AlterSubscription(ParseState *pstate, AlterSubscriptionStmt *stmt,
InvokeObjectPostAlterHook(SubscriptionRelationId, subid, 0);
+ /* Wake up the logical replication workers to handle this change quickly. */
+ LogicalRepWorkersWakeupAtCommit(subid);
+
return myself;
}
diff --git a/src/backend/replication/logical/tablesync.c b/src/backend/replication/logical/tablesync.c
index 94e813ac53..509fe2eb19 100644
--- a/src/backend/replication/logical/tablesync.c
+++ b/src/backend/replication/logical/tablesync.c
@@ -105,6 +105,7 @@
#include "pgstat.h"
#include "replication/logicallauncher.h"
#include "replication/logicalrelation.h"
+#include "replication/logicalworker.h"
#include "replication/walreceiver.h"
#include "replication/worker_internal.h"
#include "replication/slot.h"
@@ -619,6 +620,15 @@ process_syncing_tables_for_apply(XLogRecPtr current_lsn)
if (started_tx)
{
+ /*
+ * If we are ready to enable two_phase mode, wake up the logical
+ * replication workers to handle this change quickly.
+ */
+ CommandCounterIncrement();
+ if (MySubscription->twophasestate == LOGICALREP_TWOPHASE_STATE_PENDING &&
+ AllTablesyncsReady())
+ LogicalRepWorkersWakeupAtCommit(MyLogicalRepWorker->subid);
+
CommitTransactionCommand();
pgstat_report_stat(true);
}
diff --git a/src/backend/replication/logical/worker.c b/src/backend/replication/logical/worker.c
index 96772e4d73..722f796c7a 100644
--- a/src/backend/replication/logical/worker.c
+++ b/src/backend/replication/logical/worker.c
@@ -254,6 +254,8 @@ WalReceiverConn *LogRepWorkerWalRcvConn = NULL;
Subscription *MySubscription = NULL;
static bool MySubscriptionValid = false;
+static List *on_commit_wakeup_workers_subids = NIL;
+
bool in_remote_transaction = false;
static XLogRecPtr remote_final_lsn = InvalidXLogRecPtr;
@@ -4097,3 +4099,47 @@ reset_apply_error_context_info(void)
apply_error_callback_arg.remote_attnum = -1;
set_apply_error_context_xact(InvalidTransactionId, InvalidXLogRecPtr);
}
+
+/*
+ * Wakeup the stored subscriptions' workers on commit if requested.
+ */
+void
+AtEOXact_LogicalRepWorkers(bool isCommit)
+{
+ if (isCommit && on_commit_wakeup_workers_subids != NIL)
+ {
+ ListCell *subid;
+
+ LWLockAcquire(LogicalRepWorkerLock, LW_SHARED);
+ foreach(subid, on_commit_wakeup_workers_subids)
+ {
+ List *workers;
+ ListCell *worker;
+
+ workers = logicalrep_workers_find(lfirst_oid(subid), true);
+ foreach(worker, workers)
+ logicalrep_worker_wakeup_ptr((LogicalRepWorker *) lfirst(worker));
+ }
+ LWLockRelease(LogicalRepWorkerLock);
+ }
+
+ on_commit_wakeup_workers_subids = NIL;
+}
+
+/*
+ * Request wakeup of the workers for the given subscription ID on commit of the
+ * transaction.
+ *
+ * This is used to ensure that the workers process assorted changes as soon as
+ * possible.
+ */
+void
+LogicalRepWorkersWakeupAtCommit(Oid subid)
+{
+ MemoryContext oldcxt;
+
+ oldcxt = MemoryContextSwitchTo(TopTransactionContext);
+ on_commit_wakeup_workers_subids = list_append_unique_oid(on_commit_wakeup_workers_subids,
+ subid);
+ MemoryContextSwitchTo(oldcxt);
+}
diff --git a/src/include/replication/logicalworker.h b/src/include/replication/logicalworker.h
index cd1b6e8afc..2c2340d758 100644
--- a/src/include/replication/logicalworker.h
+++ b/src/include/replication/logicalworker.h
@@ -16,4 +16,7 @@ extern void ApplyWorkerMain(Datum main_arg);
extern bool IsLogicalWorker(void);
+extern void LogicalRepWorkersWakeupAtCommit(Oid subid);
+extern void AtEOXact_LogicalRepWorkers(bool isCommit);
+
#endif /* LOGICALWORKER_H */
--
2.30.0
v10-0002-Time-delayed-logical-replication-subscriber.patchapplication/octet-stream; name=v10-0002-Time-delayed-logical-replication-subscriber.patchDownload
From 89de10e78bc49a4a874fec8553ba190afa953139 Mon Sep 17 00:00:00 2001
From: Takamichi Osumi <osumi.takamichi@fujitsu.com>
Date: Mon, 12 Dec 2022 09:52:16 +0000
Subject: [PATCH v10 2/2] Time-delayed logical replication subscriber
Similar to physical replication, a time-delayed copy of the data for
logical replication is useful for some scenarios (particularly to fix
errors that might cause data loss).
If the subscription sets min_apply_delay parameter, the logical
replication worker will delay the transaction commit for min_apply_delay
milliseconds.
The delay is calculated between the WAL time stamp and the current time
on the subscriber.
The delay occurs only on WAL records for transaction begins. The main
reason is to avoid keeping a transaction open for a long time. Regular
and prepared transactions are covered. Streamed transactions are also
covered.
Author: Euler Taveira
Discussion: https://postgr.es/m/CAB-JLwYOYwL=XTyAXKiH5CtM_Vm8KjKh7aaitCKvmCh4rzr5pQ@mail.gmail.com
---
doc/src/sgml/catalogs.sgml | 9 ++
doc/src/sgml/logical-replication.sgml | 7 +
doc/src/sgml/ref/alter_subscription.sgml | 5 +-
doc/src/sgml/ref/create_subscription.sgml | 54 ++++++-
src/backend/catalog/pg_subscription.c | 1 +
src/backend/catalog/system_views.sql | 7 +-
src/backend/commands/subscriptioncmds.c | 55 ++++++-
src/backend/replication/logical/worker.c | 100 ++++++++++++
src/backend/utils/adt/timestamp.c | 32 ++++
src/bin/pg_dump/pg_dump.c | 16 +-
src/bin/pg_dump/pg_dump.h | 1 +
src/bin/psql/describe.c | 9 +-
src/bin/psql/tab-complete.c | 4 +-
src/include/catalog/pg_subscription.h | 3 +
src/include/datatype/timestamp.h | 2 +
src/include/utils/timestamp.h | 2 +
src/test/regress/expected/subscription.out | 176 +++++++++++++--------
src/test/regress/sql/subscription.sql | 25 +++
src/test/subscription/meson.build | 1 +
src/test/subscription/t/032_apply_delay.pl | 151 ++++++++++++++++++
20 files changed, 577 insertions(+), 83 deletions(-)
create mode 100644 src/test/subscription/t/032_apply_delay.pl
diff --git a/doc/src/sgml/catalogs.sgml b/doc/src/sgml/catalogs.sgml
index 9316b811ac..579b12f357 100644
--- a/doc/src/sgml/catalogs.sgml
+++ b/doc/src/sgml/catalogs.sgml
@@ -7873,6 +7873,15 @@ SCRAM-SHA-256$<replaceable><iteration count></replaceable>:<replaceable>&l
</para></entry>
</row>
+ <row>
+ <entry role="catalog_table_entry"><para role="column_definition">
+ <structfield>subminapplydelay</structfield> <type>int8</type>
+ </para>
+ <para>
+ The length of time (ms) to delay the application of changes.
+ </para></entry>
+ </row>
+
<row>
<entry role="catalog_table_entry"><para role="column_definition">
<structfield>subname</structfield> <type>name</type>
diff --git a/doc/src/sgml/logical-replication.sgml b/doc/src/sgml/logical-replication.sgml
index f8756389a3..1c3c26d7f7 100644
--- a/doc/src/sgml/logical-replication.sgml
+++ b/doc/src/sgml/logical-replication.sgml
@@ -247,6 +247,13 @@
target table.
</para>
+ <para>
+ The subscriber replication can be instructed to lag behind the publisher
+ side changes by specifying the <literal>min_apply_delay</literal>
+ subscription parameter. See <xref linkend="sql-createsubscription"/> for
+ details.
+ </para>
+
<sect2 id="logical-replication-subscription-slot">
<title>Replication Slot Management</title>
diff --git a/doc/src/sgml/ref/alter_subscription.sgml b/doc/src/sgml/ref/alter_subscription.sgml
index 1e8d72062b..d63aff1b90 100644
--- a/doc/src/sgml/ref/alter_subscription.sgml
+++ b/doc/src/sgml/ref/alter_subscription.sgml
@@ -213,8 +213,9 @@ ALTER SUBSCRIPTION <replaceable class="parameter">name</replaceable> RENAME TO <
are <literal>slot_name</literal>,
<literal>synchronous_commit</literal>,
<literal>binary</literal>, <literal>streaming</literal>,
- <literal>disable_on_error</literal>, and
- <literal>origin</literal>.
+ <literal>disable_on_error</literal>,
+ <literal>origin</literal>, and
+ <literal>min_apply_delay</literal>.
</para>
</listitem>
</varlistentry>
diff --git a/doc/src/sgml/ref/create_subscription.sgml b/doc/src/sgml/ref/create_subscription.sgml
index f9a1776380..5b8df483c7 100644
--- a/doc/src/sgml/ref/create_subscription.sgml
+++ b/doc/src/sgml/ref/create_subscription.sgml
@@ -333,7 +333,47 @@ CREATE SUBSCRIPTION <replaceable class="parameter">subscription_name</replaceabl
</para>
</listitem>
</varlistentry>
- </variablelist></para>
+
+ <varlistentry>
+ <term><literal>min_apply_delay</literal> (<type>integer</type>)</term>
+ <listitem>
+ <para>
+ By default, the subscriber applies changes as soon as possible. As
+ with the physical replication feature
+ (<xref linkend="guc-recovery-min-apply-delay"/>), it can be useful to
+ have a time-delayed logical replica. This parameter lets the user to
+ delay the application of changes by a specified amount of time. If this
+ value is specified without units, it is taken as milliseconds. The
+ default is zero(no delay).
+ </para>
+ <para>
+ The delay occurs only on WAL records for transaction begins and after
+ the initial table synchronization. It is possible that the
+ replication delay between publisher and subscriber exceeds the value
+ of this parameter, in which case no delay is added. Note that the
+ delay is calculated between the WAL time stamp as written on
+ publisher and the current time on the subscriber. Time spent in logical
+ decoding and in transfering the transaction may reduce the actual wait
+ time. If the system clocks on publisher and subscriber are not
+ synchronized, this may lead to apply changes earlier than expected,
+ but this is not a major issue because this parameter is typically much
+ larger than the time deviations between servers. Note that if this
+ parameter is set to a long delay, the replication will stop if the
+ replication slot falls behind the current LSN by more than
+ <link linkend="guc-max-slot-wal-keep-size"><literal>max_slot_wal_keep_size</literal></link>.
+ </para>
+ <warning>
+ <para>
+ Delaying the replication can mean there is a much longer time between making
+ a change on the publisher, and that change being committed on the subscriber.
+ This can have a big impact on synchronous replication.
+ See <xref linkend="guc-synchronous-commit"/>.
+ </para>
+ </warning>
+ </listitem>
+ </varlistentry>
+ </variablelist>
+ </para>
</listitem>
</varlistentry>
@@ -456,6 +496,18 @@ CREATE SUBSCRIPTION mysub
PUBLICATION insert_only
WITH (enabled = false);
</programlisting></para>
+
+ <para>
+ Create a subscription to a remote server that replicates tables in
+ the <literal>mypub</literal> publication and starts replicating immediately
+ on commit. Pre-existing data is not copied. The application of changes is
+ delayed by 4 hours.
+<programlisting>
+CREATE SUBSCRIPTION mysub
+ CONNECTION 'host=192.168.1.50 port=5432 user=foo dbname=foodb'
+ PUBLICATION mypub
+ WITH (copy_data = false, min_apply_delay = '4h');
+</programlisting></para>
</refsect1>
<refsect1>
diff --git a/src/backend/catalog/pg_subscription.c b/src/backend/catalog/pg_subscription.c
index a506fc3ec8..b29ed67715 100644
--- a/src/backend/catalog/pg_subscription.c
+++ b/src/backend/catalog/pg_subscription.c
@@ -64,6 +64,7 @@ GetSubscription(Oid subid, bool missing_ok)
sub->oid = subid;
sub->dbid = subform->subdbid;
sub->skiplsn = subform->subskiplsn;
+ sub->minapplydelay = subform->subminapplydelay;
sub->name = pstrdup(NameStr(subform->subname));
sub->owner = subform->subowner;
sub->enabled = subform->subenabled;
diff --git a/src/backend/catalog/system_views.sql b/src/backend/catalog/system_views.sql
index 2d8104b090..85aa1bd6f5 100644
--- a/src/backend/catalog/system_views.sql
+++ b/src/backend/catalog/system_views.sql
@@ -1299,9 +1299,10 @@ REVOKE ALL ON pg_replication_origin_status FROM public;
-- All columns of pg_subscription except subconninfo are publicly readable.
REVOKE ALL ON pg_subscription FROM public;
-GRANT SELECT (oid, subdbid, subskiplsn, subname, subowner, subenabled,
- subbinary, substream, subtwophasestate, subdisableonerr,
- subslotname, subsynccommit, subpublications, suborigin)
+GRANT SELECT (oid, subdbid, subskiplsn, subminapplydelay, subname, subowner,
+ subenabled, subbinary, substream, subtwophasestate,
+ subdisableonerr, subslotname, subsynccommit, subpublications,
+ suborigin)
ON pg_subscription TO public;
CREATE VIEW pg_stat_subscription_stats AS
diff --git a/src/backend/commands/subscriptioncmds.c b/src/backend/commands/subscriptioncmds.c
index d6993c26e5..fdbb9dc12c 100644
--- a/src/backend/commands/subscriptioncmds.c
+++ b/src/backend/commands/subscriptioncmds.c
@@ -48,6 +48,7 @@
#include "utils/memutils.h"
#include "utils/pg_lsn.h"
#include "utils/syscache.h"
+#include "utils/timestamp.h"
/*
* Options that can be specified by the user in CREATE/ALTER SUBSCRIPTION
@@ -66,6 +67,7 @@
#define SUBOPT_DISABLE_ON_ERR 0x00000400
#define SUBOPT_LSN 0x00000800
#define SUBOPT_ORIGIN 0x00001000
+#define SUBOPT_MIN_APPLY_DELAY 0x00002000
/* check if the 'val' has 'bits' set */
#define IsSet(val, bits) (((val) & (bits)) == (bits))
@@ -90,6 +92,7 @@ typedef struct SubOpts
bool disableonerr;
char *origin;
XLogRecPtr lsn;
+ int64 min_apply_delay;
} SubOpts;
static List *fetch_table_list(WalReceiverConn *wrconn, List *publications);
@@ -146,6 +149,8 @@ parse_subscription_options(ParseState *pstate, List *stmt_options,
opts->disableonerr = false;
if (IsSet(supported_opts, SUBOPT_ORIGIN))
opts->origin = pstrdup(LOGICALREP_ORIGIN_ANY);
+ if (IsSet(supported_opts, SUBOPT_MIN_APPLY_DELAY))
+ opts->min_apply_delay = 0;
/* Parse options */
foreach(lc, stmt_options)
@@ -324,6 +329,43 @@ parse_subscription_options(ParseState *pstate, List *stmt_options,
opts->specified_opts |= SUBOPT_LSN;
opts->lsn = lsn;
}
+ else if (IsSet(supported_opts, SUBOPT_MIN_APPLY_DELAY) &&
+ strcmp(defel->defname, "min_apply_delay") == 0)
+ {
+ char *val,
+ *tmp;
+ Interval *interval;
+ int64 ms;
+
+ if (IsSet(opts->specified_opts, SUBOPT_MIN_APPLY_DELAY))
+ errorConflictingDefElem(defel, pstate);
+
+ opts->specified_opts |= SUBOPT_MIN_APPLY_DELAY;
+ tmp = defGetString(defel);
+
+ /*
+ * If no unit was specified, then explicitly add 'ms' otherwise
+ * the interval_in function would assume 'seconds'.
+ */
+ if (strspn(tmp, "0123456789") == strlen(tmp))
+ val = psprintf("%sms", tmp);
+ else
+ val = tmp;
+
+ interval = DatumGetIntervalP(DirectFunctionCall3(interval_in,
+ CStringGetDatum(val),
+ ObjectIdGetDatum(InvalidOid),
+ Int32GetDatum(-1)));
+
+ ms = interval2ms(interval);
+ if (ms < 0 || ms > INT_MAX)
+ ereport(ERROR,
+ errcode(ERRCODE_NUMERIC_VALUE_OUT_OF_RANGE),
+ errmsg("%lld ms is outside the valid range for parameter \"%s\"",
+ (long long) ms, "min_apply_delay"));
+
+ opts->min_apply_delay = ms;
+ }
else
ereport(ERROR,
(errcode(ERRCODE_SYNTAX_ERROR),
@@ -560,7 +602,8 @@ CreateSubscription(ParseState *pstate, CreateSubscriptionStmt *stmt,
SUBOPT_SLOT_NAME | SUBOPT_COPY_DATA |
SUBOPT_SYNCHRONOUS_COMMIT | SUBOPT_BINARY |
SUBOPT_STREAMING | SUBOPT_TWOPHASE_COMMIT |
- SUBOPT_DISABLE_ON_ERR | SUBOPT_ORIGIN);
+ SUBOPT_DISABLE_ON_ERR | SUBOPT_ORIGIN |
+ SUBOPT_MIN_APPLY_DELAY);
parse_subscription_options(pstate, stmt->options, supported_opts, &opts);
/*
@@ -625,6 +668,7 @@ CreateSubscription(ParseState *pstate, CreateSubscriptionStmt *stmt,
values[Anum_pg_subscription_oid - 1] = ObjectIdGetDatum(subid);
values[Anum_pg_subscription_subdbid - 1] = ObjectIdGetDatum(MyDatabaseId);
values[Anum_pg_subscription_subskiplsn - 1] = LSNGetDatum(InvalidXLogRecPtr);
+ values[Anum_pg_subscription_subminapplydelay - 1] = Int64GetDatum(opts.min_apply_delay);
values[Anum_pg_subscription_subname - 1] =
DirectFunctionCall1(namein, CStringGetDatum(stmt->subname));
values[Anum_pg_subscription_subowner - 1] = ObjectIdGetDatum(owner);
@@ -1054,7 +1098,7 @@ AlterSubscription(ParseState *pstate, AlterSubscriptionStmt *stmt,
supported_opts = (SUBOPT_SLOT_NAME |
SUBOPT_SYNCHRONOUS_COMMIT | SUBOPT_BINARY |
SUBOPT_STREAMING | SUBOPT_DISABLE_ON_ERR |
- SUBOPT_ORIGIN);
+ SUBOPT_ORIGIN | SUBOPT_MIN_APPLY_DELAY);
parse_subscription_options(pstate, stmt->options,
supported_opts, &opts);
@@ -1111,6 +1155,13 @@ AlterSubscription(ParseState *pstate, AlterSubscriptionStmt *stmt,
= true;
}
+ if (IsSet(opts.specified_opts, SUBOPT_MIN_APPLY_DELAY))
+ {
+ values[Anum_pg_subscription_subminapplydelay - 1] =
+ Int64GetDatum(opts.min_apply_delay);
+ replaces[Anum_pg_subscription_subminapplydelay - 1] = true;
+ }
+
if (IsSet(opts.specified_opts, SUBOPT_ORIGIN))
{
values[Anum_pg_subscription_suborigin - 1] =
diff --git a/src/backend/replication/logical/worker.c b/src/backend/replication/logical/worker.c
index 722f796c7a..364d23e1c2 100644
--- a/src/backend/replication/logical/worker.c
+++ b/src/backend/replication/logical/worker.c
@@ -328,6 +328,8 @@ static void maybe_reread_subscription(void);
static void DisableSubscriptionAndExit(void);
+static void maybe_delay_apply(TimestampTz ts);
+
/* prototype needed because of stream_commit */
static void apply_dispatch(StringInfo s);
@@ -833,6 +835,72 @@ slot_modify_data(TupleTableSlot *slot, TupleTableSlot *srcslot,
ExecStoreVirtualTuple(slot);
}
+/*
+ * When min_apply_delay parameter is set on the subscriber, we wait long enough
+ * to make sure a transaction is applied at least that interval behind the
+ * publisher.
+ *
+ * While the physical replication applies the delay at commit time, this
+ * feature applies the delay for the next transaction but before starting the
+ * transaction. This is mainly because keeping a transaction that conducted
+ * write operations open for a long time results in some issues such as bloat
+ * and locks.
+ *
+ * The min_apply_delay parameter will take effect only after all tables are in
+ * READY state.
+ */
+static void
+maybe_delay_apply(TimestampTz ts)
+{
+ /* Nothing to do if no delay set */
+ if (MySubscription->minapplydelay <= 0)
+ return;
+
+ /*
+ * The min_apply_delay parameter is ignored until all tablesync workers
+ * have reached READY state. If we allow the delay during the catchup
+ * phase, once we reach the limit of tablesync workers, it will impose a
+ * delay for each subsequent worker. It means it will take a long time to
+ * finish the initial table synchronization.
+ */
+ if (!AllTablesyncsReady())
+ return;
+
+ while (true)
+ {
+ long diffms;
+
+ CHECK_FOR_INTERRUPTS();
+
+ /*
+ * Before calculating the time duration, reload the catalog if needed.
+ */
+ if (!in_remote_transaction && !in_streamed_transaction)
+ {
+ AcceptInvalidationMessages();
+ maybe_reread_subscription();
+ }
+
+ diffms = TimestampDifferenceMilliseconds(GetCurrentTimestamp(),
+ TimestampTzPlusMilliseconds(ts, MySubscription->minapplydelay));
+
+ /*
+ * Exit without arming the latch if it's already past time to apply
+ * this transaction.
+ */
+ if (diffms <= 0)
+ break;
+
+ elog(DEBUG2, "logical replication apply delay: %ld ms", diffms);
+
+ WaitLatch(MyLatch,
+ WL_LATCH_SET | WL_TIMEOUT | WL_EXIT_ON_PM_DEATH,
+ diffms,
+ WAIT_EVENT_RECOVERY_APPLY_DELAY);
+ ResetLatch(MyLatch);
+ }
+}
+
/*
* Handle BEGIN message.
*/
@@ -844,6 +912,9 @@ apply_handle_begin(StringInfo s)
logicalrep_read_begin(s, &begin_data);
set_apply_error_context_xact(begin_data.xid, begin_data.final_lsn);
+ /* Should we delay the current transaction? */
+ maybe_delay_apply(begin_data.committime);
+
remote_final_lsn = begin_data.final_lsn;
maybe_start_skipping_changes(begin_data.final_lsn);
@@ -898,6 +969,9 @@ apply_handle_begin_prepare(StringInfo s)
logicalrep_read_begin_prepare(s, &begin_data);
set_apply_error_context_xact(begin_data.xid, begin_data.prepare_lsn);
+ /* Should we delay the current prepared transaction? */
+ maybe_delay_apply(begin_data.prepare_time);
+
remote_final_lsn = begin_data.prepare_lsn;
maybe_start_skipping_changes(begin_data.prepare_lsn);
@@ -1120,6 +1194,19 @@ apply_handle_stream_prepare(StringInfo s)
elog(DEBUG1, "received prepare for streamed transaction %u", prepare_data.xid);
+ /*
+ * Should we delay the current prepared transaction?
+ *
+ * Although the delay is applied in BEGIN PREPARE messages, streamed
+ * prepared transactions apply the delay in a STREAM PREPARE message.
+ * That's ok because no changes have been applied yet
+ * (apply_spooled_messages() will do it). The STREAM START message does
+ * not contain a prepare time (it will be available when the in-progress
+ * prepared transaction finishes), hence, it was not possible to apply a
+ * delay at that time.
+ */
+ maybe_delay_apply(prepare_data.prepare_time);
+
/* Replay all the spooled operations. */
apply_spooled_messages(prepare_data.xid, prepare_data.prepare_lsn);
@@ -1511,6 +1598,19 @@ apply_handle_stream_commit(StringInfo s)
elog(DEBUG1, "received commit for streamed transaction %u", xid);
+ /*
+ * Should we delay the current transaction?
+ *
+ * Although the delay is applied in BEGIN messages, streamed transactions
+ * apply the delay in a STREAM COMMIT message. That's ok because no
+ * changes have been applied yet (apply_spooled_messages() will do it).
+ * The STREAM START message would be a natural choice for this delay but
+ * there is no commit time yet (it will be available when the in-progress
+ * transaction finishes), hence, it was not possible to apply a delay at
+ * that time.
+ */
+ maybe_delay_apply(commit_data.committime);
+
apply_spooled_messages(xid, commit_data.commit_lsn);
apply_handle_commit_internal(&commit_data);
diff --git a/src/backend/utils/adt/timestamp.c b/src/backend/utils/adt/timestamp.c
index 3f2508c0c4..4b61b15821 100644
--- a/src/backend/utils/adt/timestamp.c
+++ b/src/backend/utils/adt/timestamp.c
@@ -2437,6 +2437,38 @@ interval_cmp_internal(const Interval *interval1, const Interval *interval2)
return int128_compare(span1, span2);
}
+/*
+ * Returns the number of milliseconds in the specified Interval.
+ */
+int64
+interval2ms(const Interval *interval)
+{
+ int64 days;
+ int64 ms;
+ int64 result;
+
+ days = interval->month * INT64CONST(30);
+ days += interval->day;
+
+ /*
+ * The following operations use these special functions to detect
+ * overflow. Number of ms per informed days.
+ */
+ if (pg_mul_s64_overflow(days, MSECS_PER_DAY, &result))
+ ereport(ERROR,
+ (errcode(ERRCODE_NUMERIC_VALUE_OUT_OF_RANGE),
+ errmsg("bigint out of range")));
+
+ /* Adds portion time (in ms) to the previous result. */
+ ms = interval->time / INT64CONST(1000);
+ if (pg_add_s64_overflow(result, ms, &result))
+ ereport(ERROR,
+ (errcode(ERRCODE_NUMERIC_VALUE_OUT_OF_RANGE),
+ errmsg("bigint out of range")));
+
+ return result;
+}
+
Datum
interval_eq(PG_FUNCTION_ARGS)
{
diff --git a/src/bin/pg_dump/pg_dump.c b/src/bin/pg_dump/pg_dump.c
index 44d957c038..31c4d57764 100644
--- a/src/bin/pg_dump/pg_dump.c
+++ b/src/bin/pg_dump/pg_dump.c
@@ -4494,6 +4494,7 @@ getSubscriptions(Archive *fout)
int i_subsynccommit;
int i_subpublications;
int i_subbinary;
+ int i_subminapplydelay;
int i,
ntups;
@@ -4546,9 +4547,14 @@ getSubscriptions(Archive *fout)
LOGICALREP_TWOPHASE_STATE_DISABLED);
if (fout->remoteVersion >= 160000)
- appendPQExpBufferStr(query, " s.suborigin\n");
+ appendPQExpBufferStr(query,
+ " s.suborigin,\n"
+ " s.subminapplydelay\n");
else
- appendPQExpBuffer(query, " '%s' AS suborigin\n", LOGICALREP_ORIGIN_ANY);
+ {
+ appendPQExpBuffer(query, " '%s' AS suborigin,\n", LOGICALREP_ORIGIN_ANY);
+ appendPQExpBufferStr(query, " 0 AS subminapplydelay\n");
+ }
appendPQExpBufferStr(query,
"FROM pg_subscription s\n"
@@ -4576,6 +4582,7 @@ getSubscriptions(Archive *fout)
i_subtwophasestate = PQfnumber(res, "subtwophasestate");
i_subdisableonerr = PQfnumber(res, "subdisableonerr");
i_suborigin = PQfnumber(res, "suborigin");
+ i_subminapplydelay = PQfnumber(res, "subminapplydelay");
subinfo = pg_malloc(ntups * sizeof(SubscriptionInfo));
@@ -4606,6 +4613,8 @@ getSubscriptions(Archive *fout)
subinfo[i].subdisableonerr =
pg_strdup(PQgetvalue(res, i, i_subdisableonerr));
subinfo[i].suborigin = pg_strdup(PQgetvalue(res, i, i_suborigin));
+ subinfo[i].subminapplydelay =
+ strtoi64(PQgetvalue(res, i, i_subminapplydelay), NULL, 10);
/* Decide whether we want to dump it */
selectDumpableObject(&(subinfo[i].dobj), fout);
@@ -4685,6 +4694,9 @@ dumpSubscription(Archive *fout, const SubscriptionInfo *subinfo)
if (strcmp(subinfo->subsynccommit, "off") != 0)
appendPQExpBuffer(query, ", synchronous_commit = %s", fmtId(subinfo->subsynccommit));
+ if (subinfo->subminapplydelay > 0)
+ appendPQExpBuffer(query, ", min_apply_delay = '" INT64_FORMAT " ms'", subinfo->subminapplydelay);
+
appendPQExpBufferStr(query, ");\n");
if (subinfo->dobj.dump & DUMP_COMPONENT_DEFINITION)
diff --git a/src/bin/pg_dump/pg_dump.h b/src/bin/pg_dump/pg_dump.h
index 436ac5bb98..175b4a72a4 100644
--- a/src/bin/pg_dump/pg_dump.h
+++ b/src/bin/pg_dump/pg_dump.h
@@ -661,6 +661,7 @@ typedef struct _SubscriptionInfo
char *subdisableonerr;
char *suborigin;
char *subsynccommit;
+ int64 subminapplydelay;
char *subpublications;
} SubscriptionInfo;
diff --git a/src/bin/psql/describe.c b/src/bin/psql/describe.c
index df166365e8..6512d6059a 100644
--- a/src/bin/psql/describe.c
+++ b/src/bin/psql/describe.c
@@ -6474,7 +6474,7 @@ describeSubscriptions(const char *pattern, bool verbose)
PGresult *res;
printQueryOpt myopt = pset.popt;
static const bool translate_columns[] = {false, false, false, false,
- false, false, false, false, false, false, false, false};
+ false, false, false, false, false, false, false, false, false};
if (pset.sversion < 100000)
{
@@ -6516,10 +6516,13 @@ describeSubscriptions(const char *pattern, bool verbose)
gettext_noop("Two-phase commit"),
gettext_noop("Disable on error"));
+ /* Origin and min_apply_delay are only supported in v16 and higher */
if (pset.sversion >= 160000)
appendPQExpBuffer(&buf,
- ", suborigin AS \"%s\"\n",
- gettext_noop("Origin"));
+ ", suborigin AS \"%s\"\n"
+ ", subminapplydelay AS \"%s\"\n",
+ gettext_noop("Origin"),
+ gettext_noop("Min apply delay (ms)"));
appendPQExpBuffer(&buf,
", subsynccommit AS \"%s\"\n"
diff --git a/src/bin/psql/tab-complete.c b/src/bin/psql/tab-complete.c
index 7d222680f5..446a12ac47 100644
--- a/src/bin/psql/tab-complete.c
+++ b/src/bin/psql/tab-complete.c
@@ -1880,7 +1880,7 @@ psql_completion(const char *text, int start, int end)
COMPLETE_WITH("(", "PUBLICATION");
/* ALTER SUBSCRIPTION <name> SET ( */
else if (HeadMatches("ALTER", "SUBSCRIPTION", MatchAny) && TailMatches("SET", "("))
- COMPLETE_WITH("binary", "disable_on_error", "origin", "slot_name",
+ COMPLETE_WITH("binary", "disable_on_error", "min_apply_delay", "origin", "slot_name",
"streaming", "synchronous_commit");
/* ALTER SUBSCRIPTION <name> SKIP ( */
else if (HeadMatches("ALTER", "SUBSCRIPTION", MatchAny) && TailMatches("SKIP", "("))
@@ -3202,7 +3202,7 @@ psql_completion(const char *text, int start, int end)
/* Complete "CREATE SUBSCRIPTION <name> ... WITH ( <opt>" */
else if (HeadMatches("CREATE", "SUBSCRIPTION") && TailMatches("WITH", "("))
COMPLETE_WITH("binary", "connect", "copy_data", "create_slot",
- "disable_on_error", "enabled", "origin", "slot_name",
+ "disable_on_error", "enabled", "min_apply_delay", "origin", "slot_name",
"streaming", "synchronous_commit", "two_phase");
/* CREATE TRIGGER --- is allowed inside CREATE SCHEMA, so use TailMatches */
diff --git a/src/include/catalog/pg_subscription.h b/src/include/catalog/pg_subscription.h
index 7b98714f30..3ff890f897 100644
--- a/src/include/catalog/pg_subscription.h
+++ b/src/include/catalog/pg_subscription.h
@@ -70,6 +70,8 @@ CATALOG(pg_subscription,6100,SubscriptionRelationId) BKI_SHARED_RELATION BKI_ROW
XLogRecPtr subskiplsn; /* All changes finished at this LSN are
* skipped */
+ int64 subminapplydelay; /* Replication apply delay */
+
NameData subname; /* Name of the subscription */
Oid subowner BKI_LOOKUP(pg_authid); /* Owner of the subscription */
@@ -119,6 +121,7 @@ typedef struct Subscription
* in */
XLogRecPtr skiplsn; /* All changes finished at this LSN are
* skipped */
+ int64 minapplydelay; /* Replication apply delay */
char *name; /* Name of the subscription */
Oid owner; /* Oid of the subscription owner */
bool enabled; /* Indicates if the subscription is enabled */
diff --git a/src/include/datatype/timestamp.h b/src/include/datatype/timestamp.h
index d155f1b03b..d5bbfad1c4 100644
--- a/src/include/datatype/timestamp.h
+++ b/src/include/datatype/timestamp.h
@@ -127,6 +127,8 @@ struct pg_itm_in
#define SECS_PER_MINUTE 60
#define MINS_PER_HOUR 60
+#define MSECS_PER_DAY INT64CONST(86400000)
+
#define USECS_PER_DAY INT64CONST(86400000000)
#define USECS_PER_HOUR INT64CONST(3600000000)
#define USECS_PER_MINUTE INT64CONST(60000000)
diff --git a/src/include/utils/timestamp.h b/src/include/utils/timestamp.h
index 7fd0b58825..1d6079057e 100644
--- a/src/include/utils/timestamp.h
+++ b/src/include/utils/timestamp.h
@@ -102,6 +102,8 @@ extern bool TimestampDifferenceExceeds(TimestampTz start_time,
TimestampTz stop_time,
int msec);
+extern int64 interval2ms(const Interval *interval);
+
extern TimestampTz time_t_to_timestamptz(pg_time_t tm);
extern pg_time_t timestamptz_to_time_t(TimestampTz t);
diff --git a/src/test/regress/expected/subscription.out b/src/test/regress/expected/subscription.out
index c13d218dcf..69a5193aa5 100644
--- a/src/test/regress/expected/subscription.out
+++ b/src/test/regress/expected/subscription.out
@@ -114,18 +114,18 @@ CREATE SUBSCRIPTION regress_testsub4 CONNECTION 'dbname=regress_doesnotexist' PU
WARNING: subscription was created, but is not connected
HINT: To initiate replication, you must manually create the replication slot, enable the subscription, and refresh the subscription.
\dRs+ regress_testsub4
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
-------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub4 | regress_subscription_user | f | {testpub} | f | f | d | f | none | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay (ms) | Synchronous commit | Conninfo | Skip LSN
+------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+----------------------+--------------------+-----------------------------+----------
+ regress_testsub4 | regress_subscription_user | f | {testpub} | f | f | d | f | none | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
ALTER SUBSCRIPTION regress_testsub4 SET (origin = any);
\dRs+ regress_testsub4
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
-------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub4 | regress_subscription_user | f | {testpub} | f | f | d | f | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay (ms) | Synchronous commit | Conninfo | Skip LSN
+------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+----------------------+--------------------+-----------------------------+----------
+ regress_testsub4 | regress_subscription_user | f | {testpub} | f | f | d | f | any | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
DROP SUBSCRIPTION regress_testsub3;
@@ -135,10 +135,10 @@ ALTER SUBSCRIPTION regress_testsub CONNECTION 'foobar';
ERROR: invalid connection string syntax: missing "=" after "foobar" in connection info string
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub} | f | f | d | f | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay (ms) | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+----------------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | f | d | f | any | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
ALTER SUBSCRIPTION regress_testsub SET PUBLICATION testpub2, testpub3 WITH (refresh = false);
@@ -155,10 +155,10 @@ ERROR: unrecognized subscription parameter: "create_slot"
-- ok
ALTER SUBSCRIPTION regress_testsub SKIP (lsn = '0/12345');
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+---------------------+--------+-----------+------------------+------------------+--------+--------------------+------------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub2,testpub3} | f | f | d | f | any | off | dbname=regress_doesnotexist2 | 0/12345
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay (ms) | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+---------------------+--------+-----------+------------------+------------------+--------+----------------------+--------------------+------------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub2,testpub3} | f | f | d | f | any | 0 | off | dbname=regress_doesnotexist2 | 0/12345
(1 row)
-- ok - with lsn = NONE
@@ -167,10 +167,10 @@ ALTER SUBSCRIPTION regress_testsub SKIP (lsn = NONE);
ALTER SUBSCRIPTION regress_testsub SKIP (lsn = '0/0');
ERROR: invalid WAL location (LSN): 0/0
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+---------------------+--------+-----------+------------------+------------------+--------+--------------------+------------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub2,testpub3} | f | f | d | f | any | off | dbname=regress_doesnotexist2 | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay (ms) | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+---------------------+--------+-----------+------------------+------------------+--------+----------------------+--------------------+------------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub2,testpub3} | f | f | d | f | any | 0 | off | dbname=regress_doesnotexist2 | 0/0
(1 row)
BEGIN;
@@ -202,10 +202,10 @@ ALTER SUBSCRIPTION regress_testsub_foo SET (synchronous_commit = foobar);
ERROR: invalid value for parameter "synchronous_commit": "foobar"
HINT: Available values: local, remote_write, remote_apply, on, off.
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
----------------------+---------------------------+---------+---------------------+--------+-----------+------------------+------------------+--------+--------------------+------------------------------+----------
- regress_testsub_foo | regress_subscription_user | f | {testpub2,testpub3} | f | f | d | f | any | local | dbname=regress_doesnotexist2 | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay (ms) | Synchronous commit | Conninfo | Skip LSN
+---------------------+---------------------------+---------+---------------------+--------+-----------+------------------+------------------+--------+----------------------+--------------------+------------------------------+----------
+ regress_testsub_foo | regress_subscription_user | f | {testpub2,testpub3} | f | f | d | f | any | 0 | local | dbname=regress_doesnotexist2 | 0/0
(1 row)
-- rename back to keep the rest simple
@@ -239,19 +239,19 @@ CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUB
WARNING: subscription was created, but is not connected
HINT: To initiate replication, you must manually create the replication slot, enable the subscription, and refresh the subscription.
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub} | t | f | d | f | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay (ms) | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+----------------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | t | f | d | f | any | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
ALTER SUBSCRIPTION regress_testsub SET (binary = false);
ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE);
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub} | f | f | d | f | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay (ms) | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+----------------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | f | d | f | any | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
DROP SUBSCRIPTION regress_testsub;
@@ -263,19 +263,19 @@ CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUB
WARNING: subscription was created, but is not connected
HINT: To initiate replication, you must manually create the replication slot, enable the subscription, and refresh the subscription.
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub} | f | t | d | f | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay (ms) | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+----------------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | t | d | f | any | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
ALTER SUBSCRIPTION regress_testsub SET (streaming = false);
ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE);
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub} | f | f | d | f | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay (ms) | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+----------------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | f | d | f | any | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
-- fail - publication already exists
@@ -290,10 +290,10 @@ ALTER SUBSCRIPTION regress_testsub ADD PUBLICATION testpub1, testpub2 WITH (refr
ALTER SUBSCRIPTION regress_testsub ADD PUBLICATION testpub1, testpub2 WITH (refresh = false);
ERROR: publication "testpub1" is already in subscription "regress_testsub"
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-----------------------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub,testpub1,testpub2} | f | f | d | f | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay (ms) | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-----------------------------+--------+-----------+------------------+------------------+--------+----------------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub,testpub1,testpub2} | f | f | d | f | any | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
-- fail - publication used more then once
@@ -308,10 +308,10 @@ ERROR: publication "testpub3" is not in subscription "regress_testsub"
-- ok - delete publications
ALTER SUBSCRIPTION regress_testsub DROP PUBLICATION testpub1, testpub2 WITH (refresh = false);
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub} | f | f | d | f | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay (ms) | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+----------------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | f | d | f | any | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
DROP SUBSCRIPTION regress_testsub;
@@ -347,10 +347,10 @@ CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUB
WARNING: subscription was created, but is not connected
HINT: To initiate replication, you must manually create the replication slot, enable the subscription, and refresh the subscription.
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub} | f | f | p | f | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay (ms) | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+----------------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | f | p | f | any | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
--fail - alter of two_phase option not supported.
@@ -359,10 +359,10 @@ ERROR: unrecognized subscription parameter: "two_phase"
-- but can alter streaming when two_phase enabled
ALTER SUBSCRIPTION regress_testsub SET (streaming = true);
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub} | f | t | p | f | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay (ms) | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+----------------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | t | p | f | any | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE);
@@ -372,10 +372,10 @@ CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUB
WARNING: subscription was created, but is not connected
HINT: To initiate replication, you must manually create the replication slot, enable the subscription, and refresh the subscription.
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub} | f | t | p | f | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay (ms) | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+----------------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | t | p | f | any | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE);
@@ -388,18 +388,58 @@ CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUB
WARNING: subscription was created, but is not connected
HINT: To initiate replication, you must manually create the replication slot, enable the subscription, and refresh the subscription.
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub} | f | f | d | f | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay (ms) | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+----------------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | f | d | f | any | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
ALTER SUBSCRIPTION regress_testsub SET (disable_on_error = true);
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub} | f | f | d | t | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay (ms) | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+----------------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | f | d | t | any | 0 | off | dbname=regress_doesnotexist | 0/0
+(1 row)
+
+ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE);
+DROP SUBSCRIPTION regress_testsub;
+-- fail -- min_apply_delay must be a non-negative integer
+CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUBLICATION testpub WITH (connect = false, min_apply_delay = foo);
+ERROR: invalid input syntax for type interval: "foo"
+CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUBLICATION testpub WITH (connect = false, min_apply_delay = -1);
+ERROR: -1000 ms is outside the valid range for parameter "min_apply_delay"
+-- success -- 123 ms
+CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUBLICATION testpub WITH (connect = false, min_apply_delay = 123);
+WARNING: subscription was created, but is not connected
+HINT: To initiate replication, you must manually create the replication slot, enable the subscription, and refresh the subscription.
+\dRs+
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay (ms) | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+----------------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | f | d | f | any | 123 | off | dbname=regress_doesnotexist | 0/0
+(1 row)
+
+-- success -- 86400000 ms
+ALTER SUBSCRIPTION regress_testsub SET (min_apply_delay = '1d');
+\dRs+
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay (ms) | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+----------------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | f | d | f | any | 86400000 | off | dbname=regress_doesnotexist | 0/0
+(1 row)
+
+ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE);
+DROP SUBSCRIPTION regress_testsub;
+-- success -- interval is converted into ms and stored as integer
+CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUBLICATION testpub WITH (connect = false, min_apply_delay = '4h 27min 35s');
+WARNING: subscription was created, but is not connected
+HINT: To initiate replication, you must manually create the replication slot, enable the subscription, and refresh the subscription.
+\dRs+
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay (ms) | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+----------------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | f | d | f | any | 16055000 | off | dbname=regress_doesnotexist | 0/0
(1 row)
ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE);
diff --git a/src/test/regress/sql/subscription.sql b/src/test/regress/sql/subscription.sql
index eaeade8cce..7a4e818857 100644
--- a/src/test/regress/sql/subscription.sql
+++ b/src/test/regress/sql/subscription.sql
@@ -275,6 +275,31 @@ ALTER SUBSCRIPTION regress_testsub SET (disable_on_error = true);
ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE);
DROP SUBSCRIPTION regress_testsub;
+-- fail -- min_apply_delay must be a non-negative integer
+CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUBLICATION testpub WITH (connect = false, min_apply_delay = foo);
+CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUBLICATION testpub WITH (connect = false, min_apply_delay = -1);
+
+-- success -- 123 ms
+CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUBLICATION testpub WITH (connect = false, min_apply_delay = 123);
+
+\dRs+
+
+-- success -- 86400000 ms
+ALTER SUBSCRIPTION regress_testsub SET (min_apply_delay = '1d');
+
+\dRs+
+
+ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE);
+DROP SUBSCRIPTION regress_testsub;
+
+-- success -- interval is converted into ms and stored as integer
+CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUBLICATION testpub WITH (connect = false, min_apply_delay = '4h 27min 35s');
+
+\dRs+
+
+ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE);
+DROP SUBSCRIPTION regress_testsub;
+
RESET SESSION AUTHORIZATION;
DROP ROLE regress_subscription_user;
DROP ROLE regress_subscription_user2;
diff --git a/src/test/subscription/meson.build b/src/test/subscription/meson.build
index 85d1dd9295..bb15d062b8 100644
--- a/src/test/subscription/meson.build
+++ b/src/test/subscription/meson.build
@@ -36,6 +36,7 @@ tests += {
't/029_on_error.pl',
't/030_origin.pl',
't/031_column_list.pl',
+ 't/032_apply_delay.pl',
't/100_bugs.pl',
],
},
diff --git a/src/test/subscription/t/032_apply_delay.pl b/src/test/subscription/t/032_apply_delay.pl
new file mode 100644
index 0000000000..8f8ce23f1b
--- /dev/null
+++ b/src/test/subscription/t/032_apply_delay.pl
@@ -0,0 +1,151 @@
+
+# Copyright (c) 2022, PostgreSQL Global Development Group
+
+# Test replication apply delay
+use strict;
+use warnings;
+use PostgreSQL::Test::Cluster;
+use PostgreSQL::Test::Utils;
+use Test::More;
+
+# Create publisher node.
+my $node_publisher = PostgreSQL::Test::Cluster->new('publisher');
+$node_publisher->init(allows_streaming => 'logical');
+$node_publisher->append_conf('postgresql.conf',
+ 'logical_decoding_work_mem = 64kB');
+$node_publisher->start;
+
+# Create subscriber node.
+my $node_subscriber = PostgreSQL::Test::Cluster->new('subscriber');
+$node_subscriber->init;
+$node_subscriber->append_conf('postgresql.conf',
+ "log_min_messages = debug2");
+$node_subscriber->start;
+
+# Create some preexisting content on publisher.
+$node_publisher->safe_psql('postgres',
+ "CREATE TABLE test_tab (a int primary key, b varchar, c timestamptz DEFAULT now())");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO test_tab VALUES (1, 'foo'), (2, 'bar')");
+
+# Setup structure on subscriber.
+$node_subscriber->safe_psql('postgres',
+ "CREATE TABLE test_tab (a int primary key, b text, c timestamptz DEFAULT now(), d bigint DEFAULT 999)"
+);
+
+# Setup logical replication.
+my $publisher_connstr = $node_publisher->connstr . ' dbname=postgres';
+
+# column c must not be published because we want to compare the time difference.
+$node_publisher->safe_psql('postgres',
+ "CREATE PUBLICATION tap_pub FOR TABLE test_tab (a, b)");
+
+my $appname = 'tap_sub';
+$node_subscriber->safe_psql('postgres',
+ "CREATE SUBSCRIPTION tap_sub CONNECTION '$publisher_connstr application_name=$appname' PUBLICATION tap_pub WITH (streaming = on, min_apply_delay = '3s')"
+);
+
+# Wait for initial table sync to finish.
+$node_subscriber->wait_for_subscription_sync($node_publisher, $appname);
+
+# Check log starting now for logical replication apply delay.
+my $log_location = -s $node_subscriber->logfile;
+
+my $result =
+ $node_subscriber->safe_psql('postgres',
+ "SELECT count(*), min(a), max(a) FROM test_tab");
+is($result, qq(2|1|2), 'check initial data was copied to subscriber');
+
+# New row to trigger apply delay.
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO test_tab VALUES (3, 'baz')");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO test_tab VALUES (4, 'abc')");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO test_tab VALUES (5, 'def')");
+
+$node_publisher->wait_for_catchup($appname);
+
+$result =
+ $node_subscriber->safe_psql('postgres',
+ "SELECT count(*), min(a), max(a) FROM test_tab");
+is($result, qq(5|1|5), 'check if the new rows were applied to subscriber');
+
+check_apply_delay_log("logical replication apply delay");
+
+check_apply_delay_time('5', '3');
+
+# Test streamed transaction.
+# Insert, update and delete enough rows to exceed 64kB limit.
+$node_publisher->safe_psql(
+ 'postgres', q{
+BEGIN;
+INSERT INTO test_tab SELECT i, md5(i::text) FROM generate_series(6, 5000) s(i);
+UPDATE test_tab SET b = md5(b) WHERE mod(a, 2) = 0;
+DELETE FROM test_tab WHERE mod(a, 3) = 0;
+COMMIT;
+});
+
+$node_publisher->wait_for_catchup($appname);
+
+$result =
+ $node_subscriber->safe_psql('postgres',
+ "SELECT count(*), min(a), max(a) FROM test_tab");
+is($result, qq(3334|1|5000), 'check if the new rows were applied to subscriber');
+
+check_apply_delay_log("logical replication apply delay");
+
+check_apply_delay_time('5000', '3');
+
+# Test ALTER SUBSCRIPTION. Delay 86460 seconds (1 day 1 minute).
+$node_subscriber->safe_psql('postgres',
+ "ALTER SUBSCRIPTION tap_sub SET (min_apply_delay = 86460000)"
+);
+
+# New row to trigger apply delay.
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO test_tab VALUES (0, 'foobar')");
+
+# Disable subscription. worker should die immediately.
+$node_subscriber->safe_psql('postgres',
+ "ALTER SUBSCRIPTION tap_sub DISABLE;"
+);
+
+# Wait until worker dies.
+my $sub_query =
+ "SELECT count(1) = 0 FROM pg_stat_subscription WHERE subname = 'tap_sub' AND pid IS NOT NULL;";
+$node_subscriber->poll_query_until('postgres', $sub_query)
+ or die "Timed out while waiting for subscriber to die";
+
+$node_subscriber->stop;
+$node_publisher->stop;
+
+done_testing();
+
+sub check_apply_delay_log
+{
+ my $message = shift;
+ my $old_log_location = $log_location;
+
+ $log_location = $node_subscriber->wait_for_log(qr/$message/, $log_location);
+
+ cmp_ok($log_location, '>', $old_log_location,
+ "logfile contains triggered logical replication apply delay"
+ );
+}
+
+sub check_apply_delay_time
+{
+ my ($primary_key, $expected_diffs) = @_;
+
+ my $inserted_time_on_pub = $node_publisher->safe_psql('postgres', qq[
+ SELECT extract(epoch from c) FROM test_tab WHERE a = $primary_key;
+ ]);
+
+ my $inserted_time_on_sub = $node_subscriber->safe_psql('postgres', qq[
+ SELECT extract(epoch from c) FROM test_tab WHERE a = $primary_key;
+ ]);
+
+ cmp_ok($inserted_time_on_sub - $inserted_time_on_pub, '>', $expected_diffs,
+ "The tuple on the subscriber was modified later than the publisher");
+}
--
2.30.0
On Friday, November 25, 2022 5:43 AM Peter Smith <smithpb2250@gmail.com> wrote:
On Fri, Nov 25, 2022 at 2:15 AM Takamichi Osumi (Fujitsu)
<osumi.takamichi@fujitsu.com> wrote:On Wednesday, October 5, 2022 6:42 PM Peter Smith
<smithpb2250@gmail.com> wrote:
...======
5. src/backend/commands/subscriptioncmds.c - SubOpts
@@ -89,6 +91,7 @@ typedef struct SubOpts
bool disableonerr;
char *origin;
XLogRecPtr lsn;
+ int64 min_apply_delay;
} SubOpts;I feel it would be better to be explicit about the storage units. So
call this member ‘min_apply_delay_ms’. E.g. then other code in
parse_subscription_options will be more natural when you are
converting using and assigning them to this member.I don't think we use such names including units explicitly.
Could you please tell me a similar example for this ?Regex search "\..*_ms[e\s]" finds some members where the unit is in the
member name.e.g. delay_ms (see EnableTimeoutParams in timeout.h) e.g. interval_in_ms (see
timeout_paramsin timeout.c)Regex search ".*_ms[e\s]" finds many local variables where the unit is in the
variable name======
16. src/include/catalog/pg_subscription.h
+ int64 subapplydelay; /* Replication apply delay */ +Consider renaming this as 'subapplydelayms' to make the units perfectly
clear.
Similar to the 5th comments, I can't find any examples for this.
I'd like to keep it general, which makes me feel it is more aligned
with existing codes.
Hi, thank you for sharing this info.
I searched the codes where I could feel the merits to add "ms"
at the end of the variable names.
Adding the unites would help to calculate or convert some time related values.
In this patch there is only a couple of functions, like maybe_delay_apply()
or for conversion of time, parse_subscription_options.
I feel changing just a couple of structures might be awkward,
while changing all internal structures is too much. So, I keep the names
as those were after some modifications shared in [1]/messages/by-id/TYCPR01MB83730C23CB7D29E57368BECDEDE29@TYCPR01MB8373.jpnprd01.prod.outlook.com.
If you have any better idea, please let me know.
[1]: /messages/by-id/TYCPR01MB83730C23CB7D29E57368BECDEDE29@TYCPR01MB8373.jpnprd01.prod.outlook.com
Best Regards,
Takamichi Osumi
Hi,
On Saturday, December 10, 2022 12:08 AM Takamichi Osumi (Fujitsu) <osumi.takamichi@fujitsu.com> wrote:
On Friday, December 9, 2022 3:38 PM Kuroda, Hayato/黒田 隼人
<kuroda.hayato@fujitsu.com> wrote:Thanks for reporting! I have analyzed the problem and found the root cause.
This feature seemed not to work on 32-bit OSes. This was because the
calculation of delay_time was wrong. The first argument of this should
be TimestampTz datatype, not Datum:``` + /* Set apply delay */ + delay_until = TimestampTzPlusMilliseconds(TimestampTzGetDatum(ts), + + MySubscription->applydelay); ```In more detail, the datum representation of int64 contains the value
itself on 64-bit OSes, but it contains the pointer to the value on 32-bit.After modifying the issue, this will work on 32-bit environments.
Thank you for your analysis.
Yeah, it seems we conduct addition of values to the pointer value, which is
returned from the call of TimestampTzGetDatum(), on 32-bit machine by
mistake.I'll remove the call in my next version.
Applied this fix in the last version, shared in [1]/messages/by-id/TYCPR01MB83730C23CB7D29E57368BECDEDE29@TYCPR01MB8373.jpnprd01.prod.outlook.com.
[1]: /messages/by-id/TYCPR01MB83730C23CB7D29E57368BECDEDE29@TYCPR01MB8373.jpnprd01.prod.outlook.com
Best Regards,
Takamichi Osumi
On Mon, Dec 12, 2022 at 1:04 PM Hayato Kuroda (Fujitsu)
<kuroda.hayato@fujitsu.com> wrote:
This is a reply for later part of your e-mail.
(2) About the timeout issue
When having a look at the physical replication internals,
it conducts sending feedback and application of delay separately on differentprocesses.
OTOH, the logical replication needs to achieve those within one process.
When we want to apply delay and avoid the timeout,
we should not store all the transactions data into memory.
So, one approach for this is to serialize the transaction data and after the delay,
we apply the transactions data.It is not clear to me how this will avoid a timeout.
At first, the reason why the timeout occurs is that while delaying the apply
worker neither reads messages from the walsender nor replies to it.
The worker's last_recv_timeout will be not updated because it does not receive
messages. This leads to wal_receiver_timeout. Similarly, the walsender's
last_processing will be not updated and exit due to the timeout because the
worker does not reply to upstream.Based on the above, we thought that workers must receive and handle messages
evenif they are delaying applying transactions. In more detail, workers must
iterate the outer loop in LogicalRepApplyLoop().If workers receive transactions but they need to delay applying, they must keep
messages somewhere. So we came up with the idea that workers serialize changes
once and apply later. Our basic design is as follows:* All transactions areserialized to files if min_apply_delay is set to non-zero.
* After receiving the commit message and spending time, workers reads and
applies spooled messages
I think this may be more work than required because in some cases
doing I/O just to delay xacts will later lead to more work. Can't we
send some ping to walsender to communicate that walreceiver is alive?
We already seem to be sending a ping in LogicalRepApplyLoop if we
haven't heard anything from the server for more than
wal_receiver_timeout / 2. Now, it is possible that the walsender is
terminated due to some other reason and we need to see if we can
detect that or if it will only be detected once the walreceiver's
delay time is over.
--
With Regards,
Amit Kapila.
Hello.
At Mon, 12 Dec 2022 07:42:30 +0000, "Takamichi Osumi (Fujitsu)" <osumi.takamichi@fujitsu.com> wrote in
On Monday, December 12, 2022 2:54 PM Kyotaro Horiguchi <horikyota.ntt@gmail.com> wrote:
I asked about unexpected walsender termination caused by this feature but I
think I didn't received an answer for it and the behavior is still exists.
..
Thank you so much for your report!
Yes. Currently, how to deal with the timeout issue is under discussion.
Some analysis about the root cause are also there.Kindly have a look at [1].
[1] - /messages/by-id/TYAPR01MB58669394A67F2340B82E42D1F5E29@TYAPR01MB5866.jpnprd01.prod.outlook.com
Oops. Thank you for the pointer. Will visit there.
regards.
--
Kyotaro Horiguchi
NTT Open Source Software Center
At Mon, 12 Dec 2022 18:10:00 +0530, Amit Kapila <amit.kapila16@gmail.com> wrote in
On Mon, Dec 12, 2022 at 1:04 PM Hayato Kuroda (Fujitsu)
<kuroda.hayato@fujitsu.com> wrote:once and apply later. Our basic design is as follows:
* All transactions areserialized to files if min_apply_delay is set to non-zero.
* After receiving the commit message and spending time, workers reads and
applies spooled messagesI think this may be more work than required because in some cases
doing I/O just to delay xacts will later lead to more work. Can't we
send some ping to walsender to communicate that walreceiver is alive?
We already seem to be sending a ping in LogicalRepApplyLoop if we
haven't heard anything from the server for more than
wal_receiver_timeout / 2. Now, it is possible that the walsender is
terminated due to some other reason and we need to see if we can
detect that or if it will only be detected once the walreceiver's
delay time is over.
FWIW, I thought the same thing with Amit.
What we should do here is logrep workers notifying to walsender that
it's living and the communication in-between is fine, and maybe the
worker's status. Spontaneous send_feedback() calls while delaying will
be sufficient for this purpose. We might need to supress extra forced
feedbacks instead. In contrast the worker doesn't need to bother to
know whether the peer is living until it receives the next data. But
we might need to adjust the wait_time in LogicalRepApplyLoop().
But, I'm not sure what will happen when walsender is blocked by
buffer-full for a long time.
regards.
--
Kyotaro Horiguchi
NTT Open Source Software Center
On Wednesday, December 7, 2022 12:00 PM Kyotaro Horiguchi <horikyota.ntt@gmail.com> wrote:
At Tue, 6 Dec 2022 11:08:43 -0800, Andres Freund <andres@anarazel.de> wrote
inHi,
The tests fail on cfbot:
https://cirrus-ci.com/task/4533866329800704They only seem to fail on 32bit linux.
https://api.cirrus-ci.com/v1/artifact/task/4533866329800704/testrun/bu
ild-32/testrun/subscription/032_apply_delay/log/regress_log_032_apply_
delay
[06:27:10.628](0.138s) ok 2 - check if the new rows were applied to
subscriber timed out waiting for match: (?^:logical replication apply delay) at/tmp/cirrus-ci-build/src/test/subscription/t/032_apply_delay.pl line 124.
It fails for me on 64bit Linux.. (Rocky 8.7)
t/032_apply_delay.pl ............... Dubious, test returned 29 (wstat
7424, 0x1d00) No subtests run..
t/032_apply_delay.pl (Wstat: 7424 Tests: 0 Failed: 0)
Non-zero exit status: 29
Parse errors: No plan found in TAP output
Hi, Horiguchi-san
Sorry for being late.
We couldn't reproduce this failure and
find the same type of failure on the cfbot from the past failures.
It seems no subtests run in your environment.
Could you please share the log files, if you have
or when you can reproduce this ?
FYI, the latest patch is attached in [1]/messages/by-id/TYCPR01MB83730C23CB7D29E57368BECDEDE29@TYCPR01MB8373.jpnprd01.prod.outlook.com.
[1]: /messages/by-id/TYCPR01MB83730C23CB7D29E57368BECDEDE29@TYCPR01MB8373.jpnprd01.prod.outlook.com
Best Regards,
Takamichi Osumi
At Tue, 13 Dec 2022 02:28:49 +0000, "Takamichi Osumi (Fujitsu)" <osumi.takamichi@fujitsu.com> wrote in
On Wednesday, December 7, 2022 12:00 PM Kyotaro Horiguchi <horikyota.ntt@gmail.com> wrote:
We couldn't reproduce this failure and
find the same type of failure on the cfbot from the past failures.
It seems no subtests run in your environment.
Very sorry for that. The test script is found to be a left-over file
in a git-reset'ed working tree. Please forget about it.
FWIW, the latest patch passed make-world for me on Rocky8/x86_64.
regards.
--
Kyotaro Horiguchi
NTT Open Source Software Center
On Tuesday, December 13, 2022 1:27 PM Kyotaro Horiguchi <horikyota.ntt@gmail.com> wrote:
At Tue, 13 Dec 2022 02:28:49 +0000, "Takamichi Osumi (Fujitsu)"
<osumi.takamichi@fujitsu.com> wrote inOn Wednesday, December 7, 2022 12:00 PM Kyotaro Horiguchi
<horikyota.ntt@gmail.com> wrote:
We couldn't reproduce this failure and find the same type of failure
on the cfbot from the past failures.
It seems no subtests run in your environment.Very sorry for that. The test script is found to be a left-over file in a git-reset'ed
working tree. Please forget about it.FWIW, the latest patch passed make-world for me on Rocky8/x86_64.
Hi,
No problem at all.
Also, thank you for your testing and confirming the latest one!
Best Regards,
Takamichi Osumi
On Tue, Dec 13, 2022 at 7:35 AM Kyotaro Horiguchi
<horikyota.ntt@gmail.com> wrote:
At Mon, 12 Dec 2022 18:10:00 +0530, Amit Kapila <amit.kapila16@gmail.com> wrote in
On Mon, Dec 12, 2022 at 1:04 PM Hayato Kuroda (Fujitsu)
<kuroda.hayato@fujitsu.com> wrote:once and apply later. Our basic design is as follows:
* All transactions areserialized to files if min_apply_delay is set to non-zero.
* After receiving the commit message and spending time, workers reads and
applies spooled messagesI think this may be more work than required because in some cases
doing I/O just to delay xacts will later lead to more work. Can't we
send some ping to walsender to communicate that walreceiver is alive?
We already seem to be sending a ping in LogicalRepApplyLoop if we
haven't heard anything from the server for more than
wal_receiver_timeout / 2. Now, it is possible that the walsender is
terminated due to some other reason and we need to see if we can
detect that or if it will only be detected once the walreceiver's
delay time is over.FWIW, I thought the same thing with Amit.
What we should do here is logrep workers notifying to walsender that
it's living and the communication in-between is fine, and maybe the
worker's status. Spontaneous send_feedback() calls while delaying will
be sufficient for this purpose. We might need to supress extra forced
feedbacks instead. In contrast the worker doesn't need to bother to
know whether the peer is living until it receives the next data. But
we might need to adjust the wait_time in LogicalRepApplyLoop().But, I'm not sure what will happen when walsender is blocked by
buffer-full for a long time.
Yeah, I think ideally it will timeout but if we have a solution like
during delay, we keep sending ping messages time-to-time, it should
work fine. However, that needs to be verified. Do you see any reasons
why that won't work?
--
With Regards,
Amit Kapila.
At Tue, 13 Dec 2022 17:05:35 +0530, Amit Kapila <amit.kapila16@gmail.com> wrote in
On Tue, Dec 13, 2022 at 7:35 AM Kyotaro Horiguchi
<horikyota.ntt@gmail.com> wrote:At Mon, 12 Dec 2022 18:10:00 +0530, Amit Kapila <amit.kapila16@gmail.com> wrote in
Yeah, I think ideally it will timeout but if we have a solution like
during delay, we keep sending ping messages time-to-time, it should
work fine. However, that needs to be verified. Do you see any reasons
why that won't work?
Ah. I meant that "I have no clear idea of whether" by "I'm not sure".
I looked there a bit further. Finally ProcessPendingWrites() waits for
streaming socket to be writable thus no critical problem found here.
That being said, it might be better ProcessPendingWrites() refrain
from sending consecutive keepalives while waiting, 30s ping timeout
and 1h delay may result in 120 successive pings. It might not be a big
deal but..
regards.
--
Kyotaro Horiguchi
NTT Open Source Software Center
Dear Horiguchi-san, Amit,
On Tue, Dec 13, 2022 at 7:35 AM Kyotaro Horiguchi
<horikyota.ntt@gmail.com> wrote:At Mon, 12 Dec 2022 18:10:00 +0530, Amit Kapila
<amit.kapila16@gmail.com> wrote in
Yeah, I think ideally it will timeout but if we have a solution like
during delay, we keep sending ping messages time-to-time, it should
work fine. However, that needs to be verified. Do you see any reasons
why that won't work?
I have implemented and tested that workers wake up per wal_receiver_timeout/2
and send keepalive. Basically it works well, but I found two problems.
Do you have any good suggestions about them?
1)
With this PoC at present, workers calculate sending intervals based on its
wal_receiver_timeout, and it is suppressed when the parameter is set to zero.
This means that there is a possibility that walsender is timeout when wal_sender_timeout
in publisher and wal_receiver_timeout in subscriber is different.
Supposing that wal_sender_timeout is 2min, wal_receiver_tiemout is 5min,
and min_apply_delay is 10min. The worker on subscriber will wake up per 2.5min and
send keepalives, but walsender exits before the message arrives to publisher.
One idea to avoid that is to send the min_apply_delay subscriber option to publisher
and compare them, but it may be not sufficient. Because XXX_timout GUC parameters
could be modified later.
2)
The issue reported by Vignesh-san[1]/messages/by-id/CALDaNm1vT8qNBqHivtAgYur-5-YkwF026VHtw9srd4fsdeaufA@mail.gmail.com has still remained. I have already analyzed that [2]/messages/by-id/TYAPR01MB5866F6BE7399E6343A96E016F51C9@TYAPR01MB5866.jpnprd01.prod.outlook.com,
the root cause is that flushed WAL is not updated and sent to the publisher. Even
if workers send keepalive messages to pub during the delay, the flushed position
cannot be modified.
[1]: /messages/by-id/CALDaNm1vT8qNBqHivtAgYur-5-YkwF026VHtw9srd4fsdeaufA@mail.gmail.com
[2]: /messages/by-id/TYAPR01MB5866F6BE7399E6343A96E016F51C9@TYAPR01MB5866.jpnprd01.prod.outlook.com
Best Regards,
Hayato Kuroda
FUJITSU LIMITED
On Fri, Dec 9, 2022 at 10:49 AM Hayato Kuroda (Fujitsu)
<kuroda.hayato@fujitsu.com> wrote:
Hi Vignesh,
In the case of physical replication by setting
recovery_min_apply_delay, I noticed that both primary and standby
nodes were getting stopped successfully immediately after the stop
server command. In case of logical replication, stop server fails:
pg_ctl -D publisher -l publisher.log stop -c
waiting for server to shut
down...............................................................
failed
pg_ctl: server does not shut downIn case of logical replication, the server does not get stopped
because the walsender process is not able to exit:
ps ux | grep walsender
vignesh 1950789 75.3 0.0 8695216 22284 ? Rs 11:51 1:08
postgres: walsender vignesh [local] START_REPLICATIONThanks for reporting the issue. I analyzed about it.
This issue has occurred because the apply worker cannot reply during the delay.
I think we may have to modify the mechanism that delays applying transactions.When walsender processes are requested to shut down, it can shut down only after
that all the sent WALs are replicated on the subscriber. This check is done in
WalSndDone(), and the replicated position will be updated when processes handle
the reply messages from a subscriber, in ProcessStandbyReplyMessage().In the case of physical replication, the walreciever can receive WALs and reply
even if the application is delayed. It means that the replicated position will
be transported to the publisher side immediately. So the walsender can exit.
I think it is not only the replicated positions but it also checks if
there is any pending send in WalSndDone(). Why is it a must to send
all pending WAL and confirm that it is flushed on standby before the
shutdown for physical standby? Is it because otherwise, we may lose
the required WAL? I am asking because it is better to see if those
conditions apply to logical replication as well.
--
With Regards,
Amit Kapila.
On Wed, Dec 14, 2022 at 4:16 PM Hayato Kuroda (Fujitsu)
<kuroda.hayato@fujitsu.com> wrote:
Dear Horiguchi-san, Amit,
On Tue, Dec 13, 2022 at 7:35 AM Kyotaro Horiguchi
<horikyota.ntt@gmail.com> wrote:At Mon, 12 Dec 2022 18:10:00 +0530, Amit Kapila
<amit.kapila16@gmail.com> wrote in
Yeah, I think ideally it will timeout but if we have a solution like
during delay, we keep sending ping messages time-to-time, it should
work fine. However, that needs to be verified. Do you see any reasons
why that won't work?I have implemented and tested that workers wake up per wal_receiver_timeout/2
and send keepalive. Basically it works well, but I found two problems.
Do you have any good suggestions about them?1)
With this PoC at present, workers calculate sending intervals based on its
wal_receiver_timeout, and it is suppressed when the parameter is set to zero.This means that there is a possibility that walsender is timeout when wal_sender_timeout
in publisher and wal_receiver_timeout in subscriber is different.
Supposing that wal_sender_timeout is 2min, wal_receiver_tiemout is 5min,
and min_apply_delay is 10min. The worker on subscriber will wake up per 2.5min and
send keepalives, but walsender exits before the message arrives to publisher.One idea to avoid that is to send the min_apply_delay subscriber option to publisher
and compare them, but it may be not sufficient. Because XXX_timout GUC parameters
could be modified later.
How about restarting the apply worker if min_apply_delay changes? Will
that be sufficient?
--
With Regards,
Amit Kapila.
At Wed, 14 Dec 2022 10:46:17 +0000, "Hayato Kuroda (Fujitsu)" <kuroda.hayato@fujitsu.com> wrote in
I have implemented and tested that workers wake up per wal_receiver_timeout/2
and send keepalive. Basically it works well, but I found two problems.
Do you have any good suggestions about them?1)
With this PoC at present, workers calculate sending intervals based on its
wal_receiver_timeout, and it is suppressed when the parameter is set to zero.This means that there is a possibility that walsender is timeout when wal_sender_timeout
in publisher and wal_receiver_timeout in subscriber is different.
Supposing that wal_sender_timeout is 2min, wal_receiver_tiemout is 5min,
It seems to me wal_receiver_status_interval is better for this use.
It's enough for us to docuemnt that "wal_r_s_interval should be
shorter than wal_sener_timeout/2 especially when logical replication
connection is using min_apply_delay. Otherwise you will suffer
repeated termination of walsender".
and min_apply_delay is 10min. The worker on subscriber will wake up per 2.5min and
send keepalives, but walsender exits before the message arrives to publisher.One idea to avoid that is to send the min_apply_delay subscriber option to publisher
and compare them, but it may be not sufficient. Because XXX_timout GUC parameters
could be modified later.
# Anyway, I don't think such asymmetric setup is preferable.
2)
The issue reported by Vignesh-san[1] has still remained. I have already analyzed that [2],
the root cause is that flushed WAL is not updated and sent to the publisher. Even
if workers send keepalive messages to pub during the delay, the flushed position
cannot be modified.
I didn't look closer but the cause I guess is walsender doesn't die
until all WAL has been sent, while logical delay chokes replication
stream. Allowing walsender to finish ignoring replication status
wouldn't be great. One idea is to let logical workers send delaying
status.
regards.
--
Kyotaro Horiguchi
NTT Open Source Software Center
At Wed, 14 Dec 2022 16:30:28 +0530, Amit Kapila <amit.kapila16@gmail.com> wrote in
On Wed, Dec 14, 2022 at 4:16 PM Hayato Kuroda (Fujitsu)
<kuroda.hayato@fujitsu.com> wrote:One idea to avoid that is to send the min_apply_delay subscriber option to publisher
and compare them, but it may be not sufficient. Because XXX_timout GUC parameters
could be modified later.How about restarting the apply worker if min_apply_delay changes? Will
that be sufficient?
Mmm. If publisher knows that value, isn't it albe to delay *sending*
data in the first place? This will resolve many known issues including
walsender's un-terminatability, possible buffer-full and status packet
exchanging.
regards.
--
Kyotaro Horiguchi
NTT Open Source Software Center
On Thu, Dec 15, 2022 at 7:22 AM Kyotaro Horiguchi
<horikyota.ntt@gmail.com> wrote:
At Wed, 14 Dec 2022 16:30:28 +0530, Amit Kapila <amit.kapila16@gmail.com> wrote in
On Wed, Dec 14, 2022 at 4:16 PM Hayato Kuroda (Fujitsu)
<kuroda.hayato@fujitsu.com> wrote:One idea to avoid that is to send the min_apply_delay subscriber option to publisher
and compare them, but it may be not sufficient. Because XXX_timout GUC parameters
could be modified later.How about restarting the apply worker if min_apply_delay changes? Will
that be sufficient?Mmm. If publisher knows that value, isn't it albe to delay *sending*
data in the first place? This will resolve many known issues including
walsender's un-terminatability, possible buffer-full and status packet
exchanging.
Yeah, but won't it change the meaning of this parameter? Say the
subscriber was busy enough that it doesn't need to add an additional
delay before applying a particular transaction(s) but adding a delay
to such a transaction on the publisher will actually make it take much
longer to reflect than expected. We probably need to name this
parameter as min_send_delay if we want to do what you are saying and
then I don't know if it serves the actual need and also it will be
different from what we do in physical standby.
--
With Regards,
Amit Kapila.
On Thu, Dec 15, 2022 at 7:16 AM Kyotaro Horiguchi
<horikyota.ntt@gmail.com> wrote:
At Wed, 14 Dec 2022 10:46:17 +0000, "Hayato Kuroda (Fujitsu)" <kuroda.hayato@fujitsu.com> wrote in
I have implemented and tested that workers wake up per wal_receiver_timeout/2
and send keepalive. Basically it works well, but I found two problems.
Do you have any good suggestions about them?1)
With this PoC at present, workers calculate sending intervals based on its
wal_receiver_timeout, and it is suppressed when the parameter is set to zero.This means that there is a possibility that walsender is timeout when wal_sender_timeout
in publisher and wal_receiver_timeout in subscriber is different.
Supposing that wal_sender_timeout is 2min, wal_receiver_tiemout is 5min,It seems to me wal_receiver_status_interval is better for this use.
It's enough for us to docuemnt that "wal_r_s_interval should be
shorter than wal_sener_timeout/2 especially when logical replication
connection is using min_apply_delay. Otherwise you will suffer
repeated termination of walsender".
This sounds reasonable to me.
and min_apply_delay is 10min. The worker on subscriber will wake up per 2.5min and
send keepalives, but walsender exits before the message arrives to publisher.One idea to avoid that is to send the min_apply_delay subscriber option to publisher
and compare them, but it may be not sufficient. Because XXX_timout GUC parameters
could be modified later.# Anyway, I don't think such asymmetric setup is preferable.
2)
The issue reported by Vignesh-san[1] has still remained. I have already analyzed that [2],
the root cause is that flushed WAL is not updated and sent to the publisher. Even
if workers send keepalive messages to pub during the delay, the flushed position
cannot be modified.I didn't look closer but the cause I guess is walsender doesn't die
until all WAL has been sent, while logical delay chokes replication
stream.
Right, I also think so.
Allowing walsender to finish ignoring replication status
wouldn't be great.
Yes, that would be ideal. But do you know why that is a must?
One idea is to let logical workers send delaying
status.
How can that help?
--
With Regards,
Amit Kapila.
At Thu, 15 Dec 2022 09:23:12 +0530, Amit Kapila <amit.kapila16@gmail.com> wrote in
On Thu, Dec 15, 2022 at 7:16 AM Kyotaro Horiguchi
<horikyota.ntt@gmail.com> wrote:Allowing walsender to finish ignoring replication status
wouldn't be great.Yes, that would be ideal. But do you know why that is a must?
I believe a graceful shutdown (fast and smart) of a replication set is expected to be in sync. Of course we can change the policy to allow walsnder to stop before confirming all WAL have been applied. However walsender doesn't have an idea of wheter the peer is intentionally delaying or not.
One idea is to let logical workers send delaying
status.How can that help?
If logical worker notifies "I'm intentionally pausing replication for
now, so if you wan to shutting down, plese go ahead ignoring me",
publisher can legally run a (kind of) dirty shut down.
# It looks a bit too much, though..
regards.
--
Kyotaro Horiguchi
NTT Open Source Software Center
At Thu, 15 Dec 2022 09:18:55 +0530, Amit Kapila <amit.kapila16@gmail.com> wrote in
On Thu, Dec 15, 2022 at 7:22 AM Kyotaro Horiguchi
<horikyota.ntt@gmail.com> wrote:At Wed, 14 Dec 2022 16:30:28 +0530, Amit Kapila <amit.kapila16@gmail.com> wrote in
On Wed, Dec 14, 2022 at 4:16 PM Hayato Kuroda (Fujitsu)
<kuroda.hayato@fujitsu.com> wrote:One idea to avoid that is to send the min_apply_delay subscriber option to publisher
and compare them, but it may be not sufficient. Because XXX_timout GUC parameters
could be modified later.How about restarting the apply worker if min_apply_delay changes? Will
that be sufficient?Mmm. If publisher knows that value, isn't it albe to delay *sending*
data in the first place? This will resolve many known issues including
walsender's un-terminatability, possible buffer-full and status packet
exchanging.Yeah, but won't it change the meaning of this parameter? Say the
Internally changes, but does not change on its face. The difference is
only in where the choking point exists. If ".._apply_delay" should
work literally, we should go the way Kuroda-san proposed. Namely,
"apply worker has received the data, but will deilay applying it". If
we technically name it correctly for the current behavior, it would be
"min_receive_delay" or "min_choking_interval".
subscriber was busy enough that it doesn't need to add an additional
delay before applying a particular transaction(s) but adding a delay
to such a transaction on the publisher will actually make it take much
longer to reflect than expected. We probably need to name this
Isn't the name min_apply_delay implying the same behavior? Even though
the delay time will be a bit prolonged.
parameter as min_send_delay if we want to do what you are saying and
then I don't know if it serves the actual need and also it will be
different from what we do in physical standby.
In the first place phisical and logical replication works differently
and the mechanism to delaying "apply" differs even in the current
state in terms of logrep delay choking stream.
I guess they cannot be different in terms of normal operation. But I'm
not sure.
regards.
--
Kyotaro Horiguchi
NTT Open Source Software Center
On Thu, Dec 15, 2022 at 10:11 AM Kyotaro Horiguchi
<horikyota.ntt@gmail.com> wrote:
At Thu, 15 Dec 2022 09:18:55 +0530, Amit Kapila <amit.kapila16@gmail.com> wrote in
On Thu, Dec 15, 2022 at 7:22 AM Kyotaro Horiguchi
<horikyota.ntt@gmail.com> wrote:At Wed, 14 Dec 2022 16:30:28 +0530, Amit Kapila <amit.kapila16@gmail.com> wrote in
On Wed, Dec 14, 2022 at 4:16 PM Hayato Kuroda (Fujitsu)
<kuroda.hayato@fujitsu.com> wrote:One idea to avoid that is to send the min_apply_delay subscriber option to publisher
and compare them, but it may be not sufficient. Because XXX_timout GUC parameters
could be modified later.How about restarting the apply worker if min_apply_delay changes? Will
that be sufficient?Mmm. If publisher knows that value, isn't it albe to delay *sending*
data in the first place? This will resolve many known issues including
walsender's un-terminatability, possible buffer-full and status packet
exchanging.Yeah, but won't it change the meaning of this parameter? Say the
Internally changes, but does not change on its face. The difference is
only in where the choking point exists. If ".._apply_delay" should
work literally, we should go the way Kuroda-san proposed. Namely,
"apply worker has received the data, but will deilay applying it". If
we technically name it correctly for the current behavior, it would be
"min_receive_delay" or "min_choking_interval".subscriber was busy enough that it doesn't need to add an additional
delay before applying a particular transaction(s) but adding a delay
to such a transaction on the publisher will actually make it take much
longer to reflect than expected. We probably need to name thisIsn't the name min_apply_delay implying the same behavior? Even though
the delay time will be a bit prolonged.
Sorry, I don't understand what you intend to say in this point. In
above, I mean that the currently proposed patch won't have such a
problem but if we apply delay on publisher the problem can happen.
parameter as min_send_delay if we want to do what you are saying and
then I don't know if it serves the actual need and also it will be
different from what we do in physical standby.In the first place phisical and logical replication works differently
and the mechanism to delaying "apply" differs even in the current
state in terms of logrep delay choking stream.
I think the first preference is to make it work in a similar way (as
much as possible) to how this parameter works in physical standby and
if that is not at all possible then we may consider other approaches.
--
With Regards,
Amit Kapila.
At Thu, 15 Dec 2022 10:29:17 +0530, Amit Kapila <amit.kapila16@gmail.com> wrote in
On Thu, Dec 15, 2022 at 10:11 AM Kyotaro Horiguchi
<horikyota.ntt@gmail.com> wrote:At Thu, 15 Dec 2022 09:18:55 +0530, Amit Kapila <amit.kapila16@gmail.com> wrote in
On Thu, Dec 15, 2022 at 7:22 AM Kyotaro Horiguchi
<horikyota.ntt@gmail.com> wrote:
subscriber was busy enough that it doesn't need to add an additional
delay before applying a particular transaction(s) but adding a delay
to such a transaction on the publisher will actually make it take much
longer to reflect than expected. We probably need to name thisIsn't the name min_apply_delay implying the same behavior? Even though
the delay time will be a bit prolonged.Sorry, I don't understand what you intend to say in this point. In
above, I mean that the currently proposed patch won't have such a
problem but if we apply delay on publisher the problem can happen.
Are you saing about the sender-side delay lets the whole transaction
(if it have not streamed out) stay on the sender side? If so... yeah,
I agree that it is undesirable.
parameter as min_send_delay if we want to do what you are saying and
then I don't know if it serves the actual need and also it will be
different from what we do in physical standby.In the first place phisical and logical replication works differently
and the mechanism to delaying "apply" differs even in the current
state in terms of logrep delay choking stream.I think the first preference is to make it work in a similar way (as
much as possible) to how this parameter works in physical standby and
if that is not at all possible then we may consider other approaches.
I uderstood that. However, still I think choking the stream on the
receiver-side alone is kind of ugly since it is breaking the protocol
assumption, that is, the in-band maintenance packets are processed in
a on-time manner on the peer under normal operation (even though
involving some delays for some natural reasons). In this regard, I
inclined to be in favor of Kuroda-san'sproposal..
regards.
--
Kyotaro Horiguchi
NTT Open Source Software Center
Dear Horiguchi-san, Amit,
Yes, that would be ideal. But do you know why that is a must?
I believe a graceful shutdown (fast and smart) of a replication set is expected to
be in sync. Of course we can change the policy to allow walsnder to stop before
confirming all WAL have been applied. However walsender doesn't have an idea
of wheter the peer is intentionally delaying or not.
This mechanism was introduced by 985bd7[1]https://github.com/postgres/postgres/commit/985bd7d49726c9f178558491d31a570d47340459, which was needed to support a
"clean" switchover. I think it is needed for physical replication, but it is not
clear for the logical case.
When the postmaster is stopped in fast or smart mode, we expected that all
modifications were received by secondary. This requirement seems to be not changed
from the initial commit.
Before 985bd7, the walsender exited just after sending the final WAL, which meant
that sometimes the last packet could not reach to secondary. So there was a possibility
of failing to reboot the primary as a new secondary because the new primary does
not have the last WAL record. To avoid the above walsender started waiting for
flush before exiting.
But in the case of logical replication, I'm not sure whether this limitation is
really needed or not. I think it may be OK that walsender exits without waiting,
in case of delaying applies. Because we don't have to consider the above issue
for logical replication.
[1]: https://github.com/postgres/postgres/commit/985bd7d49726c9f178558491d31a570d47340459
Best Regards,
Hayato Kuroda
FUJITSU LIMITED
On Thu, Dec 15, 2022 at 1:42 PM Hayato Kuroda (Fujitsu)
<kuroda.hayato@fujitsu.com> wrote:
Dear Horiguchi-san, Amit,
Yes, that would be ideal. But do you know why that is a must?
I believe a graceful shutdown (fast and smart) of a replication set is expected to
be in sync. Of course we can change the policy to allow walsnder to stop before
confirming all WAL have been applied. However walsender doesn't have an idea
of wheter the peer is intentionally delaying or not.This mechanism was introduced by 985bd7[1], which was needed to support a
"clean" switchover. I think it is needed for physical replication, but it is not
clear for the logical case.When the postmaster is stopped in fast or smart mode, we expected that all
modifications were received by secondary. This requirement seems to be not changed
from the initial commit.Before 985bd7, the walsender exited just after sending the final WAL, which meant
that sometimes the last packet could not reach to secondary. So there was a possibility
of failing to reboot the primary as a new secondary because the new primary does
not have the last WAL record. To avoid the above walsender started waiting for
flush before exiting.But in the case of logical replication, I'm not sure whether this limitation is
really needed or not. I think it may be OK that walsender exits without waiting,
in case of delaying applies. Because we don't have to consider the above issue
for logical replication.
I also don't see the need for this mechanism for logical replication,
and in fact, why do we need to even wait for sending the existing WAL?
I think the reason why we don't need to wait for logical replication
is that after the restart, we always start sending WAL from the
location requested by the subscriber, or till the point where the
publisher knows the confirmed flush location of the subscriber.
Consider another case where after restart publisher (node-1) wants to
act as a subscriber for the previous subscriber (node-2). Now, the new
subscriber (node-1) won't have a way to tell the new publisher
(node-2) that starts from the location where the node-1 went down as
WAL locations between publisher and subscriber need not be same.
This brings us to the question of whether users can use logical
replication for the scenario where they want the old master to follow
the new master after the restart which we typically do in physical
replication, if so how?
Another related point to consider is what is the behavior of
synchronous replication when shutdown has been performed both in the
case of physical and logical replication especially when the
time-delayed replication feature is enabled?
--
With Regards,
Amit Kapila.
On Thu, Dec 15, 2022 at 11:22 AM Kyotaro Horiguchi
<horikyota.ntt@gmail.com> wrote:
At Thu, 15 Dec 2022 10:29:17 +0530, Amit Kapila <amit.kapila16@gmail.com> wrote in
On Thu, Dec 15, 2022 at 10:11 AM Kyotaro Horiguchi
<horikyota.ntt@gmail.com> wrote:At Thu, 15 Dec 2022 09:18:55 +0530, Amit Kapila <amit.kapila16@gmail.com> wrote in
On Thu, Dec 15, 2022 at 7:22 AM Kyotaro Horiguchi
<horikyota.ntt@gmail.com> wrote:
subscriber was busy enough that it doesn't need to add an additional
delay before applying a particular transaction(s) but adding a delay
to such a transaction on the publisher will actually make it take much
longer to reflect than expected. We probably need to name thisIsn't the name min_apply_delay implying the same behavior? Even though
the delay time will be a bit prolonged.Sorry, I don't understand what you intend to say in this point. In
above, I mean that the currently proposed patch won't have such a
problem but if we apply delay on publisher the problem can happen.Are you saing about the sender-side delay lets the whole transaction
(if it have not streamed out) stay on the sender side?
It will not stay on the sender side forever but rather will be sent
after the min_apply_delay. The point I wanted to raise is that maybe
the delay won't need to be applied where we will end up delaying it.
Because when we apply the delay on apply side, it will take into
account the other load of apply side. I don't know how much it matters
but it appears logical to add the delay on applying side.
--
With Regards,
Amit Kapila.
Dear Amit,
I also don't see the need for this mechanism for logical replication,
and in fact, why do we need to even wait for sending the existing WAL?
Is it meant that logicalrep walsenders do not have to track WalSndCaughtUp and
any pending data in the output buffer?
I think the reason why we don't need to wait for logical replication
is that after the restart, we always start sending WAL from the
location requested by the subscriber, or till the point where the
publisher knows the confirmed flush location of the subscriber.
Consider another case where after restart publisher (node-1) wants to
act as a subscriber for the previous subscriber (node-2). Now, the new
subscriber (node-1) won't have a way to tell the new publisher
(node-2) that starts from the location where the node-1 went down as
WAL locations between publisher and subscriber need not be same.
You mean to say that such mechanism was made for supporting switchover, but logical
replication cannot do because new subscriber cannot request definitively unknown
changes for it, right? It seems reasonable to me.
This brings us to the question of whether users can use logical
replication for the scenario where they want the old master to follow
the new master after the restart which we typically do in physical
replication, if so how?
Maybe to support such use-case, 2-way replication is needed
(but this is out-of-scope of this thread).
Another related point to consider is what is the behavior of
synchronous replication when shutdown has been performed both in the
case of physical and logical replication especially when the
time-delayed replication feature is enabled?
In physical replication without any failures, it seems that users can stop primary
server even if the applications are delaying on secondary. This is because sent WALs
are immediately flushed on secondary and walreceiver replies its position. The
transaction has been already committed at that time, and the transported changes
will be applied on secondary after spending time.
IIUC we can achieve that when logical walsenders do not consider the remote status
while shutting down, but I want to hear another opinion and we must confirm by testing...
Best Regards,
Hayato Kuroda
FUJITSU LIMITED
On Fri, Dec 16, 2022 at 12:11 PM Hayato Kuroda (Fujitsu)
<kuroda.hayato@fujitsu.com> wrote:
Dear Amit,
I also don't see the need for this mechanism for logical replication,
and in fact, why do we need to even wait for sending the existing WAL?Is it meant that logicalrep walsenders do not have to track WalSndCaughtUp and
any pending data in the output buffer?
I haven't checked the details but I think what you are saying is correct.
Another related point to consider is what is the behavior of
synchronous replication when shutdown has been performed both in the
case of physical and logical replication especially when the
time-delayed replication feature is enabled?In physical replication without any failures, it seems that users can stop primary
server even if the applications are delaying on secondary. This is because sent WALs
are immediately flushed on secondary and walreceiver replies its position.
What happens when synchronous_commit's value is remote_apply and the
user has also set synchronous_standby_names to corresponding standby?
--
With Regards,
Amit Kapila.
Dear Amit,
Another related point to consider is what is the behavior of
synchronous replication when shutdown has been performed both in the
case of physical and logical replication especially when the
time-delayed replication feature is enabled?In physical replication without any failures, it seems that users can stop primary
server even if the applications are delaying on secondary. This is because sentWALs
are immediately flushed on secondary and walreceiver replies its position.
What happens when synchronous_commit's value is remote_apply and the
user has also set synchronous_standby_names to corresponding standby?
Even if synchronous_commit is set to remote_apply, the primary server can be
shut down. The reason why walsender can exit is that it does not care about the
status whether WALs are "applied" or not. It just checks the "flushed" WAL
position, not applied one.
I think we should start another thread about changing the shut-down condition,
so forked[1]/messages/by-id/TYAPR01MB586668E50FC2447AD7F92491F5E89@TYAPR01MB5866.jpnprd01.prod.outlook.com.
[1]: /messages/by-id/TYAPR01MB586668E50FC2447AD7F92491F5E89@TYAPR01MB5866.jpnprd01.prod.outlook.com
Best Regards,
Hayato Kuroda
FUJITSU LIMITED
Hi,
On Thursday, December 15, 2022 12:53 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
On Thu, Dec 15, 2022 at 7:16 AM Kyotaro Horiguchi <horikyota.ntt@gmail.com>
wrote:At Wed, 14 Dec 2022 10:46:17 +0000, "Hayato Kuroda (Fujitsu)"
<kuroda.hayato@fujitsu.com> wrote inI have implemented and tested that workers wake up per
wal_receiver_timeout/2 and send keepalive. Basically it works well, but Ifound two problems.
Do you have any good suggestions about them?
1)
With this PoC at present, workers calculate sending intervals based
on its wal_receiver_timeout, and it is suppressed when the parameter is setto zero.
This means that there is a possibility that walsender is timeout
when wal_sender_timeout in publisher and wal_receiver_timeout insubscriber is different.
Supposing that wal_sender_timeout is 2min, wal_receiver_tiemout is
5min,It seems to me wal_receiver_status_interval is better for this use.
It's enough for us to docuemnt that "wal_r_s_interval should be
shorter than wal_sener_timeout/2 especially when logical replication
connection is using min_apply_delay. Otherwise you will suffer
repeated termination of walsender".This sounds reasonable to me.
Okay, I changed the time interval to wal_receiver_status_interval
and added this description about timeout.
FYI, wal_receiver_status_interval by definition specifies
the minimum frequency for the WAL receiver process to send information
to the upstream. So I utilized the value for WaitLatch directly.
My descriptions of the documentation change follow it.
and min_apply_delay is 10min. The worker on subscriber will wake up
per 2.5min and send keepalives, but walsender exits before the messagearrives to publisher.
One idea to avoid that is to send the min_apply_delay subscriber
option to publisher and compare them, but it may be not sufficient.
Because XXX_timout GUC parameters could be modified later.# Anyway, I don't think such asymmetric setup is preferable.
2)
The issue reported by Vignesh-san[1] has still remained. I have
already analyzed that [2], the root cause is that flushed WAL is not
updated and sent to the publisher. Even if workers send keepalive
messages to pub during the delay, the flushed position cannot be modified.I didn't look closer but the cause I guess is walsender doesn't die
until all WAL has been sent, while logical delay chokes replication
stream.
For the (2) issue, a new thread has been created independently from this thread in [1]/messages/by-id/TYAPR01MB586668E50FC2447AD7F92491F5E89@TYAPR01MB5866.jpnprd01.prod.outlook.com.
I'll leave any new changes to the thread on this point.
Attached the updated patch.
Again, I used one basic patch in another thread to wake up logical replication worker
shared in [2]/messages/by-id/20221122004119.GA132961@nathanxps13 for TAP tests.
[1]: /messages/by-id/TYAPR01MB586668E50FC2447AD7F92491F5E89@TYAPR01MB5866.jpnprd01.prod.outlook.com
[2]: /messages/by-id/20221122004119.GA132961@nathanxps13
Best Regards,
Takamichi Osumi
Attachments:
v11-0001-wake-up-logical-workers-as-needed-instead-of-rel.patchapplication/octet-stream; name=v11-0001-wake-up-logical-workers-as-needed-instead-of-rel.patchDownload
From 4297dd4979a32ed4524986739d5b1653dcdc6568 Mon Sep 17 00:00:00 2001
From: Nathan Bossart <nathandbossart@gmail.com>
Date: Mon, 21 Nov 2022 16:01:01 -0800
Subject: [PATCH v11 1/2] wake up logical workers as needed instead of relying
on periodic wakeups
---
src/backend/access/transam/xact.c | 3 ++
src/backend/commands/alter.c | 7 ++++
src/backend/commands/subscriptioncmds.c | 4 ++
src/backend/replication/logical/tablesync.c | 10 +++++
src/backend/replication/logical/worker.c | 46 +++++++++++++++++++++
src/include/replication/logicalworker.h | 3 ++
6 files changed, 73 insertions(+)
diff --git a/src/backend/access/transam/xact.c b/src/backend/access/transam/xact.c
index b7c7fd9f00..70ad51c591 100644
--- a/src/backend/access/transam/xact.c
+++ b/src/backend/access/transam/xact.c
@@ -47,6 +47,7 @@
#include "pgstat.h"
#include "replication/logical.h"
#include "replication/logicallauncher.h"
+#include "replication/logicalworker.h"
#include "replication/origin.h"
#include "replication/snapbuild.h"
#include "replication/syncrep.h"
@@ -2360,6 +2361,7 @@ CommitTransaction(void)
AtEOXact_PgStat(true, is_parallel_worker);
AtEOXact_Snapshot(true, false);
AtEOXact_ApplyLauncher(true);
+ AtEOXact_LogicalRepWorkers(true);
pgstat_report_xact_timestamp(0);
CurrentResourceOwner = NULL;
@@ -2860,6 +2862,7 @@ AbortTransaction(void)
AtEOXact_HashTables(false);
AtEOXact_PgStat(false, is_parallel_worker);
AtEOXact_ApplyLauncher(false);
+ AtEOXact_LogicalRepWorkers(false);
pgstat_report_xact_timestamp(0);
}
diff --git a/src/backend/commands/alter.c b/src/backend/commands/alter.c
index 10b6fe19a2..d095cd3ced 100644
--- a/src/backend/commands/alter.c
+++ b/src/backend/commands/alter.c
@@ -59,6 +59,7 @@
#include "commands/user.h"
#include "miscadmin.h"
#include "parser/parse_func.h"
+#include "replication/logicalworker.h"
#include "rewrite/rewriteDefine.h"
#include "tcop/utility.h"
#include "utils/builtins.h"
@@ -279,6 +280,12 @@ AlterObjectRename_internal(Relation rel, Oid objectId, const char *new_name)
if (strncmp(new_name, "regress_", 8) != 0)
elog(WARNING, "subscriptions created by regression test cases should have names starting with \"regress_\"");
#endif
+
+ /*
+ * Wake up the logical replication workers to handle this change
+ * quickly.
+ */
+ LogicalRepWorkersWakeupAtCommit(objectId);
}
else if (nameCacheId >= 0)
{
diff --git a/src/backend/commands/subscriptioncmds.c b/src/backend/commands/subscriptioncmds.c
index d673557ea4..d6993c26e5 100644
--- a/src/backend/commands/subscriptioncmds.c
+++ b/src/backend/commands/subscriptioncmds.c
@@ -34,6 +34,7 @@
#include "nodes/makefuncs.h"
#include "pgstat.h"
#include "replication/logicallauncher.h"
+#include "replication/logicalworker.h"
#include "replication/origin.h"
#include "replication/slot.h"
#include "replication/walreceiver.h"
@@ -1362,6 +1363,9 @@ AlterSubscription(ParseState *pstate, AlterSubscriptionStmt *stmt,
InvokeObjectPostAlterHook(SubscriptionRelationId, subid, 0);
+ /* Wake up the logical replication workers to handle this change quickly. */
+ LogicalRepWorkersWakeupAtCommit(subid);
+
return myself;
}
diff --git a/src/backend/replication/logical/tablesync.c b/src/backend/replication/logical/tablesync.c
index 94e813ac53..509fe2eb19 100644
--- a/src/backend/replication/logical/tablesync.c
+++ b/src/backend/replication/logical/tablesync.c
@@ -105,6 +105,7 @@
#include "pgstat.h"
#include "replication/logicallauncher.h"
#include "replication/logicalrelation.h"
+#include "replication/logicalworker.h"
#include "replication/walreceiver.h"
#include "replication/worker_internal.h"
#include "replication/slot.h"
@@ -619,6 +620,15 @@ process_syncing_tables_for_apply(XLogRecPtr current_lsn)
if (started_tx)
{
+ /*
+ * If we are ready to enable two_phase mode, wake up the logical
+ * replication workers to handle this change quickly.
+ */
+ CommandCounterIncrement();
+ if (MySubscription->twophasestate == LOGICALREP_TWOPHASE_STATE_PENDING &&
+ AllTablesyncsReady())
+ LogicalRepWorkersWakeupAtCommit(MyLogicalRepWorker->subid);
+
CommitTransactionCommand();
pgstat_report_stat(true);
}
diff --git a/src/backend/replication/logical/worker.c b/src/backend/replication/logical/worker.c
index 96772e4d73..722f796c7a 100644
--- a/src/backend/replication/logical/worker.c
+++ b/src/backend/replication/logical/worker.c
@@ -254,6 +254,8 @@ WalReceiverConn *LogRepWorkerWalRcvConn = NULL;
Subscription *MySubscription = NULL;
static bool MySubscriptionValid = false;
+static List *on_commit_wakeup_workers_subids = NIL;
+
bool in_remote_transaction = false;
static XLogRecPtr remote_final_lsn = InvalidXLogRecPtr;
@@ -4097,3 +4099,47 @@ reset_apply_error_context_info(void)
apply_error_callback_arg.remote_attnum = -1;
set_apply_error_context_xact(InvalidTransactionId, InvalidXLogRecPtr);
}
+
+/*
+ * Wakeup the stored subscriptions' workers on commit if requested.
+ */
+void
+AtEOXact_LogicalRepWorkers(bool isCommit)
+{
+ if (isCommit && on_commit_wakeup_workers_subids != NIL)
+ {
+ ListCell *subid;
+
+ LWLockAcquire(LogicalRepWorkerLock, LW_SHARED);
+ foreach(subid, on_commit_wakeup_workers_subids)
+ {
+ List *workers;
+ ListCell *worker;
+
+ workers = logicalrep_workers_find(lfirst_oid(subid), true);
+ foreach(worker, workers)
+ logicalrep_worker_wakeup_ptr((LogicalRepWorker *) lfirst(worker));
+ }
+ LWLockRelease(LogicalRepWorkerLock);
+ }
+
+ on_commit_wakeup_workers_subids = NIL;
+}
+
+/*
+ * Request wakeup of the workers for the given subscription ID on commit of the
+ * transaction.
+ *
+ * This is used to ensure that the workers process assorted changes as soon as
+ * possible.
+ */
+void
+LogicalRepWorkersWakeupAtCommit(Oid subid)
+{
+ MemoryContext oldcxt;
+
+ oldcxt = MemoryContextSwitchTo(TopTransactionContext);
+ on_commit_wakeup_workers_subids = list_append_unique_oid(on_commit_wakeup_workers_subids,
+ subid);
+ MemoryContextSwitchTo(oldcxt);
+}
diff --git a/src/include/replication/logicalworker.h b/src/include/replication/logicalworker.h
index cd1b6e8afc..2c2340d758 100644
--- a/src/include/replication/logicalworker.h
+++ b/src/include/replication/logicalworker.h
@@ -16,4 +16,7 @@ extern void ApplyWorkerMain(Datum main_arg);
extern bool IsLogicalWorker(void);
+extern void LogicalRepWorkersWakeupAtCommit(Oid subid);
+extern void AtEOXact_LogicalRepWorkers(bool isCommit);
+
#endif /* LOGICALWORKER_H */
--
2.30.0
v11-0002-Time-delayed-logical-replication-subscriber.patchapplication/octet-stream; name=v11-0002-Time-delayed-logical-replication-subscriber.patchDownload
From 43efe7cbd6e4e7cb35be4aec46ba196d8580c3f9 Mon Sep 17 00:00:00 2001
From: Takamichi Osumi <osumi.takamichi@fujitsu.com>
Date: Thu, 22 Dec 2022 03:08:21 +0000
Subject: [PATCH v11 2/2] Time-delayed logical replication subscriber
Similar to physical replication, a time-delayed copy of the data for
logical replication is useful for some scenarios (particularly to fix
errors that might cause data loss).
If the subscription sets min_apply_delay parameter, the logical
replication worker will delay the transaction commit for min_apply_delay
milliseconds.
The delay is calculated between the WAL time stamp and the current time
on the subscriber.
The delay occurs only on WAL records for transaction begins. The main
reason is to avoid keeping a transaction open for a long time. Regular
and prepared transactions are covered. Streamed transactions are also
covered.
Author: Euler Taveira
Discussion: https://postgr.es/m/CAB-JLwYOYwL=XTyAXKiH5CtM_Vm8KjKh7aaitCKvmCh4rzr5pQ@mail.gmail.com
---
doc/src/sgml/catalogs.sgml | 9 ++
doc/src/sgml/config.sgml | 9 ++
doc/src/sgml/logical-replication.sgml | 7 +
doc/src/sgml/ref/alter_subscription.sgml | 5 +-
doc/src/sgml/ref/create_subscription.sgml | 54 ++++++-
src/backend/catalog/pg_subscription.c | 1 +
src/backend/catalog/system_views.sql | 7 +-
src/backend/commands/subscriptioncmds.c | 55 ++++++-
src/backend/replication/logical/worker.c | 151 +++++++++++++++++-
src/backend/utils/adt/timestamp.c | 32 ++++
src/bin/pg_dump/pg_dump.c | 16 +-
src/bin/pg_dump/pg_dump.h | 1 +
src/bin/psql/describe.c | 9 +-
src/bin/psql/tab-complete.c | 4 +-
src/include/catalog/pg_subscription.h | 3 +
src/include/datatype/timestamp.h | 2 +
src/include/utils/timestamp.h | 2 +
src/test/regress/expected/subscription.out | 176 +++++++++++++--------
src/test/regress/sql/subscription.sql | 25 +++
src/test/subscription/meson.build | 1 +
src/test/subscription/t/032_apply_delay.pl | 151 ++++++++++++++++++
21 files changed, 632 insertions(+), 88 deletions(-)
create mode 100644 src/test/subscription/t/032_apply_delay.pl
diff --git a/doc/src/sgml/catalogs.sgml b/doc/src/sgml/catalogs.sgml
index 9316b811ac..579b12f357 100644
--- a/doc/src/sgml/catalogs.sgml
+++ b/doc/src/sgml/catalogs.sgml
@@ -7873,6 +7873,15 @@ SCRAM-SHA-256$<replaceable><iteration count></replaceable>:<replaceable>&l
</para></entry>
</row>
+ <row>
+ <entry role="catalog_table_entry"><para role="column_definition">
+ <structfield>subminapplydelay</structfield> <type>int8</type>
+ </para>
+ <para>
+ The length of time (ms) to delay the application of changes.
+ </para></entry>
+ </row>
+
<row>
<entry role="catalog_table_entry"><para role="column_definition">
<structfield>subname</structfield> <type>name</type>
diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml
index 9eedab652d..8bb3a9ee17 100644
--- a/doc/src/sgml/config.sgml
+++ b/doc/src/sgml/config.sgml
@@ -4753,6 +4753,15 @@ ANY <replaceable class="parameter">num_sync</replaceable> ( <replaceable class="
the <filename>postgresql.conf</filename> file or on the server
command line.
</para>
+ <para>
+ For logical replication, the apply worker sends a Standby Status Update
+ message to the corresponding publisher per specified length of time by
+ this parameter, when <literal>min_apply_delay</literal> is defined.
+ Therefore, this parameter should be shorter than
+ <literal>wal_sender_timeout</literal> on the publisher. Otherwise, the
+ walsender repeatedly terminates due to timeout during the delay of
+ the subscriber.
+ </para>
</listitem>
</varlistentry>
diff --git a/doc/src/sgml/logical-replication.sgml b/doc/src/sgml/logical-replication.sgml
index 7fdf08b59d..5471e1aca9 100644
--- a/doc/src/sgml/logical-replication.sgml
+++ b/doc/src/sgml/logical-replication.sgml
@@ -247,6 +247,13 @@
target table.
</para>
+ <para>
+ The subscriber replication can be instructed to lag behind the publisher
+ side changes by specifying the <literal>min_apply_delay</literal>
+ subscription parameter. See <xref linkend="sql-createsubscription"/> for
+ details.
+ </para>
+
<sect2 id="logical-replication-subscription-slot">
<title>Replication Slot Management</title>
diff --git a/doc/src/sgml/ref/alter_subscription.sgml b/doc/src/sgml/ref/alter_subscription.sgml
index 1e8d72062b..d63aff1b90 100644
--- a/doc/src/sgml/ref/alter_subscription.sgml
+++ b/doc/src/sgml/ref/alter_subscription.sgml
@@ -213,8 +213,9 @@ ALTER SUBSCRIPTION <replaceable class="parameter">name</replaceable> RENAME TO <
are <literal>slot_name</literal>,
<literal>synchronous_commit</literal>,
<literal>binary</literal>, <literal>streaming</literal>,
- <literal>disable_on_error</literal>, and
- <literal>origin</literal>.
+ <literal>disable_on_error</literal>,
+ <literal>origin</literal>, and
+ <literal>min_apply_delay</literal>.
</para>
</listitem>
</varlistentry>
diff --git a/doc/src/sgml/ref/create_subscription.sgml b/doc/src/sgml/ref/create_subscription.sgml
index f9a1776380..5b8df483c7 100644
--- a/doc/src/sgml/ref/create_subscription.sgml
+++ b/doc/src/sgml/ref/create_subscription.sgml
@@ -333,7 +333,47 @@ CREATE SUBSCRIPTION <replaceable class="parameter">subscription_name</replaceabl
</para>
</listitem>
</varlistentry>
- </variablelist></para>
+
+ <varlistentry>
+ <term><literal>min_apply_delay</literal> (<type>integer</type>)</term>
+ <listitem>
+ <para>
+ By default, the subscriber applies changes as soon as possible. As
+ with the physical replication feature
+ (<xref linkend="guc-recovery-min-apply-delay"/>), it can be useful to
+ have a time-delayed logical replica. This parameter lets the user to
+ delay the application of changes by a specified amount of time. If this
+ value is specified without units, it is taken as milliseconds. The
+ default is zero(no delay).
+ </para>
+ <para>
+ The delay occurs only on WAL records for transaction begins and after
+ the initial table synchronization. It is possible that the
+ replication delay between publisher and subscriber exceeds the value
+ of this parameter, in which case no delay is added. Note that the
+ delay is calculated between the WAL time stamp as written on
+ publisher and the current time on the subscriber. Time spent in logical
+ decoding and in transfering the transaction may reduce the actual wait
+ time. If the system clocks on publisher and subscriber are not
+ synchronized, this may lead to apply changes earlier than expected,
+ but this is not a major issue because this parameter is typically much
+ larger than the time deviations between servers. Note that if this
+ parameter is set to a long delay, the replication will stop if the
+ replication slot falls behind the current LSN by more than
+ <link linkend="guc-max-slot-wal-keep-size"><literal>max_slot_wal_keep_size</literal></link>.
+ </para>
+ <warning>
+ <para>
+ Delaying the replication can mean there is a much longer time between making
+ a change on the publisher, and that change being committed on the subscriber.
+ This can have a big impact on synchronous replication.
+ See <xref linkend="guc-synchronous-commit"/>.
+ </para>
+ </warning>
+ </listitem>
+ </varlistentry>
+ </variablelist>
+ </para>
</listitem>
</varlistentry>
@@ -456,6 +496,18 @@ CREATE SUBSCRIPTION mysub
PUBLICATION insert_only
WITH (enabled = false);
</programlisting></para>
+
+ <para>
+ Create a subscription to a remote server that replicates tables in
+ the <literal>mypub</literal> publication and starts replicating immediately
+ on commit. Pre-existing data is not copied. The application of changes is
+ delayed by 4 hours.
+<programlisting>
+CREATE SUBSCRIPTION mysub
+ CONNECTION 'host=192.168.1.50 port=5432 user=foo dbname=foodb'
+ PUBLICATION mypub
+ WITH (copy_data = false, min_apply_delay = '4h');
+</programlisting></para>
</refsect1>
<refsect1>
diff --git a/src/backend/catalog/pg_subscription.c b/src/backend/catalog/pg_subscription.c
index a506fc3ec8..b29ed67715 100644
--- a/src/backend/catalog/pg_subscription.c
+++ b/src/backend/catalog/pg_subscription.c
@@ -64,6 +64,7 @@ GetSubscription(Oid subid, bool missing_ok)
sub->oid = subid;
sub->dbid = subform->subdbid;
sub->skiplsn = subform->subskiplsn;
+ sub->minapplydelay = subform->subminapplydelay;
sub->name = pstrdup(NameStr(subform->subname));
sub->owner = subform->subowner;
sub->enabled = subform->subenabled;
diff --git a/src/backend/catalog/system_views.sql b/src/backend/catalog/system_views.sql
index 2d8104b090..85aa1bd6f5 100644
--- a/src/backend/catalog/system_views.sql
+++ b/src/backend/catalog/system_views.sql
@@ -1299,9 +1299,10 @@ REVOKE ALL ON pg_replication_origin_status FROM public;
-- All columns of pg_subscription except subconninfo are publicly readable.
REVOKE ALL ON pg_subscription FROM public;
-GRANT SELECT (oid, subdbid, subskiplsn, subname, subowner, subenabled,
- subbinary, substream, subtwophasestate, subdisableonerr,
- subslotname, subsynccommit, subpublications, suborigin)
+GRANT SELECT (oid, subdbid, subskiplsn, subminapplydelay, subname, subowner,
+ subenabled, subbinary, substream, subtwophasestate,
+ subdisableonerr, subslotname, subsynccommit, subpublications,
+ suborigin)
ON pg_subscription TO public;
CREATE VIEW pg_stat_subscription_stats AS
diff --git a/src/backend/commands/subscriptioncmds.c b/src/backend/commands/subscriptioncmds.c
index d6993c26e5..fdbb9dc12c 100644
--- a/src/backend/commands/subscriptioncmds.c
+++ b/src/backend/commands/subscriptioncmds.c
@@ -48,6 +48,7 @@
#include "utils/memutils.h"
#include "utils/pg_lsn.h"
#include "utils/syscache.h"
+#include "utils/timestamp.h"
/*
* Options that can be specified by the user in CREATE/ALTER SUBSCRIPTION
@@ -66,6 +67,7 @@
#define SUBOPT_DISABLE_ON_ERR 0x00000400
#define SUBOPT_LSN 0x00000800
#define SUBOPT_ORIGIN 0x00001000
+#define SUBOPT_MIN_APPLY_DELAY 0x00002000
/* check if the 'val' has 'bits' set */
#define IsSet(val, bits) (((val) & (bits)) == (bits))
@@ -90,6 +92,7 @@ typedef struct SubOpts
bool disableonerr;
char *origin;
XLogRecPtr lsn;
+ int64 min_apply_delay;
} SubOpts;
static List *fetch_table_list(WalReceiverConn *wrconn, List *publications);
@@ -146,6 +149,8 @@ parse_subscription_options(ParseState *pstate, List *stmt_options,
opts->disableonerr = false;
if (IsSet(supported_opts, SUBOPT_ORIGIN))
opts->origin = pstrdup(LOGICALREP_ORIGIN_ANY);
+ if (IsSet(supported_opts, SUBOPT_MIN_APPLY_DELAY))
+ opts->min_apply_delay = 0;
/* Parse options */
foreach(lc, stmt_options)
@@ -324,6 +329,43 @@ parse_subscription_options(ParseState *pstate, List *stmt_options,
opts->specified_opts |= SUBOPT_LSN;
opts->lsn = lsn;
}
+ else if (IsSet(supported_opts, SUBOPT_MIN_APPLY_DELAY) &&
+ strcmp(defel->defname, "min_apply_delay") == 0)
+ {
+ char *val,
+ *tmp;
+ Interval *interval;
+ int64 ms;
+
+ if (IsSet(opts->specified_opts, SUBOPT_MIN_APPLY_DELAY))
+ errorConflictingDefElem(defel, pstate);
+
+ opts->specified_opts |= SUBOPT_MIN_APPLY_DELAY;
+ tmp = defGetString(defel);
+
+ /*
+ * If no unit was specified, then explicitly add 'ms' otherwise
+ * the interval_in function would assume 'seconds'.
+ */
+ if (strspn(tmp, "0123456789") == strlen(tmp))
+ val = psprintf("%sms", tmp);
+ else
+ val = tmp;
+
+ interval = DatumGetIntervalP(DirectFunctionCall3(interval_in,
+ CStringGetDatum(val),
+ ObjectIdGetDatum(InvalidOid),
+ Int32GetDatum(-1)));
+
+ ms = interval2ms(interval);
+ if (ms < 0 || ms > INT_MAX)
+ ereport(ERROR,
+ errcode(ERRCODE_NUMERIC_VALUE_OUT_OF_RANGE),
+ errmsg("%lld ms is outside the valid range for parameter \"%s\"",
+ (long long) ms, "min_apply_delay"));
+
+ opts->min_apply_delay = ms;
+ }
else
ereport(ERROR,
(errcode(ERRCODE_SYNTAX_ERROR),
@@ -560,7 +602,8 @@ CreateSubscription(ParseState *pstate, CreateSubscriptionStmt *stmt,
SUBOPT_SLOT_NAME | SUBOPT_COPY_DATA |
SUBOPT_SYNCHRONOUS_COMMIT | SUBOPT_BINARY |
SUBOPT_STREAMING | SUBOPT_TWOPHASE_COMMIT |
- SUBOPT_DISABLE_ON_ERR | SUBOPT_ORIGIN);
+ SUBOPT_DISABLE_ON_ERR | SUBOPT_ORIGIN |
+ SUBOPT_MIN_APPLY_DELAY);
parse_subscription_options(pstate, stmt->options, supported_opts, &opts);
/*
@@ -625,6 +668,7 @@ CreateSubscription(ParseState *pstate, CreateSubscriptionStmt *stmt,
values[Anum_pg_subscription_oid - 1] = ObjectIdGetDatum(subid);
values[Anum_pg_subscription_subdbid - 1] = ObjectIdGetDatum(MyDatabaseId);
values[Anum_pg_subscription_subskiplsn - 1] = LSNGetDatum(InvalidXLogRecPtr);
+ values[Anum_pg_subscription_subminapplydelay - 1] = Int64GetDatum(opts.min_apply_delay);
values[Anum_pg_subscription_subname - 1] =
DirectFunctionCall1(namein, CStringGetDatum(stmt->subname));
values[Anum_pg_subscription_subowner - 1] = ObjectIdGetDatum(owner);
@@ -1054,7 +1098,7 @@ AlterSubscription(ParseState *pstate, AlterSubscriptionStmt *stmt,
supported_opts = (SUBOPT_SLOT_NAME |
SUBOPT_SYNCHRONOUS_COMMIT | SUBOPT_BINARY |
SUBOPT_STREAMING | SUBOPT_DISABLE_ON_ERR |
- SUBOPT_ORIGIN);
+ SUBOPT_ORIGIN | SUBOPT_MIN_APPLY_DELAY);
parse_subscription_options(pstate, stmt->options,
supported_opts, &opts);
@@ -1111,6 +1155,13 @@ AlterSubscription(ParseState *pstate, AlterSubscriptionStmt *stmt,
= true;
}
+ if (IsSet(opts.specified_opts, SUBOPT_MIN_APPLY_DELAY))
+ {
+ values[Anum_pg_subscription_subminapplydelay - 1] =
+ Int64GetDatum(opts.min_apply_delay);
+ replaces[Anum_pg_subscription_subminapplydelay - 1] = true;
+ }
+
if (IsSet(opts.specified_opts, SUBOPT_ORIGIN))
{
values[Anum_pg_subscription_suborigin - 1] =
diff --git a/src/backend/replication/logical/worker.c b/src/backend/replication/logical/worker.c
index 722f796c7a..001de480bb 100644
--- a/src/backend/replication/logical/worker.c
+++ b/src/backend/replication/logical/worker.c
@@ -259,6 +259,18 @@ static List *on_commit_wakeup_workers_subids = NIL;
bool in_remote_transaction = false;
static XLogRecPtr remote_final_lsn = InvalidXLogRecPtr;
+/*
+ * In order to avoid walsender's timeout during time delayed replication,
+ * it's necessaary to keep sending feedbacks during the delay from the worker
+ * process. Meanwhile, the feature delays the apply before starting the
+ * transaction and thus we don't write WALs for the suspended changes during
+ * the wait. Hence, in the case the worker process sends a feedback during the
+ * delay, avoid having positions of the flushed and apply LSN overwritten by
+ * the latest LSN.
+ */
+static bool in_delaying_apply = false;
+static XLogRecPtr last_received = InvalidXLogRecPtr;
+
/* fields valid only when processing streamed transaction */
static bool in_streamed_transaction = false;
@@ -328,6 +340,8 @@ static void maybe_reread_subscription(void);
static void DisableSubscriptionAndExit(void);
+static void maybe_delay_apply(TimestampTz ts);
+
/* prototype needed because of stream_commit */
static void apply_dispatch(StringInfo s);
@@ -833,6 +847,97 @@ slot_modify_data(TupleTableSlot *slot, TupleTableSlot *srcslot,
ExecStoreVirtualTuple(slot);
}
+/*
+ * When min_apply_delay parameter is set on the subscriber, we wait long enough
+ * to make sure a transaction is applied at least that interval behind the
+ * publisher.
+ *
+ * While the physical replication applies the delay at commit time, this
+ * feature applies the delay for the next transaction but before starting the
+ * transaction. This is mainly because keeping a transaction that conducted
+ * write operations open for a long time results in some issues such as bloat
+ * and locks.
+ *
+ * The min_apply_delay parameter will take effect only after all tables are in
+ * READY state.
+ */
+static void
+maybe_delay_apply(TimestampTz ts)
+{
+ /* Nothing to do if no delay set */
+ if (MySubscription->minapplydelay <= 0)
+ return;
+
+ /*
+ * The min_apply_delay parameter is ignored until all tablesync workers
+ * have reached READY state. If we allow the delay during the catchup
+ * phase, once we reach the limit of tablesync workers, it will impose a
+ * delay for each subsequent worker. It means it will take a long time to
+ * finish the initial table synchronization.
+ */
+ if (!AllTablesyncsReady())
+ return;
+
+ /*
+ * Suppress overwrites of flushed and writtten positions by the lastest
+ * LSN in send_feedback().
+ */
+ in_delaying_apply = true;
+
+ while (true)
+ {
+ long diffms;
+
+ ResetLatch(MyLatch);
+
+ CHECK_FOR_INTERRUPTS();
+
+ if (ConfigReloadPending)
+ {
+ ConfigReloadPending = false;
+ ProcessConfigFile(PGC_SIGHUP);
+ }
+
+ /*
+ * Before calculating the time duration, reload the catalog if needed.
+ */
+ if (!in_remote_transaction && !in_streamed_transaction)
+ {
+ AcceptInvalidationMessages();
+ maybe_reread_subscription();
+ }
+
+ diffms = TimestampDifferenceMilliseconds(GetCurrentTimestamp(),
+ TimestampTzPlusMilliseconds(ts, MySubscription->minapplydelay));
+
+ /*
+ * Exit without arming the latch if it's already past time to apply
+ * this transaction.
+ */
+ if (diffms <= 0)
+ break;
+
+ elog(DEBUG2, "logical replication apply delay: %ld ms", diffms);
+
+ if (wal_receiver_status_interval > 0
+ && diffms > wal_receiver_status_interval)
+ {
+ WaitLatch(MyLatch,
+ WL_LATCH_SET | WL_TIMEOUT | WL_EXIT_ON_PM_DEATH,
+ (long) wal_receiver_status_interval,
+ WAIT_EVENT_RECOVERY_APPLY_DELAY);
+ send_feedback(last_received, true, false);
+ }
+ else
+ WaitLatch(MyLatch,
+ WL_LATCH_SET | WL_TIMEOUT | WL_EXIT_ON_PM_DEATH,
+ diffms,
+ WAIT_EVENT_RECOVERY_APPLY_DELAY);
+ }
+
+ in_delaying_apply = false;
+}
+
/*
* Handle BEGIN message.
*/
@@ -844,6 +949,9 @@ apply_handle_begin(StringInfo s)
logicalrep_read_begin(s, &begin_data);
set_apply_error_context_xact(begin_data.xid, begin_data.final_lsn);
+ /* Should we delay the current transaction? */
+ maybe_delay_apply(begin_data.committime);
+
remote_final_lsn = begin_data.final_lsn;
maybe_start_skipping_changes(begin_data.final_lsn);
@@ -898,6 +1006,9 @@ apply_handle_begin_prepare(StringInfo s)
logicalrep_read_begin_prepare(s, &begin_data);
set_apply_error_context_xact(begin_data.xid, begin_data.prepare_lsn);
+ /* Should we delay the current prepared transaction? */
+ maybe_delay_apply(begin_data.prepare_time);
+
remote_final_lsn = begin_data.prepare_lsn;
maybe_start_skipping_changes(begin_data.prepare_lsn);
@@ -1120,6 +1231,19 @@ apply_handle_stream_prepare(StringInfo s)
elog(DEBUG1, "received prepare for streamed transaction %u", prepare_data.xid);
+ /*
+ * Should we delay the current prepared transaction?
+ *
+ * Although the delay is applied in BEGIN PREPARE messages, streamed
+ * prepared transactions apply the delay in a STREAM PREPARE message.
+ * That's ok because no changes have been applied yet
+ * (apply_spooled_messages() will do it). The STREAM START message does
+ * not contain a prepare time (it will be available when the in-progress
+ * prepared transaction finishes), hence, it was not possible to apply a
+ * delay at that time.
+ */
+ maybe_delay_apply(prepare_data.prepare_time);
+
/* Replay all the spooled operations. */
apply_spooled_messages(prepare_data.xid, prepare_data.prepare_lsn);
@@ -1511,6 +1635,19 @@ apply_handle_stream_commit(StringInfo s)
elog(DEBUG1, "received commit for streamed transaction %u", xid);
+ /*
+ * Should we delay the current transaction?
+ *
+ * Although the delay is applied in BEGIN messages, streamed transactions
+ * apply the delay in a STREAM COMMIT message. That's ok because no
+ * changes have been applied yet (apply_spooled_messages() will do it).
+ * The STREAM START message would be a natural choice for this delay but
+ * there is no commit time yet (it will be available when the in-progress
+ * transaction finishes), hence, it was not possible to apply a delay at
+ * that time.
+ */
+ maybe_delay_apply(commit_data.committime);
+
apply_spooled_messages(xid, commit_data.commit_lsn);
apply_handle_commit_internal(&commit_data);
@@ -2713,7 +2850,7 @@ UpdateWorkerStats(XLogRecPtr last_lsn, TimestampTz send_time, bool reply)
* Apply main loop.
*/
static void
-LogicalRepApplyLoop(XLogRecPtr last_received)
+LogicalRepApplyLoop(void)
{
TimestampTz last_recv_timestamp = GetCurrentTimestamp();
bool ping_sent = false;
@@ -3002,8 +3139,11 @@ send_feedback(XLogRecPtr recvpos, bool force, bool requestReply)
/*
* No outstanding transactions to flush, we can report the latest received
* position. This is important for synchronous replication.
+ *
+ * During the time delayed replication, avoid reporting the suspeended
+ * latest LSN are already flushed and written, to the publisher.
*/
- if (!have_pending_txes)
+ if (!have_pending_txes && !in_delaying_apply)
flushpos = writepos = recvpos;
if (writepos < last_writepos)
@@ -3589,11 +3729,11 @@ start_table_sync(XLogRecPtr *origin_startpos, char **myslotname)
* of system resource error and are not repeatable.
*/
static void
-start_apply(XLogRecPtr origin_startpos)
+start_apply(void)
{
PG_TRY();
{
- LogicalRepApplyLoop(origin_startpos);
+ LogicalRepApplyLoop();
}
PG_CATCH();
{
@@ -3851,7 +3991,8 @@ ApplyWorkerMain(Datum main_arg)
}
/* Run the main loop. */
- start_apply(origin_startpos);
+ last_received = origin_startpos;
+ start_apply();
proc_exit(0);
}
diff --git a/src/backend/utils/adt/timestamp.c b/src/backend/utils/adt/timestamp.c
index 3f2508c0c4..4b61b15821 100644
--- a/src/backend/utils/adt/timestamp.c
+++ b/src/backend/utils/adt/timestamp.c
@@ -2437,6 +2437,38 @@ interval_cmp_internal(const Interval *interval1, const Interval *interval2)
return int128_compare(span1, span2);
}
+/*
+ * Returns the number of milliseconds in the specified Interval.
+ */
+int64
+interval2ms(const Interval *interval)
+{
+ int64 days;
+ int64 ms;
+ int64 result;
+
+ days = interval->month * INT64CONST(30);
+ days += interval->day;
+
+ /*
+ * The following operations use these special functions to detect
+ * overflow. Number of ms per informed days.
+ */
+ if (pg_mul_s64_overflow(days, MSECS_PER_DAY, &result))
+ ereport(ERROR,
+ (errcode(ERRCODE_NUMERIC_VALUE_OUT_OF_RANGE),
+ errmsg("bigint out of range")));
+
+ /* Adds portion time (in ms) to the previous result. */
+ ms = interval->time / INT64CONST(1000);
+ if (pg_add_s64_overflow(result, ms, &result))
+ ereport(ERROR,
+ (errcode(ERRCODE_NUMERIC_VALUE_OUT_OF_RANGE),
+ errmsg("bigint out of range")));
+
+ return result;
+}
+
Datum
interval_eq(PG_FUNCTION_ARGS)
{
diff --git a/src/bin/pg_dump/pg_dump.c b/src/bin/pg_dump/pg_dump.c
index 44d957c038..31c4d57764 100644
--- a/src/bin/pg_dump/pg_dump.c
+++ b/src/bin/pg_dump/pg_dump.c
@@ -4494,6 +4494,7 @@ getSubscriptions(Archive *fout)
int i_subsynccommit;
int i_subpublications;
int i_subbinary;
+ int i_subminapplydelay;
int i,
ntups;
@@ -4546,9 +4547,14 @@ getSubscriptions(Archive *fout)
LOGICALREP_TWOPHASE_STATE_DISABLED);
if (fout->remoteVersion >= 160000)
- appendPQExpBufferStr(query, " s.suborigin\n");
+ appendPQExpBufferStr(query,
+ " s.suborigin,\n"
+ " s.subminapplydelay\n");
else
- appendPQExpBuffer(query, " '%s' AS suborigin\n", LOGICALREP_ORIGIN_ANY);
+ {
+ appendPQExpBuffer(query, " '%s' AS suborigin,\n", LOGICALREP_ORIGIN_ANY);
+ appendPQExpBufferStr(query, " 0 AS subminapplydelay\n");
+ }
appendPQExpBufferStr(query,
"FROM pg_subscription s\n"
@@ -4576,6 +4582,7 @@ getSubscriptions(Archive *fout)
i_subtwophasestate = PQfnumber(res, "subtwophasestate");
i_subdisableonerr = PQfnumber(res, "subdisableonerr");
i_suborigin = PQfnumber(res, "suborigin");
+ i_subminapplydelay = PQfnumber(res, "subminapplydelay");
subinfo = pg_malloc(ntups * sizeof(SubscriptionInfo));
@@ -4606,6 +4613,8 @@ getSubscriptions(Archive *fout)
subinfo[i].subdisableonerr =
pg_strdup(PQgetvalue(res, i, i_subdisableonerr));
subinfo[i].suborigin = pg_strdup(PQgetvalue(res, i, i_suborigin));
+ subinfo[i].subminapplydelay =
+ strtoi64(PQgetvalue(res, i, i_subminapplydelay), NULL, 10);
/* Decide whether we want to dump it */
selectDumpableObject(&(subinfo[i].dobj), fout);
@@ -4685,6 +4694,9 @@ dumpSubscription(Archive *fout, const SubscriptionInfo *subinfo)
if (strcmp(subinfo->subsynccommit, "off") != 0)
appendPQExpBuffer(query, ", synchronous_commit = %s", fmtId(subinfo->subsynccommit));
+ if (subinfo->subminapplydelay > 0)
+ appendPQExpBuffer(query, ", min_apply_delay = '" INT64_FORMAT " ms'", subinfo->subminapplydelay);
+
appendPQExpBufferStr(query, ");\n");
if (subinfo->dobj.dump & DUMP_COMPONENT_DEFINITION)
diff --git a/src/bin/pg_dump/pg_dump.h b/src/bin/pg_dump/pg_dump.h
index 436ac5bb98..175b4a72a4 100644
--- a/src/bin/pg_dump/pg_dump.h
+++ b/src/bin/pg_dump/pg_dump.h
@@ -661,6 +661,7 @@ typedef struct _SubscriptionInfo
char *subdisableonerr;
char *suborigin;
char *subsynccommit;
+ int64 subminapplydelay;
char *subpublications;
} SubscriptionInfo;
diff --git a/src/bin/psql/describe.c b/src/bin/psql/describe.c
index df166365e8..6512d6059a 100644
--- a/src/bin/psql/describe.c
+++ b/src/bin/psql/describe.c
@@ -6474,7 +6474,7 @@ describeSubscriptions(const char *pattern, bool verbose)
PGresult *res;
printQueryOpt myopt = pset.popt;
static const bool translate_columns[] = {false, false, false, false,
- false, false, false, false, false, false, false, false};
+ false, false, false, false, false, false, false, false, false};
if (pset.sversion < 100000)
{
@@ -6516,10 +6516,13 @@ describeSubscriptions(const char *pattern, bool verbose)
gettext_noop("Two-phase commit"),
gettext_noop("Disable on error"));
+ /* Origin and min_apply_delay are only supported in v16 and higher */
if (pset.sversion >= 160000)
appendPQExpBuffer(&buf,
- ", suborigin AS \"%s\"\n",
- gettext_noop("Origin"));
+ ", suborigin AS \"%s\"\n"
+ ", subminapplydelay AS \"%s\"\n",
+ gettext_noop("Origin"),
+ gettext_noop("Min apply delay (ms)"));
appendPQExpBuffer(&buf,
", subsynccommit AS \"%s\"\n"
diff --git a/src/bin/psql/tab-complete.c b/src/bin/psql/tab-complete.c
index 2a3921937c..df2879cbf3 100644
--- a/src/bin/psql/tab-complete.c
+++ b/src/bin/psql/tab-complete.c
@@ -1880,7 +1880,7 @@ psql_completion(const char *text, int start, int end)
COMPLETE_WITH("(", "PUBLICATION");
/* ALTER SUBSCRIPTION <name> SET ( */
else if (HeadMatches("ALTER", "SUBSCRIPTION", MatchAny) && TailMatches("SET", "("))
- COMPLETE_WITH("binary", "disable_on_error", "origin", "slot_name",
+ COMPLETE_WITH("binary", "disable_on_error", "min_apply_delay", "origin", "slot_name",
"streaming", "synchronous_commit");
/* ALTER SUBSCRIPTION <name> SKIP ( */
else if (HeadMatches("ALTER", "SUBSCRIPTION", MatchAny) && TailMatches("SKIP", "("))
@@ -3202,7 +3202,7 @@ psql_completion(const char *text, int start, int end)
/* Complete "CREATE SUBSCRIPTION <name> ... WITH ( <opt>" */
else if (HeadMatches("CREATE", "SUBSCRIPTION") && TailMatches("WITH", "("))
COMPLETE_WITH("binary", "connect", "copy_data", "create_slot",
- "disable_on_error", "enabled", "origin", "slot_name",
+ "disable_on_error", "enabled", "min_apply_delay", "origin", "slot_name",
"streaming", "synchronous_commit", "two_phase");
/* CREATE TRIGGER --- is allowed inside CREATE SCHEMA, so use TailMatches */
diff --git a/src/include/catalog/pg_subscription.h b/src/include/catalog/pg_subscription.h
index 7b98714f30..3ff890f897 100644
--- a/src/include/catalog/pg_subscription.h
+++ b/src/include/catalog/pg_subscription.h
@@ -70,6 +70,8 @@ CATALOG(pg_subscription,6100,SubscriptionRelationId) BKI_SHARED_RELATION BKI_ROW
XLogRecPtr subskiplsn; /* All changes finished at this LSN are
* skipped */
+ int64 subminapplydelay; /* Replication apply delay */
+
NameData subname; /* Name of the subscription */
Oid subowner BKI_LOOKUP(pg_authid); /* Owner of the subscription */
@@ -119,6 +121,7 @@ typedef struct Subscription
* in */
XLogRecPtr skiplsn; /* All changes finished at this LSN are
* skipped */
+ int64 minapplydelay; /* Replication apply delay */
char *name; /* Name of the subscription */
Oid owner; /* Oid of the subscription owner */
bool enabled; /* Indicates if the subscription is enabled */
diff --git a/src/include/datatype/timestamp.h b/src/include/datatype/timestamp.h
index d155f1b03b..d5bbfad1c4 100644
--- a/src/include/datatype/timestamp.h
+++ b/src/include/datatype/timestamp.h
@@ -127,6 +127,8 @@ struct pg_itm_in
#define SECS_PER_MINUTE 60
#define MINS_PER_HOUR 60
+#define MSECS_PER_DAY INT64CONST(86400000)
+
#define USECS_PER_DAY INT64CONST(86400000000)
#define USECS_PER_HOUR INT64CONST(3600000000)
#define USECS_PER_MINUTE INT64CONST(60000000)
diff --git a/src/include/utils/timestamp.h b/src/include/utils/timestamp.h
index 7fd0b58825..1d6079057e 100644
--- a/src/include/utils/timestamp.h
+++ b/src/include/utils/timestamp.h
@@ -102,6 +102,8 @@ extern bool TimestampDifferenceExceeds(TimestampTz start_time,
TimestampTz stop_time,
int msec);
+extern int64 interval2ms(const Interval *interval);
+
extern TimestampTz time_t_to_timestamptz(pg_time_t tm);
extern pg_time_t timestamptz_to_time_t(TimestampTz t);
diff --git a/src/test/regress/expected/subscription.out b/src/test/regress/expected/subscription.out
index c13d218dcf..69a5193aa5 100644
--- a/src/test/regress/expected/subscription.out
+++ b/src/test/regress/expected/subscription.out
@@ -114,18 +114,18 @@ CREATE SUBSCRIPTION regress_testsub4 CONNECTION 'dbname=regress_doesnotexist' PU
WARNING: subscription was created, but is not connected
HINT: To initiate replication, you must manually create the replication slot, enable the subscription, and refresh the subscription.
\dRs+ regress_testsub4
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
-------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub4 | regress_subscription_user | f | {testpub} | f | f | d | f | none | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay (ms) | Synchronous commit | Conninfo | Skip LSN
+------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+----------------------+--------------------+-----------------------------+----------
+ regress_testsub4 | regress_subscription_user | f | {testpub} | f | f | d | f | none | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
ALTER SUBSCRIPTION regress_testsub4 SET (origin = any);
\dRs+ regress_testsub4
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
-------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub4 | regress_subscription_user | f | {testpub} | f | f | d | f | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay (ms) | Synchronous commit | Conninfo | Skip LSN
+------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+----------------------+--------------------+-----------------------------+----------
+ regress_testsub4 | regress_subscription_user | f | {testpub} | f | f | d | f | any | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
DROP SUBSCRIPTION regress_testsub3;
@@ -135,10 +135,10 @@ ALTER SUBSCRIPTION regress_testsub CONNECTION 'foobar';
ERROR: invalid connection string syntax: missing "=" after "foobar" in connection info string
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub} | f | f | d | f | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay (ms) | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+----------------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | f | d | f | any | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
ALTER SUBSCRIPTION regress_testsub SET PUBLICATION testpub2, testpub3 WITH (refresh = false);
@@ -155,10 +155,10 @@ ERROR: unrecognized subscription parameter: "create_slot"
-- ok
ALTER SUBSCRIPTION regress_testsub SKIP (lsn = '0/12345');
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+---------------------+--------+-----------+------------------+------------------+--------+--------------------+------------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub2,testpub3} | f | f | d | f | any | off | dbname=regress_doesnotexist2 | 0/12345
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay (ms) | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+---------------------+--------+-----------+------------------+------------------+--------+----------------------+--------------------+------------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub2,testpub3} | f | f | d | f | any | 0 | off | dbname=regress_doesnotexist2 | 0/12345
(1 row)
-- ok - with lsn = NONE
@@ -167,10 +167,10 @@ ALTER SUBSCRIPTION regress_testsub SKIP (lsn = NONE);
ALTER SUBSCRIPTION regress_testsub SKIP (lsn = '0/0');
ERROR: invalid WAL location (LSN): 0/0
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+---------------------+--------+-----------+------------------+------------------+--------+--------------------+------------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub2,testpub3} | f | f | d | f | any | off | dbname=regress_doesnotexist2 | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay (ms) | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+---------------------+--------+-----------+------------------+------------------+--------+----------------------+--------------------+------------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub2,testpub3} | f | f | d | f | any | 0 | off | dbname=regress_doesnotexist2 | 0/0
(1 row)
BEGIN;
@@ -202,10 +202,10 @@ ALTER SUBSCRIPTION regress_testsub_foo SET (synchronous_commit = foobar);
ERROR: invalid value for parameter "synchronous_commit": "foobar"
HINT: Available values: local, remote_write, remote_apply, on, off.
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
----------------------+---------------------------+---------+---------------------+--------+-----------+------------------+------------------+--------+--------------------+------------------------------+----------
- regress_testsub_foo | regress_subscription_user | f | {testpub2,testpub3} | f | f | d | f | any | local | dbname=regress_doesnotexist2 | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay (ms) | Synchronous commit | Conninfo | Skip LSN
+---------------------+---------------------------+---------+---------------------+--------+-----------+------------------+------------------+--------+----------------------+--------------------+------------------------------+----------
+ regress_testsub_foo | regress_subscription_user | f | {testpub2,testpub3} | f | f | d | f | any | 0 | local | dbname=regress_doesnotexist2 | 0/0
(1 row)
-- rename back to keep the rest simple
@@ -239,19 +239,19 @@ CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUB
WARNING: subscription was created, but is not connected
HINT: To initiate replication, you must manually create the replication slot, enable the subscription, and refresh the subscription.
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub} | t | f | d | f | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay (ms) | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+----------------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | t | f | d | f | any | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
ALTER SUBSCRIPTION regress_testsub SET (binary = false);
ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE);
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub} | f | f | d | f | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay (ms) | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+----------------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | f | d | f | any | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
DROP SUBSCRIPTION regress_testsub;
@@ -263,19 +263,19 @@ CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUB
WARNING: subscription was created, but is not connected
HINT: To initiate replication, you must manually create the replication slot, enable the subscription, and refresh the subscription.
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub} | f | t | d | f | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay (ms) | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+----------------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | t | d | f | any | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
ALTER SUBSCRIPTION regress_testsub SET (streaming = false);
ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE);
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub} | f | f | d | f | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay (ms) | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+----------------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | f | d | f | any | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
-- fail - publication already exists
@@ -290,10 +290,10 @@ ALTER SUBSCRIPTION regress_testsub ADD PUBLICATION testpub1, testpub2 WITH (refr
ALTER SUBSCRIPTION regress_testsub ADD PUBLICATION testpub1, testpub2 WITH (refresh = false);
ERROR: publication "testpub1" is already in subscription "regress_testsub"
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-----------------------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub,testpub1,testpub2} | f | f | d | f | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay (ms) | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-----------------------------+--------+-----------+------------------+------------------+--------+----------------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub,testpub1,testpub2} | f | f | d | f | any | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
-- fail - publication used more then once
@@ -308,10 +308,10 @@ ERROR: publication "testpub3" is not in subscription "regress_testsub"
-- ok - delete publications
ALTER SUBSCRIPTION regress_testsub DROP PUBLICATION testpub1, testpub2 WITH (refresh = false);
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub} | f | f | d | f | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay (ms) | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+----------------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | f | d | f | any | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
DROP SUBSCRIPTION regress_testsub;
@@ -347,10 +347,10 @@ CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUB
WARNING: subscription was created, but is not connected
HINT: To initiate replication, you must manually create the replication slot, enable the subscription, and refresh the subscription.
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub} | f | f | p | f | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay (ms) | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+----------------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | f | p | f | any | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
--fail - alter of two_phase option not supported.
@@ -359,10 +359,10 @@ ERROR: unrecognized subscription parameter: "two_phase"
-- but can alter streaming when two_phase enabled
ALTER SUBSCRIPTION regress_testsub SET (streaming = true);
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub} | f | t | p | f | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay (ms) | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+----------------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | t | p | f | any | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE);
@@ -372,10 +372,10 @@ CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUB
WARNING: subscription was created, but is not connected
HINT: To initiate replication, you must manually create the replication slot, enable the subscription, and refresh the subscription.
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub} | f | t | p | f | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay (ms) | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+----------------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | t | p | f | any | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE);
@@ -388,18 +388,58 @@ CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUB
WARNING: subscription was created, but is not connected
HINT: To initiate replication, you must manually create the replication slot, enable the subscription, and refresh the subscription.
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub} | f | f | d | f | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay (ms) | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+----------------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | f | d | f | any | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
ALTER SUBSCRIPTION regress_testsub SET (disable_on_error = true);
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub} | f | f | d | t | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay (ms) | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+----------------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | f | d | t | any | 0 | off | dbname=regress_doesnotexist | 0/0
+(1 row)
+
+ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE);
+DROP SUBSCRIPTION regress_testsub;
+-- fail -- min_apply_delay must be a non-negative integer
+CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUBLICATION testpub WITH (connect = false, min_apply_delay = foo);
+ERROR: invalid input syntax for type interval: "foo"
+CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUBLICATION testpub WITH (connect = false, min_apply_delay = -1);
+ERROR: -1000 ms is outside the valid range for parameter "min_apply_delay"
+-- success -- 123 ms
+CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUBLICATION testpub WITH (connect = false, min_apply_delay = 123);
+WARNING: subscription was created, but is not connected
+HINT: To initiate replication, you must manually create the replication slot, enable the subscription, and refresh the subscription.
+\dRs+
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay (ms) | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+----------------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | f | d | f | any | 123 | off | dbname=regress_doesnotexist | 0/0
+(1 row)
+
+-- success -- 86400000 ms
+ALTER SUBSCRIPTION regress_testsub SET (min_apply_delay = '1d');
+\dRs+
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay (ms) | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+----------------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | f | d | f | any | 86400000 | off | dbname=regress_doesnotexist | 0/0
+(1 row)
+
+ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE);
+DROP SUBSCRIPTION regress_testsub;
+-- success -- interval is converted into ms and stored as integer
+CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUBLICATION testpub WITH (connect = false, min_apply_delay = '4h 27min 35s');
+WARNING: subscription was created, but is not connected
+HINT: To initiate replication, you must manually create the replication slot, enable the subscription, and refresh the subscription.
+\dRs+
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay (ms) | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+----------------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | f | d | f | any | 16055000 | off | dbname=regress_doesnotexist | 0/0
(1 row)
ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE);
diff --git a/src/test/regress/sql/subscription.sql b/src/test/regress/sql/subscription.sql
index eaeade8cce..7a4e818857 100644
--- a/src/test/regress/sql/subscription.sql
+++ b/src/test/regress/sql/subscription.sql
@@ -275,6 +275,31 @@ ALTER SUBSCRIPTION regress_testsub SET (disable_on_error = true);
ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE);
DROP SUBSCRIPTION regress_testsub;
+-- fail -- min_apply_delay must be a non-negative integer
+CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUBLICATION testpub WITH (connect = false, min_apply_delay = foo);
+CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUBLICATION testpub WITH (connect = false, min_apply_delay = -1);
+
+-- success -- 123 ms
+CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUBLICATION testpub WITH (connect = false, min_apply_delay = 123);
+
+\dRs+
+
+-- success -- 86400000 ms
+ALTER SUBSCRIPTION regress_testsub SET (min_apply_delay = '1d');
+
+\dRs+
+
+ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE);
+DROP SUBSCRIPTION regress_testsub;
+
+-- success -- interval is converted into ms and stored as integer
+CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUBLICATION testpub WITH (connect = false, min_apply_delay = '4h 27min 35s');
+
+\dRs+
+
+ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE);
+DROP SUBSCRIPTION regress_testsub;
+
RESET SESSION AUTHORIZATION;
DROP ROLE regress_subscription_user;
DROP ROLE regress_subscription_user2;
diff --git a/src/test/subscription/meson.build b/src/test/subscription/meson.build
index c28121f26e..f136f87537 100644
--- a/src/test/subscription/meson.build
+++ b/src/test/subscription/meson.build
@@ -38,6 +38,7 @@ tests += {
't/029_on_error.pl',
't/030_origin.pl',
't/031_column_list.pl',
+ 't/032_apply_delay.pl',
't/100_bugs.pl',
],
},
diff --git a/src/test/subscription/t/032_apply_delay.pl b/src/test/subscription/t/032_apply_delay.pl
new file mode 100644
index 0000000000..8f8ce23f1b
--- /dev/null
+++ b/src/test/subscription/t/032_apply_delay.pl
@@ -0,0 +1,151 @@
+
+# Copyright (c) 2022, PostgreSQL Global Development Group
+
+# Test replication apply delay
+use strict;
+use warnings;
+use PostgreSQL::Test::Cluster;
+use PostgreSQL::Test::Utils;
+use Test::More;
+
+# Create publisher node.
+my $node_publisher = PostgreSQL::Test::Cluster->new('publisher');
+$node_publisher->init(allows_streaming => 'logical');
+$node_publisher->append_conf('postgresql.conf',
+ 'logical_decoding_work_mem = 64kB');
+$node_publisher->start;
+
+# Create subscriber node.
+my $node_subscriber = PostgreSQL::Test::Cluster->new('subscriber');
+$node_subscriber->init;
+$node_subscriber->append_conf('postgresql.conf',
+ "log_min_messages = debug2");
+$node_subscriber->start;
+
+# Create some preexisting content on publisher.
+$node_publisher->safe_psql('postgres',
+ "CREATE TABLE test_tab (a int primary key, b varchar, c timestamptz DEFAULT now())");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO test_tab VALUES (1, 'foo'), (2, 'bar')");
+
+# Setup structure on subscriber.
+$node_subscriber->safe_psql('postgres',
+ "CREATE TABLE test_tab (a int primary key, b text, c timestamptz DEFAULT now(), d bigint DEFAULT 999)"
+);
+
+# Setup logical replication.
+my $publisher_connstr = $node_publisher->connstr . ' dbname=postgres';
+
+# column c must not be published because we want to compare the time difference.
+$node_publisher->safe_psql('postgres',
+ "CREATE PUBLICATION tap_pub FOR TABLE test_tab (a, b)");
+
+my $appname = 'tap_sub';
+$node_subscriber->safe_psql('postgres',
+ "CREATE SUBSCRIPTION tap_sub CONNECTION '$publisher_connstr application_name=$appname' PUBLICATION tap_pub WITH (streaming = on, min_apply_delay = '3s')"
+);
+
+# Wait for initial table sync to finish.
+$node_subscriber->wait_for_subscription_sync($node_publisher, $appname);
+
+# Check log starting now for logical replication apply delay.
+my $log_location = -s $node_subscriber->logfile;
+
+my $result =
+ $node_subscriber->safe_psql('postgres',
+ "SELECT count(*), min(a), max(a) FROM test_tab");
+is($result, qq(2|1|2), 'check initial data was copied to subscriber');
+
+# New row to trigger apply delay.
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO test_tab VALUES (3, 'baz')");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO test_tab VALUES (4, 'abc')");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO test_tab VALUES (5, 'def')");
+
+$node_publisher->wait_for_catchup($appname);
+
+$result =
+ $node_subscriber->safe_psql('postgres',
+ "SELECT count(*), min(a), max(a) FROM test_tab");
+is($result, qq(5|1|5), 'check if the new rows were applied to subscriber');
+
+check_apply_delay_log("logical replication apply delay");
+
+check_apply_delay_time('5', '3');
+
+# Test streamed transaction.
+# Insert, update and delete enough rows to exceed 64kB limit.
+$node_publisher->safe_psql(
+ 'postgres', q{
+BEGIN;
+INSERT INTO test_tab SELECT i, md5(i::text) FROM generate_series(6, 5000) s(i);
+UPDATE test_tab SET b = md5(b) WHERE mod(a, 2) = 0;
+DELETE FROM test_tab WHERE mod(a, 3) = 0;
+COMMIT;
+});
+
+$node_publisher->wait_for_catchup($appname);
+
+$result =
+ $node_subscriber->safe_psql('postgres',
+ "SELECT count(*), min(a), max(a) FROM test_tab");
+is($result, qq(3334|1|5000), 'check if the new rows were applied to subscriber');
+
+check_apply_delay_log("logical replication apply delay");
+
+check_apply_delay_time('5000', '3');
+
+# Test ALTER SUBSCRIPTION. Delay 86460 seconds (1 day 1 minute).
+$node_subscriber->safe_psql('postgres',
+ "ALTER SUBSCRIPTION tap_sub SET (min_apply_delay = 86460000)"
+);
+
+# New row to trigger apply delay.
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO test_tab VALUES (0, 'foobar')");
+
+# Disable subscription. worker should die immediately.
+$node_subscriber->safe_psql('postgres',
+ "ALTER SUBSCRIPTION tap_sub DISABLE;"
+);
+
+# Wait until worker dies.
+my $sub_query =
+ "SELECT count(1) = 0 FROM pg_stat_subscription WHERE subname = 'tap_sub' AND pid IS NOT NULL;";
+$node_subscriber->poll_query_until('postgres', $sub_query)
+ or die "Timed out while waiting for subscriber to die";
+
+$node_subscriber->stop;
+$node_publisher->stop;
+
+done_testing();
+
+sub check_apply_delay_log
+{
+ my $message = shift;
+ my $old_log_location = $log_location;
+
+ $log_location = $node_subscriber->wait_for_log(qr/$message/, $log_location);
+
+ cmp_ok($log_location, '>', $old_log_location,
+ "logfile contains triggered logical replication apply delay"
+ );
+}
+
+sub check_apply_delay_time
+{
+ my ($primary_key, $expected_diffs) = @_;
+
+ my $inserted_time_on_pub = $node_publisher->safe_psql('postgres', qq[
+ SELECT extract(epoch from c) FROM test_tab WHERE a = $primary_key;
+ ]);
+
+ my $inserted_time_on_sub = $node_subscriber->safe_psql('postgres', qq[
+ SELECT extract(epoch from c) FROM test_tab WHERE a = $primary_key;
+ ]);
+
+ cmp_ok($inserted_time_on_sub - $inserted_time_on_pub, '>', $expected_diffs,
+ "The tuple on the subscriber was modified later than the publisher");
+}
--
2.30.0
On Thursday, December 22, 2022 3:02 PM Takamichi Osumi (Fujitsu) <osumi.takamichi@fujitsu.com> wrote:
Attached the updated patch.
Again, I used one basic patch in another thread to wake up logical replication
worker shared in [2] for TAP tests.
The v11 caused a cfbot failure in [1]https://cirrus-ci.com/task/4580705867399168. But, failed tests looked irrelevant
to the feature to me at present.
While waiting for another test execution of cfbot, I'd like to check the detailed reason
and update the patch if necessary.
[1]: https://cirrus-ci.com/task/4580705867399168
Best Regards,
Takamichi Osumi
On Fri, Dec 23, 2022 at 9:16 PM Takamichi Osumi (Fujitsu)
<osumi.takamichi@fujitsu.com> wrote:
On Thursday, December 22, 2022 3:02 PM Takamichi Osumi (Fujitsu) <osumi.takamichi@fujitsu.com> wrote:
Attached the updated patch.
Again, I used one basic patch in another thread to wake up logical replication
worker shared in [2] for TAP tests.The v11 caused a cfbot failure in [1]. But, failed tests looked irrelevant
to the feature to me at present.
I have done some review for the patch and I have a few comments.
1.
A.
+ <literal>wal_sender_timeout</literal> on the publisher. Otherwise, the
+ walsender repeatedly terminates due to timeout during the delay of
+ the subscriber.
B.
+/*
+ * In order to avoid walsender's timeout during time delayed replication,
+ * it's necessaary to keep sending feedbacks during the delay from the worker
+ * process. Meanwhile, the feature delays the apply before starting the
+ * transaction and thus we don't write WALs for the suspended changes during
+ * the wait. Hence, in the case the worker process sends a feedback during the
+ * delay, avoid having positions of the flushed and apply LSN overwritten by
+ * the latest LSN.
+ */
- Seems like these two statements are conflicting, I mean if we are
sending feedback then why the walsender will timeout?
- Typo /necessaary/necessary
2.
+ *
+ * During the time delayed replication, avoid reporting the suspeended
+ * latest LSN are already flushed and written, to the publisher.
*/
Typo /suspeended/suspended
3.
+ if (wal_receiver_status_interval > 0
+ && diffms > wal_receiver_status_interval)
+ {
+ WaitLatch(MyLatch,
+ WL_LATCH_SET | WL_TIMEOUT | WL_EXIT_ON_PM_DEATH,
+ (long) wal_receiver_status_interval,
+ WAIT_EVENT_RECOVERY_APPLY_DELAY);
+ send_feedback(last_received, true, false);
+ }
+ else
+ WaitLatch(MyLatch,
+ WL_LATCH_SET | WL_TIMEOUT | WL_EXIT_ON_PM_DEATH,
+ diffms,
+ WAIT_EVENT_RECOVERY_APPLY_DELAY);
I think here we should add some comments to explain about sending
feedback, something like what we have explained at the time of
defining the "in_delaying_apply" variable.
4.
+ * Although the delay is applied in BEGIN messages, streamed transactions
+ * apply the delay in a STREAM COMMIT message. That's ok because no
+ * changes have been applied yet (apply_spooled_messages() will do it).
+ * The STREAM START message would be a natural choice for this delay but
+ * there is no commit time yet (it will be available when the in-progress
+ * transaction finishes), hence, it was not possible to apply a delay at
+ * that time.
+ */
+ maybe_delay_apply(commit_data.committime);
I am wondering how this will interact with the parallel apply worker
where we do not spool the data in file? How are we going to get the
commit time of the transaction without applying the changes?
5.
+ /*
+ * The following operations use these special functions to detect
+ * overflow. Number of ms per informed days.
+ */
This comment doesn't make much sense, I think this needs to be rephrased.
--
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com
On Mon, Dec 26, 2022 at 2:12 PM Dilip Kumar <dilipbalaut@gmail.com> wrote:
On Fri, Dec 23, 2022 at 9:16 PM Takamichi Osumi (Fujitsu)
<osumi.takamichi@fujitsu.com> wrote:4.
+ * Although the delay is applied in BEGIN messages, streamed transactions + * apply the delay in a STREAM COMMIT message. That's ok because no + * changes have been applied yet (apply_spooled_messages() will do it). + * The STREAM START message would be a natural choice for this delay but + * there is no commit time yet (it will be available when the in-progress + * transaction finishes), hence, it was not possible to apply a delay at + * that time. + */ + maybe_delay_apply(commit_data.committime);I am wondering how this will interact with the parallel apply worker
where we do not spool the data in file? How are we going to get the
commit time of the transaction without applying the changes?
There is no sane way to do this. So, I think these features won't work
together, we can disable parallelism when this is active. Considering
that parallel apply is to speed up the transactions apply and this
feature is to slow down the apply, so even if they don't work together
that should be okay. Does that make sense?
--
With Regards,
Amit Kapila.
On Mon, Dec 26, 2022 at 2:44 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
On Mon, Dec 26, 2022 at 2:12 PM Dilip Kumar <dilipbalaut@gmail.com> wrote:
On Fri, Dec 23, 2022 at 9:16 PM Takamichi Osumi (Fujitsu)
<osumi.takamichi@fujitsu.com> wrote:4.
+ * Although the delay is applied in BEGIN messages, streamed transactions + * apply the delay in a STREAM COMMIT message. That's ok because no + * changes have been applied yet (apply_spooled_messages() will do it). + * The STREAM START message would be a natural choice for this delay but + * there is no commit time yet (it will be available when the in-progress + * transaction finishes), hence, it was not possible to apply a delay at + * that time. + */ + maybe_delay_apply(commit_data.committime);I am wondering how this will interact with the parallel apply worker
where we do not spool the data in file? How are we going to get the
commit time of the transaction without applying the changes?There is no sane way to do this.
Yeah, there is no sane way to do it.
So, I think these features won't work
together, we can disable parallelism when this is active. Considering
that parallel apply is to speed up the transactions apply and this
feature is to slow down the apply, so even if they don't work together
that should be okay. Does that make sense?
Yes, this makes sense.
--
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com
On Mon, Dec 26, 2022 at 7:37 PM Dilip Kumar <dilipbalaut@gmail.com> wrote:
On Mon, Dec 26, 2022 at 2:44 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
On Mon, Dec 26, 2022 at 2:12 PM Dilip Kumar <dilipbalaut@gmail.com> wrote:
On Fri, Dec 23, 2022 at 9:16 PM Takamichi Osumi (Fujitsu)
<osumi.takamichi@fujitsu.com> wrote:4.
+ * Although the delay is applied in BEGIN messages, streamed transactions + * apply the delay in a STREAM COMMIT message. That's ok because no + * changes have been applied yet (apply_spooled_messages() will do it). + * The STREAM START message would be a natural choice for this delay but + * there is no commit time yet (it will be available when the in-progress + * transaction finishes), hence, it was not possible to apply a delay at + * that time. + */ + maybe_delay_apply(commit_data.committime);I am wondering how this will interact with the parallel apply worker
where we do not spool the data in file? How are we going to get the
commit time of the transaction without applying the changes?There is no sane way to do this.
Yeah, there is no sane way to do it.
So, I think these features won't work
together, we can disable parallelism when this is active. Considering
that parallel apply is to speed up the transactions apply and this
feature is to slow down the apply, so even if they don't work together
that should be okay. Does that make sense?Yes, this makes sense.
BTW, the blocking problem with this patch is to deal with shutdown as
discussed in the thread [1]/messages/by-id/TYAPR01MB586668E50FC2447AD7F92491F5E89@TYAPR01MB5866.jpnprd01.prod.outlook.com. In short, the problem is that at
shutdown, we wait for walsender to send all pending data and ensure
all data is flushed in the remote node. But, if the other node is
waiting due to a time-delayed apply then shutdown won't be successful.
It would be really great if you can let us know your thoughts in the
thread [1]/messages/by-id/TYAPR01MB586668E50FC2447AD7F92491F5E89@TYAPR01MB5866.jpnprd01.prod.outlook.com as that can help to move this work forward.
[1]: /messages/by-id/TYAPR01MB586668E50FC2447AD7F92491F5E89@TYAPR01MB5866.jpnprd01.prod.outlook.com
--
With Regards,
Amit Kapila.
On Tue, Dec 27, 2022 at 9:33 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
BTW, the blocking problem with this patch is to deal with shutdown as
discussed in the thread [1].
I will have a look.
In short, the problem is that at
shutdown, we wait for walsender to send all pending data and ensure
all data is flushed in the remote node. But, if the other node is
waiting due to a time-delayed apply then shutdown won't be successful.
It would be really great if you can let us know your thoughts in the
thread [1] as that can help to move this work forward.
Okay, so you mean to say that with logical the shutdown will be
delayed until all the changes are applied on the subscriber but the
same is not true for physical standby? Is it because on physical
standby we flush the WAL before applying?
--
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com
On Tue, Dec 27, 2022 at 11:42 AM Dilip Kumar <dilipbalaut@gmail.com> wrote:
On Tue, Dec 27, 2022 at 9:33 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
BTW, the blocking problem with this patch is to deal with shutdown as
discussed in the thread [1].I will have a look.
Thanks!
In short, the problem is that at
shutdown, we wait for walsender to send all pending data and ensure
all data is flushed in the remote node. But, if the other node is
waiting due to a time-delayed apply then shutdown won't be successful.
It would be really great if you can let us know your thoughts in the
thread [1] as that can help to move this work forward.Okay, so you mean to say that with logical the shutdown will be
delayed until all the changes are applied on the subscriber but the
same is not true for physical standby?
Right.
Is it because on physical
standby we flush the WAL before applying?
Yes, the walreceiver first flushes the WAL before applying.
--
With Regards,
Amit Kapila.
Hi hackers,
On Thursday, December 22, 2022 3:02 PM Takamichi Osumi (Fujitsu)
<osumi.takamichi@fujitsu.com> wrote:Attached the updated patch.
Again, I used one basic patch in another thread to wake up logical replication
worker shared in [2] for TAP tests.The v11 caused a cfbot failure in [1]. But, failed tests looked irrelevant
to the feature to me at present.While waiting for another test execution of cfbot, I'd like to check the detailed
reason
and update the patch if necessary.
I have investigated the failure and it seemed that it has been caused by VACUUM FREEZE.
Followings were copied from the server log.
```
2022-12-23 08:50:20.175 UTC [34653][postmaster] LOG: server process (PID 37171) was terminated by signal 6: Abort trap
2022-12-23 08:50:20.175 UTC [34653][postmaster] DETAIL: Failed process was running: VACUUM FREEZE tab_freeze;
2022-12-23 08:50:20.175 UTC [34653][postmaster] LOG: terminating any other active server processes
```
Same error has been raised in other threads [1]https://cirrus-ci.com/task/5630405437554688, so we have concluded that this is not related with the patch.
The report was raised in another thread [2]/messages/by-id/TYAPR01MB5866B24104FD80B5D7E65C3EF5ED9@TYAPR01MB5866.jpnprd01.prod.outlook.com.
[1]: https://cirrus-ci.com/task/5630405437554688
[2]: /messages/by-id/TYAPR01MB5866B24104FD80B5D7E65C3EF5ED9@TYAPR01MB5866.jpnprd01.prod.outlook.com
Best Regards,
Hayato Kuroda
FUJITSU LIMITED
Dear Dilip,
Thanks for reviewing our patch! PSA new version patch set.
Again, 0001 is not made by us, brought from [1]/messages/by-id/20221215224721.GA694065@nathanxps13.
I have done some review for the patch and I have a few comments.
1. A. + <literal>wal_sender_timeout</literal> on the publisher. Otherwise, the + walsender repeatedly terminates due to timeout during the delay of + the subscriber.B. +/* + * In order to avoid walsender's timeout during time delayed replication, + * it's necessaary to keep sending feedbacks during the delay from the worker + * process. Meanwhile, the feature delays the apply before starting the + * transaction and thus we don't write WALs for the suspended changes during + * the wait. Hence, in the case the worker process sends a feedback during the + * delay, avoid having positions of the flushed and apply LSN overwritten by + * the latest LSN. + */- Seems like these two statements are conflicting, I mean if we are
sending feedback then why the walsender will timeout?
It is a possibility that timeout is occurred because the interval between feedback
messages may become longer than wal_sender_timeout. Reworded and added descriptions.
- Typo /necessaary/necessary
Fixed.
2. + * + * During the time delayed replication, avoid reporting the suspeended + * latest LSN are already flushed and written, to the publisher. */ Typo /suspeended/suspended
Fixed.
3. + if (wal_receiver_status_interval > 0 + && diffms > wal_receiver_status_interval) + { + WaitLatch(MyLatch, + WL_LATCH_SET | WL_TIMEOUT | WL_EXIT_ON_PM_DEATH, + (long) wal_receiver_status_interval, + WAIT_EVENT_RECOVERY_APPLY_DELAY); + send_feedback(last_received, true, false); + } + else + WaitLatch(MyLatch, + WL_LATCH_SET | WL_TIMEOUT | WL_EXIT_ON_PM_DEATH, + diffms, + WAIT_EVENT_RECOVERY_APPLY_DELAY);I think here we should add some comments to explain about sending
feedback, something like what we have explained at the time of
defining the "in_delaying_apply" variable.
Added.
4.
+ * Although the delay is applied in BEGIN messages, streamed transactions + * apply the delay in a STREAM COMMIT message. That's ok because no + * changes have been applied yet (apply_spooled_messages() will do it). + * The STREAM START message would be a natural choice for this delay but + * there is no commit time yet (it will be available when the in-progress + * transaction finishes), hence, it was not possible to apply a delay at + * that time. + */ + maybe_delay_apply(commit_data.committime);I am wondering how this will interact with the parallel apply worker
where we do not spool the data in file? How are we going to get the
commit time of the transaction without applying the changes?
We think that parallel apply workers should not delay applications because if
they delay transactions before committing they may hold locks very long time.
5. + /* + * The following operations use these special functions to detect + * overflow. Number of ms per informed days. + */This comment doesn't make much sense, I think this needs to be rephrased.
Changed to simpler expression.
We have also fixed wrong usage of wal_receiver_status_interval. We must convert
the unit from [s] to [ms] when it is passed to WaitLatch().
Note that more than half of the modifications are done by Osumi-san.
[1]: /messages/by-id/20221215224721.GA694065@nathanxps13
Best Regards,
Hayato Kuroda
FUJITSU LIMITED
Attachments:
v12-0001-wake-up-logical-workers-as-needed-instead-of-rel.patchapplication/octet-stream; name=v12-0001-wake-up-logical-workers-as-needed-instead-of-rel.patchDownload
From dff3606ccef1b57982ace383b7df427a2d75af65 Mon Sep 17 00:00:00 2001
From: Nathan Bossart <nathandbossart@gmail.com>
Date: Mon, 21 Nov 2022 16:01:01 -0800
Subject: [PATCH v12 1/2] wake up logical workers as needed instead of relying
on periodic wakeups
---
src/backend/access/transam/xact.c | 3 ++
src/backend/commands/alter.c | 7 ++++
src/backend/commands/subscriptioncmds.c | 4 ++
src/backend/replication/logical/tablesync.c | 10 +++++
src/backend/replication/logical/worker.c | 46 +++++++++++++++++++++
src/include/replication/logicalworker.h | 3 ++
6 files changed, 73 insertions(+)
diff --git a/src/backend/access/transam/xact.c b/src/backend/access/transam/xact.c
index b7c7fd9f00..70ad51c591 100644
--- a/src/backend/access/transam/xact.c
+++ b/src/backend/access/transam/xact.c
@@ -47,6 +47,7 @@
#include "pgstat.h"
#include "replication/logical.h"
#include "replication/logicallauncher.h"
+#include "replication/logicalworker.h"
#include "replication/origin.h"
#include "replication/snapbuild.h"
#include "replication/syncrep.h"
@@ -2360,6 +2361,7 @@ CommitTransaction(void)
AtEOXact_PgStat(true, is_parallel_worker);
AtEOXact_Snapshot(true, false);
AtEOXact_ApplyLauncher(true);
+ AtEOXact_LogicalRepWorkers(true);
pgstat_report_xact_timestamp(0);
CurrentResourceOwner = NULL;
@@ -2860,6 +2862,7 @@ AbortTransaction(void)
AtEOXact_HashTables(false);
AtEOXact_PgStat(false, is_parallel_worker);
AtEOXact_ApplyLauncher(false);
+ AtEOXact_LogicalRepWorkers(false);
pgstat_report_xact_timestamp(0);
}
diff --git a/src/backend/commands/alter.c b/src/backend/commands/alter.c
index 10b6fe19a2..d095cd3ced 100644
--- a/src/backend/commands/alter.c
+++ b/src/backend/commands/alter.c
@@ -59,6 +59,7 @@
#include "commands/user.h"
#include "miscadmin.h"
#include "parser/parse_func.h"
+#include "replication/logicalworker.h"
#include "rewrite/rewriteDefine.h"
#include "tcop/utility.h"
#include "utils/builtins.h"
@@ -279,6 +280,12 @@ AlterObjectRename_internal(Relation rel, Oid objectId, const char *new_name)
if (strncmp(new_name, "regress_", 8) != 0)
elog(WARNING, "subscriptions created by regression test cases should have names starting with \"regress_\"");
#endif
+
+ /*
+ * Wake up the logical replication workers to handle this change
+ * quickly.
+ */
+ LogicalRepWorkersWakeupAtCommit(objectId);
}
else if (nameCacheId >= 0)
{
diff --git a/src/backend/commands/subscriptioncmds.c b/src/backend/commands/subscriptioncmds.c
index d673557ea4..d6993c26e5 100644
--- a/src/backend/commands/subscriptioncmds.c
+++ b/src/backend/commands/subscriptioncmds.c
@@ -34,6 +34,7 @@
#include "nodes/makefuncs.h"
#include "pgstat.h"
#include "replication/logicallauncher.h"
+#include "replication/logicalworker.h"
#include "replication/origin.h"
#include "replication/slot.h"
#include "replication/walreceiver.h"
@@ -1362,6 +1363,9 @@ AlterSubscription(ParseState *pstate, AlterSubscriptionStmt *stmt,
InvokeObjectPostAlterHook(SubscriptionRelationId, subid, 0);
+ /* Wake up the logical replication workers to handle this change quickly. */
+ LogicalRepWorkersWakeupAtCommit(subid);
+
return myself;
}
diff --git a/src/backend/replication/logical/tablesync.c b/src/backend/replication/logical/tablesync.c
index 94e813ac53..509fe2eb19 100644
--- a/src/backend/replication/logical/tablesync.c
+++ b/src/backend/replication/logical/tablesync.c
@@ -105,6 +105,7 @@
#include "pgstat.h"
#include "replication/logicallauncher.h"
#include "replication/logicalrelation.h"
+#include "replication/logicalworker.h"
#include "replication/walreceiver.h"
#include "replication/worker_internal.h"
#include "replication/slot.h"
@@ -619,6 +620,15 @@ process_syncing_tables_for_apply(XLogRecPtr current_lsn)
if (started_tx)
{
+ /*
+ * If we are ready to enable two_phase mode, wake up the logical
+ * replication workers to handle this change quickly.
+ */
+ CommandCounterIncrement();
+ if (MySubscription->twophasestate == LOGICALREP_TWOPHASE_STATE_PENDING &&
+ AllTablesyncsReady())
+ LogicalRepWorkersWakeupAtCommit(MyLogicalRepWorker->subid);
+
CommitTransactionCommand();
pgstat_report_stat(true);
}
diff --git a/src/backend/replication/logical/worker.c b/src/backend/replication/logical/worker.c
index 96772e4d73..722f796c7a 100644
--- a/src/backend/replication/logical/worker.c
+++ b/src/backend/replication/logical/worker.c
@@ -254,6 +254,8 @@ WalReceiverConn *LogRepWorkerWalRcvConn = NULL;
Subscription *MySubscription = NULL;
static bool MySubscriptionValid = false;
+static List *on_commit_wakeup_workers_subids = NIL;
+
bool in_remote_transaction = false;
static XLogRecPtr remote_final_lsn = InvalidXLogRecPtr;
@@ -4097,3 +4099,47 @@ reset_apply_error_context_info(void)
apply_error_callback_arg.remote_attnum = -1;
set_apply_error_context_xact(InvalidTransactionId, InvalidXLogRecPtr);
}
+
+/*
+ * Wakeup the stored subscriptions' workers on commit if requested.
+ */
+void
+AtEOXact_LogicalRepWorkers(bool isCommit)
+{
+ if (isCommit && on_commit_wakeup_workers_subids != NIL)
+ {
+ ListCell *subid;
+
+ LWLockAcquire(LogicalRepWorkerLock, LW_SHARED);
+ foreach(subid, on_commit_wakeup_workers_subids)
+ {
+ List *workers;
+ ListCell *worker;
+
+ workers = logicalrep_workers_find(lfirst_oid(subid), true);
+ foreach(worker, workers)
+ logicalrep_worker_wakeup_ptr((LogicalRepWorker *) lfirst(worker));
+ }
+ LWLockRelease(LogicalRepWorkerLock);
+ }
+
+ on_commit_wakeup_workers_subids = NIL;
+}
+
+/*
+ * Request wakeup of the workers for the given subscription ID on commit of the
+ * transaction.
+ *
+ * This is used to ensure that the workers process assorted changes as soon as
+ * possible.
+ */
+void
+LogicalRepWorkersWakeupAtCommit(Oid subid)
+{
+ MemoryContext oldcxt;
+
+ oldcxt = MemoryContextSwitchTo(TopTransactionContext);
+ on_commit_wakeup_workers_subids = list_append_unique_oid(on_commit_wakeup_workers_subids,
+ subid);
+ MemoryContextSwitchTo(oldcxt);
+}
diff --git a/src/include/replication/logicalworker.h b/src/include/replication/logicalworker.h
index cd1b6e8afc..2c2340d758 100644
--- a/src/include/replication/logicalworker.h
+++ b/src/include/replication/logicalworker.h
@@ -16,4 +16,7 @@ extern void ApplyWorkerMain(Datum main_arg);
extern bool IsLogicalWorker(void);
+extern void LogicalRepWorkersWakeupAtCommit(Oid subid);
+extern void AtEOXact_LogicalRepWorkers(bool isCommit);
+
#endif /* LOGICALWORKER_H */
--
2.27.0
v12-0002-Time-delayed-logical-replication-subscriber.patchapplication/octet-stream; name=v12-0002-Time-delayed-logical-replication-subscriber.patchDownload
From ff69380afcd674511e8f636ffce58ce714a6bb4c Mon Sep 17 00:00:00 2001
From: Takamichi Osumi <osumi.takamichi@fujitsu.com>
Date: Thu, 22 Dec 2022 03:08:21 +0000
Subject: [PATCH v12 2/2] Time-delayed logical replication subscriber
Similar to physical replication, a time-delayed copy of the data for
logical replication is useful for some scenarios (particularly to fix
errors that might cause data loss).
If the subscription sets min_apply_delay parameter, the logical
replication worker will delay the transaction commit for min_apply_delay
milliseconds.
The delay is calculated between the WAL time stamp and the current time
on the subscriber.
The delay occurs only on WAL records for transaction begins. The main
reason is to avoid keeping a transaction open for a long time. Regular
and prepared transactions are covered. Streamed transactions are also
covered.
Author: Euler Taveira
Discussion: https://postgr.es/m/CAB-JLwYOYwL=XTyAXKiH5CtM_Vm8KjKh7aaitCKvmCh4rzr5pQ@mail.gmail.com
---
doc/src/sgml/catalogs.sgml | 9 ++
doc/src/sgml/config.sgml | 12 ++
doc/src/sgml/logical-replication.sgml | 7 +
doc/src/sgml/ref/alter_subscription.sgml | 5 +-
doc/src/sgml/ref/create_subscription.sgml | 54 ++++++-
src/backend/catalog/pg_subscription.c | 1 +
src/backend/catalog/system_views.sql | 7 +-
src/backend/commands/subscriptioncmds.c | 55 ++++++-
src/backend/replication/logical/worker.c | 160 ++++++++++++++++++-
src/backend/utils/adt/timestamp.c | 29 ++++
src/bin/pg_dump/pg_dump.c | 16 +-
src/bin/pg_dump/pg_dump.h | 1 +
src/bin/psql/describe.c | 9 +-
src/bin/psql/tab-complete.c | 4 +-
src/include/catalog/pg_subscription.h | 3 +
src/include/datatype/timestamp.h | 2 +
src/include/utils/timestamp.h | 2 +
src/test/regress/expected/subscription.out | 176 +++++++++++++--------
src/test/regress/sql/subscription.sql | 25 +++
src/test/subscription/meson.build | 1 +
src/test/subscription/t/032_apply_delay.pl | 151 ++++++++++++++++++
21 files changed, 641 insertions(+), 88 deletions(-)
create mode 100644 src/test/subscription/t/032_apply_delay.pl
diff --git a/doc/src/sgml/catalogs.sgml b/doc/src/sgml/catalogs.sgml
index 9316b811ac..579b12f357 100644
--- a/doc/src/sgml/catalogs.sgml
+++ b/doc/src/sgml/catalogs.sgml
@@ -7873,6 +7873,15 @@ SCRAM-SHA-256$<replaceable><iteration count></replaceable>:<replaceable>&l
</para></entry>
</row>
+ <row>
+ <entry role="catalog_table_entry"><para role="column_definition">
+ <structfield>subminapplydelay</structfield> <type>int8</type>
+ </para>
+ <para>
+ The length of time (ms) to delay the application of changes.
+ </para></entry>
+ </row>
+
<row>
<entry role="catalog_table_entry"><para role="column_definition">
<structfield>subname</structfield> <type>name</type>
diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml
index 3071c8eace..91ba288d46 100644
--- a/doc/src/sgml/config.sgml
+++ b/doc/src/sgml/config.sgml
@@ -4753,6 +4753,18 @@ ANY <replaceable class="parameter">num_sync</replaceable> ( <replaceable class="
the <filename>postgresql.conf</filename> file or on the server
command line.
</para>
+ <para>
+ For time delayed logical replication, the apply worker sends a Standby
+ Status Update message to the corresponding publisher per the indicated
+ time of this parameter. Therefore, if this parameter is longer than
+ <literal>wal_sender_timeout</literal> on the publisher, then the
+ walsender doesn't get any update message during the delay and repeatedly
+ terminates due to the timeout errors. Hence, make sure this parameter
+ shorter than the <literal>wal_sender_timeout</literal> of the publisher.
+ If this parameter is set to zero with time delayed replication, the
+ apply worker doesn't send any feedback messages during the
+ <literal>min_apply_delay</literal>.
+ </para>
</listitem>
</varlistentry>
diff --git a/doc/src/sgml/logical-replication.sgml b/doc/src/sgml/logical-replication.sgml
index 7b9bb00e5a..ae4c6b2661 100644
--- a/doc/src/sgml/logical-replication.sgml
+++ b/doc/src/sgml/logical-replication.sgml
@@ -247,6 +247,13 @@
target table.
</para>
+ <para>
+ The subscriber replication can be instructed to lag behind the publisher
+ side changes by specifying the <literal>min_apply_delay</literal>
+ subscription parameter. See <xref linkend="sql-createsubscription"/> for
+ details.
+ </para>
+
<sect2 id="logical-replication-subscription-slot">
<title>Replication Slot Management</title>
diff --git a/doc/src/sgml/ref/alter_subscription.sgml b/doc/src/sgml/ref/alter_subscription.sgml
index 1e8d72062b..d63aff1b90 100644
--- a/doc/src/sgml/ref/alter_subscription.sgml
+++ b/doc/src/sgml/ref/alter_subscription.sgml
@@ -213,8 +213,9 @@ ALTER SUBSCRIPTION <replaceable class="parameter">name</replaceable> RENAME TO <
are <literal>slot_name</literal>,
<literal>synchronous_commit</literal>,
<literal>binary</literal>, <literal>streaming</literal>,
- <literal>disable_on_error</literal>, and
- <literal>origin</literal>.
+ <literal>disable_on_error</literal>,
+ <literal>origin</literal>, and
+ <literal>min_apply_delay</literal>.
</para>
</listitem>
</varlistentry>
diff --git a/doc/src/sgml/ref/create_subscription.sgml b/doc/src/sgml/ref/create_subscription.sgml
index f9a1776380..5b8df483c7 100644
--- a/doc/src/sgml/ref/create_subscription.sgml
+++ b/doc/src/sgml/ref/create_subscription.sgml
@@ -333,7 +333,47 @@ CREATE SUBSCRIPTION <replaceable class="parameter">subscription_name</replaceabl
</para>
</listitem>
</varlistentry>
- </variablelist></para>
+
+ <varlistentry>
+ <term><literal>min_apply_delay</literal> (<type>integer</type>)</term>
+ <listitem>
+ <para>
+ By default, the subscriber applies changes as soon as possible. As
+ with the physical replication feature
+ (<xref linkend="guc-recovery-min-apply-delay"/>), it can be useful to
+ have a time-delayed logical replica. This parameter lets the user to
+ delay the application of changes by a specified amount of time. If this
+ value is specified without units, it is taken as milliseconds. The
+ default is zero(no delay).
+ </para>
+ <para>
+ The delay occurs only on WAL records for transaction begins and after
+ the initial table synchronization. It is possible that the
+ replication delay between publisher and subscriber exceeds the value
+ of this parameter, in which case no delay is added. Note that the
+ delay is calculated between the WAL time stamp as written on
+ publisher and the current time on the subscriber. Time spent in logical
+ decoding and in transfering the transaction may reduce the actual wait
+ time. If the system clocks on publisher and subscriber are not
+ synchronized, this may lead to apply changes earlier than expected,
+ but this is not a major issue because this parameter is typically much
+ larger than the time deviations between servers. Note that if this
+ parameter is set to a long delay, the replication will stop if the
+ replication slot falls behind the current LSN by more than
+ <link linkend="guc-max-slot-wal-keep-size"><literal>max_slot_wal_keep_size</literal></link>.
+ </para>
+ <warning>
+ <para>
+ Delaying the replication can mean there is a much longer time between making
+ a change on the publisher, and that change being committed on the subscriber.
+ This can have a big impact on synchronous replication.
+ See <xref linkend="guc-synchronous-commit"/>.
+ </para>
+ </warning>
+ </listitem>
+ </varlistentry>
+ </variablelist>
+ </para>
</listitem>
</varlistentry>
@@ -456,6 +496,18 @@ CREATE SUBSCRIPTION mysub
PUBLICATION insert_only
WITH (enabled = false);
</programlisting></para>
+
+ <para>
+ Create a subscription to a remote server that replicates tables in
+ the <literal>mypub</literal> publication and starts replicating immediately
+ on commit. Pre-existing data is not copied. The application of changes is
+ delayed by 4 hours.
+<programlisting>
+CREATE SUBSCRIPTION mysub
+ CONNECTION 'host=192.168.1.50 port=5432 user=foo dbname=foodb'
+ PUBLICATION mypub
+ WITH (copy_data = false, min_apply_delay = '4h');
+</programlisting></para>
</refsect1>
<refsect1>
diff --git a/src/backend/catalog/pg_subscription.c b/src/backend/catalog/pg_subscription.c
index a506fc3ec8..b29ed67715 100644
--- a/src/backend/catalog/pg_subscription.c
+++ b/src/backend/catalog/pg_subscription.c
@@ -64,6 +64,7 @@ GetSubscription(Oid subid, bool missing_ok)
sub->oid = subid;
sub->dbid = subform->subdbid;
sub->skiplsn = subform->subskiplsn;
+ sub->minapplydelay = subform->subminapplydelay;
sub->name = pstrdup(NameStr(subform->subname));
sub->owner = subform->subowner;
sub->enabled = subform->subenabled;
diff --git a/src/backend/catalog/system_views.sql b/src/backend/catalog/system_views.sql
index 2d8104b090..85aa1bd6f5 100644
--- a/src/backend/catalog/system_views.sql
+++ b/src/backend/catalog/system_views.sql
@@ -1299,9 +1299,10 @@ REVOKE ALL ON pg_replication_origin_status FROM public;
-- All columns of pg_subscription except subconninfo are publicly readable.
REVOKE ALL ON pg_subscription FROM public;
-GRANT SELECT (oid, subdbid, subskiplsn, subname, subowner, subenabled,
- subbinary, substream, subtwophasestate, subdisableonerr,
- subslotname, subsynccommit, subpublications, suborigin)
+GRANT SELECT (oid, subdbid, subskiplsn, subminapplydelay, subname, subowner,
+ subenabled, subbinary, substream, subtwophasestate,
+ subdisableonerr, subslotname, subsynccommit, subpublications,
+ suborigin)
ON pg_subscription TO public;
CREATE VIEW pg_stat_subscription_stats AS
diff --git a/src/backend/commands/subscriptioncmds.c b/src/backend/commands/subscriptioncmds.c
index d6993c26e5..fdbb9dc12c 100644
--- a/src/backend/commands/subscriptioncmds.c
+++ b/src/backend/commands/subscriptioncmds.c
@@ -48,6 +48,7 @@
#include "utils/memutils.h"
#include "utils/pg_lsn.h"
#include "utils/syscache.h"
+#include "utils/timestamp.h"
/*
* Options that can be specified by the user in CREATE/ALTER SUBSCRIPTION
@@ -66,6 +67,7 @@
#define SUBOPT_DISABLE_ON_ERR 0x00000400
#define SUBOPT_LSN 0x00000800
#define SUBOPT_ORIGIN 0x00001000
+#define SUBOPT_MIN_APPLY_DELAY 0x00002000
/* check if the 'val' has 'bits' set */
#define IsSet(val, bits) (((val) & (bits)) == (bits))
@@ -90,6 +92,7 @@ typedef struct SubOpts
bool disableonerr;
char *origin;
XLogRecPtr lsn;
+ int64 min_apply_delay;
} SubOpts;
static List *fetch_table_list(WalReceiverConn *wrconn, List *publications);
@@ -146,6 +149,8 @@ parse_subscription_options(ParseState *pstate, List *stmt_options,
opts->disableonerr = false;
if (IsSet(supported_opts, SUBOPT_ORIGIN))
opts->origin = pstrdup(LOGICALREP_ORIGIN_ANY);
+ if (IsSet(supported_opts, SUBOPT_MIN_APPLY_DELAY))
+ opts->min_apply_delay = 0;
/* Parse options */
foreach(lc, stmt_options)
@@ -324,6 +329,43 @@ parse_subscription_options(ParseState *pstate, List *stmt_options,
opts->specified_opts |= SUBOPT_LSN;
opts->lsn = lsn;
}
+ else if (IsSet(supported_opts, SUBOPT_MIN_APPLY_DELAY) &&
+ strcmp(defel->defname, "min_apply_delay") == 0)
+ {
+ char *val,
+ *tmp;
+ Interval *interval;
+ int64 ms;
+
+ if (IsSet(opts->specified_opts, SUBOPT_MIN_APPLY_DELAY))
+ errorConflictingDefElem(defel, pstate);
+
+ opts->specified_opts |= SUBOPT_MIN_APPLY_DELAY;
+ tmp = defGetString(defel);
+
+ /*
+ * If no unit was specified, then explicitly add 'ms' otherwise
+ * the interval_in function would assume 'seconds'.
+ */
+ if (strspn(tmp, "0123456789") == strlen(tmp))
+ val = psprintf("%sms", tmp);
+ else
+ val = tmp;
+
+ interval = DatumGetIntervalP(DirectFunctionCall3(interval_in,
+ CStringGetDatum(val),
+ ObjectIdGetDatum(InvalidOid),
+ Int32GetDatum(-1)));
+
+ ms = interval2ms(interval);
+ if (ms < 0 || ms > INT_MAX)
+ ereport(ERROR,
+ errcode(ERRCODE_NUMERIC_VALUE_OUT_OF_RANGE),
+ errmsg("%lld ms is outside the valid range for parameter \"%s\"",
+ (long long) ms, "min_apply_delay"));
+
+ opts->min_apply_delay = ms;
+ }
else
ereport(ERROR,
(errcode(ERRCODE_SYNTAX_ERROR),
@@ -560,7 +602,8 @@ CreateSubscription(ParseState *pstate, CreateSubscriptionStmt *stmt,
SUBOPT_SLOT_NAME | SUBOPT_COPY_DATA |
SUBOPT_SYNCHRONOUS_COMMIT | SUBOPT_BINARY |
SUBOPT_STREAMING | SUBOPT_TWOPHASE_COMMIT |
- SUBOPT_DISABLE_ON_ERR | SUBOPT_ORIGIN);
+ SUBOPT_DISABLE_ON_ERR | SUBOPT_ORIGIN |
+ SUBOPT_MIN_APPLY_DELAY);
parse_subscription_options(pstate, stmt->options, supported_opts, &opts);
/*
@@ -625,6 +668,7 @@ CreateSubscription(ParseState *pstate, CreateSubscriptionStmt *stmt,
values[Anum_pg_subscription_oid - 1] = ObjectIdGetDatum(subid);
values[Anum_pg_subscription_subdbid - 1] = ObjectIdGetDatum(MyDatabaseId);
values[Anum_pg_subscription_subskiplsn - 1] = LSNGetDatum(InvalidXLogRecPtr);
+ values[Anum_pg_subscription_subminapplydelay - 1] = Int64GetDatum(opts.min_apply_delay);
values[Anum_pg_subscription_subname - 1] =
DirectFunctionCall1(namein, CStringGetDatum(stmt->subname));
values[Anum_pg_subscription_subowner - 1] = ObjectIdGetDatum(owner);
@@ -1054,7 +1098,7 @@ AlterSubscription(ParseState *pstate, AlterSubscriptionStmt *stmt,
supported_opts = (SUBOPT_SLOT_NAME |
SUBOPT_SYNCHRONOUS_COMMIT | SUBOPT_BINARY |
SUBOPT_STREAMING | SUBOPT_DISABLE_ON_ERR |
- SUBOPT_ORIGIN);
+ SUBOPT_ORIGIN | SUBOPT_MIN_APPLY_DELAY);
parse_subscription_options(pstate, stmt->options,
supported_opts, &opts);
@@ -1111,6 +1155,13 @@ AlterSubscription(ParseState *pstate, AlterSubscriptionStmt *stmt,
= true;
}
+ if (IsSet(opts.specified_opts, SUBOPT_MIN_APPLY_DELAY))
+ {
+ values[Anum_pg_subscription_subminapplydelay - 1] =
+ Int64GetDatum(opts.min_apply_delay);
+ replaces[Anum_pg_subscription_subminapplydelay - 1] = true;
+ }
+
if (IsSet(opts.specified_opts, SUBOPT_ORIGIN))
{
values[Anum_pg_subscription_suborigin - 1] =
diff --git a/src/backend/replication/logical/worker.c b/src/backend/replication/logical/worker.c
index 722f796c7a..c04e3d90a1 100644
--- a/src/backend/replication/logical/worker.c
+++ b/src/backend/replication/logical/worker.c
@@ -259,6 +259,18 @@ static List *on_commit_wakeup_workers_subids = NIL;
bool in_remote_transaction = false;
static XLogRecPtr remote_final_lsn = InvalidXLogRecPtr;
+/*
+ * In order to avoid walsender's timeout during time delayed replication,
+ * it's necessary to keep sending feedbacks during the delay from the worker
+ * process. Meanwhile, the feature delays the apply before starting the
+ * transaction and thus we don't write WALs for the suspended changes during
+ * the wait. Hence, in the case the worker process sends a feedback during the
+ * delay, avoid having positions of the flushed and apply LSN overwritten by
+ * the latest LSN.
+ */
+static bool in_delaying_apply = false;
+static XLogRecPtr last_received = InvalidXLogRecPtr;
+
/* fields valid only when processing streamed transaction */
static bool in_streamed_transaction = false;
@@ -328,6 +340,8 @@ static void maybe_reread_subscription(void);
static void DisableSubscriptionAndExit(void);
+static void maybe_delay_apply(TimestampTz ts);
+
/* prototype needed because of stream_commit */
static void apply_dispatch(StringInfo s);
@@ -833,6 +847,106 @@ slot_modify_data(TupleTableSlot *slot, TupleTableSlot *srcslot,
ExecStoreVirtualTuple(slot);
}
+/*
+ * When min_apply_delay parameter is set on the subscriber, we wait long enough
+ * to make sure a transaction is applied at least that interval behind the
+ * publisher.
+ *
+ * While the physical replication applies the delay at commit time, this
+ * feature applies the delay for the next transaction but before starting the
+ * transaction. This is mainly because keeping a transaction that conducted
+ * write operations open for a long time results in some issues such as bloat
+ * and locks.
+ *
+ * The min_apply_delay parameter will take effect only after all tables are in
+ * READY state.
+ */
+static void
+maybe_delay_apply(TimestampTz ts)
+{
+ /* Nothing to do if no delay set */
+ if (MySubscription->minapplydelay <= 0)
+ return;
+
+ /*
+ * The min_apply_delay parameter is ignored until all tablesync workers
+ * have reached READY state. If we allow the delay during the catchup
+ * phase, once we reach the limit of tablesync workers, it will impose a
+ * delay for each subsequent worker. It means it will take a long time to
+ * finish the initial table synchronization.
+ */
+ if (!AllTablesyncsReady())
+ return;
+
+ /*
+ * Suppress overwrites of flushed and writtten positions by the lastest
+ * LSN in send_feedback().
+ */
+ in_delaying_apply = true;
+
+ while (true)
+ {
+ long diffms;
+
+ ResetLatch(MyLatch);
+
+ CHECK_FOR_INTERRUPTS();
+
+ if (ConfigReloadPending)
+ {
+ ConfigReloadPending = false;
+ ProcessConfigFile(PGC_SIGHUP);
+ }
+
+ /*
+ * Before calculating the time duration, reload the catalog if needed.
+ */
+ if (!in_remote_transaction && !in_streamed_transaction)
+ {
+ AcceptInvalidationMessages();
+ maybe_reread_subscription();
+ }
+
+ diffms = TimestampDifferenceMilliseconds(GetCurrentTimestamp(),
+ TimestampTzPlusMilliseconds(ts, MySubscription->minapplydelay));
+
+ /*
+ * Exit without arming the latch if it's already past time to apply
+ * this transaction.
+ */
+ if (diffms <= 0)
+ break;
+
+ elog(DEBUG2, "logical replication apply delay: %ld ms", diffms);
+
+ /*
+ * Call send_feedback() to prevent the publisher from exiting by
+ * timeout during the delay, when wal_receiver_status_interval is
+ * available. The WALs for this delayed transaction is neither
+ * written nor flushed yet, Thus, we don't make the latest LSN
+ * overwrite those positions of the update message for this delay.
+ *
+ * See send_feedback() also.
+ */
+ if (wal_receiver_status_interval > 0
+ && diffms > wal_receiver_status_interval * 1000)
+ {
+ WaitLatch(MyLatch,
+ WL_LATCH_SET | WL_TIMEOUT | WL_EXIT_ON_PM_DEATH,
+ (long) wal_receiver_status_interval * 1000,
+ WAIT_EVENT_RECOVERY_APPLY_DELAY);
+ send_feedback(last_received, true, false);
+ }
+ else
+ WaitLatch(MyLatch,
+ WL_LATCH_SET | WL_TIMEOUT | WL_EXIT_ON_PM_DEATH,
+ diffms,
+ WAIT_EVENT_RECOVERY_APPLY_DELAY);
+ }
+
+ in_delaying_apply = false;
+}
+
/*
* Handle BEGIN message.
*/
@@ -844,6 +958,9 @@ apply_handle_begin(StringInfo s)
logicalrep_read_begin(s, &begin_data);
set_apply_error_context_xact(begin_data.xid, begin_data.final_lsn);
+ /* Should we delay the current transaction? */
+ maybe_delay_apply(begin_data.committime);
+
remote_final_lsn = begin_data.final_lsn;
maybe_start_skipping_changes(begin_data.final_lsn);
@@ -898,6 +1015,9 @@ apply_handle_begin_prepare(StringInfo s)
logicalrep_read_begin_prepare(s, &begin_data);
set_apply_error_context_xact(begin_data.xid, begin_data.prepare_lsn);
+ /* Should we delay the current prepared transaction? */
+ maybe_delay_apply(begin_data.prepare_time);
+
remote_final_lsn = begin_data.prepare_lsn;
maybe_start_skipping_changes(begin_data.prepare_lsn);
@@ -1120,6 +1240,19 @@ apply_handle_stream_prepare(StringInfo s)
elog(DEBUG1, "received prepare for streamed transaction %u", prepare_data.xid);
+ /*
+ * Should we delay the current prepared transaction?
+ *
+ * Although the delay is applied in BEGIN PREPARE messages, streamed
+ * prepared transactions apply the delay in a STREAM PREPARE message.
+ * That's ok because no changes have been applied yet
+ * (apply_spooled_messages() will do it). The STREAM START message does
+ * not contain a prepare time (it will be available when the in-progress
+ * prepared transaction finishes), hence, it was not possible to apply a
+ * delay at that time.
+ */
+ maybe_delay_apply(prepare_data.prepare_time);
+
/* Replay all the spooled operations. */
apply_spooled_messages(prepare_data.xid, prepare_data.prepare_lsn);
@@ -1511,6 +1644,19 @@ apply_handle_stream_commit(StringInfo s)
elog(DEBUG1, "received commit for streamed transaction %u", xid);
+ /*
+ * Should we delay the current transaction?
+ *
+ * Although the delay is applied in BEGIN messages, streamed transactions
+ * apply the delay in a STREAM COMMIT message. That's ok because no
+ * changes have been applied yet (apply_spooled_messages() will do it).
+ * The STREAM START message would be a natural choice for this delay but
+ * there is no commit time yet (it will be available when the in-progress
+ * transaction finishes), hence, it was not possible to apply a delay at
+ * that time.
+ */
+ maybe_delay_apply(commit_data.committime);
+
apply_spooled_messages(xid, commit_data.commit_lsn);
apply_handle_commit_internal(&commit_data);
@@ -2713,7 +2859,7 @@ UpdateWorkerStats(XLogRecPtr last_lsn, TimestampTz send_time, bool reply)
* Apply main loop.
*/
static void
-LogicalRepApplyLoop(XLogRecPtr last_received)
+LogicalRepApplyLoop(void)
{
TimestampTz last_recv_timestamp = GetCurrentTimestamp();
bool ping_sent = false;
@@ -3002,8 +3148,11 @@ send_feedback(XLogRecPtr recvpos, bool force, bool requestReply)
/*
* No outstanding transactions to flush, we can report the latest received
* position. This is important for synchronous replication.
+ *
+ * During the time delayed replication, avoid reporting the suspended
+ * latest LSN are already flushed and written, to the publisher.
*/
- if (!have_pending_txes)
+ if (!have_pending_txes && !in_delaying_apply)
flushpos = writepos = recvpos;
if (writepos < last_writepos)
@@ -3589,11 +3738,11 @@ start_table_sync(XLogRecPtr *origin_startpos, char **myslotname)
* of system resource error and are not repeatable.
*/
static void
-start_apply(XLogRecPtr origin_startpos)
+start_apply(void)
{
PG_TRY();
{
- LogicalRepApplyLoop(origin_startpos);
+ LogicalRepApplyLoop();
}
PG_CATCH();
{
@@ -3851,7 +4000,8 @@ ApplyWorkerMain(Datum main_arg)
}
/* Run the main loop. */
- start_apply(origin_startpos);
+ last_received = origin_startpos;
+ start_apply();
proc_exit(0);
}
diff --git a/src/backend/utils/adt/timestamp.c b/src/backend/utils/adt/timestamp.c
index 3f2508c0c4..c90147bb72 100644
--- a/src/backend/utils/adt/timestamp.c
+++ b/src/backend/utils/adt/timestamp.c
@@ -2437,6 +2437,35 @@ interval_cmp_internal(const Interval *interval1, const Interval *interval2)
return int128_compare(span1, span2);
}
+/*
+ * Returns the number of milliseconds in the specified Interval.
+ */
+int64
+interval2ms(const Interval *interval)
+{
+ int64 days;
+ int64 ms;
+ int64 result;
+
+ days = interval->month * INT64CONST(30);
+ days += interval->day;
+
+ /* Detect whether the value of interval can cause an overflow. */
+ if (pg_mul_s64_overflow(days, MSECS_PER_DAY, &result))
+ ereport(ERROR,
+ (errcode(ERRCODE_NUMERIC_VALUE_OUT_OF_RANGE),
+ errmsg("bigint out of range")));
+
+ /* Adds portion time (in ms) to the previous result. */
+ ms = interval->time / INT64CONST(1000);
+ if (pg_add_s64_overflow(result, ms, &result))
+ ereport(ERROR,
+ (errcode(ERRCODE_NUMERIC_VALUE_OUT_OF_RANGE),
+ errmsg("bigint out of range")));
+
+ return result;
+}
+
Datum
interval_eq(PG_FUNCTION_ARGS)
{
diff --git a/src/bin/pg_dump/pg_dump.c b/src/bin/pg_dump/pg_dump.c
index 44d957c038..31c4d57764 100644
--- a/src/bin/pg_dump/pg_dump.c
+++ b/src/bin/pg_dump/pg_dump.c
@@ -4494,6 +4494,7 @@ getSubscriptions(Archive *fout)
int i_subsynccommit;
int i_subpublications;
int i_subbinary;
+ int i_subminapplydelay;
int i,
ntups;
@@ -4546,9 +4547,14 @@ getSubscriptions(Archive *fout)
LOGICALREP_TWOPHASE_STATE_DISABLED);
if (fout->remoteVersion >= 160000)
- appendPQExpBufferStr(query, " s.suborigin\n");
+ appendPQExpBufferStr(query,
+ " s.suborigin,\n"
+ " s.subminapplydelay\n");
else
- appendPQExpBuffer(query, " '%s' AS suborigin\n", LOGICALREP_ORIGIN_ANY);
+ {
+ appendPQExpBuffer(query, " '%s' AS suborigin,\n", LOGICALREP_ORIGIN_ANY);
+ appendPQExpBufferStr(query, " 0 AS subminapplydelay\n");
+ }
appendPQExpBufferStr(query,
"FROM pg_subscription s\n"
@@ -4576,6 +4582,7 @@ getSubscriptions(Archive *fout)
i_subtwophasestate = PQfnumber(res, "subtwophasestate");
i_subdisableonerr = PQfnumber(res, "subdisableonerr");
i_suborigin = PQfnumber(res, "suborigin");
+ i_subminapplydelay = PQfnumber(res, "subminapplydelay");
subinfo = pg_malloc(ntups * sizeof(SubscriptionInfo));
@@ -4606,6 +4613,8 @@ getSubscriptions(Archive *fout)
subinfo[i].subdisableonerr =
pg_strdup(PQgetvalue(res, i, i_subdisableonerr));
subinfo[i].suborigin = pg_strdup(PQgetvalue(res, i, i_suborigin));
+ subinfo[i].subminapplydelay =
+ strtoi64(PQgetvalue(res, i, i_subminapplydelay), NULL, 10);
/* Decide whether we want to dump it */
selectDumpableObject(&(subinfo[i].dobj), fout);
@@ -4685,6 +4694,9 @@ dumpSubscription(Archive *fout, const SubscriptionInfo *subinfo)
if (strcmp(subinfo->subsynccommit, "off") != 0)
appendPQExpBuffer(query, ", synchronous_commit = %s", fmtId(subinfo->subsynccommit));
+ if (subinfo->subminapplydelay > 0)
+ appendPQExpBuffer(query, ", min_apply_delay = '" INT64_FORMAT " ms'", subinfo->subminapplydelay);
+
appendPQExpBufferStr(query, ");\n");
if (subinfo->dobj.dump & DUMP_COMPONENT_DEFINITION)
diff --git a/src/bin/pg_dump/pg_dump.h b/src/bin/pg_dump/pg_dump.h
index 436ac5bb98..175b4a72a4 100644
--- a/src/bin/pg_dump/pg_dump.h
+++ b/src/bin/pg_dump/pg_dump.h
@@ -661,6 +661,7 @@ typedef struct _SubscriptionInfo
char *subdisableonerr;
char *suborigin;
char *subsynccommit;
+ int64 subminapplydelay;
char *subpublications;
} SubscriptionInfo;
diff --git a/src/bin/psql/describe.c b/src/bin/psql/describe.c
index df166365e8..6512d6059a 100644
--- a/src/bin/psql/describe.c
+++ b/src/bin/psql/describe.c
@@ -6474,7 +6474,7 @@ describeSubscriptions(const char *pattern, bool verbose)
PGresult *res;
printQueryOpt myopt = pset.popt;
static const bool translate_columns[] = {false, false, false, false,
- false, false, false, false, false, false, false, false};
+ false, false, false, false, false, false, false, false, false};
if (pset.sversion < 100000)
{
@@ -6516,10 +6516,13 @@ describeSubscriptions(const char *pattern, bool verbose)
gettext_noop("Two-phase commit"),
gettext_noop("Disable on error"));
+ /* Origin and min_apply_delay are only supported in v16 and higher */
if (pset.sversion >= 160000)
appendPQExpBuffer(&buf,
- ", suborigin AS \"%s\"\n",
- gettext_noop("Origin"));
+ ", suborigin AS \"%s\"\n"
+ ", subminapplydelay AS \"%s\"\n",
+ gettext_noop("Origin"),
+ gettext_noop("Min apply delay (ms)"));
appendPQExpBuffer(&buf,
", subsynccommit AS \"%s\"\n"
diff --git a/src/bin/psql/tab-complete.c b/src/bin/psql/tab-complete.c
index 2a3921937c..df2879cbf3 100644
--- a/src/bin/psql/tab-complete.c
+++ b/src/bin/psql/tab-complete.c
@@ -1880,7 +1880,7 @@ psql_completion(const char *text, int start, int end)
COMPLETE_WITH("(", "PUBLICATION");
/* ALTER SUBSCRIPTION <name> SET ( */
else if (HeadMatches("ALTER", "SUBSCRIPTION", MatchAny) && TailMatches("SET", "("))
- COMPLETE_WITH("binary", "disable_on_error", "origin", "slot_name",
+ COMPLETE_WITH("binary", "disable_on_error", "min_apply_delay", "origin", "slot_name",
"streaming", "synchronous_commit");
/* ALTER SUBSCRIPTION <name> SKIP ( */
else if (HeadMatches("ALTER", "SUBSCRIPTION", MatchAny) && TailMatches("SKIP", "("))
@@ -3202,7 +3202,7 @@ psql_completion(const char *text, int start, int end)
/* Complete "CREATE SUBSCRIPTION <name> ... WITH ( <opt>" */
else if (HeadMatches("CREATE", "SUBSCRIPTION") && TailMatches("WITH", "("))
COMPLETE_WITH("binary", "connect", "copy_data", "create_slot",
- "disable_on_error", "enabled", "origin", "slot_name",
+ "disable_on_error", "enabled", "min_apply_delay", "origin", "slot_name",
"streaming", "synchronous_commit", "two_phase");
/* CREATE TRIGGER --- is allowed inside CREATE SCHEMA, so use TailMatches */
diff --git a/src/include/catalog/pg_subscription.h b/src/include/catalog/pg_subscription.h
index 7b98714f30..3ff890f897 100644
--- a/src/include/catalog/pg_subscription.h
+++ b/src/include/catalog/pg_subscription.h
@@ -70,6 +70,8 @@ CATALOG(pg_subscription,6100,SubscriptionRelationId) BKI_SHARED_RELATION BKI_ROW
XLogRecPtr subskiplsn; /* All changes finished at this LSN are
* skipped */
+ int64 subminapplydelay; /* Replication apply delay */
+
NameData subname; /* Name of the subscription */
Oid subowner BKI_LOOKUP(pg_authid); /* Owner of the subscription */
@@ -119,6 +121,7 @@ typedef struct Subscription
* in */
XLogRecPtr skiplsn; /* All changes finished at this LSN are
* skipped */
+ int64 minapplydelay; /* Replication apply delay */
char *name; /* Name of the subscription */
Oid owner; /* Oid of the subscription owner */
bool enabled; /* Indicates if the subscription is enabled */
diff --git a/src/include/datatype/timestamp.h b/src/include/datatype/timestamp.h
index d155f1b03b..d5bbfad1c4 100644
--- a/src/include/datatype/timestamp.h
+++ b/src/include/datatype/timestamp.h
@@ -127,6 +127,8 @@ struct pg_itm_in
#define SECS_PER_MINUTE 60
#define MINS_PER_HOUR 60
+#define MSECS_PER_DAY INT64CONST(86400000)
+
#define USECS_PER_DAY INT64CONST(86400000000)
#define USECS_PER_HOUR INT64CONST(3600000000)
#define USECS_PER_MINUTE INT64CONST(60000000)
diff --git a/src/include/utils/timestamp.h b/src/include/utils/timestamp.h
index 7fd0b58825..1d6079057e 100644
--- a/src/include/utils/timestamp.h
+++ b/src/include/utils/timestamp.h
@@ -102,6 +102,8 @@ extern bool TimestampDifferenceExceeds(TimestampTz start_time,
TimestampTz stop_time,
int msec);
+extern int64 interval2ms(const Interval *interval);
+
extern TimestampTz time_t_to_timestamptz(pg_time_t tm);
extern pg_time_t timestamptz_to_time_t(TimestampTz t);
diff --git a/src/test/regress/expected/subscription.out b/src/test/regress/expected/subscription.out
index c13d218dcf..69a5193aa5 100644
--- a/src/test/regress/expected/subscription.out
+++ b/src/test/regress/expected/subscription.out
@@ -114,18 +114,18 @@ CREATE SUBSCRIPTION regress_testsub4 CONNECTION 'dbname=regress_doesnotexist' PU
WARNING: subscription was created, but is not connected
HINT: To initiate replication, you must manually create the replication slot, enable the subscription, and refresh the subscription.
\dRs+ regress_testsub4
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
-------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub4 | regress_subscription_user | f | {testpub} | f | f | d | f | none | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay (ms) | Synchronous commit | Conninfo | Skip LSN
+------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+----------------------+--------------------+-----------------------------+----------
+ regress_testsub4 | regress_subscription_user | f | {testpub} | f | f | d | f | none | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
ALTER SUBSCRIPTION regress_testsub4 SET (origin = any);
\dRs+ regress_testsub4
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
-------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub4 | regress_subscription_user | f | {testpub} | f | f | d | f | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay (ms) | Synchronous commit | Conninfo | Skip LSN
+------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+----------------------+--------------------+-----------------------------+----------
+ regress_testsub4 | regress_subscription_user | f | {testpub} | f | f | d | f | any | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
DROP SUBSCRIPTION regress_testsub3;
@@ -135,10 +135,10 @@ ALTER SUBSCRIPTION regress_testsub CONNECTION 'foobar';
ERROR: invalid connection string syntax: missing "=" after "foobar" in connection info string
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub} | f | f | d | f | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay (ms) | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+----------------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | f | d | f | any | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
ALTER SUBSCRIPTION regress_testsub SET PUBLICATION testpub2, testpub3 WITH (refresh = false);
@@ -155,10 +155,10 @@ ERROR: unrecognized subscription parameter: "create_slot"
-- ok
ALTER SUBSCRIPTION regress_testsub SKIP (lsn = '0/12345');
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+---------------------+--------+-----------+------------------+------------------+--------+--------------------+------------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub2,testpub3} | f | f | d | f | any | off | dbname=regress_doesnotexist2 | 0/12345
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay (ms) | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+---------------------+--------+-----------+------------------+------------------+--------+----------------------+--------------------+------------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub2,testpub3} | f | f | d | f | any | 0 | off | dbname=regress_doesnotexist2 | 0/12345
(1 row)
-- ok - with lsn = NONE
@@ -167,10 +167,10 @@ ALTER SUBSCRIPTION regress_testsub SKIP (lsn = NONE);
ALTER SUBSCRIPTION regress_testsub SKIP (lsn = '0/0');
ERROR: invalid WAL location (LSN): 0/0
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+---------------------+--------+-----------+------------------+------------------+--------+--------------------+------------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub2,testpub3} | f | f | d | f | any | off | dbname=regress_doesnotexist2 | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay (ms) | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+---------------------+--------+-----------+------------------+------------------+--------+----------------------+--------------------+------------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub2,testpub3} | f | f | d | f | any | 0 | off | dbname=regress_doesnotexist2 | 0/0
(1 row)
BEGIN;
@@ -202,10 +202,10 @@ ALTER SUBSCRIPTION regress_testsub_foo SET (synchronous_commit = foobar);
ERROR: invalid value for parameter "synchronous_commit": "foobar"
HINT: Available values: local, remote_write, remote_apply, on, off.
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
----------------------+---------------------------+---------+---------------------+--------+-----------+------------------+------------------+--------+--------------------+------------------------------+----------
- regress_testsub_foo | regress_subscription_user | f | {testpub2,testpub3} | f | f | d | f | any | local | dbname=regress_doesnotexist2 | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay (ms) | Synchronous commit | Conninfo | Skip LSN
+---------------------+---------------------------+---------+---------------------+--------+-----------+------------------+------------------+--------+----------------------+--------------------+------------------------------+----------
+ regress_testsub_foo | regress_subscription_user | f | {testpub2,testpub3} | f | f | d | f | any | 0 | local | dbname=regress_doesnotexist2 | 0/0
(1 row)
-- rename back to keep the rest simple
@@ -239,19 +239,19 @@ CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUB
WARNING: subscription was created, but is not connected
HINT: To initiate replication, you must manually create the replication slot, enable the subscription, and refresh the subscription.
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub} | t | f | d | f | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay (ms) | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+----------------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | t | f | d | f | any | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
ALTER SUBSCRIPTION regress_testsub SET (binary = false);
ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE);
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub} | f | f | d | f | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay (ms) | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+----------------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | f | d | f | any | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
DROP SUBSCRIPTION regress_testsub;
@@ -263,19 +263,19 @@ CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUB
WARNING: subscription was created, but is not connected
HINT: To initiate replication, you must manually create the replication slot, enable the subscription, and refresh the subscription.
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub} | f | t | d | f | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay (ms) | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+----------------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | t | d | f | any | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
ALTER SUBSCRIPTION regress_testsub SET (streaming = false);
ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE);
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub} | f | f | d | f | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay (ms) | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+----------------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | f | d | f | any | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
-- fail - publication already exists
@@ -290,10 +290,10 @@ ALTER SUBSCRIPTION regress_testsub ADD PUBLICATION testpub1, testpub2 WITH (refr
ALTER SUBSCRIPTION regress_testsub ADD PUBLICATION testpub1, testpub2 WITH (refresh = false);
ERROR: publication "testpub1" is already in subscription "regress_testsub"
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-----------------------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub,testpub1,testpub2} | f | f | d | f | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay (ms) | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-----------------------------+--------+-----------+------------------+------------------+--------+----------------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub,testpub1,testpub2} | f | f | d | f | any | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
-- fail - publication used more then once
@@ -308,10 +308,10 @@ ERROR: publication "testpub3" is not in subscription "regress_testsub"
-- ok - delete publications
ALTER SUBSCRIPTION regress_testsub DROP PUBLICATION testpub1, testpub2 WITH (refresh = false);
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub} | f | f | d | f | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay (ms) | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+----------------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | f | d | f | any | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
DROP SUBSCRIPTION regress_testsub;
@@ -347,10 +347,10 @@ CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUB
WARNING: subscription was created, but is not connected
HINT: To initiate replication, you must manually create the replication slot, enable the subscription, and refresh the subscription.
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub} | f | f | p | f | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay (ms) | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+----------------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | f | p | f | any | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
--fail - alter of two_phase option not supported.
@@ -359,10 +359,10 @@ ERROR: unrecognized subscription parameter: "two_phase"
-- but can alter streaming when two_phase enabled
ALTER SUBSCRIPTION regress_testsub SET (streaming = true);
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub} | f | t | p | f | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay (ms) | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+----------------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | t | p | f | any | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE);
@@ -372,10 +372,10 @@ CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUB
WARNING: subscription was created, but is not connected
HINT: To initiate replication, you must manually create the replication slot, enable the subscription, and refresh the subscription.
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub} | f | t | p | f | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay (ms) | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+----------------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | t | p | f | any | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE);
@@ -388,18 +388,58 @@ CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUB
WARNING: subscription was created, but is not connected
HINT: To initiate replication, you must manually create the replication slot, enable the subscription, and refresh the subscription.
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub} | f | f | d | f | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay (ms) | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+----------------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | f | d | f | any | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
ALTER SUBSCRIPTION regress_testsub SET (disable_on_error = true);
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub} | f | f | d | t | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay (ms) | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+----------------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | f | d | t | any | 0 | off | dbname=regress_doesnotexist | 0/0
+(1 row)
+
+ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE);
+DROP SUBSCRIPTION regress_testsub;
+-- fail -- min_apply_delay must be a non-negative integer
+CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUBLICATION testpub WITH (connect = false, min_apply_delay = foo);
+ERROR: invalid input syntax for type interval: "foo"
+CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUBLICATION testpub WITH (connect = false, min_apply_delay = -1);
+ERROR: -1000 ms is outside the valid range for parameter "min_apply_delay"
+-- success -- 123 ms
+CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUBLICATION testpub WITH (connect = false, min_apply_delay = 123);
+WARNING: subscription was created, but is not connected
+HINT: To initiate replication, you must manually create the replication slot, enable the subscription, and refresh the subscription.
+\dRs+
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay (ms) | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+----------------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | f | d | f | any | 123 | off | dbname=regress_doesnotexist | 0/0
+(1 row)
+
+-- success -- 86400000 ms
+ALTER SUBSCRIPTION regress_testsub SET (min_apply_delay = '1d');
+\dRs+
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay (ms) | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+----------------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | f | d | f | any | 86400000 | off | dbname=regress_doesnotexist | 0/0
+(1 row)
+
+ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE);
+DROP SUBSCRIPTION regress_testsub;
+-- success -- interval is converted into ms and stored as integer
+CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUBLICATION testpub WITH (connect = false, min_apply_delay = '4h 27min 35s');
+WARNING: subscription was created, but is not connected
+HINT: To initiate replication, you must manually create the replication slot, enable the subscription, and refresh the subscription.
+\dRs+
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay (ms) | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+----------------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | f | d | f | any | 16055000 | off | dbname=regress_doesnotexist | 0/0
(1 row)
ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE);
diff --git a/src/test/regress/sql/subscription.sql b/src/test/regress/sql/subscription.sql
index eaeade8cce..7a4e818857 100644
--- a/src/test/regress/sql/subscription.sql
+++ b/src/test/regress/sql/subscription.sql
@@ -275,6 +275,31 @@ ALTER SUBSCRIPTION regress_testsub SET (disable_on_error = true);
ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE);
DROP SUBSCRIPTION regress_testsub;
+-- fail -- min_apply_delay must be a non-negative integer
+CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUBLICATION testpub WITH (connect = false, min_apply_delay = foo);
+CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUBLICATION testpub WITH (connect = false, min_apply_delay = -1);
+
+-- success -- 123 ms
+CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUBLICATION testpub WITH (connect = false, min_apply_delay = 123);
+
+\dRs+
+
+-- success -- 86400000 ms
+ALTER SUBSCRIPTION regress_testsub SET (min_apply_delay = '1d');
+
+\dRs+
+
+ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE);
+DROP SUBSCRIPTION regress_testsub;
+
+-- success -- interval is converted into ms and stored as integer
+CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUBLICATION testpub WITH (connect = false, min_apply_delay = '4h 27min 35s');
+
+\dRs+
+
+ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE);
+DROP SUBSCRIPTION regress_testsub;
+
RESET SESSION AUTHORIZATION;
DROP ROLE regress_subscription_user;
DROP ROLE regress_subscription_user2;
diff --git a/src/test/subscription/meson.build b/src/test/subscription/meson.build
index c28121f26e..f136f87537 100644
--- a/src/test/subscription/meson.build
+++ b/src/test/subscription/meson.build
@@ -38,6 +38,7 @@ tests += {
't/029_on_error.pl',
't/030_origin.pl',
't/031_column_list.pl',
+ 't/032_apply_delay.pl',
't/100_bugs.pl',
],
},
diff --git a/src/test/subscription/t/032_apply_delay.pl b/src/test/subscription/t/032_apply_delay.pl
new file mode 100644
index 0000000000..8f8ce23f1b
--- /dev/null
+++ b/src/test/subscription/t/032_apply_delay.pl
@@ -0,0 +1,151 @@
+
+# Copyright (c) 2022, PostgreSQL Global Development Group
+
+# Test replication apply delay
+use strict;
+use warnings;
+use PostgreSQL::Test::Cluster;
+use PostgreSQL::Test::Utils;
+use Test::More;
+
+# Create publisher node.
+my $node_publisher = PostgreSQL::Test::Cluster->new('publisher');
+$node_publisher->init(allows_streaming => 'logical');
+$node_publisher->append_conf('postgresql.conf',
+ 'logical_decoding_work_mem = 64kB');
+$node_publisher->start;
+
+# Create subscriber node.
+my $node_subscriber = PostgreSQL::Test::Cluster->new('subscriber');
+$node_subscriber->init;
+$node_subscriber->append_conf('postgresql.conf',
+ "log_min_messages = debug2");
+$node_subscriber->start;
+
+# Create some preexisting content on publisher.
+$node_publisher->safe_psql('postgres',
+ "CREATE TABLE test_tab (a int primary key, b varchar, c timestamptz DEFAULT now())");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO test_tab VALUES (1, 'foo'), (2, 'bar')");
+
+# Setup structure on subscriber.
+$node_subscriber->safe_psql('postgres',
+ "CREATE TABLE test_tab (a int primary key, b text, c timestamptz DEFAULT now(), d bigint DEFAULT 999)"
+);
+
+# Setup logical replication.
+my $publisher_connstr = $node_publisher->connstr . ' dbname=postgres';
+
+# column c must not be published because we want to compare the time difference.
+$node_publisher->safe_psql('postgres',
+ "CREATE PUBLICATION tap_pub FOR TABLE test_tab (a, b)");
+
+my $appname = 'tap_sub';
+$node_subscriber->safe_psql('postgres',
+ "CREATE SUBSCRIPTION tap_sub CONNECTION '$publisher_connstr application_name=$appname' PUBLICATION tap_pub WITH (streaming = on, min_apply_delay = '3s')"
+);
+
+# Wait for initial table sync to finish.
+$node_subscriber->wait_for_subscription_sync($node_publisher, $appname);
+
+# Check log starting now for logical replication apply delay.
+my $log_location = -s $node_subscriber->logfile;
+
+my $result =
+ $node_subscriber->safe_psql('postgres',
+ "SELECT count(*), min(a), max(a) FROM test_tab");
+is($result, qq(2|1|2), 'check initial data was copied to subscriber');
+
+# New row to trigger apply delay.
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO test_tab VALUES (3, 'baz')");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO test_tab VALUES (4, 'abc')");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO test_tab VALUES (5, 'def')");
+
+$node_publisher->wait_for_catchup($appname);
+
+$result =
+ $node_subscriber->safe_psql('postgres',
+ "SELECT count(*), min(a), max(a) FROM test_tab");
+is($result, qq(5|1|5), 'check if the new rows were applied to subscriber');
+
+check_apply_delay_log("logical replication apply delay");
+
+check_apply_delay_time('5', '3');
+
+# Test streamed transaction.
+# Insert, update and delete enough rows to exceed 64kB limit.
+$node_publisher->safe_psql(
+ 'postgres', q{
+BEGIN;
+INSERT INTO test_tab SELECT i, md5(i::text) FROM generate_series(6, 5000) s(i);
+UPDATE test_tab SET b = md5(b) WHERE mod(a, 2) = 0;
+DELETE FROM test_tab WHERE mod(a, 3) = 0;
+COMMIT;
+});
+
+$node_publisher->wait_for_catchup($appname);
+
+$result =
+ $node_subscriber->safe_psql('postgres',
+ "SELECT count(*), min(a), max(a) FROM test_tab");
+is($result, qq(3334|1|5000), 'check if the new rows were applied to subscriber');
+
+check_apply_delay_log("logical replication apply delay");
+
+check_apply_delay_time('5000', '3');
+
+# Test ALTER SUBSCRIPTION. Delay 86460 seconds (1 day 1 minute).
+$node_subscriber->safe_psql('postgres',
+ "ALTER SUBSCRIPTION tap_sub SET (min_apply_delay = 86460000)"
+);
+
+# New row to trigger apply delay.
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO test_tab VALUES (0, 'foobar')");
+
+# Disable subscription. worker should die immediately.
+$node_subscriber->safe_psql('postgres',
+ "ALTER SUBSCRIPTION tap_sub DISABLE;"
+);
+
+# Wait until worker dies.
+my $sub_query =
+ "SELECT count(1) = 0 FROM pg_stat_subscription WHERE subname = 'tap_sub' AND pid IS NOT NULL;";
+$node_subscriber->poll_query_until('postgres', $sub_query)
+ or die "Timed out while waiting for subscriber to die";
+
+$node_subscriber->stop;
+$node_publisher->stop;
+
+done_testing();
+
+sub check_apply_delay_log
+{
+ my $message = shift;
+ my $old_log_location = $log_location;
+
+ $log_location = $node_subscriber->wait_for_log(qr/$message/, $log_location);
+
+ cmp_ok($log_location, '>', $old_log_location,
+ "logfile contains triggered logical replication apply delay"
+ );
+}
+
+sub check_apply_delay_time
+{
+ my ($primary_key, $expected_diffs) = @_;
+
+ my $inserted_time_on_pub = $node_publisher->safe_psql('postgres', qq[
+ SELECT extract(epoch from c) FROM test_tab WHERE a = $primary_key;
+ ]);
+
+ my $inserted_time_on_sub = $node_subscriber->safe_psql('postgres', qq[
+ SELECT extract(epoch from c) FROM test_tab WHERE a = $primary_key;
+ ]);
+
+ cmp_ok($inserted_time_on_sub - $inserted_time_on_pub, '>', $expected_diffs,
+ "The tuple on the subscriber was modified later than the publisher");
+}
--
2.27.0
On Tue, 27 Dec 2022 at 14:59, Hayato Kuroda (Fujitsu)
<kuroda.hayato@fujitsu.com> wrote:
Note that more than half of the modifications are done by Osumi-san.
1) This global variable can be removed as it is used only in
send_feedback which is called from maybe_delay_apply so we could pass
it as a function argument:
+ * delay, avoid having positions of the flushed and apply LSN overwritten by
+ * the latest LSN.
+ */
+static bool in_delaying_apply = false;
+static XLogRecPtr last_received = InvalidXLogRecPtr;
+
2) -1 gets converted to -1000
+int64
+interval2ms(const Interval *interval)
+{
+ int64 days;
+ int64 ms;
+ int64 result;
+
+ days = interval->month * INT64CONST(30);
+ days += interval->day;
+
+ /* Detect whether the value of interval can cause an overflow. */
+ if (pg_mul_s64_overflow(days, MSECS_PER_DAY, &result))
+ ereport(ERROR,
+ (errcode(ERRCODE_NUMERIC_VALUE_OUT_OF_RANGE),
+ errmsg("bigint out of range")));
+
+ /* Adds portion time (in ms) to the previous result. */
+ ms = interval->time / INT64CONST(1000);
+ if (pg_add_s64_overflow(result, ms, &result))
+ ereport(ERROR,
+ (errcode(ERRCODE_NUMERIC_VALUE_OUT_OF_RANGE),
+ errmsg("bigint out of range")));
create subscription sub7 connection 'dbname=regression host=localhost
port=5432' publication pub1 with (min_apply_delay = '-1');
ERROR: -1000 ms is outside the valid range for parameter "min_apply_delay"
3) This can be slightly reworded:
+ <para>
+ The length of time (ms) to delay the application of changes.
+ </para></entry>
to:
Delay applying the changes by a specified amount of time(ms).
4) maybe_delay_apply can be moved from apply_handle_stream_prepare to
apply_spooled_messages so that it is consistent with
maybe_start_skipping_changes:
@@ -1120,6 +1240,19 @@ apply_handle_stream_prepare(StringInfo s)
elog(DEBUG1, "received prepare for streamed transaction %u",
prepare_data.xid);
+ /*
+ * Should we delay the current prepared transaction?
+ *
+ * Although the delay is applied in BEGIN PREPARE messages, streamed
+ * prepared transactions apply the delay in a STREAM PREPARE message.
+ * That's ok because no changes have been applied yet
+ * (apply_spooled_messages() will do it). The STREAM START message does
+ * not contain a prepare time (it will be available when the in-progress
+ * prepared transaction finishes), hence, it was not possible to apply a
+ * delay at that time.
+ */
+ maybe_delay_apply(prepare_data.prepare_time);
That way the call from apply_handle_stream_commit can also be removed.
5) typo transfering should be transferring
+ publisher and the current time on the subscriber. Time
spent in logical
+ decoding and in transfering the transaction may reduce the
actual wait
+ time. If the system clocks on publisher and subscriber are not
6) feedbacks can be changed to feedback messages
+ * it's necessary to keep sending feedbacks during the delay from the worker
+ * process. Meanwhile, the feature delays the apply before starting the
7)
+ /*
+ * Suppress overwrites of flushed and writtten positions by the lastest
+ * LSN in send_feedback().
+ */
7a) typo writtten should be written
7b) lastest should latest
Regards,
Vignesh
On Tue, 27 Dec 2022 at 14:59, Hayato Kuroda (Fujitsu)
<kuroda.hayato@fujitsu.com> wrote:Note that more than half of the modifications are done by Osumi-san.
Please find a few minor comments.
1.
+ diffms = TimestampDifferenceMilliseconds(GetCurrentTimestamp(),
+
TimestampTzPlusMilliseconds(ts, MySubscription->minapplydelay));
on unix, above code looks unaligned (copied from unix)
2. same with:
+ interval = DatumGetIntervalP(DirectFunctionCall3(interval_in,
+
CStringGetDatum(val),
+
ObjectIdGetDatum(InvalidOid),
+
Int32GetDatum(-1)));
perhaps due to tabs?
2. comment not clear:
+ * During the time delayed replication, avoid reporting the suspended
+ * latest LSN are already flushed and written, to the publisher.
3.
+ * Call send_feedback() to prevent the publisher from exiting by
+ * timeout during the delay, when wal_receiver_status_interval is
+ * available. The WALs for this delayed transaction is neither
+ * written nor flushed yet, Thus, we don't make the latest LSN
+ * overwrite those positions of the update message for this delay.
yet, Thus, we --> yet, thus, we/ yet. Thus, we
4.
+ /* Adds portion time (in ms) to the previous result. */
+ ms = interval->time / INT64CONST(1000);
Is interval->time always in micro-seconds here?
Thanks
Shveta
On Tuesday, December 27, 2022 6:29 PM Tuesday, December 27, 2022 6:29 PM wrote:
Thanks for reviewing our patch! PSA new version patch set.
Now, the patches fails to apply to the HEAD,
because of recent commits (c6e1f62e2c and 216a784829c) as reported in [1]http://cfbot.cputube.org/patch_41_3581.log.
I'll rebase the patch with other changes when I post a new version.
[1]: http://cfbot.cputube.org/patch_41_3581.log
Best Regards,
Takamichi Osumi
On Tuesday, January 3, 2023 4:01 PM vignesh C <vignesh21@gmail.com> wrote:
Hi, thanks for your review !
1) This global variable can be removed as it is used only in send_feedback which is called from maybe_delay_apply so we could pass it as a function argument: + * delay, avoid having positions of the flushed and apply LSN +overwritten by + * the latest LSN. + */ +static bool in_delaying_apply = false; +static XLogRecPtr last_received = InvalidXLogRecPtr; +
I have removed the first variable and make it one of the arguments for send_feedback().
2) -1 gets converted to -1000
+int64 +interval2ms(const Interval *interval) +{ + int64 days; + int64 ms; + int64 result; + + days = interval->month * INT64CONST(30); + days += interval->day; + + /* Detect whether the value of interval can cause an overflow. */ + if (pg_mul_s64_overflow(days, MSECS_PER_DAY, &result)) + ereport(ERROR, + (errcode(ERRCODE_NUMERIC_VALUE_OUT_OF_RANGE), + errmsg("bigint out of range"))); + + /* Adds portion time (in ms) to the previous result. */ + ms = interval->time / INT64CONST(1000); + if (pg_add_s64_overflow(result, ms, &result)) + ereport(ERROR, + (errcode(ERRCODE_NUMERIC_VALUE_OUT_OF_RANGE), + errmsg("bigint out of range")));create subscription sub7 connection 'dbname=regression host=localhost
port=5432' publication pub1 with (min_apply_delay = '-1');
ERROR: -1000 ms is outside the valid range for parameter "min_apply_delay"
Good catch! Fixed in order to make input '-1' interpretted as -1 ms.
3) This can be slightly reworded: + <para> + The length of time (ms) to delay the application of changes. + </para></entry> to: Delay applying the changes by a specified amount of time(ms).
This has been suggested in [1]/messages/by-id/CAHut+PttQdFMNM2c6WqKt2c9G6r3ZKYRGHm04RR-4p4fyA4WRg@mail.gmail.com by Peter Smith. So, I'd like to keep the current patch's description.
Then, I didn't change this.
4) maybe_delay_apply can be moved from apply_handle_stream_prepare to
apply_spooled_messages so that it is consistent with
maybe_start_skipping_changes:
@@ -1120,6 +1240,19 @@ apply_handle_stream_prepare(StringInfo s)elog(DEBUG1, "received prepare for streamed transaction %u",
prepare_data.xid);+ /* + * Should we delay the current prepared transaction? + * + * Although the delay is applied in BEGIN PREPARE messages, streamed + * prepared transactions apply the delay in a STREAM PREPARE message. + * That's ok because no changes have been applied yet + * (apply_spooled_messages() will do it). The STREAM START message does + * not contain a prepare time (it will be available when the in-progress + * prepared transaction finishes), hence, it was not possible to apply a + * delay at that time. + */ + maybe_delay_apply(prepare_data.prepare_time);That way the call from apply_handle_stream_commit can also be removed.
Sounds good. I moved the call of maybe_delay_apply() to the apply_spooled_messages().
Now it's aligned with maybe_start_skipping_changes().
5) typo transfering should be transferring + publisher and the current time on the subscriber. Time spent in logical + decoding and in transfering the transaction may reduce the actual wait + time. If the system clocks on publisher and subscriber are + not
Fixed.
6) feedbacks can be changed to feedback messages + * it's necessary to keep sending feedbacks during the delay from the + worker + * process. Meanwhile, the feature delays the apply before starting the
Fixed.
7) + /* + * Suppress overwrites of flushed and writtten positions by the lastest + * LSN in send_feedback(). + */7a) typo writtten should be written
7b) lastest should latest
I have removed this sentence. So, those typos are removed.
Please have a look at the updated patch.
[1]: /messages/by-id/CAHut+PttQdFMNM2c6WqKt2c9G6r3ZKYRGHm04RR-4p4fyA4WRg@mail.gmail.com
Best Regards,
Takamichi Osumi
Attachments:
v13-0001-Time-delayed-logical-replication-subscriber.patchapplication/octet-stream; name=v13-0001-Time-delayed-logical-replication-subscriber.patchDownload
From 92410c9772cf5919373faef36f454aca2fb47f3e Mon Sep 17 00:00:00 2001
From: Takamichi Osumi <osumi.takamichi@fujitsu.com>
Date: Tue, 10 Jan 2023 13:35:45 +0000
Subject: [PATCH v13] Time-delayed logical replication subscriber
Similar to physical replication, a time-delayed copy of the data for
logical replication is useful for some scenarios (particularly to fix
errors that might cause data loss).
If the subscription sets min_apply_delay parameter, the logical
replication worker will delay the transaction commit for min_apply_delay
milliseconds.
The delay is calculated between the WAL time stamp and the current time
on the subscriber.
The delay occurs only on WAL records for transaction begins. The main
reason is to avoid keeping a transaction open for a long time. Regular
and prepared transactions are covered. Streamed transactions are also
covered.
Author: Euler Taveira
Discussion: https://postgr.es/m/CAB-JLwYOYwL=XTyAXKiH5CtM_Vm8KjKh7aaitCKvmCh4rzr5pQ@mail.gmail.com
---
doc/src/sgml/catalogs.sgml | 9 +
doc/src/sgml/config.sgml | 12 ++
doc/src/sgml/logical-replication.sgml | 7 +
doc/src/sgml/ref/alter_subscription.sgml | 5 +-
doc/src/sgml/ref/create_subscription.sgml | 59 ++++-
src/backend/catalog/pg_subscription.c | 1 +
src/backend/catalog/system_views.sql | 7 +-
src/backend/commands/subscriptioncmds.c | 88 +++++++-
.../replication/logical/applyparallelworker.c | 3 +-
src/backend/replication/logical/worker.c | 161 ++++++++++++--
src/backend/utils/adt/timestamp.c | 29 +++
src/bin/pg_dump/pg_dump.c | 16 +-
src/bin/pg_dump/pg_dump.h | 1 +
src/bin/psql/describe.c | 9 +-
src/bin/psql/tab-complete.c | 4 +-
src/include/catalog/pg_subscription.h | 3 +
src/include/datatype/timestamp.h | 2 +
src/include/replication/worker_internal.h | 2 +-
src/include/utils/timestamp.h | 2 +
src/test/regress/expected/subscription.out | 203 +++++++++++-------
src/test/regress/sql/subscription.sql | 38 ++++
src/test/subscription/meson.build | 1 +
src/test/subscription/t/032_apply_delay.pl | 151 +++++++++++++
23 files changed, 711 insertions(+), 102 deletions(-)
create mode 100644 src/test/subscription/t/032_apply_delay.pl
diff --git a/doc/src/sgml/catalogs.sgml b/doc/src/sgml/catalogs.sgml
index c1e4048054..bf3c05241c 100644
--- a/doc/src/sgml/catalogs.sgml
+++ b/doc/src/sgml/catalogs.sgml
@@ -7873,6 +7873,15 @@ SCRAM-SHA-256$<replaceable><iteration count></replaceable>:<replaceable>&l
</para></entry>
</row>
+ <row>
+ <entry role="catalog_table_entry"><para role="column_definition">
+ <structfield>subminapplydelay</structfield> <type>int8</type>
+ </para>
+ <para>
+ The length of time (ms) to delay the application of changes.
+ </para></entry>
+ </row>
+
<row>
<entry role="catalog_table_entry"><para role="column_definition">
<structfield>subname</structfield> <type>name</type>
diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml
index 2fec613484..0e68c3320d 100644
--- a/doc/src/sgml/config.sgml
+++ b/doc/src/sgml/config.sgml
@@ -4753,6 +4753,18 @@ ANY <replaceable class="parameter">num_sync</replaceable> ( <replaceable class="
the <filename>postgresql.conf</filename> file or on the server
command line.
</para>
+ <para>
+ For time delayed logical replication, the apply worker sends a Standby
+ Status Update message to the corresponding publisher per the indicated
+ time of this parameter. Therefore, if this parameter is longer than
+ <literal>wal_sender_timeout</literal> on the publisher, then the
+ walsender doesn't get any update message during the delay and repeatedly
+ terminates due to the timeout errors. Hence, make sure this parameter
+ shorter than the <literal>wal_sender_timeout</literal> of the publisher.
+ If this parameter is set to zero with time delayed replication, the
+ apply worker doesn't send any feedback messages during the
+ <literal>min_apply_delay</literal>.
+ </para>
</listitem>
</varlistentry>
diff --git a/doc/src/sgml/logical-replication.sgml b/doc/src/sgml/logical-replication.sgml
index 54f48be87f..6407804547 100644
--- a/doc/src/sgml/logical-replication.sgml
+++ b/doc/src/sgml/logical-replication.sgml
@@ -247,6 +247,13 @@
target table.
</para>
+ <para>
+ The subscriber replication can be instructed to lag behind the publisher
+ side changes by specifying the <literal>min_apply_delay</literal>
+ subscription parameter. See <xref linkend="sql-createsubscription"/> for
+ details.
+ </para>
+
<sect2 id="logical-replication-subscription-slot">
<title>Replication Slot Management</title>
diff --git a/doc/src/sgml/ref/alter_subscription.sgml b/doc/src/sgml/ref/alter_subscription.sgml
index 1e8d72062b..d63aff1b90 100644
--- a/doc/src/sgml/ref/alter_subscription.sgml
+++ b/doc/src/sgml/ref/alter_subscription.sgml
@@ -213,8 +213,9 @@ ALTER SUBSCRIPTION <replaceable class="parameter">name</replaceable> RENAME TO <
are <literal>slot_name</literal>,
<literal>synchronous_commit</literal>,
<literal>binary</literal>, <literal>streaming</literal>,
- <literal>disable_on_error</literal>, and
- <literal>origin</literal>.
+ <literal>disable_on_error</literal>,
+ <literal>origin</literal>, and
+ <literal>min_apply_delay</literal>.
</para>
</listitem>
</varlistentry>
diff --git a/doc/src/sgml/ref/create_subscription.sgml b/doc/src/sgml/ref/create_subscription.sgml
index eba72c6af6..d95ff8944e 100644
--- a/doc/src/sgml/ref/create_subscription.sgml
+++ b/doc/src/sgml/ref/create_subscription.sgml
@@ -349,7 +349,47 @@ CREATE SUBSCRIPTION <replaceable class="parameter">subscription_name</replaceabl
</para>
</listitem>
</varlistentry>
- </variablelist></para>
+
+ <varlistentry>
+ <term><literal>min_apply_delay</literal> (<type>integer</type>)</term>
+ <listitem>
+ <para>
+ By default, the subscriber applies changes as soon as possible. As
+ with the physical replication feature
+ (<xref linkend="guc-recovery-min-apply-delay"/>), it can be useful to
+ have a time-delayed logical replica. This parameter lets the user to
+ delay the application of changes by a specified amount of time. If this
+ value is specified without units, it is taken as milliseconds. The
+ default is zero(no delay).
+ </para>
+ <para>
+ The delay occurs only on WAL records for transaction begins and after
+ the initial table synchronization. It is possible that the
+ replication delay between publisher and subscriber exceeds the value
+ of this parameter, in which case no delay is added. Note that the
+ delay is calculated between the WAL time stamp as written on
+ publisher and the current time on the subscriber. Time spent in logical
+ decoding and in transferring the transaction may reduce the actual wait
+ time. If the system clocks on publisher and subscriber are not
+ synchronized, this may lead to apply changes earlier than expected,
+ but this is not a major issue because this parameter is typically much
+ larger than the time deviations between servers. Note that if this
+ parameter is set to a long delay, the replication will stop if the
+ replication slot falls behind the current LSN by more than
+ <link linkend="guc-max-slot-wal-keep-size"><literal>max_slot_wal_keep_size</literal></link>.
+ </para>
+ <warning>
+ <para>
+ Delaying the replication can mean there is a much longer time between making
+ a change on the publisher, and that change being committed on the subscriber.
+ This can have a big impact on synchronous replication.
+ See <xref linkend="guc-synchronous-commit"/>.
+ </para>
+ </warning>
+ </listitem>
+ </varlistentry>
+ </variablelist>
+ </para>
</listitem>
</varlistentry>
@@ -413,6 +453,11 @@ CREATE SUBSCRIPTION <replaceable class="parameter">subscription_name</replaceabl
published with different column lists are not supported.
</para>
+ <para>
+ Setting streaming to <literal>parallel</literal> mode and <literal>min_apply_delay</literal>
+ at the same time is not supported.
+ </para>
+
<para>
We allow non-existent publications to be specified so that users can add
those later. This means
@@ -472,6 +517,18 @@ CREATE SUBSCRIPTION mysub
PUBLICATION insert_only
WITH (enabled = false);
</programlisting></para>
+
+ <para>
+ Create a subscription to a remote server that replicates tables in
+ the <literal>mypub</literal> publication and starts replicating immediately
+ on commit. Pre-existing data is not copied. The application of changes is
+ delayed by 4 hours.
+<programlisting>
+CREATE SUBSCRIPTION mysub
+ CONNECTION 'host=192.168.1.50 port=5432 user=foo dbname=foodb'
+ PUBLICATION mypub
+ WITH (copy_data = false, min_apply_delay = '4h');
+</programlisting></para>
</refsect1>
<refsect1>
diff --git a/src/backend/catalog/pg_subscription.c b/src/backend/catalog/pg_subscription.c
index a56ae311c3..c767cc1c3a 100644
--- a/src/backend/catalog/pg_subscription.c
+++ b/src/backend/catalog/pg_subscription.c
@@ -64,6 +64,7 @@ GetSubscription(Oid subid, bool missing_ok)
sub->oid = subid;
sub->dbid = subform->subdbid;
sub->skiplsn = subform->subskiplsn;
+ sub->minapplydelay = subform->subminapplydelay;
sub->name = pstrdup(NameStr(subform->subname));
sub->owner = subform->subowner;
sub->enabled = subform->subenabled;
diff --git a/src/backend/catalog/system_views.sql b/src/backend/catalog/system_views.sql
index 447c9b970f..4004fcd0c4 100644
--- a/src/backend/catalog/system_views.sql
+++ b/src/backend/catalog/system_views.sql
@@ -1299,9 +1299,10 @@ REVOKE ALL ON pg_replication_origin_status FROM public;
-- All columns of pg_subscription except subconninfo are publicly readable.
REVOKE ALL ON pg_subscription FROM public;
-GRANT SELECT (oid, subdbid, subskiplsn, subname, subowner, subenabled,
- subbinary, substream, subtwophasestate, subdisableonerr,
- subslotname, subsynccommit, subpublications, suborigin)
+GRANT SELECT (oid, subdbid, subskiplsn, subminapplydelay, subname, subowner,
+ subenabled, subbinary, substream, subtwophasestate,
+ subdisableonerr, subslotname, subsynccommit, subpublications,
+ suborigin)
ON pg_subscription TO public;
CREATE VIEW pg_stat_subscription_stats AS
diff --git a/src/backend/commands/subscriptioncmds.c b/src/backend/commands/subscriptioncmds.c
index baff00dd74..a305edca39 100644
--- a/src/backend/commands/subscriptioncmds.c
+++ b/src/backend/commands/subscriptioncmds.c
@@ -48,6 +48,7 @@
#include "utils/memutils.h"
#include "utils/pg_lsn.h"
#include "utils/syscache.h"
+#include "utils/timestamp.h"
/*
* Options that can be specified by the user in CREATE/ALTER SUBSCRIPTION
@@ -66,6 +67,7 @@
#define SUBOPT_DISABLE_ON_ERR 0x00000400
#define SUBOPT_LSN 0x00000800
#define SUBOPT_ORIGIN 0x00001000
+#define SUBOPT_MIN_APPLY_DELAY 0x00002000
/* check if the 'val' has 'bits' set */
#define IsSet(val, bits) (((val) & (bits)) == (bits))
@@ -90,6 +92,7 @@ typedef struct SubOpts
bool disableonerr;
char *origin;
XLogRecPtr lsn;
+ int64 min_apply_delay;
} SubOpts;
static List *fetch_table_list(WalReceiverConn *wrconn, List *publications);
@@ -146,6 +149,8 @@ parse_subscription_options(ParseState *pstate, List *stmt_options,
opts->disableonerr = false;
if (IsSet(supported_opts, SUBOPT_ORIGIN))
opts->origin = pstrdup(LOGICALREP_ORIGIN_ANY);
+ if (IsSet(supported_opts, SUBOPT_MIN_APPLY_DELAY))
+ opts->min_apply_delay = 0;
/* Parse options */
foreach(lc, stmt_options)
@@ -324,6 +329,43 @@ parse_subscription_options(ParseState *pstate, List *stmt_options,
opts->specified_opts |= SUBOPT_LSN;
opts->lsn = lsn;
}
+ else if (IsSet(supported_opts, SUBOPT_MIN_APPLY_DELAY) &&
+ strcmp(defel->defname, "min_apply_delay") == 0)
+ {
+ char *val,
+ *tmp;
+ Interval *interval;
+ int64 ms;
+
+ if (IsSet(opts->specified_opts, SUBOPT_MIN_APPLY_DELAY))
+ errorConflictingDefElem(defel, pstate);
+
+ opts->specified_opts |= SUBOPT_MIN_APPLY_DELAY;
+ tmp = defGetString(defel);
+
+ /*
+ * If no unit was specified, then explicitly add 'ms' otherwise
+ * the interval_in function would assume 'seconds'.
+ */
+ if (strspn(tmp, "-0123456789 ") == strlen(tmp))
+ val = psprintf("%sms", tmp);
+ else
+ val = tmp;
+
+ interval = DatumGetIntervalP(DirectFunctionCall3(interval_in,
+ CStringGetDatum(val),
+ ObjectIdGetDatum(InvalidOid),
+ Int32GetDatum(-1)));
+
+ ms = interval2ms(interval);
+ if (ms < 0 || ms > INT_MAX)
+ ereport(ERROR,
+ errcode(ERRCODE_NUMERIC_VALUE_OUT_OF_RANGE),
+ errmsg(INT64_FORMAT " ms is outside the valid range for parameter \"%s\"",
+ ms, "min_apply_delay"));
+
+ opts->min_apply_delay = ms;
+ }
else
ereport(ERROR,
(errcode(ERRCODE_SYNTAX_ERROR),
@@ -404,6 +446,19 @@ parse_subscription_options(ParseState *pstate, List *stmt_options,
"slot_name = NONE", "create_slot = false")));
}
}
+
+ /*
+ * Do additional checking for disallowed combination when min_apply_delay
+ * was not zero.
+ */
+ if (IsSet(supported_opts, SUBOPT_MIN_APPLY_DELAY) &&
+ opts->min_apply_delay > 0)
+ {
+ if (opts->streaming == LOGICALREP_STREAM_PARALLEL)
+ ereport(ERROR,
+ (errcode(ERRCODE_SYNTAX_ERROR)),
+ errmsg("min_apply_delay must not be set when streaming = parallel"));
+ }
}
/*
@@ -560,7 +615,8 @@ CreateSubscription(ParseState *pstate, CreateSubscriptionStmt *stmt,
SUBOPT_SLOT_NAME | SUBOPT_COPY_DATA |
SUBOPT_SYNCHRONOUS_COMMIT | SUBOPT_BINARY |
SUBOPT_STREAMING | SUBOPT_TWOPHASE_COMMIT |
- SUBOPT_DISABLE_ON_ERR | SUBOPT_ORIGIN);
+ SUBOPT_DISABLE_ON_ERR | SUBOPT_ORIGIN |
+ SUBOPT_MIN_APPLY_DELAY);
parse_subscription_options(pstate, stmt->options, supported_opts, &opts);
/*
@@ -625,6 +681,7 @@ CreateSubscription(ParseState *pstate, CreateSubscriptionStmt *stmt,
values[Anum_pg_subscription_oid - 1] = ObjectIdGetDatum(subid);
values[Anum_pg_subscription_subdbid - 1] = ObjectIdGetDatum(MyDatabaseId);
values[Anum_pg_subscription_subskiplsn - 1] = LSNGetDatum(InvalidXLogRecPtr);
+ values[Anum_pg_subscription_subminapplydelay - 1] = Int64GetDatum(opts.min_apply_delay);
values[Anum_pg_subscription_subname - 1] =
DirectFunctionCall1(namein, CStringGetDatum(stmt->subname));
values[Anum_pg_subscription_subowner - 1] = ObjectIdGetDatum(owner);
@@ -1054,7 +1111,7 @@ AlterSubscription(ParseState *pstate, AlterSubscriptionStmt *stmt,
supported_opts = (SUBOPT_SLOT_NAME |
SUBOPT_SYNCHRONOUS_COMMIT | SUBOPT_BINARY |
SUBOPT_STREAMING | SUBOPT_DISABLE_ON_ERR |
- SUBOPT_ORIGIN);
+ SUBOPT_ORIGIN | SUBOPT_MIN_APPLY_DELAY);
parse_subscription_options(pstate, stmt->options,
supported_opts, &opts);
@@ -1098,6 +1155,16 @@ AlterSubscription(ParseState *pstate, AlterSubscriptionStmt *stmt,
if (IsSet(opts.specified_opts, SUBOPT_STREAMING))
{
+ /*
+ * Test the combination of streaming mode and
+ * min_apply_delay
+ */
+ if (opts.streaming == LOGICALREP_STREAM_PARALLEL &&
+ sub->minapplydelay > 0)
+ ereport(ERROR,
+ (errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
+ errmsg("min_apply_delay must not be set when streaming = parallel")));
+
values[Anum_pg_subscription_substream - 1] =
CharGetDatum(opts.streaming);
replaces[Anum_pg_subscription_substream - 1] = true;
@@ -1111,6 +1178,23 @@ AlterSubscription(ParseState *pstate, AlterSubscriptionStmt *stmt,
= true;
}
+ if (IsSet(opts.specified_opts, SUBOPT_MIN_APPLY_DELAY))
+ {
+ /*
+ * Test the combination of streaming mode and
+ * min_apply_delay
+ */
+ if (opts.min_apply_delay > 0 &&
+ sub->stream == LOGICALREP_STREAM_PARALLEL)
+ ereport(ERROR,
+ (errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
+ errmsg("min_apply_delay must not be set when streaming = parallel")));
+
+ values[Anum_pg_subscription_subminapplydelay - 1] =
+ Int64GetDatum(opts.min_apply_delay);
+ replaces[Anum_pg_subscription_subminapplydelay - 1] = true;
+ }
+
if (IsSet(opts.specified_opts, SUBOPT_ORIGIN))
{
values[Anum_pg_subscription_suborigin - 1] =
diff --git a/src/backend/replication/logical/applyparallelworker.c b/src/backend/replication/logical/applyparallelworker.c
index 2e5914d5d9..72e6f5ce84 100644
--- a/src/backend/replication/logical/applyparallelworker.c
+++ b/src/backend/replication/logical/applyparallelworker.c
@@ -704,7 +704,8 @@ pa_process_spooled_messages_if_required(void)
{
apply_spooled_messages(&MyParallelShared->fileset,
MyParallelShared->xid,
- InvalidXLogRecPtr);
+ InvalidXLogRecPtr,
+ 0);
pa_set_fileset_state(MyParallelShared, FS_EMPTY);
}
diff --git a/src/backend/replication/logical/worker.c b/src/backend/replication/logical/worker.c
index 79cda39445..5d11cc5662 100644
--- a/src/backend/replication/logical/worker.c
+++ b/src/backend/replication/logical/worker.c
@@ -316,6 +316,17 @@ static List *on_commit_wakeup_workers_subids = NIL;
bool in_remote_transaction = false;
static XLogRecPtr remote_final_lsn = InvalidXLogRecPtr;
+/*
+ * In order to avoid walsender's timeout during time delayed replication,
+ * it's necessary to keep sending feedback messages during the delay from the
+ * worker process. Meanwhile, the feature delays the apply before starting the
+ * transaction and thus we don't write WALs for the suspended changes during
+ * the wait. Hence, in the case the worker process sends a feedback during the
+ * delay, avoid having positions of the flushed and apply LSN overwritten by
+ * the latest LSN.
+ */
+static XLogRecPtr last_received = InvalidXLogRecPtr;
+
/* fields valid only when processing streamed transaction */
static bool in_streamed_transaction = false;
@@ -386,10 +397,13 @@ static void stream_write_change(char action, StringInfo s);
static void stream_open_and_write_change(TransactionId xid, char action, StringInfo s);
static void stream_close_file(void);
-static void send_feedback(XLogRecPtr recvpos, bool force, bool requestReply);
+static void send_feedback(XLogRecPtr recvpos, bool force, bool requestReply,
+ bool in_delaying_apply);
static void DisableSubscriptionAndExit(void);
+static void maybe_delay_apply(TimestampTz ts);
+
static void apply_handle_commit_internal(LogicalRepCommitData *commit_data);
static void apply_handle_insert_internal(ApplyExecutionData *edata,
ResultRelInfo *relinfo,
@@ -996,6 +1010,94 @@ slot_modify_data(TupleTableSlot *slot, TupleTableSlot *srcslot,
ExecStoreVirtualTuple(slot);
}
+/*
+ * When min_apply_delay parameter is set on the subscriber, we wait long enough
+ * to make sure a transaction is applied at least that interval behind the
+ * publisher.
+ *
+ * While the physical replication applies the delay at commit time, this
+ * feature applies the delay for the next transaction but before starting the
+ * transaction. This is mainly because keeping a transaction that conducted
+ * write operations open for a long time results in some issues such as bloat
+ * and locks.
+ *
+ * The min_apply_delay parameter will take effect only after all tables are in
+ * READY state.
+ */
+static void
+maybe_delay_apply(TimestampTz ts)
+{
+ /* Nothing to do if no delay set */
+ if (MySubscription->minapplydelay <= 0)
+ return;
+
+ /*
+ * The min_apply_delay parameter is ignored until all tablesync workers
+ * have reached READY state. If we allow the delay during the catchup
+ * phase, once we reach the limit of tablesync workers, it will impose a
+ * delay for each subsequent worker. It means it will take a long time to
+ * finish the initial table synchronization.
+ */
+ if (!AllTablesyncsReady())
+ return;
+
+ while (true)
+ {
+ long diffms;
+
+ ResetLatch(MyLatch);
+
+ CHECK_FOR_INTERRUPTS();
+
+ if (ConfigReloadPending)
+ {
+ ConfigReloadPending = false;
+ ProcessConfigFile(PGC_SIGHUP);
+ }
+
+ /*
+ * Before calculating the time duration, reload the catalog if needed.
+ */
+ if (!in_remote_transaction && !in_streamed_transaction)
+ {
+ AcceptInvalidationMessages();
+ maybe_reread_subscription();
+ }
+
+ diffms = TimestampDifferenceMilliseconds(GetCurrentTimestamp(),
+ TimestampTzPlusMilliseconds(ts, MySubscription->minapplydelay));
+
+ /*
+ * Exit without arming the latch if it's already past time to apply
+ * this transaction.
+ */
+ if (diffms <= 0)
+ break;
+
+ elog(DEBUG2, "logical replication apply delay: %ld ms", diffms);
+
+ /*
+ * Call send_feedback() to prevent the publisher from exiting by
+ * timeout during the delay, when wal_receiver_status_interval is
+ * available.
+ */
+ if (wal_receiver_status_interval > 0
+ && diffms > wal_receiver_status_interval * 1000)
+ {
+ WaitLatch(MyLatch,
+ WL_LATCH_SET | WL_TIMEOUT | WL_EXIT_ON_PM_DEATH,
+ (long) wal_receiver_status_interval * 1000,
+ WAIT_EVENT_RECOVERY_APPLY_DELAY);
+ send_feedback(last_received, true, false, true);
+ }
+ else
+ WaitLatch(MyLatch,
+ WL_LATCH_SET | WL_TIMEOUT | WL_EXIT_ON_PM_DEATH,
+ diffms,
+ WAIT_EVENT_RECOVERY_APPLY_DELAY);
+ }
+}
+
/*
* Handle BEGIN message.
*/
@@ -1007,6 +1109,9 @@ apply_handle_begin(StringInfo s)
logicalrep_read_begin(s, &begin_data);
set_apply_error_context_xact(begin_data.xid, begin_data.final_lsn);
+ /* Should we delay the current transaction? */
+ maybe_delay_apply(begin_data.committime);
+
remote_final_lsn = begin_data.final_lsn;
maybe_start_skipping_changes(begin_data.final_lsn);
@@ -1061,6 +1166,9 @@ apply_handle_begin_prepare(StringInfo s)
logicalrep_read_begin_prepare(s, &begin_data);
set_apply_error_context_xact(begin_data.xid, begin_data.prepare_lsn);
+ /* Should we delay the current prepared transaction? */
+ maybe_delay_apply(begin_data.prepare_time);
+
remote_final_lsn = begin_data.prepare_lsn;
maybe_start_skipping_changes(begin_data.prepare_lsn);
@@ -1308,7 +1416,8 @@ apply_handle_stream_prepare(StringInfo s)
* spooled operations.
*/
apply_spooled_messages(MyLogicalRepWorker->stream_fileset,
- prepare_data.xid, prepare_data.prepare_lsn);
+ prepare_data.xid, prepare_data.prepare_lsn,
+ prepare_data.prepare_time);
/* Mark the transaction as prepared. */
apply_handle_prepare_internal(&prepare_data);
@@ -2001,7 +2110,7 @@ ensure_last_message(FileSet *stream_fileset, TransactionId xid, int fileno,
*/
void
apply_spooled_messages(FileSet *stream_fileset, TransactionId xid,
- XLogRecPtr lsn)
+ XLogRecPtr lsn, TimestampTz ts)
{
StringInfoData s2;
int nchanges;
@@ -2012,6 +2121,24 @@ apply_spooled_messages(FileSet *stream_fileset, TransactionId xid,
int fileno;
off_t offset;
+ /*
+ * Should we delay the current transaction?
+ *
+ * Unlike the regular (non-streamed) cases, the delay is applied in a
+ * STREAM COMMIT/STREAM PREPARE message for streamed transactions. The
+ * STREAM START message does not contain a commit/prepare time (it will be
+ * available when the in-progress transaction finishes). Hence, it's not
+ * appropriate to apply a delay at that time.
+ *
+ * It's not allowed to execute time delayed replication with parallel
+ * apply feature.
+ */
+ if (!am_parallel_apply_worker())
+ {
+ Assert(ts > 0);
+ maybe_delay_apply(ts);
+ }
+
if (!am_parallel_apply_worker())
maybe_start_skipping_changes(lsn);
@@ -2171,7 +2298,7 @@ apply_handle_stream_commit(StringInfo s)
* spooled operations.
*/
apply_spooled_messages(MyLogicalRepWorker->stream_fileset, xid,
- commit_data.commit_lsn);
+ commit_data.commit_lsn, commit_data.committime);
apply_handle_commit_internal(&commit_data);
@@ -3444,7 +3571,7 @@ UpdateWorkerStats(XLogRecPtr last_lsn, TimestampTz send_time, bool reply)
* Apply main loop.
*/
static void
-LogicalRepApplyLoop(XLogRecPtr last_received)
+LogicalRepApplyLoop(void)
{
TimestampTz last_recv_timestamp = GetCurrentTimestamp();
bool ping_sent = false;
@@ -3565,7 +3692,7 @@ LogicalRepApplyLoop(XLogRecPtr last_received)
if (last_received < end_lsn)
last_received = end_lsn;
- send_feedback(last_received, reply_requested, false);
+ send_feedback(last_received, reply_requested, false, false);
UpdateWorkerStats(last_received, timestamp, true);
}
/* other message types are purposefully ignored */
@@ -3578,7 +3705,7 @@ LogicalRepApplyLoop(XLogRecPtr last_received)
}
/* confirm all writes so far */
- send_feedback(last_received, false, false);
+ send_feedback(last_received, false, false, false);
if (!in_remote_transaction && !in_streamed_transaction)
{
@@ -3675,7 +3802,7 @@ LogicalRepApplyLoop(XLogRecPtr last_received)
}
}
- send_feedback(last_received, requestReply, requestReply);
+ send_feedback(last_received, requestReply, requestReply, false);
/*
* Force reporting to ensure long idle periods don't lead to
@@ -3705,7 +3832,7 @@ LogicalRepApplyLoop(XLogRecPtr last_received)
* to send a response to avoid timeouts.
*/
static void
-send_feedback(XLogRecPtr recvpos, bool force, bool requestReply)
+send_feedback(XLogRecPtr recvpos, bool force, bool requestReply, bool in_delaying_apply)
{
static StringInfo reply_message = NULL;
static TimestampTz send_time = 0;
@@ -3735,8 +3862,15 @@ send_feedback(XLogRecPtr recvpos, bool force, bool requestReply)
/*
* No outstanding transactions to flush, we can report the latest received
* position. This is important for synchronous replication.
+ *
+ * During the delay of time delayed replication, do not tell the publisher
+ * that the received latest LSN is already applied and flushed at this
+ * stage, since we don't apply the transaction yet. If we do so, it leads
+ * to a wrong assumption of logical replication progress on the publisher
+ * side. Here, we just send a feedback message to avoid publisher's
+ * timeout during the delay.
*/
- if (!have_pending_txes)
+ if (!have_pending_txes && !in_delaying_apply)
flushpos = writepos = recvpos;
if (writepos < last_writepos)
@@ -4362,11 +4496,11 @@ start_table_sync(XLogRecPtr *origin_startpos, char **myslotname)
* of system resource error and are not repeatable.
*/
static void
-start_apply(XLogRecPtr origin_startpos)
+start_apply(void)
{
PG_TRY();
{
- LogicalRepApplyLoop(origin_startpos);
+ LogicalRepApplyLoop();
}
PG_CATCH();
{
@@ -4653,7 +4787,8 @@ ApplyWorkerMain(Datum main_arg)
}
/* Run the main loop. */
- start_apply(origin_startpos);
+ last_received = origin_startpos;
+ start_apply();
proc_exit(0);
}
diff --git a/src/backend/utils/adt/timestamp.c b/src/backend/utils/adt/timestamp.c
index 928c330897..525e0b5870 100644
--- a/src/backend/utils/adt/timestamp.c
+++ b/src/backend/utils/adt/timestamp.c
@@ -2431,6 +2431,35 @@ interval_cmp_internal(const Interval *interval1, const Interval *interval2)
return int128_compare(span1, span2);
}
+/*
+ * Returns the number of milliseconds in the specified Interval.
+ */
+int64
+interval2ms(const Interval *interval)
+{
+ int64 days;
+ int64 ms;
+ int64 result;
+
+ days = interval->month * INT64CONST(30);
+ days += interval->day;
+
+ /* Detect whether the value of interval can cause an overflow. */
+ if (pg_mul_s64_overflow(days, MSECS_PER_DAY, &result))
+ ereport(ERROR,
+ (errcode(ERRCODE_NUMERIC_VALUE_OUT_OF_RANGE),
+ errmsg("bigint out of range")));
+
+ /* Adds portion time (in ms) to the previous result. */
+ ms = interval->time / INT64CONST(1000);
+ if (pg_add_s64_overflow(result, ms, &result))
+ ereport(ERROR,
+ (errcode(ERRCODE_NUMERIC_VALUE_OUT_OF_RANGE),
+ errmsg("bigint out of range")));
+
+ return result;
+}
+
Datum
interval_eq(PG_FUNCTION_ARGS)
{
diff --git a/src/bin/pg_dump/pg_dump.c b/src/bin/pg_dump/pg_dump.c
index 5e800dc79a..d9b4c8b7f0 100644
--- a/src/bin/pg_dump/pg_dump.c
+++ b/src/bin/pg_dump/pg_dump.c
@@ -4494,6 +4494,7 @@ getSubscriptions(Archive *fout)
int i_subsynccommit;
int i_subpublications;
int i_subbinary;
+ int i_subminapplydelay;
int i,
ntups;
@@ -4546,9 +4547,14 @@ getSubscriptions(Archive *fout)
LOGICALREP_TWOPHASE_STATE_DISABLED);
if (fout->remoteVersion >= 160000)
- appendPQExpBufferStr(query, " s.suborigin\n");
+ appendPQExpBufferStr(query,
+ " s.suborigin,\n"
+ " s.subminapplydelay\n");
else
- appendPQExpBuffer(query, " '%s' AS suborigin\n", LOGICALREP_ORIGIN_ANY);
+ {
+ appendPQExpBuffer(query, " '%s' AS suborigin,\n", LOGICALREP_ORIGIN_ANY);
+ appendPQExpBufferStr(query, " 0 AS subminapplydelay\n");
+ }
appendPQExpBufferStr(query,
"FROM pg_subscription s\n"
@@ -4576,6 +4582,7 @@ getSubscriptions(Archive *fout)
i_subtwophasestate = PQfnumber(res, "subtwophasestate");
i_subdisableonerr = PQfnumber(res, "subdisableonerr");
i_suborigin = PQfnumber(res, "suborigin");
+ i_subminapplydelay = PQfnumber(res, "subminapplydelay");
subinfo = pg_malloc(ntups * sizeof(SubscriptionInfo));
@@ -4606,6 +4613,8 @@ getSubscriptions(Archive *fout)
subinfo[i].subdisableonerr =
pg_strdup(PQgetvalue(res, i, i_subdisableonerr));
subinfo[i].suborigin = pg_strdup(PQgetvalue(res, i, i_suborigin));
+ subinfo[i].subminapplydelay =
+ strtoi64(PQgetvalue(res, i, i_subminapplydelay), NULL, 10);
/* Decide whether we want to dump it */
selectDumpableObject(&(subinfo[i].dobj), fout);
@@ -4687,6 +4696,9 @@ dumpSubscription(Archive *fout, const SubscriptionInfo *subinfo)
if (strcmp(subinfo->subsynccommit, "off") != 0)
appendPQExpBuffer(query, ", synchronous_commit = %s", fmtId(subinfo->subsynccommit));
+ if (subinfo->subminapplydelay > 0)
+ appendPQExpBuffer(query, ", min_apply_delay = '" INT64_FORMAT " ms'", subinfo->subminapplydelay);
+
appendPQExpBufferStr(query, ");\n");
if (subinfo->dobj.dump & DUMP_COMPONENT_DEFINITION)
diff --git a/src/bin/pg_dump/pg_dump.h b/src/bin/pg_dump/pg_dump.h
index e7cbd8d7ed..e2525f70ab 100644
--- a/src/bin/pg_dump/pg_dump.h
+++ b/src/bin/pg_dump/pg_dump.h
@@ -661,6 +661,7 @@ typedef struct _SubscriptionInfo
char *subdisableonerr;
char *suborigin;
char *subsynccommit;
+ int64 subminapplydelay;
char *subpublications;
} SubscriptionInfo;
diff --git a/src/bin/psql/describe.c b/src/bin/psql/describe.c
index c8a0bb7b3a..8a27063bed 100644
--- a/src/bin/psql/describe.c
+++ b/src/bin/psql/describe.c
@@ -6472,7 +6472,7 @@ describeSubscriptions(const char *pattern, bool verbose)
PGresult *res;
printQueryOpt myopt = pset.popt;
static const bool translate_columns[] = {false, false, false, false,
- false, false, false, false, false, false, false, false};
+ false, false, false, false, false, false, false, false, false};
if (pset.sversion < 100000)
{
@@ -6527,10 +6527,13 @@ describeSubscriptions(const char *pattern, bool verbose)
gettext_noop("Two-phase commit"),
gettext_noop("Disable on error"));
+ /* Origin and min_apply_delay are only supported in v16 and higher */
if (pset.sversion >= 160000)
appendPQExpBuffer(&buf,
- ", suborigin AS \"%s\"\n",
- gettext_noop("Origin"));
+ ", suborigin AS \"%s\"\n"
+ ", subminapplydelay AS \"%s\"\n",
+ gettext_noop("Origin"),
+ gettext_noop("Min apply delay (ms)"));
appendPQExpBuffer(&buf,
", subsynccommit AS \"%s\"\n"
diff --git a/src/bin/psql/tab-complete.c b/src/bin/psql/tab-complete.c
index 23750ea5fb..19d2f90dc0 100644
--- a/src/bin/psql/tab-complete.c
+++ b/src/bin/psql/tab-complete.c
@@ -1925,7 +1925,7 @@ psql_completion(const char *text, int start, int end)
COMPLETE_WITH("(", "PUBLICATION");
/* ALTER SUBSCRIPTION <name> SET ( */
else if (HeadMatches("ALTER", "SUBSCRIPTION", MatchAny) && TailMatches("SET", "("))
- COMPLETE_WITH("binary", "disable_on_error", "origin", "slot_name",
+ COMPLETE_WITH("binary", "disable_on_error", "min_apply_delay", "origin", "slot_name",
"streaming", "synchronous_commit");
/* ALTER SUBSCRIPTION <name> SKIP ( */
else if (HeadMatches("ALTER", "SUBSCRIPTION", MatchAny) && TailMatches("SKIP", "("))
@@ -3247,7 +3247,7 @@ psql_completion(const char *text, int start, int end)
/* Complete "CREATE SUBSCRIPTION <name> ... WITH ( <opt>" */
else if (HeadMatches("CREATE", "SUBSCRIPTION") && TailMatches("WITH", "("))
COMPLETE_WITH("binary", "connect", "copy_data", "create_slot",
- "disable_on_error", "enabled", "origin", "slot_name",
+ "disable_on_error", "enabled", "min_apply_delay", "origin", "slot_name",
"streaming", "synchronous_commit", "two_phase");
/* CREATE TRIGGER --- is allowed inside CREATE SCHEMA, so use TailMatches */
diff --git a/src/include/catalog/pg_subscription.h b/src/include/catalog/pg_subscription.h
index b0f2a1705d..078495cbf0 100644
--- a/src/include/catalog/pg_subscription.h
+++ b/src/include/catalog/pg_subscription.h
@@ -70,6 +70,8 @@ CATALOG(pg_subscription,6100,SubscriptionRelationId) BKI_SHARED_RELATION BKI_ROW
XLogRecPtr subskiplsn; /* All changes finished at this LSN are
* skipped */
+ int64 subminapplydelay; /* Replication apply delay */
+
NameData subname; /* Name of the subscription */
Oid subowner BKI_LOOKUP(pg_authid); /* Owner of the subscription */
@@ -120,6 +122,7 @@ typedef struct Subscription
* in */
XLogRecPtr skiplsn; /* All changes finished at this LSN are
* skipped */
+ int64 minapplydelay; /* Replication apply delay */
char *name; /* Name of the subscription */
Oid owner; /* Oid of the subscription owner */
bool enabled; /* Indicates if the subscription is enabled */
diff --git a/src/include/datatype/timestamp.h b/src/include/datatype/timestamp.h
index 21a37e21e9..8b368af299 100644
--- a/src/include/datatype/timestamp.h
+++ b/src/include/datatype/timestamp.h
@@ -127,6 +127,8 @@ struct pg_itm_in
#define SECS_PER_MINUTE 60
#define MINS_PER_HOUR 60
+#define MSECS_PER_DAY INT64CONST(86400000)
+
#define USECS_PER_DAY INT64CONST(86400000000)
#define USECS_PER_HOUR INT64CONST(3600000000)
#define USECS_PER_MINUTE INT64CONST(60000000)
diff --git a/src/include/replication/worker_internal.h b/src/include/replication/worker_internal.h
index db891eea8a..4bdb6b7de3 100644
--- a/src/include/replication/worker_internal.h
+++ b/src/include/replication/worker_internal.h
@@ -255,7 +255,7 @@ extern void stream_stop_internal(TransactionId xid);
/* Common streaming function to apply all the spooled messages */
extern void apply_spooled_messages(FileSet *stream_fileset, TransactionId xid,
- XLogRecPtr lsn);
+ XLogRecPtr lsn, TimestampTz ts);
extern void apply_dispatch(StringInfo s);
diff --git a/src/include/utils/timestamp.h b/src/include/utils/timestamp.h
index 42f802bb9d..534051fe13 100644
--- a/src/include/utils/timestamp.h
+++ b/src/include/utils/timestamp.h
@@ -102,6 +102,8 @@ extern bool TimestampDifferenceExceeds(TimestampTz start_time,
TimestampTz stop_time,
int msec);
+extern int64 interval2ms(const Interval *interval);
+
extern TimestampTz time_t_to_timestamptz(pg_time_t tm);
extern pg_time_t timestamptz_to_time_t(TimestampTz t);
diff --git a/src/test/regress/expected/subscription.out b/src/test/regress/expected/subscription.out
index 1ed6f4c39c..36990f584e 100644
--- a/src/test/regress/expected/subscription.out
+++ b/src/test/regress/expected/subscription.out
@@ -114,18 +114,18 @@ CREATE SUBSCRIPTION regress_testsub4 CONNECTION 'dbname=regress_doesnotexist' PU
WARNING: subscription was created, but is not connected
HINT: To initiate replication, you must manually create the replication slot, enable the subscription, and refresh the subscription.
\dRs+ regress_testsub4
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
-------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub4 | regress_subscription_user | f | {testpub} | f | off | d | f | none | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay (ms) | Synchronous commit | Conninfo | Skip LSN
+------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+----------------------+--------------------+-----------------------------+----------
+ regress_testsub4 | regress_subscription_user | f | {testpub} | f | off | d | f | none | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
ALTER SUBSCRIPTION regress_testsub4 SET (origin = any);
\dRs+ regress_testsub4
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
-------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub4 | regress_subscription_user | f | {testpub} | f | off | d | f | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay (ms) | Synchronous commit | Conninfo | Skip LSN
+------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+----------------------+--------------------+-----------------------------+----------
+ regress_testsub4 | regress_subscription_user | f | {testpub} | f | off | d | f | any | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
DROP SUBSCRIPTION regress_testsub3;
@@ -135,10 +135,10 @@ ALTER SUBSCRIPTION regress_testsub CONNECTION 'foobar';
ERROR: invalid connection string syntax: missing "=" after "foobar" in connection info string
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub} | f | off | d | f | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay (ms) | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+----------------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | off | d | f | any | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
ALTER SUBSCRIPTION regress_testsub SET PUBLICATION testpub2, testpub3 WITH (refresh = false);
@@ -155,10 +155,10 @@ ERROR: unrecognized subscription parameter: "create_slot"
-- ok
ALTER SUBSCRIPTION regress_testsub SKIP (lsn = '0/12345');
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+---------------------+--------+-----------+------------------+------------------+--------+--------------------+------------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub2,testpub3} | f | off | d | f | any | off | dbname=regress_doesnotexist2 | 0/12345
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay (ms) | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+---------------------+--------+-----------+------------------+------------------+--------+----------------------+--------------------+------------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub2,testpub3} | f | off | d | f | any | 0 | off | dbname=regress_doesnotexist2 | 0/12345
(1 row)
-- ok - with lsn = NONE
@@ -167,10 +167,10 @@ ALTER SUBSCRIPTION regress_testsub SKIP (lsn = NONE);
ALTER SUBSCRIPTION regress_testsub SKIP (lsn = '0/0');
ERROR: invalid WAL location (LSN): 0/0
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+---------------------+--------+-----------+------------------+------------------+--------+--------------------+------------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub2,testpub3} | f | off | d | f | any | off | dbname=regress_doesnotexist2 | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay (ms) | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+---------------------+--------+-----------+------------------+------------------+--------+----------------------+--------------------+------------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub2,testpub3} | f | off | d | f | any | 0 | off | dbname=regress_doesnotexist2 | 0/0
(1 row)
BEGIN;
@@ -202,10 +202,10 @@ ALTER SUBSCRIPTION regress_testsub_foo SET (synchronous_commit = foobar);
ERROR: invalid value for parameter "synchronous_commit": "foobar"
HINT: Available values: local, remote_write, remote_apply, on, off.
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
----------------------+---------------------------+---------+---------------------+--------+-----------+------------------+------------------+--------+--------------------+------------------------------+----------
- regress_testsub_foo | regress_subscription_user | f | {testpub2,testpub3} | f | off | d | f | any | local | dbname=regress_doesnotexist2 | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay (ms) | Synchronous commit | Conninfo | Skip LSN
+---------------------+---------------------------+---------+---------------------+--------+-----------+------------------+------------------+--------+----------------------+--------------------+------------------------------+----------
+ regress_testsub_foo | regress_subscription_user | f | {testpub2,testpub3} | f | off | d | f | any | 0 | local | dbname=regress_doesnotexist2 | 0/0
(1 row)
-- rename back to keep the rest simple
@@ -239,19 +239,19 @@ CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUB
WARNING: subscription was created, but is not connected
HINT: To initiate replication, you must manually create the replication slot, enable the subscription, and refresh the subscription.
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub} | t | off | d | f | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay (ms) | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+----------------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | t | off | d | f | any | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
ALTER SUBSCRIPTION regress_testsub SET (binary = false);
ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE);
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub} | f | off | d | f | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay (ms) | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+----------------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | off | d | f | any | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
DROP SUBSCRIPTION regress_testsub;
@@ -263,27 +263,27 @@ CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUB
WARNING: subscription was created, but is not connected
HINT: To initiate replication, you must manually create the replication slot, enable the subscription, and refresh the subscription.
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub} | f | on | d | f | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay (ms) | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+----------------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | on | d | f | any | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
ALTER SUBSCRIPTION regress_testsub SET (streaming = parallel);
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub} | f | parallel | d | f | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay (ms) | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+----------------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | parallel | d | f | any | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
ALTER SUBSCRIPTION regress_testsub SET (streaming = false);
ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE);
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub} | f | off | d | f | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay (ms) | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+----------------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | off | d | f | any | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
-- fail - publication already exists
@@ -298,10 +298,10 @@ ALTER SUBSCRIPTION regress_testsub ADD PUBLICATION testpub1, testpub2 WITH (refr
ALTER SUBSCRIPTION regress_testsub ADD PUBLICATION testpub1, testpub2 WITH (refresh = false);
ERROR: publication "testpub1" is already in subscription "regress_testsub"
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-----------------------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub,testpub1,testpub2} | f | off | d | f | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay (ms) | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-----------------------------+--------+-----------+------------------+------------------+--------+----------------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub,testpub1,testpub2} | f | off | d | f | any | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
-- fail - publication used more then once
@@ -316,10 +316,10 @@ ERROR: publication "testpub3" is not in subscription "regress_testsub"
-- ok - delete publications
ALTER SUBSCRIPTION regress_testsub DROP PUBLICATION testpub1, testpub2 WITH (refresh = false);
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub} | f | off | d | f | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay (ms) | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+----------------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | off | d | f | any | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
DROP SUBSCRIPTION regress_testsub;
@@ -355,10 +355,10 @@ CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUB
WARNING: subscription was created, but is not connected
HINT: To initiate replication, you must manually create the replication slot, enable the subscription, and refresh the subscription.
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub} | f | off | p | f | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay (ms) | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+----------------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | off | p | f | any | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
--fail - alter of two_phase option not supported.
@@ -367,10 +367,10 @@ ERROR: unrecognized subscription parameter: "two_phase"
-- but can alter streaming when two_phase enabled
ALTER SUBSCRIPTION regress_testsub SET (streaming = true);
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub} | f | on | p | f | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay (ms) | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+----------------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | on | p | f | any | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE);
@@ -380,10 +380,10 @@ CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUB
WARNING: subscription was created, but is not connected
HINT: To initiate replication, you must manually create the replication slot, enable the subscription, and refresh the subscription.
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub} | f | on | p | f | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay (ms) | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+----------------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | on | p | f | any | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE);
@@ -396,18 +396,77 @@ CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUB
WARNING: subscription was created, but is not connected
HINT: To initiate replication, you must manually create the replication slot, enable the subscription, and refresh the subscription.
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub} | f | off | d | f | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay (ms) | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+----------------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | off | d | f | any | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
ALTER SUBSCRIPTION regress_testsub SET (disable_on_error = true);
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub} | f | off | d | t | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay (ms) | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+----------------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | off | d | t | any | 0 | off | dbname=regress_doesnotexist | 0/0
+(1 row)
+
+ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE);
+DROP SUBSCRIPTION regress_testsub;
+-- fail -- min_apply_delay must be a non-negative integer
+CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUBLICATION testpub WITH (connect = false, min_apply_delay = foo);
+ERROR: invalid input syntax for type interval: "foo"
+CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUBLICATION testpub WITH (connect = false, min_apply_delay = -1);
+ERROR: -1 ms is outside the valid range for parameter "min_apply_delay"
+-- fail - utilizing streaming = parallel with time delayed replication is not supported.
+CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUBLICATION testpub WITH (connect = false, streaming = parallel, min_apply_delay = '1 day');
+ERROR: min_apply_delay must not be set when streaming = parallel
+-- success -- 123 ms
+CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUBLICATION testpub WITH (connect = false, min_apply_delay = 123);
+WARNING: subscription was created, but is not connected
+HINT: To initiate replication, you must manually create the replication slot, enable the subscription, and refresh the subscription.
+\dRs+
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay (ms) | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+----------------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | off | d | f | any | 123 | off | dbname=regress_doesnotexist | 0/0
+(1 row)
+
+-- fail - alter subscription with streaming = parallel failed when time delayed replication is set.
+ALTER SUBSCRIPTION regress_testsub SET (streaming = parallel);
+ERROR: min_apply_delay must not be set when streaming = parallel
+ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE);
+DROP SUBSCRIPTION regress_testsub;
+-- fail - alter subscription with min_apply_delay failed when streaming = parallel is set.
+CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUBLICATION testpub WITH (connect = false, streaming = parallel);
+WARNING: subscription was created, but is not connected
+HINT: To initiate replication, you must manually create the replication slot, enable the subscription, and refresh the subscription.
+ALTER SUBSCRIPTION regress_testsub SET (min_apply_delay = 123);
+ERROR: min_apply_delay must not be set when streaming = parallel
+ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE);
+DROP SUBSCRIPTION regress_testsub;
+-- success -- 86400000 ms
+CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUBLICATION testpub WITH (connect = false, min_apply_delay = 123);
+WARNING: subscription was created, but is not connected
+HINT: To initiate replication, you must manually create the replication slot, enable the subscription, and refresh the subscription.
+ALTER SUBSCRIPTION regress_testsub SET (min_apply_delay = '1d');
+\dRs+
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay (ms) | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+----------------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | off | d | f | any | 86400000 | off | dbname=regress_doesnotexist | 0/0
+(1 row)
+
+ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE);
+DROP SUBSCRIPTION regress_testsub;
+-- success -- interval is converted into ms and stored as integer
+CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUBLICATION testpub WITH (connect = false, min_apply_delay = '4h 27min 35s');
+WARNING: subscription was created, but is not connected
+HINT: To initiate replication, you must manually create the replication slot, enable the subscription, and refresh the subscription.
+\dRs+
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay (ms) | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+----------------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | off | d | f | any | 16055000 | off | dbname=regress_doesnotexist | 0/0
(1 row)
ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE);
diff --git a/src/test/regress/sql/subscription.sql b/src/test/regress/sql/subscription.sql
index 7991abfe8f..7f6b16e6cf 100644
--- a/src/test/regress/sql/subscription.sql
+++ b/src/test/regress/sql/subscription.sql
@@ -279,6 +279,44 @@ ALTER SUBSCRIPTION regress_testsub SET (disable_on_error = true);
ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE);
DROP SUBSCRIPTION regress_testsub;
+-- fail -- min_apply_delay must be a non-negative integer
+CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUBLICATION testpub WITH (connect = false, min_apply_delay = foo);
+CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUBLICATION testpub WITH (connect = false, min_apply_delay = -1);
+-- fail - utilizing streaming = parallel with time delayed replication is not supported.
+CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUBLICATION testpub WITH (connect = false, streaming = parallel, min_apply_delay = '1 day');
+
+-- success -- 123 ms
+CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUBLICATION testpub WITH (connect = false, min_apply_delay = 123);
+\dRs+
+
+-- fail - alter subscription with streaming = parallel failed when time delayed replication is set.
+ALTER SUBSCRIPTION regress_testsub SET (streaming = parallel);
+ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE);
+DROP SUBSCRIPTION regress_testsub;
+
+-- fail - alter subscription with min_apply_delay failed when streaming = parallel is set.
+CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUBLICATION testpub WITH (connect = false, streaming = parallel);
+ALTER SUBSCRIPTION regress_testsub SET (min_apply_delay = 123);
+ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE);
+DROP SUBSCRIPTION regress_testsub;
+
+-- success -- 86400000 ms
+CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUBLICATION testpub WITH (connect = false, min_apply_delay = 123);
+ALTER SUBSCRIPTION regress_testsub SET (min_apply_delay = '1d');
+
+\dRs+
+
+ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE);
+DROP SUBSCRIPTION regress_testsub;
+
+-- success -- interval is converted into ms and stored as integer
+CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUBLICATION testpub WITH (connect = false, min_apply_delay = '4h 27min 35s');
+
+\dRs+
+
+ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE);
+DROP SUBSCRIPTION regress_testsub;
+
RESET SESSION AUTHORIZATION;
DROP ROLE regress_subscription_user;
DROP ROLE regress_subscription_user2;
diff --git a/src/test/subscription/meson.build b/src/test/subscription/meson.build
index 3db0fdfd96..a186876eb4 100644
--- a/src/test/subscription/meson.build
+++ b/src/test/subscription/meson.build
@@ -38,6 +38,7 @@ tests += {
't/029_on_error.pl',
't/030_origin.pl',
't/031_column_list.pl',
+ 't/032_apply_delay.pl',
't/100_bugs.pl',
],
},
diff --git a/src/test/subscription/t/032_apply_delay.pl b/src/test/subscription/t/032_apply_delay.pl
new file mode 100644
index 0000000000..8f8ce23f1b
--- /dev/null
+++ b/src/test/subscription/t/032_apply_delay.pl
@@ -0,0 +1,151 @@
+
+# Copyright (c) 2022, PostgreSQL Global Development Group
+
+# Test replication apply delay
+use strict;
+use warnings;
+use PostgreSQL::Test::Cluster;
+use PostgreSQL::Test::Utils;
+use Test::More;
+
+# Create publisher node.
+my $node_publisher = PostgreSQL::Test::Cluster->new('publisher');
+$node_publisher->init(allows_streaming => 'logical');
+$node_publisher->append_conf('postgresql.conf',
+ 'logical_decoding_work_mem = 64kB');
+$node_publisher->start;
+
+# Create subscriber node.
+my $node_subscriber = PostgreSQL::Test::Cluster->new('subscriber');
+$node_subscriber->init;
+$node_subscriber->append_conf('postgresql.conf',
+ "log_min_messages = debug2");
+$node_subscriber->start;
+
+# Create some preexisting content on publisher.
+$node_publisher->safe_psql('postgres',
+ "CREATE TABLE test_tab (a int primary key, b varchar, c timestamptz DEFAULT now())");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO test_tab VALUES (1, 'foo'), (2, 'bar')");
+
+# Setup structure on subscriber.
+$node_subscriber->safe_psql('postgres',
+ "CREATE TABLE test_tab (a int primary key, b text, c timestamptz DEFAULT now(), d bigint DEFAULT 999)"
+);
+
+# Setup logical replication.
+my $publisher_connstr = $node_publisher->connstr . ' dbname=postgres';
+
+# column c must not be published because we want to compare the time difference.
+$node_publisher->safe_psql('postgres',
+ "CREATE PUBLICATION tap_pub FOR TABLE test_tab (a, b)");
+
+my $appname = 'tap_sub';
+$node_subscriber->safe_psql('postgres',
+ "CREATE SUBSCRIPTION tap_sub CONNECTION '$publisher_connstr application_name=$appname' PUBLICATION tap_pub WITH (streaming = on, min_apply_delay = '3s')"
+);
+
+# Wait for initial table sync to finish.
+$node_subscriber->wait_for_subscription_sync($node_publisher, $appname);
+
+# Check log starting now for logical replication apply delay.
+my $log_location = -s $node_subscriber->logfile;
+
+my $result =
+ $node_subscriber->safe_psql('postgres',
+ "SELECT count(*), min(a), max(a) FROM test_tab");
+is($result, qq(2|1|2), 'check initial data was copied to subscriber');
+
+# New row to trigger apply delay.
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO test_tab VALUES (3, 'baz')");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO test_tab VALUES (4, 'abc')");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO test_tab VALUES (5, 'def')");
+
+$node_publisher->wait_for_catchup($appname);
+
+$result =
+ $node_subscriber->safe_psql('postgres',
+ "SELECT count(*), min(a), max(a) FROM test_tab");
+is($result, qq(5|1|5), 'check if the new rows were applied to subscriber');
+
+check_apply_delay_log("logical replication apply delay");
+
+check_apply_delay_time('5', '3');
+
+# Test streamed transaction.
+# Insert, update and delete enough rows to exceed 64kB limit.
+$node_publisher->safe_psql(
+ 'postgres', q{
+BEGIN;
+INSERT INTO test_tab SELECT i, md5(i::text) FROM generate_series(6, 5000) s(i);
+UPDATE test_tab SET b = md5(b) WHERE mod(a, 2) = 0;
+DELETE FROM test_tab WHERE mod(a, 3) = 0;
+COMMIT;
+});
+
+$node_publisher->wait_for_catchup($appname);
+
+$result =
+ $node_subscriber->safe_psql('postgres',
+ "SELECT count(*), min(a), max(a) FROM test_tab");
+is($result, qq(3334|1|5000), 'check if the new rows were applied to subscriber');
+
+check_apply_delay_log("logical replication apply delay");
+
+check_apply_delay_time('5000', '3');
+
+# Test ALTER SUBSCRIPTION. Delay 86460 seconds (1 day 1 minute).
+$node_subscriber->safe_psql('postgres',
+ "ALTER SUBSCRIPTION tap_sub SET (min_apply_delay = 86460000)"
+);
+
+# New row to trigger apply delay.
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO test_tab VALUES (0, 'foobar')");
+
+# Disable subscription. worker should die immediately.
+$node_subscriber->safe_psql('postgres',
+ "ALTER SUBSCRIPTION tap_sub DISABLE;"
+);
+
+# Wait until worker dies.
+my $sub_query =
+ "SELECT count(1) = 0 FROM pg_stat_subscription WHERE subname = 'tap_sub' AND pid IS NOT NULL;";
+$node_subscriber->poll_query_until('postgres', $sub_query)
+ or die "Timed out while waiting for subscriber to die";
+
+$node_subscriber->stop;
+$node_publisher->stop;
+
+done_testing();
+
+sub check_apply_delay_log
+{
+ my $message = shift;
+ my $old_log_location = $log_location;
+
+ $log_location = $node_subscriber->wait_for_log(qr/$message/, $log_location);
+
+ cmp_ok($log_location, '>', $old_log_location,
+ "logfile contains triggered logical replication apply delay"
+ );
+}
+
+sub check_apply_delay_time
+{
+ my ($primary_key, $expected_diffs) = @_;
+
+ my $inserted_time_on_pub = $node_publisher->safe_psql('postgres', qq[
+ SELECT extract(epoch from c) FROM test_tab WHERE a = $primary_key;
+ ]);
+
+ my $inserted_time_on_sub = $node_subscriber->safe_psql('postgres', qq[
+ SELECT extract(epoch from c) FROM test_tab WHERE a = $primary_key;
+ ]);
+
+ cmp_ok($inserted_time_on_sub - $inserted_time_on_pub, '>', $expected_diffs,
+ "The tuple on the subscriber was modified later than the publisher");
+}
--
2.30.0
On Tuesday, January 3, 2023 8:22 PM shveta malik <shveta.malik@gmail.com> wrote:
Please find a few minor comments.
Thanks for your review !
1. + diffms = TimestampDifferenceMilliseconds(GetCurrentTimestamp(), +TimestampTzPlusMilliseconds(ts, MySubscription->minapplydelay)); on
unix, above code looks unaligned (copied from unix)2. same with: + interval = DatumGetIntervalP(DirectFunctionCall3(interval_in, +CStringGetDatum(val),
+ObjectIdGetDatum(InvalidOid),
+Int32GetDatum(-1)));
perhaps due to tabs?
Those patches indentation look OK. I checked them
by pgindent and less command described in [1]https://www.postgresql.org/docs/current/source-format.html. So, I didn't change those.
2. comment not clear: + * During the time delayed replication, avoid reporting the suspended + * latest LSN are already flushed and written, to the publisher.
You are right. I fixed this part to make it clearer.
Could you please check ?
3. + * Call send_feedback() to prevent the publisher from exiting by + * timeout during the delay, when wal_receiver_status_interval is + * available. The WALs for this delayed transaction is neither + * written nor flushed yet, Thus, we don't make the latest LSN + * overwrite those positions of the update message for this delay.yet, Thus, we --> yet, thus, we/ yet. Thus, we
Yeah, you are right. But, I have removed the last sentence, because the last one
explains some internals of send_feedback(). I judged that it would be awkward
to describe it in maybe_delay_apply(). Now this part has become concise.
4. + /* Adds portion time (in ms) to the previous result. */ + ms = interval->time / INT64CONST(1000); Is interval->time always in micro-seconds here?
Yeah, it seems so. Some internal codes indicate it. Kindly have a look at functions
such as make_interval() and interval2itm().
Please have a look at the latest patch v12 in [2]/messages/by-id/TYCPR01MB837340F78F4A16F542589195EDFF9@TYCPR01MB8373.jpnprd01.prod.outlook.com.
[1]: https://www.postgresql.org/docs/current/source-format.html
[2]: /messages/by-id/TYCPR01MB837340F78F4A16F542589195EDFF9@TYCPR01MB8373.jpnprd01.prod.outlook.com
Best Regards,
Takamichi Osumi
On Tuesday, January 10, 2023 11:28 AM I wrote:
On Tuesday, December 27, 2022 6:29 PM Tuesday, December 27, 2022 6:29 PM
wrote:Thanks for reviewing our patch! PSA new version patch set.
Now, the patches fails to apply to the HEAD, because of recent commits
(c6e1f62e2c and 216a784829c) as reported in [1].I'll rebase the patch with other changes when I post a new version.
This is done in the patch in [1]/messages/by-id/TYCPR01MB837340F78F4A16F542589195EDFF9@TYCPR01MB8373.jpnprd01.prod.outlook.com.
Please note that because of the commit c6e1f62e2c,
we don't need the 1st patch we borrowed from another thread in [2]/messages/by-id/20221122004119.GA132961@nathanxps13 any more.
[1]: /messages/by-id/TYCPR01MB837340F78F4A16F542589195EDFF9@TYCPR01MB8373.jpnprd01.prod.outlook.com
[2]: /messages/by-id/20221122004119.GA132961@nathanxps13
Best Regards,
Takamichi Osumi
On Tue, Jan 10, 2023 at 7:42 PM Takamichi Osumi (Fujitsu)
<osumi.takamichi@fujitsu.com> wrote:
On Tuesday, January 3, 2023 4:01 PM vignesh C <vignesh21@gmail.com> wrote:
Hi, thanks for your review !1) This global variable can be removed as it is used only in send_feedback which is called from maybe_delay_apply so we could pass it as a function argument: + * delay, avoid having positions of the flushed and apply LSN +overwritten by + * the latest LSN. + */ +static bool in_delaying_apply = false; +static XLogRecPtr last_received = InvalidXLogRecPtr; +I have removed the first variable and make it one of the arguments for send_feedback().
2) -1 gets converted to -1000
+int64 +interval2ms(const Interval *interval) +{ + int64 days; + int64 ms; + int64 result; + + days = interval->month * INT64CONST(30); + days += interval->day; + + /* Detect whether the value of interval can cause an overflow. */ + if (pg_mul_s64_overflow(days, MSECS_PER_DAY, &result)) + ereport(ERROR, + (errcode(ERRCODE_NUMERIC_VALUE_OUT_OF_RANGE), + errmsg("bigint out of range"))); + + /* Adds portion time (in ms) to the previous result. */ + ms = interval->time / INT64CONST(1000); + if (pg_add_s64_overflow(result, ms, &result)) + ereport(ERROR, + (errcode(ERRCODE_NUMERIC_VALUE_OUT_OF_RANGE), + errmsg("bigint out of range")));create subscription sub7 connection 'dbname=regression host=localhost
port=5432' publication pub1 with (min_apply_delay = '-1');
ERROR: -1000 ms is outside the valid range for parameter "min_apply_delay"Good catch! Fixed in order to make input '-1' interpretted as -1 ms.
3) This can be slightly reworded: + <para> + The length of time (ms) to delay the application of changes. + </para></entry> to: Delay applying the changes by a specified amount of time(ms).This has been suggested in [1] by Peter Smith. So, I'd like to keep the current patch's description.
Then, I didn't change this.4) maybe_delay_apply can be moved from apply_handle_stream_prepare to
apply_spooled_messages so that it is consistent with
maybe_start_skipping_changes:
@@ -1120,6 +1240,19 @@ apply_handle_stream_prepare(StringInfo s)elog(DEBUG1, "received prepare for streamed transaction %u",
prepare_data.xid);+ /* + * Should we delay the current prepared transaction? + * + * Although the delay is applied in BEGIN PREPARE messages, streamed + * prepared transactions apply the delay in a STREAM PREPARE message. + * That's ok because no changes have been applied yet + * (apply_spooled_messages() will do it). The STREAM START message does + * not contain a prepare time (it will be available when the in-progress + * prepared transaction finishes), hence, it was not possible to apply a + * delay at that time. + */ + maybe_delay_apply(prepare_data.prepare_time);That way the call from apply_handle_stream_commit can also be removed.
Sounds good. I moved the call of maybe_delay_apply() to the apply_spooled_messages().
Now it's aligned with maybe_start_skipping_changes().5) typo transfering should be transferring + publisher and the current time on the subscriber. Time spent in logical + decoding and in transfering the transaction may reduce the actual wait + time. If the system clocks on publisher and subscriber are + notFixed.
6) feedbacks can be changed to feedback messages + * it's necessary to keep sending feedbacks during the delay from the + worker + * process. Meanwhile, the feature delays the apply before starting theFixed.
7) + /* + * Suppress overwrites of flushed and writtten positions by the lastest + * LSN in send_feedback(). + */7a) typo writtten should be written
7b) lastest should latest
I have removed this sentence. So, those typos are removed.
Please have a look at the updated patch.
[1] - /messages/by-id/CAHut+PttQdFMNM2c6WqKt2c9G6r3ZKYRGHm04RR-4p4fyA4WRg@mail.gmail.com
Hi,
1.
+ errmsg("min_apply_delay must not be set when streaming = parallel")));
we give the same error msg for both the cases:
a. when subscription is created with streaming=parallel but we are
trying to alter subscription to set min_apply_delay >0
b. when subscription is created with some min_apply_delay and we are
trying to alter subscription to make it streaming=parallel.
For case a, error msg looks fine but for case b, I think error msg
should be changed slightly.
ALTER SUBSCRIPTION regress_testsub SET (streaming = parallel);
ERROR: min_apply_delay must not be set when streaming = parallel
This gives the feeling that we are trying to modify min_apply_delay
but we are not. Maybe we can change it to:
"subscription with min_apply_delay must not be allowed to stream
parallel" (or something better)
thanks
Shveta
On Wed, Jan 11, 2023 at 3:27 PM shveta malik <shveta.malik@gmail.com> wrote:
On Tue, Jan 10, 2023 at 7:42 PM Takamichi Osumi (Fujitsu)
<osumi.takamichi@fujitsu.com> wrote:On Tuesday, January 3, 2023 4:01 PM vignesh C <vignesh21@gmail.com> wrote:
Hi, thanks for your review !1) This global variable can be removed as it is used only in send_feedback which is called from maybe_delay_apply so we could pass it as a function argument: + * delay, avoid having positions of the flushed and apply LSN +overwritten by + * the latest LSN. + */ +static bool in_delaying_apply = false; +static XLogRecPtr last_received = InvalidXLogRecPtr; +I have removed the first variable and make it one of the arguments for send_feedback().
2) -1 gets converted to -1000
+int64 +interval2ms(const Interval *interval) +{ + int64 days; + int64 ms; + int64 result; + + days = interval->month * INT64CONST(30); + days += interval->day; + + /* Detect whether the value of interval can cause an overflow. */ + if (pg_mul_s64_overflow(days, MSECS_PER_DAY, &result)) + ereport(ERROR, + (errcode(ERRCODE_NUMERIC_VALUE_OUT_OF_RANGE), + errmsg("bigint out of range"))); + + /* Adds portion time (in ms) to the previous result. */ + ms = interval->time / INT64CONST(1000); + if (pg_add_s64_overflow(result, ms, &result)) + ereport(ERROR, + (errcode(ERRCODE_NUMERIC_VALUE_OUT_OF_RANGE), + errmsg("bigint out of range")));create subscription sub7 connection 'dbname=regression host=localhost
port=5432' publication pub1 with (min_apply_delay = '-1');
ERROR: -1000 ms is outside the valid range for parameter "min_apply_delay"Good catch! Fixed in order to make input '-1' interpretted as -1 ms.
3) This can be slightly reworded: + <para> + The length of time (ms) to delay the application of changes. + </para></entry> to: Delay applying the changes by a specified amount of time(ms).This has been suggested in [1] by Peter Smith. So, I'd like to keep the current patch's description.
Then, I didn't change this.4) maybe_delay_apply can be moved from apply_handle_stream_prepare to
apply_spooled_messages so that it is consistent with
maybe_start_skipping_changes:
@@ -1120,6 +1240,19 @@ apply_handle_stream_prepare(StringInfo s)elog(DEBUG1, "received prepare for streamed transaction %u",
prepare_data.xid);+ /* + * Should we delay the current prepared transaction? + * + * Although the delay is applied in BEGIN PREPARE messages, streamed + * prepared transactions apply the delay in a STREAM PREPARE message. + * That's ok because no changes have been applied yet + * (apply_spooled_messages() will do it). The STREAM START message does + * not contain a prepare time (it will be available when the in-progress + * prepared transaction finishes), hence, it was not possible to apply a + * delay at that time. + */ + maybe_delay_apply(prepare_data.prepare_time);That way the call from apply_handle_stream_commit can also be removed.
Sounds good. I moved the call of maybe_delay_apply() to the apply_spooled_messages().
Now it's aligned with maybe_start_skipping_changes().5) typo transfering should be transferring + publisher and the current time on the subscriber. Time spent in logical + decoding and in transfering the transaction may reduce the actual wait + time. If the system clocks on publisher and subscriber are + notFixed.
6) feedbacks can be changed to feedback messages + * it's necessary to keep sending feedbacks during the delay from the + worker + * process. Meanwhile, the feature delays the apply before starting theFixed.
7) + /* + * Suppress overwrites of flushed and writtten positions by the lastest + * LSN in send_feedback(). + */7a) typo writtten should be written
7b) lastest should latest
I have removed this sentence. So, those typos are removed.
Please have a look at the updated patch.
[1] - /messages/by-id/CAHut+PttQdFMNM2c6WqKt2c9G6r3ZKYRGHm04RR-4p4fyA4WRg@mail.gmail.com
Hi,
1.
+ errmsg("min_apply_delay must not be set when streaming = parallel")));
we give the same error msg for both the cases:
a. when subscription is created with streaming=parallel but we are
trying to alter subscription to set min_apply_delay >0
b. when subscription is created with some min_apply_delay and we are
trying to alter subscription to make it streaming=parallel.
For case a, error msg looks fine but for case b, I think error msg
should be changed slightly.
ALTER SUBSCRIPTION regress_testsub SET (streaming = parallel);
ERROR: min_apply_delay must not be set when streaming = parallel
This gives the feeling that we are trying to modify min_apply_delay
but we are not. Maybe we can change it to:
"subscription with min_apply_delay must not be allowed to stream
parallel" (or something better)thanks
Shveta
Sorry for multiple emails. One suggestion:
2.
I think users can set ' wal_receiver_status_interval ' to 0 or more
than 'wal_sender_timeout'. But is this a frequent use-case scenario or
do we see DBAs setting these in such a way by mistake? If so, then I
think, it is better to give Warning message in such a case when a user
tries to create or alter a subscription with a large 'min_apply_delay'
(>= 'wal_sender_timeout') , rather than leaving it to the user's
understanding that WalSender may repeatedly timeout in such a case.
Parse_subscription_options and AlterSubscription can be modified to
log a warning. Any thoughts?
thanks
Shveta
On Tue, 10 Jan 2023 at 19:41, Takamichi Osumi (Fujitsu)
<osumi.takamichi@fujitsu.com> wrote:
On Tuesday, January 3, 2023 4:01 PM vignesh C <vignesh21@gmail.com> wrote:
Hi, thanks for your review !Please have a look at the updated patch.
Thanks for the updated patch, few comments:
1) Comment inconsistency across create and alter subscription, better
to keep it same:
+ /*
+ * Do additional checking for disallowed combination when
min_apply_delay
+ * was not zero.
+ */
+ if (IsSet(supported_opts, SUBOPT_MIN_APPLY_DELAY) &&
+ opts->min_apply_delay > 0)
+ {
+ if (opts->streaming == LOGICALREP_STREAM_PARALLEL)
+ ereport(ERROR,
+ (errcode(ERRCODE_SYNTAX_ERROR)),
+ errmsg("min_apply_delay must
not be set when streaming = parallel"));
+ }
+ /*
+ * Test the combination of
streaming mode and
+ * min_apply_delay
+ */
+ if (opts.streaming ==
LOGICALREP_STREAM_PARALLEL &&
+ sub->minapplydelay > 0)
+ ereport(ERROR,
+
(errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
+
errmsg("min_apply_delay must not be set when streaming = parallel")));
2) ereport inconsistency, braces around errcode is present in few
places and not present in few places, it is better to keep it
consistent by removing it:
2.a)
+ if (opts->streaming == LOGICALREP_STREAM_PARALLEL)
+ ereport(ERROR,
+ (errcode(ERRCODE_SYNTAX_ERROR)),
+ errmsg("min_apply_delay must
not be set when streaming = parallel"));
2.b)
+ if (opts.streaming ==
LOGICALREP_STREAM_PARALLEL &&
+ sub->minapplydelay > 0)
+ ereport(ERROR,
+
(errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
+
errmsg("min_apply_delay must not be set when streaming = parallel")));
2.c)
+ if (opts.min_apply_delay > 0 &&
+ sub->stream ==
LOGICALREP_STREAM_PARALLEL)
+ ereport(ERROR,
+
(errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
+
errmsg("min_apply_delay must not be set when streaming = parallel")));
2.d)
+ if (pg_mul_s64_overflow(days, MSECS_PER_DAY, &result))
+ ereport(ERROR,
+ (errcode(ERRCODE_NUMERIC_VALUE_OUT_OF_RANGE),
+ errmsg("bigint out of range")));
2.e)
+ if (pg_add_s64_overflow(result, ms, &result))
+ ereport(ERROR,
+ (errcode(ERRCODE_NUMERIC_VALUE_OUT_OF_RANGE),
+ errmsg("bigint out of range")));
3) this include is not required, I could compile without it
--- a/src/backend/commands/subscriptioncmds.c
+++ b/src/backend/commands/subscriptioncmds.c
@@ -48,6 +48,7 @@
#include "utils/memutils.h"
#include "utils/pg_lsn.h"
#include "utils/syscache.h"
+#include "utils/timestamp.h"
4)
4.a)
Should this be changed:
/* Adds portion time (in ms) to the previous result. */
to
/* Adds portion time (in ms) to the previous result */
4.b)
Should this be changed:
/* Detect whether the value of interval can cause an overflow. */
to
/* Detect whether the value of interval can cause an overflow */
5) Can this "ALTER SUBSCRIPTION regress_testsub SET (min_apply_delay =
'1d')" be combined along with "-- success -- 123 ms", that way few
statements could be reduced
+-- success -- 86400000 ms
+CREATE SUBSCRIPTION regress_testsub CONNECTION
'dbname=regress_doesnotexist' PUBLICATION testpub WITH (connect =
false, min_apply_delay = 123);
+ALTER SUBSCRIPTION regress_testsub SET (min_apply_delay = '1d');
+
+\dRs+
+
+ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE);
+DROP SUBSCRIPTION regress_testsub;
6) Can we do the interval testing along with alter subscription and
combined with "-- success -- 123 ms" test, that way few statements
could be reduced
+-- success -- interval is converted into ms and stored as integer
+CREATE SUBSCRIPTION regress_testsub CONNECTION
'dbname=regress_doesnotexist' PUBLICATION testpub WITH (connect =
false, min_apply_delay = '4h 27min 35s');
+
+\dRs+
+
+ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE);
+DROP SUBSCRIPTION regress_testsub;
Regards,
Vignesh
Dear Shveta,
Thanks for reviewing! PSA new version.
1.
+ errmsg("min_apply_delay must not be set when streaming = parallel")));
we give the same error msg for both the cases:
a. when subscription is created with streaming=parallel but we are
trying to alter subscription to set min_apply_delay >0
b. when subscription is created with some min_apply_delay and we are
trying to alter subscription to make it streaming=parallel.
For case a, error msg looks fine but for case b, I think error msg
should be changed slightly.
ALTER SUBSCRIPTION regress_testsub SET (streaming = parallel);
ERROR: min_apply_delay must not be set when streaming = parallel
This gives the feeling that we are trying to modify min_apply_delay
but we are not. Maybe we can change it to:
"subscription with min_apply_delay must not be allowed to stream
parallel" (or something better)
Your point that error messages are strange is right. And while
checking other ones, I found they have very similar styles. Therefore I reworded
ERROR messages in AlterSubscription() and parse_subscription_options() to follow
them. Which version is better?
Best Regards,
Hayato Kuroda
FUJITSU LIMITED
Attachments:
v14-0001-Time-delayed-logical-replication-subscriber.patchapplication/octet-stream; name=v14-0001-Time-delayed-logical-replication-subscriber.patchDownload
From 11087d85a7a86c5066b99163f4b595575e3708f0 Mon Sep 17 00:00:00 2001
From: Takamichi Osumi <osumi.takamichi@fujitsu.com>
Date: Tue, 10 Jan 2023 13:35:45 +0000
Subject: [PATCH v14] Time-delayed logical replication subscriber
Similar to physical replication, a time-delayed copy of the data for
logical replication is useful for some scenarios (particularly to fix
errors that might cause data loss).
If the subscription sets min_apply_delay parameter, the logical
replication worker will delay the transaction commit for min_apply_delay
milliseconds.
The delay is calculated between the WAL time stamp and the current time
on the subscriber.
The delay occurs only on WAL records for transaction begins. The main
reason is to avoid keeping a transaction open for a long time. Regular
and prepared transactions are covered. Streamed transactions are also
covered.
Author: Euler Taveira
Discussion: https://postgr.es/m/CAB-JLwYOYwL=XTyAXKiH5CtM_Vm8KjKh7aaitCKvmCh4rzr5pQ@mail.gmail.com
---
doc/src/sgml/catalogs.sgml | 9 +
doc/src/sgml/config.sgml | 12 ++
doc/src/sgml/logical-replication.sgml | 7 +
doc/src/sgml/ref/alter_subscription.sgml | 5 +-
doc/src/sgml/ref/create_subscription.sgml | 59 +++++-
src/backend/catalog/pg_subscription.c | 1 +
src/backend/catalog/system_views.sql | 7 +-
src/backend/commands/subscriptioncmds.c | 87 +++++++-
.../replication/logical/applyparallelworker.c | 3 +-
src/backend/replication/logical/worker.c | 161 +++++++++++++--
src/backend/utils/adt/timestamp.c | 29 +++
src/bin/pg_dump/pg_dump.c | 16 +-
src/bin/pg_dump/pg_dump.h | 1 +
src/bin/psql/describe.c | 9 +-
src/bin/psql/tab-complete.c | 4 +-
src/include/catalog/pg_subscription.h | 3 +
src/include/datatype/timestamp.h | 2 +
src/include/replication/worker_internal.h | 2 +-
src/include/utils/timestamp.h | 2 +
src/test/regress/expected/subscription.out | 185 +++++++++++-------
src/test/regress/sql/subscription.sql | 25 +++
src/test/subscription/meson.build | 1 +
src/test/subscription/t/032_apply_delay.pl | 151 ++++++++++++++
23 files changed, 679 insertions(+), 102 deletions(-)
create mode 100644 src/test/subscription/t/032_apply_delay.pl
diff --git a/doc/src/sgml/catalogs.sgml b/doc/src/sgml/catalogs.sgml
index c1e4048054..bf3c05241c 100644
--- a/doc/src/sgml/catalogs.sgml
+++ b/doc/src/sgml/catalogs.sgml
@@ -7873,6 +7873,15 @@ SCRAM-SHA-256$<replaceable><iteration count></replaceable>:<replaceable>&l
</para></entry>
</row>
+ <row>
+ <entry role="catalog_table_entry"><para role="column_definition">
+ <structfield>subminapplydelay</structfield> <type>int8</type>
+ </para>
+ <para>
+ The length of time (ms) to delay the application of changes.
+ </para></entry>
+ </row>
+
<row>
<entry role="catalog_table_entry"><para role="column_definition">
<structfield>subname</structfield> <type>name</type>
diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml
index 77574e2d4e..223c07b06f 100644
--- a/doc/src/sgml/config.sgml
+++ b/doc/src/sgml/config.sgml
@@ -4753,6 +4753,18 @@ ANY <replaceable class="parameter">num_sync</replaceable> ( <replaceable class="
the <filename>postgresql.conf</filename> file or on the server
command line.
</para>
+ <para>
+ For time delayed logical replication, the apply worker sends a Standby
+ Status Update message to the corresponding publisher per the indicated
+ time of this parameter. Therefore, if this parameter is longer than
+ <literal>wal_sender_timeout</literal> on the publisher, then the
+ walsender doesn't get any update message during the delay and repeatedly
+ terminates due to the timeout errors. Hence, make sure this parameter
+ shorter than the <literal>wal_sender_timeout</literal> of the publisher.
+ If this parameter is set to zero with time delayed replication, the
+ apply worker doesn't send any feedback messages during the
+ <literal>min_apply_delay</literal>.
+ </para>
</listitem>
</varlistentry>
diff --git a/doc/src/sgml/logical-replication.sgml b/doc/src/sgml/logical-replication.sgml
index 54f48be87f..6407804547 100644
--- a/doc/src/sgml/logical-replication.sgml
+++ b/doc/src/sgml/logical-replication.sgml
@@ -247,6 +247,13 @@
target table.
</para>
+ <para>
+ The subscriber replication can be instructed to lag behind the publisher
+ side changes by specifying the <literal>min_apply_delay</literal>
+ subscription parameter. See <xref linkend="sql-createsubscription"/> for
+ details.
+ </para>
+
<sect2 id="logical-replication-subscription-slot">
<title>Replication Slot Management</title>
diff --git a/doc/src/sgml/ref/alter_subscription.sgml b/doc/src/sgml/ref/alter_subscription.sgml
index 1e8d72062b..d63aff1b90 100644
--- a/doc/src/sgml/ref/alter_subscription.sgml
+++ b/doc/src/sgml/ref/alter_subscription.sgml
@@ -213,8 +213,9 @@ ALTER SUBSCRIPTION <replaceable class="parameter">name</replaceable> RENAME TO <
are <literal>slot_name</literal>,
<literal>synchronous_commit</literal>,
<literal>binary</literal>, <literal>streaming</literal>,
- <literal>disable_on_error</literal>, and
- <literal>origin</literal>.
+ <literal>disable_on_error</literal>,
+ <literal>origin</literal>, and
+ <literal>min_apply_delay</literal>.
</para>
</listitem>
</varlistentry>
diff --git a/doc/src/sgml/ref/create_subscription.sgml b/doc/src/sgml/ref/create_subscription.sgml
index eba72c6af6..d95ff8944e 100644
--- a/doc/src/sgml/ref/create_subscription.sgml
+++ b/doc/src/sgml/ref/create_subscription.sgml
@@ -349,7 +349,47 @@ CREATE SUBSCRIPTION <replaceable class="parameter">subscription_name</replaceabl
</para>
</listitem>
</varlistentry>
- </variablelist></para>
+
+ <varlistentry>
+ <term><literal>min_apply_delay</literal> (<type>integer</type>)</term>
+ <listitem>
+ <para>
+ By default, the subscriber applies changes as soon as possible. As
+ with the physical replication feature
+ (<xref linkend="guc-recovery-min-apply-delay"/>), it can be useful to
+ have a time-delayed logical replica. This parameter lets the user to
+ delay the application of changes by a specified amount of time. If this
+ value is specified without units, it is taken as milliseconds. The
+ default is zero(no delay).
+ </para>
+ <para>
+ The delay occurs only on WAL records for transaction begins and after
+ the initial table synchronization. It is possible that the
+ replication delay between publisher and subscriber exceeds the value
+ of this parameter, in which case no delay is added. Note that the
+ delay is calculated between the WAL time stamp as written on
+ publisher and the current time on the subscriber. Time spent in logical
+ decoding and in transferring the transaction may reduce the actual wait
+ time. If the system clocks on publisher and subscriber are not
+ synchronized, this may lead to apply changes earlier than expected,
+ but this is not a major issue because this parameter is typically much
+ larger than the time deviations between servers. Note that if this
+ parameter is set to a long delay, the replication will stop if the
+ replication slot falls behind the current LSN by more than
+ <link linkend="guc-max-slot-wal-keep-size"><literal>max_slot_wal_keep_size</literal></link>.
+ </para>
+ <warning>
+ <para>
+ Delaying the replication can mean there is a much longer time between making
+ a change on the publisher, and that change being committed on the subscriber.
+ This can have a big impact on synchronous replication.
+ See <xref linkend="guc-synchronous-commit"/>.
+ </para>
+ </warning>
+ </listitem>
+ </varlistentry>
+ </variablelist>
+ </para>
</listitem>
</varlistentry>
@@ -413,6 +453,11 @@ CREATE SUBSCRIPTION <replaceable class="parameter">subscription_name</replaceabl
published with different column lists are not supported.
</para>
+ <para>
+ Setting streaming to <literal>parallel</literal> mode and <literal>min_apply_delay</literal>
+ at the same time is not supported.
+ </para>
+
<para>
We allow non-existent publications to be specified so that users can add
those later. This means
@@ -472,6 +517,18 @@ CREATE SUBSCRIPTION mysub
PUBLICATION insert_only
WITH (enabled = false);
</programlisting></para>
+
+ <para>
+ Create a subscription to a remote server that replicates tables in
+ the <literal>mypub</literal> publication and starts replicating immediately
+ on commit. Pre-existing data is not copied. The application of changes is
+ delayed by 4 hours.
+<programlisting>
+CREATE SUBSCRIPTION mysub
+ CONNECTION 'host=192.168.1.50 port=5432 user=foo dbname=foodb'
+ PUBLICATION mypub
+ WITH (copy_data = false, min_apply_delay = '4h');
+</programlisting></para>
</refsect1>
<refsect1>
diff --git a/src/backend/catalog/pg_subscription.c b/src/backend/catalog/pg_subscription.c
index a56ae311c3..c767cc1c3a 100644
--- a/src/backend/catalog/pg_subscription.c
+++ b/src/backend/catalog/pg_subscription.c
@@ -64,6 +64,7 @@ GetSubscription(Oid subid, bool missing_ok)
sub->oid = subid;
sub->dbid = subform->subdbid;
sub->skiplsn = subform->subskiplsn;
+ sub->minapplydelay = subform->subminapplydelay;
sub->name = pstrdup(NameStr(subform->subname));
sub->owner = subform->subowner;
sub->enabled = subform->subenabled;
diff --git a/src/backend/catalog/system_views.sql b/src/backend/catalog/system_views.sql
index 447c9b970f..4004fcd0c4 100644
--- a/src/backend/catalog/system_views.sql
+++ b/src/backend/catalog/system_views.sql
@@ -1299,9 +1299,10 @@ REVOKE ALL ON pg_replication_origin_status FROM public;
-- All columns of pg_subscription except subconninfo are publicly readable.
REVOKE ALL ON pg_subscription FROM public;
-GRANT SELECT (oid, subdbid, subskiplsn, subname, subowner, subenabled,
- subbinary, substream, subtwophasestate, subdisableonerr,
- subslotname, subsynccommit, subpublications, suborigin)
+GRANT SELECT (oid, subdbid, subskiplsn, subminapplydelay, subname, subowner,
+ subenabled, subbinary, substream, subtwophasestate,
+ subdisableonerr, subslotname, subsynccommit, subpublications,
+ suborigin)
ON pg_subscription TO public;
CREATE VIEW pg_stat_subscription_stats AS
diff --git a/src/backend/commands/subscriptioncmds.c b/src/backend/commands/subscriptioncmds.c
index baff00dd74..d55864a2ce 100644
--- a/src/backend/commands/subscriptioncmds.c
+++ b/src/backend/commands/subscriptioncmds.c
@@ -66,6 +66,7 @@
#define SUBOPT_DISABLE_ON_ERR 0x00000400
#define SUBOPT_LSN 0x00000800
#define SUBOPT_ORIGIN 0x00001000
+#define SUBOPT_MIN_APPLY_DELAY 0x00002000
/* check if the 'val' has 'bits' set */
#define IsSet(val, bits) (((val) & (bits)) == (bits))
@@ -90,6 +91,7 @@ typedef struct SubOpts
bool disableonerr;
char *origin;
XLogRecPtr lsn;
+ int64 min_apply_delay;
} SubOpts;
static List *fetch_table_list(WalReceiverConn *wrconn, List *publications);
@@ -146,6 +148,8 @@ parse_subscription_options(ParseState *pstate, List *stmt_options,
opts->disableonerr = false;
if (IsSet(supported_opts, SUBOPT_ORIGIN))
opts->origin = pstrdup(LOGICALREP_ORIGIN_ANY);
+ if (IsSet(supported_opts, SUBOPT_MIN_APPLY_DELAY))
+ opts->min_apply_delay = 0;
/* Parse options */
foreach(lc, stmt_options)
@@ -324,6 +328,43 @@ parse_subscription_options(ParseState *pstate, List *stmt_options,
opts->specified_opts |= SUBOPT_LSN;
opts->lsn = lsn;
}
+ else if (IsSet(supported_opts, SUBOPT_MIN_APPLY_DELAY) &&
+ strcmp(defel->defname, "min_apply_delay") == 0)
+ {
+ char *val,
+ *tmp;
+ Interval *interval;
+ int64 ms;
+
+ if (IsSet(opts->specified_opts, SUBOPT_MIN_APPLY_DELAY))
+ errorConflictingDefElem(defel, pstate);
+
+ opts->specified_opts |= SUBOPT_MIN_APPLY_DELAY;
+ tmp = defGetString(defel);
+
+ /*
+ * If no unit was specified, then explicitly add 'ms' otherwise
+ * the interval_in function would assume 'seconds'.
+ */
+ if (strspn(tmp, "-0123456789 ") == strlen(tmp))
+ val = psprintf("%sms", tmp);
+ else
+ val = tmp;
+
+ interval = DatumGetIntervalP(DirectFunctionCall3(interval_in,
+ CStringGetDatum(val),
+ ObjectIdGetDatum(InvalidOid),
+ Int32GetDatum(-1)));
+
+ ms = interval2ms(interval);
+ if (ms < 0 || ms > INT_MAX)
+ ereport(ERROR,
+ errcode(ERRCODE_NUMERIC_VALUE_OUT_OF_RANGE),
+ errmsg(INT64_FORMAT " ms is outside the valid range for parameter \"%s\"",
+ ms, "min_apply_delay"));
+
+ opts->min_apply_delay = ms;
+ }
else
ereport(ERROR,
(errcode(ERRCODE_SYNTAX_ERROR),
@@ -404,6 +445,17 @@ parse_subscription_options(ParseState *pstate, List *stmt_options,
"slot_name = NONE", "create_slot = false")));
}
}
+
+ /* Test the combination of streaming mode and min_apply_delay */
+ if (IsSet(supported_opts, SUBOPT_MIN_APPLY_DELAY) &&
+ opts->min_apply_delay > 0)
+ {
+ if (opts->streaming == LOGICALREP_STREAM_PARALLEL)
+ ereport(ERROR,
+ errcode(ERRCODE_SYNTAX_ERROR),
+ errmsg("%s and %s are mutually exclusive options",
+ "min_apply_delay > 0", "streaming = parallel"));
+ }
}
/*
@@ -560,7 +612,8 @@ CreateSubscription(ParseState *pstate, CreateSubscriptionStmt *stmt,
SUBOPT_SLOT_NAME | SUBOPT_COPY_DATA |
SUBOPT_SYNCHRONOUS_COMMIT | SUBOPT_BINARY |
SUBOPT_STREAMING | SUBOPT_TWOPHASE_COMMIT |
- SUBOPT_DISABLE_ON_ERR | SUBOPT_ORIGIN);
+ SUBOPT_DISABLE_ON_ERR | SUBOPT_ORIGIN |
+ SUBOPT_MIN_APPLY_DELAY);
parse_subscription_options(pstate, stmt->options, supported_opts, &opts);
/*
@@ -625,6 +678,7 @@ CreateSubscription(ParseState *pstate, CreateSubscriptionStmt *stmt,
values[Anum_pg_subscription_oid - 1] = ObjectIdGetDatum(subid);
values[Anum_pg_subscription_subdbid - 1] = ObjectIdGetDatum(MyDatabaseId);
values[Anum_pg_subscription_subskiplsn - 1] = LSNGetDatum(InvalidXLogRecPtr);
+ values[Anum_pg_subscription_subminapplydelay - 1] = Int64GetDatum(opts.min_apply_delay);
values[Anum_pg_subscription_subname - 1] =
DirectFunctionCall1(namein, CStringGetDatum(stmt->subname));
values[Anum_pg_subscription_subowner - 1] = ObjectIdGetDatum(owner);
@@ -1054,7 +1108,7 @@ AlterSubscription(ParseState *pstate, AlterSubscriptionStmt *stmt,
supported_opts = (SUBOPT_SLOT_NAME |
SUBOPT_SYNCHRONOUS_COMMIT | SUBOPT_BINARY |
SUBOPT_STREAMING | SUBOPT_DISABLE_ON_ERR |
- SUBOPT_ORIGIN);
+ SUBOPT_ORIGIN | SUBOPT_MIN_APPLY_DELAY);
parse_subscription_options(pstate, stmt->options,
supported_opts, &opts);
@@ -1098,6 +1152,17 @@ AlterSubscription(ParseState *pstate, AlterSubscriptionStmt *stmt,
if (IsSet(opts.specified_opts, SUBOPT_STREAMING))
{
+ /*
+ * Test the combination of streaming mode and
+ * min_apply_delay
+ */
+ if (opts.streaming == LOGICALREP_STREAM_PARALLEL &&
+ sub->minapplydelay > 0)
+ ereport(ERROR,
+ errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
+ errmsg("cannot set %s for subscription with %s",
+ "streaming = parallel", "min_apply_delay > 0"));
+
values[Anum_pg_subscription_substream - 1] =
CharGetDatum(opts.streaming);
replaces[Anum_pg_subscription_substream - 1] = true;
@@ -1111,6 +1176,24 @@ AlterSubscription(ParseState *pstate, AlterSubscriptionStmt *stmt,
= true;
}
+ if (IsSet(opts.specified_opts, SUBOPT_MIN_APPLY_DELAY))
+ {
+ /*
+ * Test the combination of streaming mode and
+ * min_apply_delay
+ */
+ if (opts.min_apply_delay > 0 &&
+ sub->stream == LOGICALREP_STREAM_PARALLEL)
+ ereport(ERROR,
+ errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
+ errmsg("cannot set %s for subscription with %s",
+ "min_apply_delay > 0", "streaming = parallel"));
+
+ values[Anum_pg_subscription_subminapplydelay - 1] =
+ Int64GetDatum(opts.min_apply_delay);
+ replaces[Anum_pg_subscription_subminapplydelay - 1] = true;
+ }
+
if (IsSet(opts.specified_opts, SUBOPT_ORIGIN))
{
values[Anum_pg_subscription_suborigin - 1] =
diff --git a/src/backend/replication/logical/applyparallelworker.c b/src/backend/replication/logical/applyparallelworker.c
index 2e5914d5d9..72e6f5ce84 100644
--- a/src/backend/replication/logical/applyparallelworker.c
+++ b/src/backend/replication/logical/applyparallelworker.c
@@ -704,7 +704,8 @@ pa_process_spooled_messages_if_required(void)
{
apply_spooled_messages(&MyParallelShared->fileset,
MyParallelShared->xid,
- InvalidXLogRecPtr);
+ InvalidXLogRecPtr,
+ 0);
pa_set_fileset_state(MyParallelShared, FS_EMPTY);
}
diff --git a/src/backend/replication/logical/worker.c b/src/backend/replication/logical/worker.c
index 79cda39445..5d11cc5662 100644
--- a/src/backend/replication/logical/worker.c
+++ b/src/backend/replication/logical/worker.c
@@ -316,6 +316,17 @@ static List *on_commit_wakeup_workers_subids = NIL;
bool in_remote_transaction = false;
static XLogRecPtr remote_final_lsn = InvalidXLogRecPtr;
+/*
+ * In order to avoid walsender's timeout during time delayed replication,
+ * it's necessary to keep sending feedback messages during the delay from the
+ * worker process. Meanwhile, the feature delays the apply before starting the
+ * transaction and thus we don't write WALs for the suspended changes during
+ * the wait. Hence, in the case the worker process sends a feedback during the
+ * delay, avoid having positions of the flushed and apply LSN overwritten by
+ * the latest LSN.
+ */
+static XLogRecPtr last_received = InvalidXLogRecPtr;
+
/* fields valid only when processing streamed transaction */
static bool in_streamed_transaction = false;
@@ -386,10 +397,13 @@ static void stream_write_change(char action, StringInfo s);
static void stream_open_and_write_change(TransactionId xid, char action, StringInfo s);
static void stream_close_file(void);
-static void send_feedback(XLogRecPtr recvpos, bool force, bool requestReply);
+static void send_feedback(XLogRecPtr recvpos, bool force, bool requestReply,
+ bool in_delaying_apply);
static void DisableSubscriptionAndExit(void);
+static void maybe_delay_apply(TimestampTz ts);
+
static void apply_handle_commit_internal(LogicalRepCommitData *commit_data);
static void apply_handle_insert_internal(ApplyExecutionData *edata,
ResultRelInfo *relinfo,
@@ -996,6 +1010,94 @@ slot_modify_data(TupleTableSlot *slot, TupleTableSlot *srcslot,
ExecStoreVirtualTuple(slot);
}
+/*
+ * When min_apply_delay parameter is set on the subscriber, we wait long enough
+ * to make sure a transaction is applied at least that interval behind the
+ * publisher.
+ *
+ * While the physical replication applies the delay at commit time, this
+ * feature applies the delay for the next transaction but before starting the
+ * transaction. This is mainly because keeping a transaction that conducted
+ * write operations open for a long time results in some issues such as bloat
+ * and locks.
+ *
+ * The min_apply_delay parameter will take effect only after all tables are in
+ * READY state.
+ */
+static void
+maybe_delay_apply(TimestampTz ts)
+{
+ /* Nothing to do if no delay set */
+ if (MySubscription->minapplydelay <= 0)
+ return;
+
+ /*
+ * The min_apply_delay parameter is ignored until all tablesync workers
+ * have reached READY state. If we allow the delay during the catchup
+ * phase, once we reach the limit of tablesync workers, it will impose a
+ * delay for each subsequent worker. It means it will take a long time to
+ * finish the initial table synchronization.
+ */
+ if (!AllTablesyncsReady())
+ return;
+
+ while (true)
+ {
+ long diffms;
+
+ ResetLatch(MyLatch);
+
+ CHECK_FOR_INTERRUPTS();
+
+ if (ConfigReloadPending)
+ {
+ ConfigReloadPending = false;
+ ProcessConfigFile(PGC_SIGHUP);
+ }
+
+ /*
+ * Before calculating the time duration, reload the catalog if needed.
+ */
+ if (!in_remote_transaction && !in_streamed_transaction)
+ {
+ AcceptInvalidationMessages();
+ maybe_reread_subscription();
+ }
+
+ diffms = TimestampDifferenceMilliseconds(GetCurrentTimestamp(),
+ TimestampTzPlusMilliseconds(ts, MySubscription->minapplydelay));
+
+ /*
+ * Exit without arming the latch if it's already past time to apply
+ * this transaction.
+ */
+ if (diffms <= 0)
+ break;
+
+ elog(DEBUG2, "logical replication apply delay: %ld ms", diffms);
+
+ /*
+ * Call send_feedback() to prevent the publisher from exiting by
+ * timeout during the delay, when wal_receiver_status_interval is
+ * available.
+ */
+ if (wal_receiver_status_interval > 0
+ && diffms > wal_receiver_status_interval * 1000)
+ {
+ WaitLatch(MyLatch,
+ WL_LATCH_SET | WL_TIMEOUT | WL_EXIT_ON_PM_DEATH,
+ (long) wal_receiver_status_interval * 1000,
+ WAIT_EVENT_RECOVERY_APPLY_DELAY);
+ send_feedback(last_received, true, false, true);
+ }
+ else
+ WaitLatch(MyLatch,
+ WL_LATCH_SET | WL_TIMEOUT | WL_EXIT_ON_PM_DEATH,
+ diffms,
+ WAIT_EVENT_RECOVERY_APPLY_DELAY);
+ }
+}
+
/*
* Handle BEGIN message.
*/
@@ -1007,6 +1109,9 @@ apply_handle_begin(StringInfo s)
logicalrep_read_begin(s, &begin_data);
set_apply_error_context_xact(begin_data.xid, begin_data.final_lsn);
+ /* Should we delay the current transaction? */
+ maybe_delay_apply(begin_data.committime);
+
remote_final_lsn = begin_data.final_lsn;
maybe_start_skipping_changes(begin_data.final_lsn);
@@ -1061,6 +1166,9 @@ apply_handle_begin_prepare(StringInfo s)
logicalrep_read_begin_prepare(s, &begin_data);
set_apply_error_context_xact(begin_data.xid, begin_data.prepare_lsn);
+ /* Should we delay the current prepared transaction? */
+ maybe_delay_apply(begin_data.prepare_time);
+
remote_final_lsn = begin_data.prepare_lsn;
maybe_start_skipping_changes(begin_data.prepare_lsn);
@@ -1308,7 +1416,8 @@ apply_handle_stream_prepare(StringInfo s)
* spooled operations.
*/
apply_spooled_messages(MyLogicalRepWorker->stream_fileset,
- prepare_data.xid, prepare_data.prepare_lsn);
+ prepare_data.xid, prepare_data.prepare_lsn,
+ prepare_data.prepare_time);
/* Mark the transaction as prepared. */
apply_handle_prepare_internal(&prepare_data);
@@ -2001,7 +2110,7 @@ ensure_last_message(FileSet *stream_fileset, TransactionId xid, int fileno,
*/
void
apply_spooled_messages(FileSet *stream_fileset, TransactionId xid,
- XLogRecPtr lsn)
+ XLogRecPtr lsn, TimestampTz ts)
{
StringInfoData s2;
int nchanges;
@@ -2012,6 +2121,24 @@ apply_spooled_messages(FileSet *stream_fileset, TransactionId xid,
int fileno;
off_t offset;
+ /*
+ * Should we delay the current transaction?
+ *
+ * Unlike the regular (non-streamed) cases, the delay is applied in a
+ * STREAM COMMIT/STREAM PREPARE message for streamed transactions. The
+ * STREAM START message does not contain a commit/prepare time (it will be
+ * available when the in-progress transaction finishes). Hence, it's not
+ * appropriate to apply a delay at that time.
+ *
+ * It's not allowed to execute time delayed replication with parallel
+ * apply feature.
+ */
+ if (!am_parallel_apply_worker())
+ {
+ Assert(ts > 0);
+ maybe_delay_apply(ts);
+ }
+
if (!am_parallel_apply_worker())
maybe_start_skipping_changes(lsn);
@@ -2171,7 +2298,7 @@ apply_handle_stream_commit(StringInfo s)
* spooled operations.
*/
apply_spooled_messages(MyLogicalRepWorker->stream_fileset, xid,
- commit_data.commit_lsn);
+ commit_data.commit_lsn, commit_data.committime);
apply_handle_commit_internal(&commit_data);
@@ -3444,7 +3571,7 @@ UpdateWorkerStats(XLogRecPtr last_lsn, TimestampTz send_time, bool reply)
* Apply main loop.
*/
static void
-LogicalRepApplyLoop(XLogRecPtr last_received)
+LogicalRepApplyLoop(void)
{
TimestampTz last_recv_timestamp = GetCurrentTimestamp();
bool ping_sent = false;
@@ -3565,7 +3692,7 @@ LogicalRepApplyLoop(XLogRecPtr last_received)
if (last_received < end_lsn)
last_received = end_lsn;
- send_feedback(last_received, reply_requested, false);
+ send_feedback(last_received, reply_requested, false, false);
UpdateWorkerStats(last_received, timestamp, true);
}
/* other message types are purposefully ignored */
@@ -3578,7 +3705,7 @@ LogicalRepApplyLoop(XLogRecPtr last_received)
}
/* confirm all writes so far */
- send_feedback(last_received, false, false);
+ send_feedback(last_received, false, false, false);
if (!in_remote_transaction && !in_streamed_transaction)
{
@@ -3675,7 +3802,7 @@ LogicalRepApplyLoop(XLogRecPtr last_received)
}
}
- send_feedback(last_received, requestReply, requestReply);
+ send_feedback(last_received, requestReply, requestReply, false);
/*
* Force reporting to ensure long idle periods don't lead to
@@ -3705,7 +3832,7 @@ LogicalRepApplyLoop(XLogRecPtr last_received)
* to send a response to avoid timeouts.
*/
static void
-send_feedback(XLogRecPtr recvpos, bool force, bool requestReply)
+send_feedback(XLogRecPtr recvpos, bool force, bool requestReply, bool in_delaying_apply)
{
static StringInfo reply_message = NULL;
static TimestampTz send_time = 0;
@@ -3735,8 +3862,15 @@ send_feedback(XLogRecPtr recvpos, bool force, bool requestReply)
/*
* No outstanding transactions to flush, we can report the latest received
* position. This is important for synchronous replication.
+ *
+ * During the delay of time delayed replication, do not tell the publisher
+ * that the received latest LSN is already applied and flushed at this
+ * stage, since we don't apply the transaction yet. If we do so, it leads
+ * to a wrong assumption of logical replication progress on the publisher
+ * side. Here, we just send a feedback message to avoid publisher's
+ * timeout during the delay.
*/
- if (!have_pending_txes)
+ if (!have_pending_txes && !in_delaying_apply)
flushpos = writepos = recvpos;
if (writepos < last_writepos)
@@ -4362,11 +4496,11 @@ start_table_sync(XLogRecPtr *origin_startpos, char **myslotname)
* of system resource error and are not repeatable.
*/
static void
-start_apply(XLogRecPtr origin_startpos)
+start_apply(void)
{
PG_TRY();
{
- LogicalRepApplyLoop(origin_startpos);
+ LogicalRepApplyLoop();
}
PG_CATCH();
{
@@ -4653,7 +4787,8 @@ ApplyWorkerMain(Datum main_arg)
}
/* Run the main loop. */
- start_apply(origin_startpos);
+ last_received = origin_startpos;
+ start_apply();
proc_exit(0);
}
diff --git a/src/backend/utils/adt/timestamp.c b/src/backend/utils/adt/timestamp.c
index 928c330897..422e6ad0fa 100644
--- a/src/backend/utils/adt/timestamp.c
+++ b/src/backend/utils/adt/timestamp.c
@@ -2431,6 +2431,35 @@ interval_cmp_internal(const Interval *interval1, const Interval *interval2)
return int128_compare(span1, span2);
}
+/*
+ * Returns the number of milliseconds in the specified Interval.
+ */
+int64
+interval2ms(const Interval *interval)
+{
+ int64 days;
+ int64 ms;
+ int64 result;
+
+ days = interval->month * INT64CONST(30);
+ days += interval->day;
+
+ /* Detect whether the value of interval can cause an overflow */
+ if (pg_mul_s64_overflow(days, MSECS_PER_DAY, &result))
+ ereport(ERROR,
+ errcode(ERRCODE_NUMERIC_VALUE_OUT_OF_RANGE),
+ errmsg("bigint out of range"));
+
+ /* Adds portion time (in ms) to the previous result */
+ ms = interval->time / INT64CONST(1000);
+ if (pg_add_s64_overflow(result, ms, &result))
+ ereport(ERROR,
+ errcode(ERRCODE_NUMERIC_VALUE_OUT_OF_RANGE),
+ errmsg("bigint out of range"));
+
+ return result;
+}
+
Datum
interval_eq(PG_FUNCTION_ARGS)
{
diff --git a/src/bin/pg_dump/pg_dump.c b/src/bin/pg_dump/pg_dump.c
index 5e800dc79a..d9b4c8b7f0 100644
--- a/src/bin/pg_dump/pg_dump.c
+++ b/src/bin/pg_dump/pg_dump.c
@@ -4494,6 +4494,7 @@ getSubscriptions(Archive *fout)
int i_subsynccommit;
int i_subpublications;
int i_subbinary;
+ int i_subminapplydelay;
int i,
ntups;
@@ -4546,9 +4547,14 @@ getSubscriptions(Archive *fout)
LOGICALREP_TWOPHASE_STATE_DISABLED);
if (fout->remoteVersion >= 160000)
- appendPQExpBufferStr(query, " s.suborigin\n");
+ appendPQExpBufferStr(query,
+ " s.suborigin,\n"
+ " s.subminapplydelay\n");
else
- appendPQExpBuffer(query, " '%s' AS suborigin\n", LOGICALREP_ORIGIN_ANY);
+ {
+ appendPQExpBuffer(query, " '%s' AS suborigin,\n", LOGICALREP_ORIGIN_ANY);
+ appendPQExpBufferStr(query, " 0 AS subminapplydelay\n");
+ }
appendPQExpBufferStr(query,
"FROM pg_subscription s\n"
@@ -4576,6 +4582,7 @@ getSubscriptions(Archive *fout)
i_subtwophasestate = PQfnumber(res, "subtwophasestate");
i_subdisableonerr = PQfnumber(res, "subdisableonerr");
i_suborigin = PQfnumber(res, "suborigin");
+ i_subminapplydelay = PQfnumber(res, "subminapplydelay");
subinfo = pg_malloc(ntups * sizeof(SubscriptionInfo));
@@ -4606,6 +4613,8 @@ getSubscriptions(Archive *fout)
subinfo[i].subdisableonerr =
pg_strdup(PQgetvalue(res, i, i_subdisableonerr));
subinfo[i].suborigin = pg_strdup(PQgetvalue(res, i, i_suborigin));
+ subinfo[i].subminapplydelay =
+ strtoi64(PQgetvalue(res, i, i_subminapplydelay), NULL, 10);
/* Decide whether we want to dump it */
selectDumpableObject(&(subinfo[i].dobj), fout);
@@ -4687,6 +4696,9 @@ dumpSubscription(Archive *fout, const SubscriptionInfo *subinfo)
if (strcmp(subinfo->subsynccommit, "off") != 0)
appendPQExpBuffer(query, ", synchronous_commit = %s", fmtId(subinfo->subsynccommit));
+ if (subinfo->subminapplydelay > 0)
+ appendPQExpBuffer(query, ", min_apply_delay = '" INT64_FORMAT " ms'", subinfo->subminapplydelay);
+
appendPQExpBufferStr(query, ");\n");
if (subinfo->dobj.dump & DUMP_COMPONENT_DEFINITION)
diff --git a/src/bin/pg_dump/pg_dump.h b/src/bin/pg_dump/pg_dump.h
index e7cbd8d7ed..e2525f70ab 100644
--- a/src/bin/pg_dump/pg_dump.h
+++ b/src/bin/pg_dump/pg_dump.h
@@ -661,6 +661,7 @@ typedef struct _SubscriptionInfo
char *subdisableonerr;
char *suborigin;
char *subsynccommit;
+ int64 subminapplydelay;
char *subpublications;
} SubscriptionInfo;
diff --git a/src/bin/psql/describe.c b/src/bin/psql/describe.c
index c8a0bb7b3a..8a27063bed 100644
--- a/src/bin/psql/describe.c
+++ b/src/bin/psql/describe.c
@@ -6472,7 +6472,7 @@ describeSubscriptions(const char *pattern, bool verbose)
PGresult *res;
printQueryOpt myopt = pset.popt;
static const bool translate_columns[] = {false, false, false, false,
- false, false, false, false, false, false, false, false};
+ false, false, false, false, false, false, false, false, false};
if (pset.sversion < 100000)
{
@@ -6527,10 +6527,13 @@ describeSubscriptions(const char *pattern, bool verbose)
gettext_noop("Two-phase commit"),
gettext_noop("Disable on error"));
+ /* Origin and min_apply_delay are only supported in v16 and higher */
if (pset.sversion >= 160000)
appendPQExpBuffer(&buf,
- ", suborigin AS \"%s\"\n",
- gettext_noop("Origin"));
+ ", suborigin AS \"%s\"\n"
+ ", subminapplydelay AS \"%s\"\n",
+ gettext_noop("Origin"),
+ gettext_noop("Min apply delay (ms)"));
appendPQExpBuffer(&buf,
", subsynccommit AS \"%s\"\n"
diff --git a/src/bin/psql/tab-complete.c b/src/bin/psql/tab-complete.c
index 23750ea5fb..19d2f90dc0 100644
--- a/src/bin/psql/tab-complete.c
+++ b/src/bin/psql/tab-complete.c
@@ -1925,7 +1925,7 @@ psql_completion(const char *text, int start, int end)
COMPLETE_WITH("(", "PUBLICATION");
/* ALTER SUBSCRIPTION <name> SET ( */
else if (HeadMatches("ALTER", "SUBSCRIPTION", MatchAny) && TailMatches("SET", "("))
- COMPLETE_WITH("binary", "disable_on_error", "origin", "slot_name",
+ COMPLETE_WITH("binary", "disable_on_error", "min_apply_delay", "origin", "slot_name",
"streaming", "synchronous_commit");
/* ALTER SUBSCRIPTION <name> SKIP ( */
else if (HeadMatches("ALTER", "SUBSCRIPTION", MatchAny) && TailMatches("SKIP", "("))
@@ -3247,7 +3247,7 @@ psql_completion(const char *text, int start, int end)
/* Complete "CREATE SUBSCRIPTION <name> ... WITH ( <opt>" */
else if (HeadMatches("CREATE", "SUBSCRIPTION") && TailMatches("WITH", "("))
COMPLETE_WITH("binary", "connect", "copy_data", "create_slot",
- "disable_on_error", "enabled", "origin", "slot_name",
+ "disable_on_error", "enabled", "min_apply_delay", "origin", "slot_name",
"streaming", "synchronous_commit", "two_phase");
/* CREATE TRIGGER --- is allowed inside CREATE SCHEMA, so use TailMatches */
diff --git a/src/include/catalog/pg_subscription.h b/src/include/catalog/pg_subscription.h
index b0f2a1705d..078495cbf0 100644
--- a/src/include/catalog/pg_subscription.h
+++ b/src/include/catalog/pg_subscription.h
@@ -70,6 +70,8 @@ CATALOG(pg_subscription,6100,SubscriptionRelationId) BKI_SHARED_RELATION BKI_ROW
XLogRecPtr subskiplsn; /* All changes finished at this LSN are
* skipped */
+ int64 subminapplydelay; /* Replication apply delay */
+
NameData subname; /* Name of the subscription */
Oid subowner BKI_LOOKUP(pg_authid); /* Owner of the subscription */
@@ -120,6 +122,7 @@ typedef struct Subscription
* in */
XLogRecPtr skiplsn; /* All changes finished at this LSN are
* skipped */
+ int64 minapplydelay; /* Replication apply delay */
char *name; /* Name of the subscription */
Oid owner; /* Oid of the subscription owner */
bool enabled; /* Indicates if the subscription is enabled */
diff --git a/src/include/datatype/timestamp.h b/src/include/datatype/timestamp.h
index 21a37e21e9..8b368af299 100644
--- a/src/include/datatype/timestamp.h
+++ b/src/include/datatype/timestamp.h
@@ -127,6 +127,8 @@ struct pg_itm_in
#define SECS_PER_MINUTE 60
#define MINS_PER_HOUR 60
+#define MSECS_PER_DAY INT64CONST(86400000)
+
#define USECS_PER_DAY INT64CONST(86400000000)
#define USECS_PER_HOUR INT64CONST(3600000000)
#define USECS_PER_MINUTE INT64CONST(60000000)
diff --git a/src/include/replication/worker_internal.h b/src/include/replication/worker_internal.h
index db891eea8a..4bdb6b7de3 100644
--- a/src/include/replication/worker_internal.h
+++ b/src/include/replication/worker_internal.h
@@ -255,7 +255,7 @@ extern void stream_stop_internal(TransactionId xid);
/* Common streaming function to apply all the spooled messages */
extern void apply_spooled_messages(FileSet *stream_fileset, TransactionId xid,
- XLogRecPtr lsn);
+ XLogRecPtr lsn, TimestampTz ts);
extern void apply_dispatch(StringInfo s);
diff --git a/src/include/utils/timestamp.h b/src/include/utils/timestamp.h
index 42f802bb9d..534051fe13 100644
--- a/src/include/utils/timestamp.h
+++ b/src/include/utils/timestamp.h
@@ -102,6 +102,8 @@ extern bool TimestampDifferenceExceeds(TimestampTz start_time,
TimestampTz stop_time,
int msec);
+extern int64 interval2ms(const Interval *interval);
+
extern TimestampTz time_t_to_timestamptz(pg_time_t tm);
extern pg_time_t timestamptz_to_time_t(TimestampTz t);
diff --git a/src/test/regress/expected/subscription.out b/src/test/regress/expected/subscription.out
index 1ed6f4c39c..ff52daed3e 100644
--- a/src/test/regress/expected/subscription.out
+++ b/src/test/regress/expected/subscription.out
@@ -114,18 +114,18 @@ CREATE SUBSCRIPTION regress_testsub4 CONNECTION 'dbname=regress_doesnotexist' PU
WARNING: subscription was created, but is not connected
HINT: To initiate replication, you must manually create the replication slot, enable the subscription, and refresh the subscription.
\dRs+ regress_testsub4
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
-------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub4 | regress_subscription_user | f | {testpub} | f | off | d | f | none | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay (ms) | Synchronous commit | Conninfo | Skip LSN
+------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+----------------------+--------------------+-----------------------------+----------
+ regress_testsub4 | regress_subscription_user | f | {testpub} | f | off | d | f | none | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
ALTER SUBSCRIPTION regress_testsub4 SET (origin = any);
\dRs+ regress_testsub4
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
-------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub4 | regress_subscription_user | f | {testpub} | f | off | d | f | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay (ms) | Synchronous commit | Conninfo | Skip LSN
+------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+----------------------+--------------------+-----------------------------+----------
+ regress_testsub4 | regress_subscription_user | f | {testpub} | f | off | d | f | any | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
DROP SUBSCRIPTION regress_testsub3;
@@ -135,10 +135,10 @@ ALTER SUBSCRIPTION regress_testsub CONNECTION 'foobar';
ERROR: invalid connection string syntax: missing "=" after "foobar" in connection info string
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub} | f | off | d | f | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay (ms) | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+----------------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | off | d | f | any | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
ALTER SUBSCRIPTION regress_testsub SET PUBLICATION testpub2, testpub3 WITH (refresh = false);
@@ -155,10 +155,10 @@ ERROR: unrecognized subscription parameter: "create_slot"
-- ok
ALTER SUBSCRIPTION regress_testsub SKIP (lsn = '0/12345');
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+---------------------+--------+-----------+------------------+------------------+--------+--------------------+------------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub2,testpub3} | f | off | d | f | any | off | dbname=regress_doesnotexist2 | 0/12345
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay (ms) | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+---------------------+--------+-----------+------------------+------------------+--------+----------------------+--------------------+------------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub2,testpub3} | f | off | d | f | any | 0 | off | dbname=regress_doesnotexist2 | 0/12345
(1 row)
-- ok - with lsn = NONE
@@ -167,10 +167,10 @@ ALTER SUBSCRIPTION regress_testsub SKIP (lsn = NONE);
ALTER SUBSCRIPTION regress_testsub SKIP (lsn = '0/0');
ERROR: invalid WAL location (LSN): 0/0
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+---------------------+--------+-----------+------------------+------------------+--------+--------------------+------------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub2,testpub3} | f | off | d | f | any | off | dbname=regress_doesnotexist2 | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay (ms) | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+---------------------+--------+-----------+------------------+------------------+--------+----------------------+--------------------+------------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub2,testpub3} | f | off | d | f | any | 0 | off | dbname=regress_doesnotexist2 | 0/0
(1 row)
BEGIN;
@@ -202,10 +202,10 @@ ALTER SUBSCRIPTION regress_testsub_foo SET (synchronous_commit = foobar);
ERROR: invalid value for parameter "synchronous_commit": "foobar"
HINT: Available values: local, remote_write, remote_apply, on, off.
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
----------------------+---------------------------+---------+---------------------+--------+-----------+------------------+------------------+--------+--------------------+------------------------------+----------
- regress_testsub_foo | regress_subscription_user | f | {testpub2,testpub3} | f | off | d | f | any | local | dbname=regress_doesnotexist2 | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay (ms) | Synchronous commit | Conninfo | Skip LSN
+---------------------+---------------------------+---------+---------------------+--------+-----------+------------------+------------------+--------+----------------------+--------------------+------------------------------+----------
+ regress_testsub_foo | regress_subscription_user | f | {testpub2,testpub3} | f | off | d | f | any | 0 | local | dbname=regress_doesnotexist2 | 0/0
(1 row)
-- rename back to keep the rest simple
@@ -239,19 +239,19 @@ CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUB
WARNING: subscription was created, but is not connected
HINT: To initiate replication, you must manually create the replication slot, enable the subscription, and refresh the subscription.
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub} | t | off | d | f | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay (ms) | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+----------------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | t | off | d | f | any | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
ALTER SUBSCRIPTION regress_testsub SET (binary = false);
ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE);
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub} | f | off | d | f | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay (ms) | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+----------------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | off | d | f | any | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
DROP SUBSCRIPTION regress_testsub;
@@ -263,27 +263,27 @@ CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUB
WARNING: subscription was created, but is not connected
HINT: To initiate replication, you must manually create the replication slot, enable the subscription, and refresh the subscription.
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub} | f | on | d | f | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay (ms) | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+----------------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | on | d | f | any | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
ALTER SUBSCRIPTION regress_testsub SET (streaming = parallel);
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub} | f | parallel | d | f | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay (ms) | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+----------------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | parallel | d | f | any | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
ALTER SUBSCRIPTION regress_testsub SET (streaming = false);
ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE);
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub} | f | off | d | f | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay (ms) | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+----------------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | off | d | f | any | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
-- fail - publication already exists
@@ -298,10 +298,10 @@ ALTER SUBSCRIPTION regress_testsub ADD PUBLICATION testpub1, testpub2 WITH (refr
ALTER SUBSCRIPTION regress_testsub ADD PUBLICATION testpub1, testpub2 WITH (refresh = false);
ERROR: publication "testpub1" is already in subscription "regress_testsub"
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-----------------------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub,testpub1,testpub2} | f | off | d | f | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay (ms) | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-----------------------------+--------+-----------+------------------+------------------+--------+----------------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub,testpub1,testpub2} | f | off | d | f | any | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
-- fail - publication used more then once
@@ -316,10 +316,10 @@ ERROR: publication "testpub3" is not in subscription "regress_testsub"
-- ok - delete publications
ALTER SUBSCRIPTION regress_testsub DROP PUBLICATION testpub1, testpub2 WITH (refresh = false);
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub} | f | off | d | f | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay (ms) | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+----------------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | off | d | f | any | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
DROP SUBSCRIPTION regress_testsub;
@@ -355,10 +355,10 @@ CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUB
WARNING: subscription was created, but is not connected
HINT: To initiate replication, you must manually create the replication slot, enable the subscription, and refresh the subscription.
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub} | f | off | p | f | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay (ms) | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+----------------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | off | p | f | any | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
--fail - alter of two_phase option not supported.
@@ -367,10 +367,10 @@ ERROR: unrecognized subscription parameter: "two_phase"
-- but can alter streaming when two_phase enabled
ALTER SUBSCRIPTION regress_testsub SET (streaming = true);
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub} | f | on | p | f | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay (ms) | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+----------------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | on | p | f | any | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE);
@@ -380,10 +380,10 @@ CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUB
WARNING: subscription was created, but is not connected
HINT: To initiate replication, you must manually create the replication slot, enable the subscription, and refresh the subscription.
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub} | f | on | p | f | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay (ms) | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+----------------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | on | p | f | any | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE);
@@ -396,20 +396,61 @@ CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUB
WARNING: subscription was created, but is not connected
HINT: To initiate replication, you must manually create the replication slot, enable the subscription, and refresh the subscription.
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub} | f | off | d | f | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay (ms) | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+----------------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | off | d | f | any | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
ALTER SUBSCRIPTION regress_testsub SET (disable_on_error = true);
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub} | f | off | d | t | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay (ms) | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+----------------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | off | d | t | any | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
+ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE);
+DROP SUBSCRIPTION regress_testsub;
+-- fail -- min_apply_delay must be a non-negative integer
+CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUBLICATION testpub WITH (connect = false, min_apply_delay = foo);
+ERROR: invalid input syntax for type interval: "foo"
+CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUBLICATION testpub WITH (connect = false, min_apply_delay = -1);
+ERROR: -1 ms is outside the valid range for parameter "min_apply_delay"
+-- fail - utilizing streaming = parallel with time delayed replication is not supported.
+CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUBLICATION testpub WITH (connect = false, streaming = parallel, min_apply_delay = '1 day');
+ERROR: min_apply_delay > 0 and streaming = parallel are mutually exclusive options
+-- success -- value without unit is taken as milliseconds
+CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUBLICATION testpub WITH (connect = false, min_apply_delay = 123);
+WARNING: subscription was created, but is not connected
+HINT: To initiate replication, you must manually create the replication slot, enable the subscription, and refresh the subscription.
+\dRs+
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay (ms) | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+----------------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | off | d | f | any | 123 | off | dbname=regress_doesnotexist | 0/0
+(1 row)
+
+-- success -- interval is converted into ms and stored as integer
+ALTER SUBSCRIPTION regress_testsub SET (min_apply_delay = '4h 27min 35s');
+\dRs+
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay (ms) | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+----------------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | off | d | f | any | 16055000 | off | dbname=regress_doesnotexist | 0/0
+(1 row)
+
+-- fail - alter subscription with streaming = parallel failed when time delayed replication is set.
+ALTER SUBSCRIPTION regress_testsub SET (streaming = parallel);
+ERROR: cannot set streaming = parallel for subscription with min_apply_delay > 0
+ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE);
+DROP SUBSCRIPTION regress_testsub;
+-- fail - alter subscription with min_apply_delay failed when streaming = parallel is set.
+CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUBLICATION testpub WITH (connect = false, streaming = parallel);
+WARNING: subscription was created, but is not connected
+HINT: To initiate replication, you must manually create the replication slot, enable the subscription, and refresh the subscription.
+ALTER SUBSCRIPTION regress_testsub SET (min_apply_delay = 123);
+ERROR: cannot set min_apply_delay > 0 for subscription with streaming = parallel
ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE);
DROP SUBSCRIPTION regress_testsub;
RESET SESSION AUTHORIZATION;
diff --git a/src/test/regress/sql/subscription.sql b/src/test/regress/sql/subscription.sql
index 7991abfe8f..80f4966797 100644
--- a/src/test/regress/sql/subscription.sql
+++ b/src/test/regress/sql/subscription.sql
@@ -279,6 +279,31 @@ ALTER SUBSCRIPTION regress_testsub SET (disable_on_error = true);
ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE);
DROP SUBSCRIPTION regress_testsub;
+-- fail -- min_apply_delay must be a non-negative integer
+CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUBLICATION testpub WITH (connect = false, min_apply_delay = foo);
+CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUBLICATION testpub WITH (connect = false, min_apply_delay = -1);
+-- fail - utilizing streaming = parallel with time delayed replication is not supported.
+CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUBLICATION testpub WITH (connect = false, streaming = parallel, min_apply_delay = '1 day');
+
+-- success -- value without unit is taken as milliseconds
+CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUBLICATION testpub WITH (connect = false, min_apply_delay = 123);
+\dRs+
+
+-- success -- interval is converted into ms and stored as integer
+ALTER SUBSCRIPTION regress_testsub SET (min_apply_delay = '4h 27min 35s');
+\dRs+
+
+-- fail - alter subscription with streaming = parallel failed when time delayed replication is set.
+ALTER SUBSCRIPTION regress_testsub SET (streaming = parallel);
+ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE);
+DROP SUBSCRIPTION regress_testsub;
+
+-- fail - alter subscription with min_apply_delay failed when streaming = parallel is set.
+CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUBLICATION testpub WITH (connect = false, streaming = parallel);
+ALTER SUBSCRIPTION regress_testsub SET (min_apply_delay = 123);
+ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE);
+DROP SUBSCRIPTION regress_testsub;
+
RESET SESSION AUTHORIZATION;
DROP ROLE regress_subscription_user;
DROP ROLE regress_subscription_user2;
diff --git a/src/test/subscription/meson.build b/src/test/subscription/meson.build
index 3db0fdfd96..a186876eb4 100644
--- a/src/test/subscription/meson.build
+++ b/src/test/subscription/meson.build
@@ -38,6 +38,7 @@ tests += {
't/029_on_error.pl',
't/030_origin.pl',
't/031_column_list.pl',
+ 't/032_apply_delay.pl',
't/100_bugs.pl',
],
},
diff --git a/src/test/subscription/t/032_apply_delay.pl b/src/test/subscription/t/032_apply_delay.pl
new file mode 100644
index 0000000000..8f8ce23f1b
--- /dev/null
+++ b/src/test/subscription/t/032_apply_delay.pl
@@ -0,0 +1,151 @@
+
+# Copyright (c) 2022, PostgreSQL Global Development Group
+
+# Test replication apply delay
+use strict;
+use warnings;
+use PostgreSQL::Test::Cluster;
+use PostgreSQL::Test::Utils;
+use Test::More;
+
+# Create publisher node.
+my $node_publisher = PostgreSQL::Test::Cluster->new('publisher');
+$node_publisher->init(allows_streaming => 'logical');
+$node_publisher->append_conf('postgresql.conf',
+ 'logical_decoding_work_mem = 64kB');
+$node_publisher->start;
+
+# Create subscriber node.
+my $node_subscriber = PostgreSQL::Test::Cluster->new('subscriber');
+$node_subscriber->init;
+$node_subscriber->append_conf('postgresql.conf',
+ "log_min_messages = debug2");
+$node_subscriber->start;
+
+# Create some preexisting content on publisher.
+$node_publisher->safe_psql('postgres',
+ "CREATE TABLE test_tab (a int primary key, b varchar, c timestamptz DEFAULT now())");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO test_tab VALUES (1, 'foo'), (2, 'bar')");
+
+# Setup structure on subscriber.
+$node_subscriber->safe_psql('postgres',
+ "CREATE TABLE test_tab (a int primary key, b text, c timestamptz DEFAULT now(), d bigint DEFAULT 999)"
+);
+
+# Setup logical replication.
+my $publisher_connstr = $node_publisher->connstr . ' dbname=postgres';
+
+# column c must not be published because we want to compare the time difference.
+$node_publisher->safe_psql('postgres',
+ "CREATE PUBLICATION tap_pub FOR TABLE test_tab (a, b)");
+
+my $appname = 'tap_sub';
+$node_subscriber->safe_psql('postgres',
+ "CREATE SUBSCRIPTION tap_sub CONNECTION '$publisher_connstr application_name=$appname' PUBLICATION tap_pub WITH (streaming = on, min_apply_delay = '3s')"
+);
+
+# Wait for initial table sync to finish.
+$node_subscriber->wait_for_subscription_sync($node_publisher, $appname);
+
+# Check log starting now for logical replication apply delay.
+my $log_location = -s $node_subscriber->logfile;
+
+my $result =
+ $node_subscriber->safe_psql('postgres',
+ "SELECT count(*), min(a), max(a) FROM test_tab");
+is($result, qq(2|1|2), 'check initial data was copied to subscriber');
+
+# New row to trigger apply delay.
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO test_tab VALUES (3, 'baz')");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO test_tab VALUES (4, 'abc')");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO test_tab VALUES (5, 'def')");
+
+$node_publisher->wait_for_catchup($appname);
+
+$result =
+ $node_subscriber->safe_psql('postgres',
+ "SELECT count(*), min(a), max(a) FROM test_tab");
+is($result, qq(5|1|5), 'check if the new rows were applied to subscriber');
+
+check_apply_delay_log("logical replication apply delay");
+
+check_apply_delay_time('5', '3');
+
+# Test streamed transaction.
+# Insert, update and delete enough rows to exceed 64kB limit.
+$node_publisher->safe_psql(
+ 'postgres', q{
+BEGIN;
+INSERT INTO test_tab SELECT i, md5(i::text) FROM generate_series(6, 5000) s(i);
+UPDATE test_tab SET b = md5(b) WHERE mod(a, 2) = 0;
+DELETE FROM test_tab WHERE mod(a, 3) = 0;
+COMMIT;
+});
+
+$node_publisher->wait_for_catchup($appname);
+
+$result =
+ $node_subscriber->safe_psql('postgres',
+ "SELECT count(*), min(a), max(a) FROM test_tab");
+is($result, qq(3334|1|5000), 'check if the new rows were applied to subscriber');
+
+check_apply_delay_log("logical replication apply delay");
+
+check_apply_delay_time('5000', '3');
+
+# Test ALTER SUBSCRIPTION. Delay 86460 seconds (1 day 1 minute).
+$node_subscriber->safe_psql('postgres',
+ "ALTER SUBSCRIPTION tap_sub SET (min_apply_delay = 86460000)"
+);
+
+# New row to trigger apply delay.
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO test_tab VALUES (0, 'foobar')");
+
+# Disable subscription. worker should die immediately.
+$node_subscriber->safe_psql('postgres',
+ "ALTER SUBSCRIPTION tap_sub DISABLE;"
+);
+
+# Wait until worker dies.
+my $sub_query =
+ "SELECT count(1) = 0 FROM pg_stat_subscription WHERE subname = 'tap_sub' AND pid IS NOT NULL;";
+$node_subscriber->poll_query_until('postgres', $sub_query)
+ or die "Timed out while waiting for subscriber to die";
+
+$node_subscriber->stop;
+$node_publisher->stop;
+
+done_testing();
+
+sub check_apply_delay_log
+{
+ my $message = shift;
+ my $old_log_location = $log_location;
+
+ $log_location = $node_subscriber->wait_for_log(qr/$message/, $log_location);
+
+ cmp_ok($log_location, '>', $old_log_location,
+ "logfile contains triggered logical replication apply delay"
+ );
+}
+
+sub check_apply_delay_time
+{
+ my ($primary_key, $expected_diffs) = @_;
+
+ my $inserted_time_on_pub = $node_publisher->safe_psql('postgres', qq[
+ SELECT extract(epoch from c) FROM test_tab WHERE a = $primary_key;
+ ]);
+
+ my $inserted_time_on_sub = $node_subscriber->safe_psql('postgres', qq[
+ SELECT extract(epoch from c) FROM test_tab WHERE a = $primary_key;
+ ]);
+
+ cmp_ok($inserted_time_on_sub - $inserted_time_on_pub, '>', $expected_diffs,
+ "The tuple on the subscriber was modified later than the publisher");
+}
--
2.27.0
2.
I think users can set ' wal_receiver_status_interval ' to 0 or more
than 'wal_sender_timeout'. But is this a frequent use-case scenario or
do we see DBAs setting these in such a way by mistake? If so, then I
think, it is better to give Warning message in such a case when a user
tries to create or alter a subscription with a large 'min_apply_delay'
(>= 'wal_sender_timeout') , rather than leaving it to the user's
understanding that WalSender may repeatedly timeout in such a case.
Parse_subscription_options and AlterSubscription can be modified to
log a warning. Any thoughts?
Yes, DBAs may set wal_receiver_status_interval to more than wal_sender_timeout by
mistake.
But to handle the scenario we must compare between min_apply_delay *on subscriber*
and wal_sender_timeout *on publisher*. Both values are not transferred to opposite
sides, so the WARNING cannot be raised. I considered that such a mechanism seemed
to be complex. The discussion around [1]/messages/by-id/CAA4eK1Lq+h8qo+rqGU-E+hwJKAHYocV54y4pvou4rLysCgYD-g@mail.gmail.com may be useful.
[1]: /messages/by-id/CAA4eK1Lq+h8qo+rqGU-E+hwJKAHYocV54y4pvou4rLysCgYD-g@mail.gmail.com
Best Regards,
Hayato Kuroda
FUJITSU LIMITED
Dear Vignesh,
Thanks for reviewing!
1) Comment inconsistency across create and alter subscription, better
to keep it same:
A comment for CREATE SUBSCRIPTION became same as ALTER's one.
2) ereport inconsistency, braces around errcode is present in few
places and not present in few places, it is better to keep it
consistent by removing it:
Removed.
3) this include is not required, I could compile without it
Removed. Timestamp datatype is not used in subscriptioncmds.c.
4)
4.a)
Should this be changed:
/* Adds portion time (in ms) to the previous result. */
to
/* Adds portion time (in ms) to the previous result */
Changed.
4.b)
Should this be changed:
/* Detect whether the value of interval can cause an overflow. */
to
/* Detect whether the value of interval can cause an overflow */
Changed.
5) Can this "ALTER SUBSCRIPTION regress_testsub SET (min_apply_delay =
'1d')" be combined along with "-- success -- 123 ms", that way few
statements could be reduced
6) Can we do the interval testing along with alter subscription and
combined with "-- success -- 123 ms" test, that way few statements
could be reduced
To keep the code coverage, either of them must remain. 5) was cleanly removed and
6) was combined to you suggested. In addition, comments were updated to clarify
the testcase.
Please have a look at the latest patch v14 in [1]/messages/by-id/TYAPR01MB5866D0527B1B8D589F1C2551F5FC9@TYAPR01MB5866.jpnprd01.prod.outlook.com.
[1]: /messages/by-id/TYAPR01MB5866D0527B1B8D589F1C2551F5FC9@TYAPR01MB5866.jpnprd01.prod.outlook.com
Best Regards,
Hayato Kuroda
FUJITSU LIMITED
At Wed, 11 Jan 2023 12:46:24 +0000, "Hayato Kuroda (Fujitsu)" <kuroda.hayato@fujitsu.com> wrote in
them. Which version is better?
Some comments by a quick loock, different from the above.
+ CONNECTION 'host=192.168.1.50 port=5432 user=foo dbname=foodb'
I understand that we (not PG people, but IT people) are supposed to
use in documents a certain set of special addresses that is guaranteed
not to be routed in the field.
TEST-NET-1 : 192.0.2.0/24
TEST-NET-2 : 198.51.100.0/24
TEST-NET-3 : 203.0.113.0/24
(I found 192.83.123.89 in the postgres_fdw doc, but it'd be another issue..)
+ if (strspn(tmp, "-0123456789 ") == strlen(tmp))
Do we need to bother spending another memory block for apparent
non-digits here?
+ errmsg(INT64_FORMAT " ms is outside the valid range for parameter \"%s\"",
We don't use INT64_FORMAT in translatable message strings. Cast then
use %lld instead.
This message looks unfriendly as it doesn't suggest the valid range,
and it shows the input value in a different unit from what was in the
input. A I think we can spell it as "\"%s\" is outside the valid range
for subsciription parameter \"%s\" (0 .. <INT_MAX> in millisecond)"
+ int64 min_apply_delay;
..
+ if (ms < 0 || ms > INT_MAX)
Why is the variable wider than required?
+ errmsg("%s and %s are mutually exclusive options",
+ "min_apply_delay > 0", "streaming = parallel"));
Mmm. Couldn't we refuse 0 as min_apply_delay?
+ sub->minapplydelay > 0)
...
+ if (opts.min_apply_delay > 0 &&
Is there any reason for the differenciation?
+ errmsg("cannot set %s for subscription with %s",
+ "streaming = parallel", "min_apply_delay > 0"));
I think that this shoud be more like human-speking. Say, "cannot
enable min_apply_delay for subscription in parallel streaming mode" or
something.. The same is applicable to the nearby message.
+static void maybe_delay_apply(TimestampTz ts);
apply_spooled_messages(FileSet *stream_fileset, TransactionId xid,
- XLogRecPtr lsn)
+ XLogRecPtr lsn, TimestampTz ts)
"ts" looks too generic. Couldn't it be more specific?
We need a explanation for the parameter in the function comment.
+ if (!am_parallel_apply_worker())
+ {
+ Assert(ts > 0);
+ maybe_delay_apply(ts);
It seems to me better that the if condition and assertion are checked
inside maybe_delay_apply().
regards.
--
Kyotaro Horiguchi
NTT Open Source Software Center
On Wed, Jan 11, 2023 at 6:16 PM Hayato Kuroda (Fujitsu)
<kuroda.hayato@fujitsu.com> wrote:
Dear Shveta,
Thanks for reviewing! PSA new version.
1.
+ errmsg("min_apply_delay must not be set when streaming = parallel")));
we give the same error msg for both the cases:
a. when subscription is created with streaming=parallel but we are
trying to alter subscription to set min_apply_delay >0
b. when subscription is created with some min_apply_delay and we are
trying to alter subscription to make it streaming=parallel.
For case a, error msg looks fine but for case b, I think error msg
should be changed slightly.
ALTER SUBSCRIPTION regress_testsub SET (streaming = parallel);
ERROR: min_apply_delay must not be set when streaming = parallel
This gives the feeling that we are trying to modify min_apply_delay
but we are not. Maybe we can change it to:
"subscription with min_apply_delay must not be allowed to stream
parallel" (or something better)Your point that error messages are strange is right. And while
checking other ones, I found they have very similar styles. Therefore I reworded
ERROR messages in AlterSubscription() and parse_subscription_options() to follow
them. Which version is better?
v14 one looks much better. Thanks!
thanks
Shveta
On Wed, Jan 11, 2023 at 6:16 PM Hayato Kuroda (Fujitsu)
<kuroda.hayato@fujitsu.com> wrote:
2.
I think users can set ' wal_receiver_status_interval ' to 0 or more
than 'wal_sender_timeout'. But is this a frequent use-case scenario or
do we see DBAs setting these in such a way by mistake? If so, then I
think, it is better to give Warning message in such a case when a user
tries to create or alter a subscription with a large 'min_apply_delay'
(>= 'wal_sender_timeout') , rather than leaving it to the user's
understanding that WalSender may repeatedly timeout in such a case.
Parse_subscription_options and AlterSubscription can be modified to
log a warning. Any thoughts?Yes, DBAs may set wal_receiver_status_interval to more than wal_sender_timeout by
mistake.But to handle the scenario we must compare between min_apply_delay *on subscriber*
and wal_sender_timeout *on publisher*. Both values are not transferred to opposite
sides, so the WARNING cannot be raised. I considered that such a mechanism seemed
to be complex. The discussion around [1] may be useful.[1]: /messages/by-id/CAA4eK1Lq+h8qo+rqGU-E+hwJKAHYocV54y4pvou4rLysCgYD-g@mail.gmail.com
okay, I see. So even when 'wal_receiver_status_interval' is set to 0,
no log/warning is needed when the user tries to set min_apply_delay>0?
Are we good with doc alone?
One trivial correction in config.sgml:
+ terminates due to the timeout errors. Hence, make sure this parameter
+ shorter than the <literal>wal_sender_timeout</literal> of the publisher.
Hence, make sure this parameter is shorter... <is missing>
thanks
Shveta
Hi,
I've a question about 032_apply_delay.pl.
+# Test ALTER SUBSCRIPTION. Delay 86460 seconds (1 day 1 minute).
+$node_subscriber->safe_psql('postgres', + "ALTER SUBSCRIPTION tap_sub SET (min_apply_delay = 86460000)" +); + +# New row to trigger apply delay. +$node_publisher->safe_psql('postgres', + "INSERT INTO test_tab VALUES (0, 'foobar')"); +
I couldn't quite see how these lines test whether ALTER SUBSCRIPTION
successfully worked.
Don't we need to check that min_apply_delay really changed as a result?
But also I see that subscription.sql already tests this ALTER SUBSCRIPTION
behaviour.
Best,
--
Melih Mutlu
Microsoft
On Thursday, January 12, 2023 12:04 PM Kyotaro Horiguchi <horikyota.ntt@gmail.com> wrote:
At Wed, 11 Jan 2023 12:46:24 +0000, "Hayato Kuroda (Fujitsu)"
<kuroda.hayato@fujitsu.com> wrote inthem. Which version is better?
Some comments by a quick loock, different from the above.
Horiguchi-san, thanks for your review !
+ CONNECTION 'host=192.168.1.50 port=5432 user=foo
dbname=foodb'I understand that we (not PG people, but IT people) are supposed to use in
documents a certain set of special addresses that is guaranteed not to be
routed in the field.TEST-NET-1 : 192.0.2.0/24
TEST-NET-2 : 198.51.100.0/24
TEST-NET-3 : 203.0.113.0/24(I found 192.83.123.89 in the postgres_fdw doc, but it'd be another issue..)
Fixed. If necessary we can create another thread for this.
+ if (strspn(tmp, "-0123456789 ") == strlen(tmp))
Do we need to bother spending another memory block for apparent non-digits
here?
Yes. The characters are necessary to handle an issue reported in [1]/messages/by-id/CALDaNm3Bpzhh60nU-keuGxMPb-OhcqsfpCN3ysfCfCJ-2ShYPA@mail.gmail.com.
The issue happened if the user inputs a negative value,
then the length comparison became different between strspn and strlen
and the input value was recognized as seconds, when
the unit wasn't described. This led to a wrong error message for the user.
Those addition of such characters solve the issue.
+ errmsg(INT64_FORMAT " ms is outside the valid range for parameter +\"%s\"",We don't use INT64_FORMAT in translatable message strings. Cast then
use %lld instead.
Thanks for teaching us. Fixed.
This message looks unfriendly as it doesn't suggest the valid range, and it
shows the input value in a different unit from what was in the input. A I think we
can spell it as "\"%s\" is outside the valid range for subsciription parameter
\"%s\" (0 .. <INT_MAX> in millisecond)"
Makes sense. I incorporated the valid range with the aligned format of recovery_min_apply_delay.
FYI, the physical replication's GUC doesn't write the unites for the range like below.
I followed and applied this style.
---
LOG: -1 ms is outside the valid range for parameter "recovery_min_apply_delay" (0 .. 2147483647)
FATAL: configuration file "/home/k5user/new/pg/l/make_v15/slave/postgresql.conf" contains errors
---
+ int64 min_apply_delay; .. + if (ms < 0 || ms > INT_MAX)Why is the variable wider than required?
You are right. Fixed.
+ errmsg("%s and %s are mutually
exclusive options",
+ "min_apply_delay > 0",
"streaming = parallel"));Mmm. Couldn't we refuse 0 as min_apply_delay?
Sorry, the previous patch's behavior wasn't consistent with this error message.
In the previous patch, if we conducted alter subscription
with stream = parallel and min_apply_delay = 0 (from a positive value) at the same time,
the alter command failed, although this should succeed by this time-delayed feature specification.
We fixed this part accordingly by some more tests in AlterSubscription().
By the way, we should allow users to change min_apply_dealy to 0
whenever they want from different value. Then, we didn't restrict
this kind of operation.
+ sub->minapplydelay > 0) ... + if (opts.min_apply_delay > 0 &&Is there any reason for the differenciation?
Yes. The former is the object for an existing subscription configuration.
For example, if we alter subscription with setting streaming = 'parallel'
for a subscription created with min_apply_delay = '1 day', we
need to reject the alter command. The latter is new settings.
+
errmsg("cannot set %s for subscription with %s",
+
"streaming = parallel", "min_apply_delay > 0"));I think that this shoud be more like human-speking. Say, "cannot enable
min_apply_delay for subscription in parallel streaming mode" or something..
The same is applicable to the nearby message.
Reworded the error messages. Please check.
+static void maybe_delay_apply(TimestampTz ts);
apply_spooled_messages(FileSet *stream_fileset, TransactionId xid, - XLogRecPtr lsn) + XLogRecPtr lsn, TimestampTz ts)"ts" looks too generic. Couldn't it be more specific?
We need a explanation for the parameter in the function comment.
Changed it to finish_ts, since it indicates commit/prepare time.
This terminology should be aligned with finish lsn.
+ if (!am_parallel_apply_worker()) + { + Assert(ts > 0); + maybe_delay_apply(ts);It seems to me better that the if condition and assertion are checked inside
maybe_delay_apply().
Fixed.
[1]: /messages/by-id/CALDaNm3Bpzhh60nU-keuGxMPb-OhcqsfpCN3ysfCfCJ-2ShYPA@mail.gmail.com
Best Regards,
Takamichi Osumi
Attachments:
v15-0001-Time-delayed-logical-replication-subscriber.patchapplication/octet-stream; name=v15-0001-Time-delayed-logical-replication-subscriber.patchDownload
From aec7ec458e22f1a75e06789a1b36b44191d6b760 Mon Sep 17 00:00:00 2001
From: Takamichi Osumi <osumi.takamichi@fujitsu.com>
Date: Thu, 12 Jan 2023 15:05:59 +0000
Subject: [PATCH v15] Time-delayed logical replication subscriber
Similar to physical replication, a time-delayed copy of the data for
logical replication is useful for some scenarios (particularly to fix
errors that might cause data loss).
If the subscription sets min_apply_delay parameter, the logical
replication worker will delay the transaction commit for min_apply_delay
milliseconds.
The delay is calculated between the WAL time stamp and the current time
on the subscriber.
The delay occurs only on WAL records for transaction begins. The main
reason is to avoid keeping a transaction open for a long time. Regular
and prepared transactions are covered. Streamed transactions are also
covered.
Using this feature with parallel apply feature is prohibited.
Author: Euler Taveira
Discussion: https://postgr.es/m/CAB-JLwYOYwL=XTyAXKiH5CtM_Vm8KjKh7aaitCKvmCh4rzr5pQ@mail.gmail.com
---
doc/src/sgml/catalogs.sgml | 9 +
doc/src/sgml/config.sgml | 12 ++
doc/src/sgml/logical-replication.sgml | 7 +
doc/src/sgml/ref/alter_subscription.sgml | 5 +-
doc/src/sgml/ref/create_subscription.sgml | 59 +++++-
src/backend/catalog/pg_subscription.c | 1 +
src/backend/catalog/system_views.sql | 7 +-
src/backend/commands/subscriptioncmds.c | 92 ++++++++-
.../replication/logical/applyparallelworker.c | 3 +-
src/backend/replication/logical/worker.c | 166 ++++++++++++++--
src/backend/utils/adt/timestamp.c | 29 +++
src/bin/pg_dump/pg_dump.c | 16 +-
src/bin/pg_dump/pg_dump.h | 1 +
src/bin/psql/describe.c | 9 +-
src/bin/psql/tab-complete.c | 4 +-
src/include/catalog/pg_subscription.h | 3 +
src/include/datatype/timestamp.h | 2 +
src/include/replication/worker_internal.h | 2 +-
src/include/utils/timestamp.h | 2 +
src/test/regress/expected/subscription.out | 185 +++++++++++-------
src/test/regress/sql/subscription.sql | 25 +++
src/test/subscription/meson.build | 1 +
src/test/subscription/t/032_apply_delay.pl | 170 ++++++++++++++++
23 files changed, 708 insertions(+), 102 deletions(-)
create mode 100644 src/test/subscription/t/032_apply_delay.pl
diff --git a/doc/src/sgml/catalogs.sgml b/doc/src/sgml/catalogs.sgml
index c1e4048054..bf3c05241c 100644
--- a/doc/src/sgml/catalogs.sgml
+++ b/doc/src/sgml/catalogs.sgml
@@ -7873,6 +7873,15 @@ SCRAM-SHA-256$<replaceable><iteration count></replaceable>:<replaceable>&l
</para></entry>
</row>
+ <row>
+ <entry role="catalog_table_entry"><para role="column_definition">
+ <structfield>subminapplydelay</structfield> <type>int8</type>
+ </para>
+ <para>
+ The length of time (ms) to delay the application of changes.
+ </para></entry>
+ </row>
+
<row>
<entry role="catalog_table_entry"><para role="column_definition">
<structfield>subname</structfield> <type>name</type>
diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml
index 77574e2d4e..89cdc0a75b 100644
--- a/doc/src/sgml/config.sgml
+++ b/doc/src/sgml/config.sgml
@@ -4753,6 +4753,18 @@ ANY <replaceable class="parameter">num_sync</replaceable> ( <replaceable class="
the <filename>postgresql.conf</filename> file or on the server
command line.
</para>
+ <para>
+ For time delayed logical replication, the apply worker sends a Standby
+ Status Update message to the corresponding publisher per the indicated
+ time of this parameter. Therefore, if this parameter is longer than
+ <literal>wal_sender_timeout</literal> on the publisher, then the
+ walsender doesn't get any update message during the delay and repeatedly
+ terminates due to the timeout errors. Hence, make sure this parameter is
+ shorter than the <literal>wal_sender_timeout</literal> of the publisher.
+ If this parameter is set to zero with time delayed replication, the
+ apply worker doesn't send any feedback messages during the
+ <literal>min_apply_delay</literal>.
+ </para>
</listitem>
</varlistentry>
diff --git a/doc/src/sgml/logical-replication.sgml b/doc/src/sgml/logical-replication.sgml
index 54f48be87f..6407804547 100644
--- a/doc/src/sgml/logical-replication.sgml
+++ b/doc/src/sgml/logical-replication.sgml
@@ -247,6 +247,13 @@
target table.
</para>
+ <para>
+ The subscriber replication can be instructed to lag behind the publisher
+ side changes by specifying the <literal>min_apply_delay</literal>
+ subscription parameter. See <xref linkend="sql-createsubscription"/> for
+ details.
+ </para>
+
<sect2 id="logical-replication-subscription-slot">
<title>Replication Slot Management</title>
diff --git a/doc/src/sgml/ref/alter_subscription.sgml b/doc/src/sgml/ref/alter_subscription.sgml
index 1e8d72062b..d63aff1b90 100644
--- a/doc/src/sgml/ref/alter_subscription.sgml
+++ b/doc/src/sgml/ref/alter_subscription.sgml
@@ -213,8 +213,9 @@ ALTER SUBSCRIPTION <replaceable class="parameter">name</replaceable> RENAME TO <
are <literal>slot_name</literal>,
<literal>synchronous_commit</literal>,
<literal>binary</literal>, <literal>streaming</literal>,
- <literal>disable_on_error</literal>, and
- <literal>origin</literal>.
+ <literal>disable_on_error</literal>,
+ <literal>origin</literal>, and
+ <literal>min_apply_delay</literal>.
</para>
</listitem>
</varlistentry>
diff --git a/doc/src/sgml/ref/create_subscription.sgml b/doc/src/sgml/ref/create_subscription.sgml
index eba72c6af6..dc330e8db7 100644
--- a/doc/src/sgml/ref/create_subscription.sgml
+++ b/doc/src/sgml/ref/create_subscription.sgml
@@ -349,7 +349,47 @@ CREATE SUBSCRIPTION <replaceable class="parameter">subscription_name</replaceabl
</para>
</listitem>
</varlistentry>
- </variablelist></para>
+
+ <varlistentry>
+ <term><literal>min_apply_delay</literal> (<type>integer</type>)</term>
+ <listitem>
+ <para>
+ By default, the subscriber applies changes as soon as possible. As
+ with the physical replication feature
+ (<xref linkend="guc-recovery-min-apply-delay"/>), it can be useful to
+ have a time-delayed logical replica. This parameter lets the user to
+ delay the application of changes by a specified amount of time. If this
+ value is specified without units, it is taken as milliseconds. The
+ default is zero(no delay).
+ </para>
+ <para>
+ The delay occurs only on WAL records for transaction begins and after
+ the initial table synchronization. It is possible that the
+ replication delay between publisher and subscriber exceeds the value
+ of this parameter, in which case no delay is added. Note that the
+ delay is calculated between the WAL time stamp as written on
+ publisher and the current time on the subscriber. Time spent in logical
+ decoding and in transferring the transaction may reduce the actual wait
+ time. If the system clocks on publisher and subscriber are not
+ synchronized, this may lead to apply changes earlier than expected,
+ but this is not a major issue because this parameter is typically much
+ larger than the time deviations between servers. Note that if this
+ parameter is set to a long delay, the replication will stop if the
+ replication slot falls behind the current LSN by more than
+ <link linkend="guc-max-slot-wal-keep-size"><literal>max_slot_wal_keep_size</literal></link>.
+ </para>
+ <warning>
+ <para>
+ Delaying the replication can mean there is a much longer time between making
+ a change on the publisher, and that change being committed on the subscriber.
+ This can have a big impact on synchronous replication.
+ See <xref linkend="guc-synchronous-commit"/>.
+ </para>
+ </warning>
+ </listitem>
+ </varlistentry>
+ </variablelist>
+ </para>
</listitem>
</varlistentry>
@@ -413,6 +453,11 @@ CREATE SUBSCRIPTION <replaceable class="parameter">subscription_name</replaceabl
published with different column lists are not supported.
</para>
+ <para>
+ Setting streaming to <literal>parallel</literal> mode and <literal>min_apply_delay</literal>
+ at the same time is not supported.
+ </para>
+
<para>
We allow non-existent publications to be specified so that users can add
those later. This means
@@ -472,6 +517,18 @@ CREATE SUBSCRIPTION mysub
PUBLICATION insert_only
WITH (enabled = false);
</programlisting></para>
+
+ <para>
+ Create a subscription to a remote server that replicates tables in
+ the <literal>mypub</literal> publication and starts replicating immediately
+ on commit. Pre-existing data is not copied. The application of changes is
+ delayed by 4 hours.
+<programlisting>
+CREATE SUBSCRIPTION mysub
+ CONNECTION 'host=192.0.2.4 port=5432 user=foo dbname=foodb'
+ PUBLICATION mypub
+ WITH (copy_data = false, min_apply_delay = '4h');
+</programlisting></para>
</refsect1>
<refsect1>
diff --git a/src/backend/catalog/pg_subscription.c b/src/backend/catalog/pg_subscription.c
index a56ae311c3..c767cc1c3a 100644
--- a/src/backend/catalog/pg_subscription.c
+++ b/src/backend/catalog/pg_subscription.c
@@ -64,6 +64,7 @@ GetSubscription(Oid subid, bool missing_ok)
sub->oid = subid;
sub->dbid = subform->subdbid;
sub->skiplsn = subform->subskiplsn;
+ sub->minapplydelay = subform->subminapplydelay;
sub->name = pstrdup(NameStr(subform->subname));
sub->owner = subform->subowner;
sub->enabled = subform->subenabled;
diff --git a/src/backend/catalog/system_views.sql b/src/backend/catalog/system_views.sql
index 447c9b970f..4004fcd0c4 100644
--- a/src/backend/catalog/system_views.sql
+++ b/src/backend/catalog/system_views.sql
@@ -1299,9 +1299,10 @@ REVOKE ALL ON pg_replication_origin_status FROM public;
-- All columns of pg_subscription except subconninfo are publicly readable.
REVOKE ALL ON pg_subscription FROM public;
-GRANT SELECT (oid, subdbid, subskiplsn, subname, subowner, subenabled,
- subbinary, substream, subtwophasestate, subdisableonerr,
- subslotname, subsynccommit, subpublications, suborigin)
+GRANT SELECT (oid, subdbid, subskiplsn, subminapplydelay, subname, subowner,
+ subenabled, subbinary, substream, subtwophasestate,
+ subdisableonerr, subslotname, subsynccommit, subpublications,
+ suborigin)
ON pg_subscription TO public;
CREATE VIEW pg_stat_subscription_stats AS
diff --git a/src/backend/commands/subscriptioncmds.c b/src/backend/commands/subscriptioncmds.c
index baff00dd74..b0ddd530e3 100644
--- a/src/backend/commands/subscriptioncmds.c
+++ b/src/backend/commands/subscriptioncmds.c
@@ -66,6 +66,7 @@
#define SUBOPT_DISABLE_ON_ERR 0x00000400
#define SUBOPT_LSN 0x00000800
#define SUBOPT_ORIGIN 0x00001000
+#define SUBOPT_MIN_APPLY_DELAY 0x00002000
/* check if the 'val' has 'bits' set */
#define IsSet(val, bits) (((val) & (bits)) == (bits))
@@ -90,6 +91,7 @@ typedef struct SubOpts
bool disableonerr;
char *origin;
XLogRecPtr lsn;
+ int min_apply_delay;
} SubOpts;
static List *fetch_table_list(WalReceiverConn *wrconn, List *publications);
@@ -146,6 +148,8 @@ parse_subscription_options(ParseState *pstate, List *stmt_options,
opts->disableonerr = false;
if (IsSet(supported_opts, SUBOPT_ORIGIN))
opts->origin = pstrdup(LOGICALREP_ORIGIN_ANY);
+ if (IsSet(supported_opts, SUBOPT_MIN_APPLY_DELAY))
+ opts->min_apply_delay = 0;
/* Parse options */
foreach(lc, stmt_options)
@@ -324,6 +328,43 @@ parse_subscription_options(ParseState *pstate, List *stmt_options,
opts->specified_opts |= SUBOPT_LSN;
opts->lsn = lsn;
}
+ else if (IsSet(supported_opts, SUBOPT_MIN_APPLY_DELAY) &&
+ strcmp(defel->defname, "min_apply_delay") == 0)
+ {
+ char *val,
+ *tmp;
+ Interval *interval;
+ int64 ms;
+
+ if (IsSet(opts->specified_opts, SUBOPT_MIN_APPLY_DELAY))
+ errorConflictingDefElem(defel, pstate);
+
+ opts->specified_opts |= SUBOPT_MIN_APPLY_DELAY;
+ tmp = defGetString(defel);
+
+ /*
+ * If no unit was specified, then explicitly add 'ms' otherwise
+ * the interval_in function would assume 'seconds'.
+ */
+ if (strspn(tmp, "-0123456789 ") == strlen(tmp))
+ val = psprintf("%sms", tmp);
+ else
+ val = tmp;
+
+ interval = DatumGetIntervalP(DirectFunctionCall3(interval_in,
+ CStringGetDatum(val),
+ ObjectIdGetDatum(InvalidOid),
+ Int32GetDatum(-1)));
+
+ ms = interval2ms(interval);
+ if (ms < 0 || ms > INT_MAX)
+ ereport(ERROR,
+ errcode(ERRCODE_NUMERIC_VALUE_OUT_OF_RANGE),
+ errmsg("%lld ms is outside the valid range for parameter \"%s\" (0 .. %d)",
+ (long long) ms, "min_apply_delay", INT_MAX));
+
+ opts->min_apply_delay = ms;
+ }
else
ereport(ERROR,
(errcode(ERRCODE_SYNTAX_ERROR),
@@ -404,6 +445,17 @@ parse_subscription_options(ParseState *pstate, List *stmt_options,
"slot_name = NONE", "create_slot = false")));
}
}
+
+ /* Test the combination of streaming mode and min_apply_delay */
+ if (IsSet(supported_opts, SUBOPT_MIN_APPLY_DELAY) &&
+ opts->min_apply_delay > 0)
+ {
+ if (opts->streaming == LOGICALREP_STREAM_PARALLEL)
+ ereport(ERROR,
+ errcode(ERRCODE_SYNTAX_ERROR),
+ errmsg("%s and %s are mutually exclusive options",
+ "min_apply_delay > 0", "streaming = parallel"));
+ }
}
/*
@@ -560,7 +612,8 @@ CreateSubscription(ParseState *pstate, CreateSubscriptionStmt *stmt,
SUBOPT_SLOT_NAME | SUBOPT_COPY_DATA |
SUBOPT_SYNCHRONOUS_COMMIT | SUBOPT_BINARY |
SUBOPT_STREAMING | SUBOPT_TWOPHASE_COMMIT |
- SUBOPT_DISABLE_ON_ERR | SUBOPT_ORIGIN);
+ SUBOPT_DISABLE_ON_ERR | SUBOPT_ORIGIN |
+ SUBOPT_MIN_APPLY_DELAY);
parse_subscription_options(pstate, stmt->options, supported_opts, &opts);
/*
@@ -625,6 +678,7 @@ CreateSubscription(ParseState *pstate, CreateSubscriptionStmt *stmt,
values[Anum_pg_subscription_oid - 1] = ObjectIdGetDatum(subid);
values[Anum_pg_subscription_subdbid - 1] = ObjectIdGetDatum(MyDatabaseId);
values[Anum_pg_subscription_subskiplsn - 1] = LSNGetDatum(InvalidXLogRecPtr);
+ values[Anum_pg_subscription_subminapplydelay - 1] = Int64GetDatum(opts.min_apply_delay);
values[Anum_pg_subscription_subname - 1] =
DirectFunctionCall1(namein, CStringGetDatum(stmt->subname));
values[Anum_pg_subscription_subowner - 1] = ObjectIdGetDatum(owner);
@@ -1054,7 +1108,7 @@ AlterSubscription(ParseState *pstate, AlterSubscriptionStmt *stmt,
supported_opts = (SUBOPT_SLOT_NAME |
SUBOPT_SYNCHRONOUS_COMMIT | SUBOPT_BINARY |
SUBOPT_STREAMING | SUBOPT_DISABLE_ON_ERR |
- SUBOPT_ORIGIN);
+ SUBOPT_ORIGIN | SUBOPT_MIN_APPLY_DELAY);
parse_subscription_options(pstate, stmt->options,
supported_opts, &opts);
@@ -1098,6 +1152,20 @@ AlterSubscription(ParseState *pstate, AlterSubscriptionStmt *stmt,
if (IsSet(opts.specified_opts, SUBOPT_STREAMING))
{
+ /*
+ * Test the combination of streaming mode and
+ * min_apply_delay
+ */
+ if (opts.streaming == LOGICALREP_STREAM_PARALLEL)
+ {
+ if ((IsSet(opts.specified_opts, SUBOPT_MIN_APPLY_DELAY) && opts.min_apply_delay > 0) ||
+ (!IsSet(opts.specified_opts, SUBOPT_MIN_APPLY_DELAY) && sub->minapplydelay > 0))
+ ereport(ERROR,
+ errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
+ errmsg("cannot enable %s mode for subscription with %s",
+ "streaming = parallel", "min_apply_delay"));
+ }
+
values[Anum_pg_subscription_substream - 1] =
CharGetDatum(opts.streaming);
replaces[Anum_pg_subscription_substream - 1] = true;
@@ -1111,6 +1179,26 @@ AlterSubscription(ParseState *pstate, AlterSubscriptionStmt *stmt,
= true;
}
+ if (IsSet(opts.specified_opts, SUBOPT_MIN_APPLY_DELAY))
+ {
+ /*
+ * Test the combination of streaming mode and
+ * min_apply_delay
+ */
+ if (opts.min_apply_delay > 0)
+ {
+ if ((IsSet(opts.specified_opts, SUBOPT_STREAMING) && opts.streaming == LOGICALREP_STREAM_PARALLEL) ||
+ (!IsSet(opts.specified_opts, SUBOPT_STREAMING) && sub->stream == LOGICALREP_STREAM_PARALLEL))
+ ereport(ERROR,
+ errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
+ errmsg("cannot enable %s for subscription in %s mode",
+ "min_apply_delay", "streaming = parallel"));
+ }
+ values[Anum_pg_subscription_subminapplydelay - 1] =
+ Int64GetDatum(opts.min_apply_delay);
+ replaces[Anum_pg_subscription_subminapplydelay - 1] = true;
+ }
+
if (IsSet(opts.specified_opts, SUBOPT_ORIGIN))
{
values[Anum_pg_subscription_suborigin - 1] =
diff --git a/src/backend/replication/logical/applyparallelworker.c b/src/backend/replication/logical/applyparallelworker.c
index 2e5914d5d9..72e6f5ce84 100644
--- a/src/backend/replication/logical/applyparallelworker.c
+++ b/src/backend/replication/logical/applyparallelworker.c
@@ -704,7 +704,8 @@ pa_process_spooled_messages_if_required(void)
{
apply_spooled_messages(&MyParallelShared->fileset,
MyParallelShared->xid,
- InvalidXLogRecPtr);
+ InvalidXLogRecPtr,
+ 0);
pa_set_fileset_state(MyParallelShared, FS_EMPTY);
}
diff --git a/src/backend/replication/logical/worker.c b/src/backend/replication/logical/worker.c
index 79cda39445..62f72027ac 100644
--- a/src/backend/replication/logical/worker.c
+++ b/src/backend/replication/logical/worker.c
@@ -316,6 +316,17 @@ static List *on_commit_wakeup_workers_subids = NIL;
bool in_remote_transaction = false;
static XLogRecPtr remote_final_lsn = InvalidXLogRecPtr;
+/*
+ * In order to avoid walsender's timeout during time delayed replication,
+ * it's necessary to keep sending feedback messages during the delay from the
+ * worker process. Meanwhile, the feature delays the apply before starting the
+ * transaction and thus we don't write WALs for the suspended changes during
+ * the wait. Hence, in the case the worker process sends a feedback during the
+ * delay, avoid having positions of the flushed and apply LSN overwritten by
+ * the latest LSN.
+ */
+static XLogRecPtr last_received = InvalidXLogRecPtr;
+
/* fields valid only when processing streamed transaction */
static bool in_streamed_transaction = false;
@@ -386,10 +397,13 @@ static void stream_write_change(char action, StringInfo s);
static void stream_open_and_write_change(TransactionId xid, char action, StringInfo s);
static void stream_close_file(void);
-static void send_feedback(XLogRecPtr recvpos, bool force, bool requestReply);
+static void send_feedback(XLogRecPtr recvpos, bool force, bool requestReply,
+ bool in_delaying_apply);
static void DisableSubscriptionAndExit(void);
+static void maybe_delay_apply(TimestampTz finish_ts);
+
static void apply_handle_commit_internal(LogicalRepCommitData *commit_data);
static void apply_handle_insert_internal(ApplyExecutionData *edata,
ResultRelInfo *relinfo,
@@ -996,6 +1010,99 @@ slot_modify_data(TupleTableSlot *slot, TupleTableSlot *srcslot,
ExecStoreVirtualTuple(slot);
}
+/*
+ * When min_apply_delay parameter is set on the subscriber, we wait long enough
+ * to make sure a transaction is applied at least that interval behind the
+ * publisher.
+ *
+ * While the physical replication applies the delay at commit time, this
+ * feature applies the delay for the next transaction but before starting the
+ * transaction. This is mainly because keeping a transaction that conducted
+ * write operations open for a long time results in some issues such as bloat
+ * and locks.
+ *
+ * The min_apply_delay parameter will take effect only after all tables are in
+ * READY state.
+ *
+ * finish_ts is the commit/prepare time of both regular (non-streamed) and
+ * streamed transactions for time delayed replication.
+ */
+static void
+maybe_delay_apply(TimestampTz finish_ts)
+{
+ Assert(finish_ts > 0);
+
+ /* Nothing to do if no delay set */
+ if (MySubscription->minapplydelay <= 0)
+ return;
+
+ /*
+ * The min_apply_delay parameter is ignored until all tablesync workers
+ * have reached READY state. If we allow the delay during the catchup
+ * phase, once we reach the limit of tablesync workers, it will impose a
+ * delay for each subsequent worker. It means it will take a long time to
+ * finish the initial table synchronization.
+ */
+ if (!AllTablesyncsReady())
+ return;
+
+ while (true)
+ {
+ long diffms;
+
+ ResetLatch(MyLatch);
+
+ CHECK_FOR_INTERRUPTS();
+
+ if (ConfigReloadPending)
+ {
+ ConfigReloadPending = false;
+ ProcessConfigFile(PGC_SIGHUP);
+ }
+
+ /*
+ * Before calculating the time duration, reload the catalog if needed.
+ */
+ if (!in_remote_transaction && !in_streamed_transaction)
+ {
+ AcceptInvalidationMessages();
+ maybe_reread_subscription();
+ }
+
+ diffms = TimestampDifferenceMilliseconds(GetCurrentTimestamp(),
+ TimestampTzPlusMilliseconds(finish_ts, MySubscription->minapplydelay));
+
+ /*
+ * Exit without arming the latch if it's already past time to apply
+ * this transaction.
+ */
+ if (diffms <= 0)
+ break;
+
+ elog(DEBUG2, "logical replication apply delay: %ld ms", diffms);
+
+ /*
+ * Call send_feedback() to prevent the publisher from exiting by
+ * timeout during the delay, when wal_receiver_status_interval is
+ * available.
+ */
+ if (wal_receiver_status_interval > 0
+ && diffms > wal_receiver_status_interval * 1000)
+ {
+ WaitLatch(MyLatch,
+ WL_LATCH_SET | WL_TIMEOUT | WL_EXIT_ON_PM_DEATH,
+ (long) wal_receiver_status_interval * 1000,
+ WAIT_EVENT_RECOVERY_APPLY_DELAY);
+ send_feedback(last_received, true, false, true);
+ }
+ else
+ WaitLatch(MyLatch,
+ WL_LATCH_SET | WL_TIMEOUT | WL_EXIT_ON_PM_DEATH,
+ diffms,
+ WAIT_EVENT_RECOVERY_APPLY_DELAY);
+ }
+}
+
/*
* Handle BEGIN message.
*/
@@ -1007,6 +1114,9 @@ apply_handle_begin(StringInfo s)
logicalrep_read_begin(s, &begin_data);
set_apply_error_context_xact(begin_data.xid, begin_data.final_lsn);
+ /* Should we delay the current transaction? */
+ maybe_delay_apply(begin_data.committime);
+
remote_final_lsn = begin_data.final_lsn;
maybe_start_skipping_changes(begin_data.final_lsn);
@@ -1061,6 +1171,9 @@ apply_handle_begin_prepare(StringInfo s)
logicalrep_read_begin_prepare(s, &begin_data);
set_apply_error_context_xact(begin_data.xid, begin_data.prepare_lsn);
+ /* Should we delay the current prepared transaction? */
+ maybe_delay_apply(begin_data.prepare_time);
+
remote_final_lsn = begin_data.prepare_lsn;
maybe_start_skipping_changes(begin_data.prepare_lsn);
@@ -1308,7 +1421,8 @@ apply_handle_stream_prepare(StringInfo s)
* spooled operations.
*/
apply_spooled_messages(MyLogicalRepWorker->stream_fileset,
- prepare_data.xid, prepare_data.prepare_lsn);
+ prepare_data.xid, prepare_data.prepare_lsn,
+ prepare_data.prepare_time);
/* Mark the transaction as prepared. */
apply_handle_prepare_internal(&prepare_data);
@@ -1998,10 +2112,13 @@ ensure_last_message(FileSet *stream_fileset, TransactionId xid, int fileno,
/*
* Common spoolfile processing.
+ *
+ * The commit/prepare time for streaming transaction is required to achieve time
+ * delayed replication.
*/
void
apply_spooled_messages(FileSet *stream_fileset, TransactionId xid,
- XLogRecPtr lsn)
+ XLogRecPtr lsn, TimestampTz finish_ts)
{
StringInfoData s2;
int nchanges;
@@ -2012,6 +2129,21 @@ apply_spooled_messages(FileSet *stream_fileset, TransactionId xid,
int fileno;
off_t offset;
+ /*
+ * Should we delay the current transaction?
+ *
+ * Unlike the regular (non-streamed) cases, the delay is applied in a
+ * STREAM COMMIT/STREAM PREPARE message for streamed transactions. The
+ * STREAM START message does not contain a commit/prepare time (it will be
+ * available when the in-progress transaction finishes). Hence, it's not
+ * appropriate to apply a delay at that time.
+ *
+ * It's not allowed to execute time delayed replication with parallel
+ * apply feature.
+ */
+ if (!am_parallel_apply_worker())
+ maybe_delay_apply(finish_ts);
+
if (!am_parallel_apply_worker())
maybe_start_skipping_changes(lsn);
@@ -2171,7 +2303,7 @@ apply_handle_stream_commit(StringInfo s)
* spooled operations.
*/
apply_spooled_messages(MyLogicalRepWorker->stream_fileset, xid,
- commit_data.commit_lsn);
+ commit_data.commit_lsn, commit_data.committime);
apply_handle_commit_internal(&commit_data);
@@ -3444,7 +3576,7 @@ UpdateWorkerStats(XLogRecPtr last_lsn, TimestampTz send_time, bool reply)
* Apply main loop.
*/
static void
-LogicalRepApplyLoop(XLogRecPtr last_received)
+LogicalRepApplyLoop(void)
{
TimestampTz last_recv_timestamp = GetCurrentTimestamp();
bool ping_sent = false;
@@ -3565,7 +3697,7 @@ LogicalRepApplyLoop(XLogRecPtr last_received)
if (last_received < end_lsn)
last_received = end_lsn;
- send_feedback(last_received, reply_requested, false);
+ send_feedback(last_received, reply_requested, false, false);
UpdateWorkerStats(last_received, timestamp, true);
}
/* other message types are purposefully ignored */
@@ -3578,7 +3710,7 @@ LogicalRepApplyLoop(XLogRecPtr last_received)
}
/* confirm all writes so far */
- send_feedback(last_received, false, false);
+ send_feedback(last_received, false, false, false);
if (!in_remote_transaction && !in_streamed_transaction)
{
@@ -3675,7 +3807,7 @@ LogicalRepApplyLoop(XLogRecPtr last_received)
}
}
- send_feedback(last_received, requestReply, requestReply);
+ send_feedback(last_received, requestReply, requestReply, false);
/*
* Force reporting to ensure long idle periods don't lead to
@@ -3705,7 +3837,7 @@ LogicalRepApplyLoop(XLogRecPtr last_received)
* to send a response to avoid timeouts.
*/
static void
-send_feedback(XLogRecPtr recvpos, bool force, bool requestReply)
+send_feedback(XLogRecPtr recvpos, bool force, bool requestReply, bool in_delaying_apply)
{
static StringInfo reply_message = NULL;
static TimestampTz send_time = 0;
@@ -3735,8 +3867,15 @@ send_feedback(XLogRecPtr recvpos, bool force, bool requestReply)
/*
* No outstanding transactions to flush, we can report the latest received
* position. This is important for synchronous replication.
+ *
+ * During the delay of time delayed replication, do not tell the publisher
+ * that the received latest LSN is already applied and flushed at this
+ * stage, since we don't apply the transaction yet. If we do so, it leads
+ * to a wrong assumption of logical replication progress on the publisher
+ * side. Here, we just send a feedback message to avoid publisher's
+ * timeout during the delay.
*/
- if (!have_pending_txes)
+ if (!have_pending_txes && !in_delaying_apply)
flushpos = writepos = recvpos;
if (writepos < last_writepos)
@@ -4362,11 +4501,11 @@ start_table_sync(XLogRecPtr *origin_startpos, char **myslotname)
* of system resource error and are not repeatable.
*/
static void
-start_apply(XLogRecPtr origin_startpos)
+start_apply(void)
{
PG_TRY();
{
- LogicalRepApplyLoop(origin_startpos);
+ LogicalRepApplyLoop();
}
PG_CATCH();
{
@@ -4653,7 +4792,8 @@ ApplyWorkerMain(Datum main_arg)
}
/* Run the main loop. */
- start_apply(origin_startpos);
+ last_received = origin_startpos;
+ start_apply();
proc_exit(0);
}
diff --git a/src/backend/utils/adt/timestamp.c b/src/backend/utils/adt/timestamp.c
index 928c330897..422e6ad0fa 100644
--- a/src/backend/utils/adt/timestamp.c
+++ b/src/backend/utils/adt/timestamp.c
@@ -2431,6 +2431,35 @@ interval_cmp_internal(const Interval *interval1, const Interval *interval2)
return int128_compare(span1, span2);
}
+/*
+ * Returns the number of milliseconds in the specified Interval.
+ */
+int64
+interval2ms(const Interval *interval)
+{
+ int64 days;
+ int64 ms;
+ int64 result;
+
+ days = interval->month * INT64CONST(30);
+ days += interval->day;
+
+ /* Detect whether the value of interval can cause an overflow */
+ if (pg_mul_s64_overflow(days, MSECS_PER_DAY, &result))
+ ereport(ERROR,
+ errcode(ERRCODE_NUMERIC_VALUE_OUT_OF_RANGE),
+ errmsg("bigint out of range"));
+
+ /* Adds portion time (in ms) to the previous result */
+ ms = interval->time / INT64CONST(1000);
+ if (pg_add_s64_overflow(result, ms, &result))
+ ereport(ERROR,
+ errcode(ERRCODE_NUMERIC_VALUE_OUT_OF_RANGE),
+ errmsg("bigint out of range"));
+
+ return result;
+}
+
Datum
interval_eq(PG_FUNCTION_ARGS)
{
diff --git a/src/bin/pg_dump/pg_dump.c b/src/bin/pg_dump/pg_dump.c
index 2c0a969972..16d3b003fe 100644
--- a/src/bin/pg_dump/pg_dump.c
+++ b/src/bin/pg_dump/pg_dump.c
@@ -4494,6 +4494,7 @@ getSubscriptions(Archive *fout)
int i_subsynccommit;
int i_subpublications;
int i_subbinary;
+ int i_subminapplydelay;
int i,
ntups;
@@ -4546,9 +4547,14 @@ getSubscriptions(Archive *fout)
LOGICALREP_TWOPHASE_STATE_DISABLED);
if (fout->remoteVersion >= 160000)
- appendPQExpBufferStr(query, " s.suborigin\n");
+ appendPQExpBufferStr(query,
+ " s.suborigin,\n"
+ " s.subminapplydelay\n");
else
- appendPQExpBuffer(query, " '%s' AS suborigin\n", LOGICALREP_ORIGIN_ANY);
+ {
+ appendPQExpBuffer(query, " '%s' AS suborigin,\n", LOGICALREP_ORIGIN_ANY);
+ appendPQExpBufferStr(query, " 0 AS subminapplydelay\n");
+ }
appendPQExpBufferStr(query,
"FROM pg_subscription s\n"
@@ -4576,6 +4582,7 @@ getSubscriptions(Archive *fout)
i_subtwophasestate = PQfnumber(res, "subtwophasestate");
i_subdisableonerr = PQfnumber(res, "subdisableonerr");
i_suborigin = PQfnumber(res, "suborigin");
+ i_subminapplydelay = PQfnumber(res, "subminapplydelay");
subinfo = pg_malloc(ntups * sizeof(SubscriptionInfo));
@@ -4606,6 +4613,8 @@ getSubscriptions(Archive *fout)
subinfo[i].subdisableonerr =
pg_strdup(PQgetvalue(res, i, i_subdisableonerr));
subinfo[i].suborigin = pg_strdup(PQgetvalue(res, i, i_suborigin));
+ subinfo[i].subminapplydelay =
+ strtoi64(PQgetvalue(res, i, i_subminapplydelay), NULL, 10);
/* Decide whether we want to dump it */
selectDumpableObject(&(subinfo[i].dobj), fout);
@@ -4687,6 +4696,9 @@ dumpSubscription(Archive *fout, const SubscriptionInfo *subinfo)
if (strcmp(subinfo->subsynccommit, "off") != 0)
appendPQExpBuffer(query, ", synchronous_commit = %s", fmtId(subinfo->subsynccommit));
+ if (subinfo->subminapplydelay > 0)
+ appendPQExpBuffer(query, ", min_apply_delay = '" INT64_FORMAT " ms'", subinfo->subminapplydelay);
+
appendPQExpBufferStr(query, ");\n");
if (subinfo->dobj.dump & DUMP_COMPONENT_DEFINITION)
diff --git a/src/bin/pg_dump/pg_dump.h b/src/bin/pg_dump/pg_dump.h
index e7cbd8d7ed..e2525f70ab 100644
--- a/src/bin/pg_dump/pg_dump.h
+++ b/src/bin/pg_dump/pg_dump.h
@@ -661,6 +661,7 @@ typedef struct _SubscriptionInfo
char *subdisableonerr;
char *suborigin;
char *subsynccommit;
+ int64 subminapplydelay;
char *subpublications;
} SubscriptionInfo;
diff --git a/src/bin/psql/describe.c b/src/bin/psql/describe.c
index c8a0bb7b3a..8a27063bed 100644
--- a/src/bin/psql/describe.c
+++ b/src/bin/psql/describe.c
@@ -6472,7 +6472,7 @@ describeSubscriptions(const char *pattern, bool verbose)
PGresult *res;
printQueryOpt myopt = pset.popt;
static const bool translate_columns[] = {false, false, false, false,
- false, false, false, false, false, false, false, false};
+ false, false, false, false, false, false, false, false, false};
if (pset.sversion < 100000)
{
@@ -6527,10 +6527,13 @@ describeSubscriptions(const char *pattern, bool verbose)
gettext_noop("Two-phase commit"),
gettext_noop("Disable on error"));
+ /* Origin and min_apply_delay are only supported in v16 and higher */
if (pset.sversion >= 160000)
appendPQExpBuffer(&buf,
- ", suborigin AS \"%s\"\n",
- gettext_noop("Origin"));
+ ", suborigin AS \"%s\"\n"
+ ", subminapplydelay AS \"%s\"\n",
+ gettext_noop("Origin"),
+ gettext_noop("Min apply delay (ms)"));
appendPQExpBuffer(&buf,
", subsynccommit AS \"%s\"\n"
diff --git a/src/bin/psql/tab-complete.c b/src/bin/psql/tab-complete.c
index 23750ea5fb..19d2f90dc0 100644
--- a/src/bin/psql/tab-complete.c
+++ b/src/bin/psql/tab-complete.c
@@ -1925,7 +1925,7 @@ psql_completion(const char *text, int start, int end)
COMPLETE_WITH("(", "PUBLICATION");
/* ALTER SUBSCRIPTION <name> SET ( */
else if (HeadMatches("ALTER", "SUBSCRIPTION", MatchAny) && TailMatches("SET", "("))
- COMPLETE_WITH("binary", "disable_on_error", "origin", "slot_name",
+ COMPLETE_WITH("binary", "disable_on_error", "min_apply_delay", "origin", "slot_name",
"streaming", "synchronous_commit");
/* ALTER SUBSCRIPTION <name> SKIP ( */
else if (HeadMatches("ALTER", "SUBSCRIPTION", MatchAny) && TailMatches("SKIP", "("))
@@ -3247,7 +3247,7 @@ psql_completion(const char *text, int start, int end)
/* Complete "CREATE SUBSCRIPTION <name> ... WITH ( <opt>" */
else if (HeadMatches("CREATE", "SUBSCRIPTION") && TailMatches("WITH", "("))
COMPLETE_WITH("binary", "connect", "copy_data", "create_slot",
- "disable_on_error", "enabled", "origin", "slot_name",
+ "disable_on_error", "enabled", "min_apply_delay", "origin", "slot_name",
"streaming", "synchronous_commit", "two_phase");
/* CREATE TRIGGER --- is allowed inside CREATE SCHEMA, so use TailMatches */
diff --git a/src/include/catalog/pg_subscription.h b/src/include/catalog/pg_subscription.h
index b0f2a1705d..078495cbf0 100644
--- a/src/include/catalog/pg_subscription.h
+++ b/src/include/catalog/pg_subscription.h
@@ -70,6 +70,8 @@ CATALOG(pg_subscription,6100,SubscriptionRelationId) BKI_SHARED_RELATION BKI_ROW
XLogRecPtr subskiplsn; /* All changes finished at this LSN are
* skipped */
+ int64 subminapplydelay; /* Replication apply delay */
+
NameData subname; /* Name of the subscription */
Oid subowner BKI_LOOKUP(pg_authid); /* Owner of the subscription */
@@ -120,6 +122,7 @@ typedef struct Subscription
* in */
XLogRecPtr skiplsn; /* All changes finished at this LSN are
* skipped */
+ int64 minapplydelay; /* Replication apply delay */
char *name; /* Name of the subscription */
Oid owner; /* Oid of the subscription owner */
bool enabled; /* Indicates if the subscription is enabled */
diff --git a/src/include/datatype/timestamp.h b/src/include/datatype/timestamp.h
index 21a37e21e9..8b368af299 100644
--- a/src/include/datatype/timestamp.h
+++ b/src/include/datatype/timestamp.h
@@ -127,6 +127,8 @@ struct pg_itm_in
#define SECS_PER_MINUTE 60
#define MINS_PER_HOUR 60
+#define MSECS_PER_DAY INT64CONST(86400000)
+
#define USECS_PER_DAY INT64CONST(86400000000)
#define USECS_PER_HOUR INT64CONST(3600000000)
#define USECS_PER_MINUTE INT64CONST(60000000)
diff --git a/src/include/replication/worker_internal.h b/src/include/replication/worker_internal.h
index db891eea8a..4ef132d243 100644
--- a/src/include/replication/worker_internal.h
+++ b/src/include/replication/worker_internal.h
@@ -255,7 +255,7 @@ extern void stream_stop_internal(TransactionId xid);
/* Common streaming function to apply all the spooled messages */
extern void apply_spooled_messages(FileSet *stream_fileset, TransactionId xid,
- XLogRecPtr lsn);
+ XLogRecPtr lsn, TimestampTz finish_ts);
extern void apply_dispatch(StringInfo s);
diff --git a/src/include/utils/timestamp.h b/src/include/utils/timestamp.h
index 42f802bb9d..534051fe13 100644
--- a/src/include/utils/timestamp.h
+++ b/src/include/utils/timestamp.h
@@ -102,6 +102,8 @@ extern bool TimestampDifferenceExceeds(TimestampTz start_time,
TimestampTz stop_time,
int msec);
+extern int64 interval2ms(const Interval *interval);
+
extern TimestampTz time_t_to_timestamptz(pg_time_t tm);
extern pg_time_t timestamptz_to_time_t(TimestampTz t);
diff --git a/src/test/regress/expected/subscription.out b/src/test/regress/expected/subscription.out
index 1ed6f4c39c..7b16c80926 100644
--- a/src/test/regress/expected/subscription.out
+++ b/src/test/regress/expected/subscription.out
@@ -114,18 +114,18 @@ CREATE SUBSCRIPTION regress_testsub4 CONNECTION 'dbname=regress_doesnotexist' PU
WARNING: subscription was created, but is not connected
HINT: To initiate replication, you must manually create the replication slot, enable the subscription, and refresh the subscription.
\dRs+ regress_testsub4
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
-------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub4 | regress_subscription_user | f | {testpub} | f | off | d | f | none | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay (ms) | Synchronous commit | Conninfo | Skip LSN
+------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+----------------------+--------------------+-----------------------------+----------
+ regress_testsub4 | regress_subscription_user | f | {testpub} | f | off | d | f | none | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
ALTER SUBSCRIPTION regress_testsub4 SET (origin = any);
\dRs+ regress_testsub4
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
-------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub4 | regress_subscription_user | f | {testpub} | f | off | d | f | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay (ms) | Synchronous commit | Conninfo | Skip LSN
+------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+----------------------+--------------------+-----------------------------+----------
+ regress_testsub4 | regress_subscription_user | f | {testpub} | f | off | d | f | any | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
DROP SUBSCRIPTION regress_testsub3;
@@ -135,10 +135,10 @@ ALTER SUBSCRIPTION regress_testsub CONNECTION 'foobar';
ERROR: invalid connection string syntax: missing "=" after "foobar" in connection info string
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub} | f | off | d | f | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay (ms) | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+----------------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | off | d | f | any | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
ALTER SUBSCRIPTION regress_testsub SET PUBLICATION testpub2, testpub3 WITH (refresh = false);
@@ -155,10 +155,10 @@ ERROR: unrecognized subscription parameter: "create_slot"
-- ok
ALTER SUBSCRIPTION regress_testsub SKIP (lsn = '0/12345');
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+---------------------+--------+-----------+------------------+------------------+--------+--------------------+------------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub2,testpub3} | f | off | d | f | any | off | dbname=regress_doesnotexist2 | 0/12345
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay (ms) | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+---------------------+--------+-----------+------------------+------------------+--------+----------------------+--------------------+------------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub2,testpub3} | f | off | d | f | any | 0 | off | dbname=regress_doesnotexist2 | 0/12345
(1 row)
-- ok - with lsn = NONE
@@ -167,10 +167,10 @@ ALTER SUBSCRIPTION regress_testsub SKIP (lsn = NONE);
ALTER SUBSCRIPTION regress_testsub SKIP (lsn = '0/0');
ERROR: invalid WAL location (LSN): 0/0
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+---------------------+--------+-----------+------------------+------------------+--------+--------------------+------------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub2,testpub3} | f | off | d | f | any | off | dbname=regress_doesnotexist2 | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay (ms) | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+---------------------+--------+-----------+------------------+------------------+--------+----------------------+--------------------+------------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub2,testpub3} | f | off | d | f | any | 0 | off | dbname=regress_doesnotexist2 | 0/0
(1 row)
BEGIN;
@@ -202,10 +202,10 @@ ALTER SUBSCRIPTION regress_testsub_foo SET (synchronous_commit = foobar);
ERROR: invalid value for parameter "synchronous_commit": "foobar"
HINT: Available values: local, remote_write, remote_apply, on, off.
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
----------------------+---------------------------+---------+---------------------+--------+-----------+------------------+------------------+--------+--------------------+------------------------------+----------
- regress_testsub_foo | regress_subscription_user | f | {testpub2,testpub3} | f | off | d | f | any | local | dbname=regress_doesnotexist2 | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay (ms) | Synchronous commit | Conninfo | Skip LSN
+---------------------+---------------------------+---------+---------------------+--------+-----------+------------------+------------------+--------+----------------------+--------------------+------------------------------+----------
+ regress_testsub_foo | regress_subscription_user | f | {testpub2,testpub3} | f | off | d | f | any | 0 | local | dbname=regress_doesnotexist2 | 0/0
(1 row)
-- rename back to keep the rest simple
@@ -239,19 +239,19 @@ CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUB
WARNING: subscription was created, but is not connected
HINT: To initiate replication, you must manually create the replication slot, enable the subscription, and refresh the subscription.
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub} | t | off | d | f | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay (ms) | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+----------------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | t | off | d | f | any | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
ALTER SUBSCRIPTION regress_testsub SET (binary = false);
ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE);
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub} | f | off | d | f | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay (ms) | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+----------------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | off | d | f | any | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
DROP SUBSCRIPTION regress_testsub;
@@ -263,27 +263,27 @@ CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUB
WARNING: subscription was created, but is not connected
HINT: To initiate replication, you must manually create the replication slot, enable the subscription, and refresh the subscription.
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub} | f | on | d | f | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay (ms) | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+----------------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | on | d | f | any | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
ALTER SUBSCRIPTION regress_testsub SET (streaming = parallel);
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub} | f | parallel | d | f | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay (ms) | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+----------------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | parallel | d | f | any | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
ALTER SUBSCRIPTION regress_testsub SET (streaming = false);
ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE);
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub} | f | off | d | f | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay (ms) | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+----------------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | off | d | f | any | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
-- fail - publication already exists
@@ -298,10 +298,10 @@ ALTER SUBSCRIPTION regress_testsub ADD PUBLICATION testpub1, testpub2 WITH (refr
ALTER SUBSCRIPTION regress_testsub ADD PUBLICATION testpub1, testpub2 WITH (refresh = false);
ERROR: publication "testpub1" is already in subscription "regress_testsub"
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-----------------------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub,testpub1,testpub2} | f | off | d | f | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay (ms) | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-----------------------------+--------+-----------+------------------+------------------+--------+----------------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub,testpub1,testpub2} | f | off | d | f | any | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
-- fail - publication used more then once
@@ -316,10 +316,10 @@ ERROR: publication "testpub3" is not in subscription "regress_testsub"
-- ok - delete publications
ALTER SUBSCRIPTION regress_testsub DROP PUBLICATION testpub1, testpub2 WITH (refresh = false);
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub} | f | off | d | f | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay (ms) | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+----------------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | off | d | f | any | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
DROP SUBSCRIPTION regress_testsub;
@@ -355,10 +355,10 @@ CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUB
WARNING: subscription was created, but is not connected
HINT: To initiate replication, you must manually create the replication slot, enable the subscription, and refresh the subscription.
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub} | f | off | p | f | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay (ms) | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+----------------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | off | p | f | any | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
--fail - alter of two_phase option not supported.
@@ -367,10 +367,10 @@ ERROR: unrecognized subscription parameter: "two_phase"
-- but can alter streaming when two_phase enabled
ALTER SUBSCRIPTION regress_testsub SET (streaming = true);
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub} | f | on | p | f | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay (ms) | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+----------------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | on | p | f | any | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE);
@@ -380,10 +380,10 @@ CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUB
WARNING: subscription was created, but is not connected
HINT: To initiate replication, you must manually create the replication slot, enable the subscription, and refresh the subscription.
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub} | f | on | p | f | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay (ms) | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+----------------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | on | p | f | any | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE);
@@ -396,20 +396,61 @@ CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUB
WARNING: subscription was created, but is not connected
HINT: To initiate replication, you must manually create the replication slot, enable the subscription, and refresh the subscription.
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub} | f | off | d | f | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay (ms) | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+----------------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | off | d | f | any | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
ALTER SUBSCRIPTION regress_testsub SET (disable_on_error = true);
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub} | f | off | d | t | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay (ms) | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+----------------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | off | d | t | any | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
+ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE);
+DROP SUBSCRIPTION regress_testsub;
+-- fail -- min_apply_delay must be a non-negative integer
+CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUBLICATION testpub WITH (connect = false, min_apply_delay = foo);
+ERROR: invalid input syntax for type interval: "foo"
+CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUBLICATION testpub WITH (connect = false, min_apply_delay = -1);
+ERROR: -1 ms is outside the valid range for parameter "min_apply_delay" (0 .. 2147483647)
+-- fail - utilizing streaming = parallel with time delayed replication is not supported.
+CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUBLICATION testpub WITH (connect = false, streaming = parallel, min_apply_delay = '1 day');
+ERROR: min_apply_delay > 0 and streaming = parallel are mutually exclusive options
+-- success -- value without unit is taken as milliseconds
+CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUBLICATION testpub WITH (connect = false, min_apply_delay = 123);
+WARNING: subscription was created, but is not connected
+HINT: To initiate replication, you must manually create the replication slot, enable the subscription, and refresh the subscription.
+\dRs+
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay (ms) | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+----------------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | off | d | f | any | 123 | off | dbname=regress_doesnotexist | 0/0
+(1 row)
+
+-- success -- interval is converted into ms and stored as integer
+ALTER SUBSCRIPTION regress_testsub SET (min_apply_delay = '4h 27min 35s');
+\dRs+
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay (ms) | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+----------------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | off | d | f | any | 16055000 | off | dbname=regress_doesnotexist | 0/0
+(1 row)
+
+-- fail - alter subscription with streaming = parallel should fail when time delayed replication is set.
+ALTER SUBSCRIPTION regress_testsub SET (streaming = parallel);
+ERROR: cannot enable streaming = parallel mode for subscription with min_apply_delay
+ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE);
+DROP SUBSCRIPTION regress_testsub;
+-- fail - alter subscription with min_apply_delay should fail when streaming = parallel is set.
+CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUBLICATION testpub WITH (connect = false, streaming = parallel);
+WARNING: subscription was created, but is not connected
+HINT: To initiate replication, you must manually create the replication slot, enable the subscription, and refresh the subscription.
+ALTER SUBSCRIPTION regress_testsub SET (min_apply_delay = 123);
+ERROR: cannot enable min_apply_delay for subscription in streaming = parallel mode
ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE);
DROP SUBSCRIPTION regress_testsub;
RESET SESSION AUTHORIZATION;
diff --git a/src/test/regress/sql/subscription.sql b/src/test/regress/sql/subscription.sql
index 7991abfe8f..ba822d4c29 100644
--- a/src/test/regress/sql/subscription.sql
+++ b/src/test/regress/sql/subscription.sql
@@ -279,6 +279,31 @@ ALTER SUBSCRIPTION regress_testsub SET (disable_on_error = true);
ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE);
DROP SUBSCRIPTION regress_testsub;
+-- fail -- min_apply_delay must be a non-negative integer
+CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUBLICATION testpub WITH (connect = false, min_apply_delay = foo);
+CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUBLICATION testpub WITH (connect = false, min_apply_delay = -1);
+-- fail - utilizing streaming = parallel with time delayed replication is not supported.
+CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUBLICATION testpub WITH (connect = false, streaming = parallel, min_apply_delay = '1 day');
+
+-- success -- value without unit is taken as milliseconds
+CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUBLICATION testpub WITH (connect = false, min_apply_delay = 123);
+\dRs+
+
+-- success -- interval is converted into ms and stored as integer
+ALTER SUBSCRIPTION regress_testsub SET (min_apply_delay = '4h 27min 35s');
+\dRs+
+
+-- fail - alter subscription with streaming = parallel should fail when time delayed replication is set.
+ALTER SUBSCRIPTION regress_testsub SET (streaming = parallel);
+ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE);
+DROP SUBSCRIPTION regress_testsub;
+
+-- fail - alter subscription with min_apply_delay should fail when streaming = parallel is set.
+CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUBLICATION testpub WITH (connect = false, streaming = parallel);
+ALTER SUBSCRIPTION regress_testsub SET (min_apply_delay = 123);
+ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE);
+DROP SUBSCRIPTION regress_testsub;
+
RESET SESSION AUTHORIZATION;
DROP ROLE regress_subscription_user;
DROP ROLE regress_subscription_user2;
diff --git a/src/test/subscription/meson.build b/src/test/subscription/meson.build
index 3db0fdfd96..a186876eb4 100644
--- a/src/test/subscription/meson.build
+++ b/src/test/subscription/meson.build
@@ -38,6 +38,7 @@ tests += {
't/029_on_error.pl',
't/030_origin.pl',
't/031_column_list.pl',
+ 't/032_apply_delay.pl',
't/100_bugs.pl',
],
},
diff --git a/src/test/subscription/t/032_apply_delay.pl b/src/test/subscription/t/032_apply_delay.pl
new file mode 100644
index 0000000000..b21de1ff63
--- /dev/null
+++ b/src/test/subscription/t/032_apply_delay.pl
@@ -0,0 +1,170 @@
+
+# Copyright (c) 2022, PostgreSQL Global Development Group
+
+# Test replication apply delay
+use strict;
+use warnings;
+use PostgreSQL::Test::Cluster;
+use PostgreSQL::Test::Utils;
+use Test::More;
+
+# Create publisher node.
+my $node_publisher = PostgreSQL::Test::Cluster->new('publisher');
+$node_publisher->init(allows_streaming => 'logical');
+$node_publisher->append_conf('postgresql.conf',
+ 'logical_decoding_work_mem = 64kB');
+$node_publisher->start;
+
+# Create subscriber node.
+my $node_subscriber = PostgreSQL::Test::Cluster->new('subscriber');
+$node_subscriber->init;
+$node_subscriber->append_conf('postgresql.conf',
+ "log_min_messages = debug2");
+$node_subscriber->start;
+
+# Create some preexisting content on publisher.
+$node_publisher->safe_psql('postgres',
+ "CREATE TABLE test_tab (a int primary key, b varchar, c timestamptz DEFAULT now())");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO test_tab VALUES (1, 'foo'), (2, 'bar')");
+
+# Setup structure on subscriber.
+$node_subscriber->safe_psql('postgres',
+ "CREATE TABLE test_tab (a int primary key, b text, c timestamptz DEFAULT now(), d bigint DEFAULT 999)"
+);
+
+# Setup logical replication.
+my $publisher_connstr = $node_publisher->connstr . ' dbname=postgres';
+
+# column c must not be published because we want to compare the time difference.
+$node_publisher->safe_psql('postgres',
+ "CREATE PUBLICATION tap_pub FOR TABLE test_tab (a, b)");
+
+my $appname = 'tap_sub';
+$node_subscriber->safe_psql('postgres',
+ "CREATE SUBSCRIPTION tap_sub CONNECTION '$publisher_connstr application_name=$appname' PUBLICATION tap_pub WITH (streaming = on, min_apply_delay = '3s')"
+);
+
+# Wait for initial table sync to finish.
+$node_subscriber->wait_for_subscription_sync($node_publisher, $appname);
+
+# Check log starting now for logical replication apply delay.
+my $log_location = -s $node_subscriber->logfile;
+
+my $result =
+ $node_subscriber->safe_psql('postgres',
+ "SELECT count(*), min(a), max(a) FROM test_tab");
+is($result, qq(2|1|2), 'check initial data was copied to subscriber');
+
+# New row to trigger apply delay.
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO test_tab VALUES (3, 'baz')");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO test_tab VALUES (4, 'abc')");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO test_tab VALUES (5, 'def')");
+
+$node_publisher->wait_for_catchup($appname);
+
+$result =
+ $node_subscriber->safe_psql('postgres',
+ "SELECT count(*), min(a), max(a) FROM test_tab");
+is($result, qq(5|1|5), 'check if the new rows were applied to subscriber');
+
+check_apply_delay_log("logical replication apply delay", "2000");
+
+check_apply_delay_time('5', '3');
+
+# Test streamed transaction.
+# Insert, update and delete enough rows to exceed 64kB limit.
+$node_publisher->safe_psql(
+ 'postgres', q{
+BEGIN;
+INSERT INTO test_tab SELECT i, md5(i::text) FROM generate_series(6, 5000) s(i);
+UPDATE test_tab SET b = md5(b) WHERE mod(a, 2) = 0;
+DELETE FROM test_tab WHERE mod(a, 3) = 0;
+COMMIT;
+});
+
+$node_publisher->wait_for_catchup($appname);
+
+$result =
+ $node_subscriber->safe_psql('postgres',
+ "SELECT count(*), min(a), max(a) FROM test_tab");
+is($result, qq(3334|1|5000), 'check if the new rows were applied to subscriber');
+
+check_apply_delay_log("logical replication apply delay", "2000");
+
+check_apply_delay_time('5000', '3');
+
+# Test ALTER SUBSCRIPTION. Delay 86460 seconds (1 day 1 minute).
+$node_subscriber->safe_psql('postgres',
+ "ALTER SUBSCRIPTION tap_sub SET (min_apply_delay = 86460000)"
+);
+
+# New row to trigger apply delay.
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO test_tab VALUES (0, 'foobar')");
+
+check_apply_delay_log("logical replication apply delay", "80000000");
+
+# Disable subscription. worker should die immediately.
+$node_subscriber->safe_psql('postgres',
+ "ALTER SUBSCRIPTION tap_sub DISABLE;"
+);
+
+# Wait until worker dies.
+my $sub_query =
+ "SELECT count(1) = 0 FROM pg_stat_subscription WHERE subname = 'tap_sub' AND pid IS NOT NULL;";
+$node_subscriber->poll_query_until('postgres', $sub_query)
+ or die "Timed out while waiting for subscriber to die";
+
+$node_subscriber->stop;
+$node_publisher->stop;
+
+done_testing();
+
+sub check_apply_delay_log
+{
+ my ($message, $expected) = @_;
+ $expected = 0 unless defined $expected;
+
+ my $old_log_location = $log_location;
+
+ $log_location = $node_subscriber->wait_for_log(qr/$message/, $log_location);
+
+ cmp_ok($log_location, '>', $old_log_location,
+ "logfile contains triggered logical replication apply delay"
+ );
+
+ if ($expected > 0)
+ {
+ # Get the delay time in the server log.
+ my $contents = slurp_file($node_subscriber->logfile, $old_log_location);
+ $contents =~
+ qr/logical replication apply delay: (\d+) ms/
+ or die "could not get time";
+ my $logged_delay = $1;
+
+ # Is it larger than expected ?
+ cmp_ok($logged_delay, '>', $expected,
+ "The wait time of the apply worker is long enough expectedly"
+ );
+ }
+}
+
+sub check_apply_delay_time
+{
+ my ($primary_key, $expected_diffs) = @_;
+
+ my $inserted_time_on_pub = $node_publisher->safe_psql('postgres', qq[
+ SELECT extract(epoch from c) FROM test_tab WHERE a = $primary_key;
+ ]);
+
+ my $inserted_time_on_sub = $node_subscriber->safe_psql('postgres', qq[
+ SELECT extract(epoch from c) FROM test_tab WHERE a = $primary_key;
+ ]);
+
+ cmp_ok($inserted_time_on_sub - $inserted_time_on_pub, '>', $expected_diffs,
+ "The tuple on the subscriber was modified later than the publisher");
+}
--
2.30.0
Hi, Shveta
Thanks for your comments!
On Thursday, January 12, 2023 6:51 PM shveta malik <shveta.malik@gmail.com> wrote:
Yes, DBAs may set wal_receiver_status_interval to more than
wal_sender_timeout by mistake.But to handle the scenario we must compare between min_apply_delay *on
subscriber* and wal_sender_timeout *on publisher*. Both values are not
transferred to opposite sides, so the WARNING cannot be raised. I
considered that such a mechanism seemed to be complex. The discussionaround [1] may be useful.
[1]:
/messages/by-id/CAA4eK1Lq+h8qo+rqGU-E+
hwJKAHYocV54y4pvou4rLysCgYD-g%40mail.gmail.com
okay, I see. So even when 'wal_receiver_status_interval' is set to 0, no
log/warning is needed when the user tries to set min_apply_delay>0?
Are we good with doc alone?
Yes. As far as I can remember, we don't emit log or warning
for some kind of combination of those parameters (in the context
of timeout too). So, it should be fine.
One trivial correction in config.sgml: + terminates due to the timeout errors. Hence, make sure this parameter + shorter than the <literal>wal_sender_timeout</literal> of the publisher. Hence, make sure this parameter is shorter... <is missing>
Fixed.
Kindly have a look at the latest patch shared in [1]/messages/by-id/TYCPR01MB83739C6133B50DDA8BAD1601EDFD9@TYCPR01MB8373.jpnprd01.prod.outlook.com.
[1]: /messages/by-id/TYCPR01MB83739C6133B50DDA8BAD1601EDFD9@TYCPR01MB8373.jpnprd01.prod.outlook.com
Best Regards,
Takamichi Osumi
Hi, Melih
On Thursday, January 12, 2023 10:12 PM Melih Mutlu <m.melihmutlu@gmail.com> wrote:
I've a question about 032_apply_delay.pl.
...
I couldn't quite see how these lines test whether ALTER SUBSCRIPTION successfully worked.
Don't we need to check that min_apply_delay really changed as a result?
Yeah, we should check it from the POV of apply worker's debug logs.
The latest patch posted in [1]/messages/by-id/TYCPR01MB83739C6133B50DDA8BAD1601EDFD9@TYCPR01MB8373.jpnprd01.prod.outlook.com addressed your concern,
by checking the logged delay time in the server log.
I'd say what we could do is to check the logged time is long enough
after the ALTER SUBSCRIPTION command.
Please have a look at the patch.
[1]: /messages/by-id/TYCPR01MB83739C6133B50DDA8BAD1601EDFD9@TYCPR01MB8373.jpnprd01.prod.outlook.com
Best Regards,
Takamichi Osumi
On Thu, 12 Jan 2023 at 21:09, Takamichi Osumi (Fujitsu)
<osumi.takamichi@fujitsu.com> wrote:
On Thursday, January 12, 2023 12:04 PM Kyotaro Horiguchi <horikyota.ntt@gmail.com> wrote:
At Wed, 11 Jan 2023 12:46:24 +0000, "Hayato Kuroda (Fujitsu)"
<kuroda.hayato@fujitsu.com> wrote inthem. Which version is better?
Some comments by a quick loock, different from the above.
Horiguchi-san, thanks for your review !
+ CONNECTION 'host=192.168.1.50 port=5432 user=foo
dbname=foodb'I understand that we (not PG people, but IT people) are supposed to use in
documents a certain set of special addresses that is guaranteed not to be
routed in the field.TEST-NET-1 : 192.0.2.0/24
TEST-NET-2 : 198.51.100.0/24
TEST-NET-3 : 203.0.113.0/24(I found 192.83.123.89 in the postgres_fdw doc, but it'd be another issue..)
Fixed. If necessary we can create another thread for this.
+ if (strspn(tmp, "-0123456789 ") == strlen(tmp))
Do we need to bother spending another memory block for apparent non-digits
here?Yes. The characters are necessary to handle an issue reported in [1].
The issue happened if the user inputs a negative value,
then the length comparison became different between strspn and strlen
and the input value was recognized as seconds, when
the unit wasn't described. This led to a wrong error message for the user.Those addition of such characters solve the issue.
+ errmsg(INT64_FORMAT " ms is outside the valid range for parameter +\"%s\"",We don't use INT64_FORMAT in translatable message strings. Cast then
use %lld instead.Thanks for teaching us. Fixed.
This message looks unfriendly as it doesn't suggest the valid range, and it
shows the input value in a different unit from what was in the input. A I think we
can spell it as "\"%s\" is outside the valid range for subsciription parameter
\"%s\" (0 .. <INT_MAX> in millisecond)"Makes sense. I incorporated the valid range with the aligned format of recovery_min_apply_delay.
FYI, the physical replication's GUC doesn't write the unites for the range like below.
I followed and applied this style.---
LOG: -1 ms is outside the valid range for parameter "recovery_min_apply_delay" (0 .. 2147483647)
FATAL: configuration file "/home/k5user/new/pg/l/make_v15/slave/postgresql.conf" contains errors
---+ int64 min_apply_delay; .. + if (ms < 0 || ms > INT_MAX)Why is the variable wider than required?
You are right. Fixed.
+ errmsg("%s and %s are mutually exclusive options", + "min_apply_delay > 0", "streaming = parallel"));Mmm. Couldn't we refuse 0 as min_apply_delay?
Sorry, the previous patch's behavior wasn't consistent with this error message.
In the previous patch, if we conducted alter subscription
with stream = parallel and min_apply_delay = 0 (from a positive value) at the same time,
the alter command failed, although this should succeed by this time-delayed feature specification.
We fixed this part accordingly by some more tests in AlterSubscription().By the way, we should allow users to change min_apply_dealy to 0
whenever they want from different value. Then, we didn't restrict
this kind of operation.+ sub->minapplydelay > 0) ... + if (opts.min_apply_delay > 0 &&Is there any reason for the differenciation?
Yes. The former is the object for an existing subscription configuration.
For example, if we alter subscription with setting streaming = 'parallel'
for a subscription created with min_apply_delay = '1 day', we
need to reject the alter command. The latter is new settings.+
errmsg("cannot set %s for subscription with %s",
+
"streaming = parallel", "min_apply_delay > 0"));I think that this shoud be more like human-speking. Say, "cannot enable
min_apply_delay for subscription in parallel streaming mode" or something..
The same is applicable to the nearby message.Reworded the error messages. Please check.
+static void maybe_delay_apply(TimestampTz ts);
apply_spooled_messages(FileSet *stream_fileset, TransactionId xid, - XLogRecPtr lsn) + XLogRecPtr lsn, TimestampTz ts)"ts" looks too generic. Couldn't it be more specific?
We need a explanation for the parameter in the function comment.Changed it to finish_ts, since it indicates commit/prepare time.
This terminology should be aligned with finish lsn.+ if (!am_parallel_apply_worker()) + { + Assert(ts > 0); + maybe_delay_apply(ts);It seems to me better that the if condition and assertion are checked inside
maybe_delay_apply().Fixed.
Thanks for the updated patch, Few comments:
1) Since the min_apply_delay = 3, but you have specified 2s, there
might be a possibility that it can log delay as 1000ms due to
pub/sub/network delay and the test can fail randomly, If we cannot
ensure this log file value, check_apply_delay_time verification alone
should be sufficient.
+is($result, qq(5|1|5), 'check if the new rows were applied to subscriber');
+
+check_apply_delay_log("logical replication apply delay", "2000");
2) I'm not sure if this will add any extra coverage as the altering
value of min_apply_delay is already tested in the regression, if so
this test can be removed:
+# Test ALTER SUBSCRIPTION. Delay 86460 seconds (1 day 1 minute).
+$node_subscriber->safe_psql('postgres',
+ "ALTER SUBSCRIPTION tap_sub SET (min_apply_delay = 86460000)"
+);
+
+# New row to trigger apply delay.
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO test_tab VALUES (0, 'foobar')");
+
+check_apply_delay_log("logical replication apply delay", "80000000");
3) We generally keep the subroutines before the tests, it can be kept
accordingly:
3.a)
+sub check_apply_delay_log
+{
+ my ($message, $expected) = @_;
+ $expected = 0 unless defined $expected;
+
+ my $old_log_location = $log_location;
3.b)
+sub check_apply_delay_time
+{
+ my ($primary_key, $expected_diffs) = @_;
+
+ my $inserted_time_on_pub = $node_publisher->safe_psql('postgres', qq[
+ SELECT extract(epoch from c) FROM test_tab WHERE a =
$primary_key;
+ ]);
+
4) typo "more then once" should be "more than once"
+ regress_testsub | regress_subscription_user | f |
{testpub,testpub1,testpub2} | f | off | d |
f | any | 0 | off
| dbname=regress_doesnotexist | 0/0
(1 row)
-- fail - publication used more then once
@@ -316,10 +316,10 @@ ERROR: publication "testpub3" is not in
subscription "regress_testsub"
-- ok - delete publications
ALTER SUBSCRIPTION regress_testsub DROP PUBLICATION testpub1,
testpub2 WITH (refresh = false);
\dRs+
5) This can be changed to "Is it larger than expected?"
+ # Is it larger than expected ?
+ cmp_ok($logged_delay, '>', $expected,
+ "The wait time of the apply worker is long
enough expectedly"
+ );
6) 2022 should be changed to 2023
+++ b/src/test/subscription/t/032_apply_delay.pl
@@ -0,0 +1,170 @@
+
+# Copyright (c) 2022, PostgreSQL Global Development Group
+
+# Test replication apply delay
7) Termination full stop is not required for single line comments:
7.a)
+use Test::More;
+
+# Create publisher node.
+my $node_publisher = PostgreSQL::Test::Cluster->new('publisher');
7.b) +
+# Create subscriber node.
+my $node_subscriber = PostgreSQL::Test::Cluster->new('subscriber');
7.c) +
+# Create some preexisting content on publisher.
+$node_publisher->safe_psql('postgres',
7.d) similarly in rest of the files
8) Is it possible to add one test for spooling also?
Regards,
Vignesh
Hi,
On Saturday, January 14, 2023 3:27 PM vignesh C <vignesh21@gmail.com> wrote:
1) Since the min_apply_delay = 3, but you have specified 2s, there might be a possibility that it can log delay as 1000ms due to pub/sub/network delay and the test can fail randomly, If we cannot ensure this log file value, check_apply_delay_time verification alone should be sufficient. +is($result, qq(5|1|5), 'check if the new rows were applied to +subscriber'); + +check_apply_delay_log("logical replication apply delay", "2000");
You are right. Removed the left-time check of the 1st call of check_apply_delay_log().
2) I'm not sure if this will add any extra coverage as the altering value of min_apply_delay is already tested in the regression, if so this test can be removed: +# Test ALTER SUBSCRIPTION. Delay 86460 seconds (1 day 1 minute). +$node_subscriber->safe_psql('postgres', + "ALTER SUBSCRIPTION tap_sub SET (min_apply_delay = 86460000)" +); + +# New row to trigger apply delay. +$node_publisher->safe_psql('postgres', + "INSERT INTO test_tab VALUES (0, 'foobar')"); + +check_apply_delay_log("logical replication apply delay", "80000000");
While addressing this point, I've noticed that there is a
behavior difference between physical replication's recovery_min_apply_delay
and this feature when stopping the replication during delays.
At present, in the latter case,
the apply worker exits without applying the suspended transaction
after ALTER SUBSCRIPTION DISABLE command for the subscription.
Meanwhile, there is no "disabling" command for physical replication,
but I checked the behavior about what happens for promoting a secondary
during the delay of recovery_min_apply_delay for physical replication as one example.
The transaction has become visible even in the promoting in the middle of delay.
I'm not sure if I should make the time-delayed LR aligned with this behavior.
Does someone has an opinion for this ?
By the way, the above test code can be used for the test case
when the apply worker is in a delay but the transaction has been canceled by
ALTER SUBSCRIPTION DISABLE command. So, I didn't remove it at this stage.
3) We generally keep the subroutines before the tests, it can be kept accordingly: 3.a) +sub check_apply_delay_log +{ + my ($message, $expected) = @_; + $expected = 0 unless defined $expected; + + my $old_log_location = $log_location;3.b) +sub check_apply_delay_time +{ + my ($primary_key, $expected_diffs) = @_; + + my $inserted_time_on_pub = $node_publisher->safe_psql('postgres', qq[ + SELECT extract(epoch from c) FROM test_tab WHERE a = $primary_key; + ]); +
Fixed.
4) typo "more then once" should be "more than once"
+ regress_testsub | regress_subscription_user | f |
{testpub,testpub1,testpub2} | f | off | d |
f | any | 0 | off
| dbname=regress_doesnotexist | 0/0
(1 row)-- fail - publication used more then once @@ -316,10 +316,10 @@ ERROR:
publication "testpub3" is not in subscription "regress_testsub"
-- ok - delete publications
ALTER SUBSCRIPTION regress_testsub DROP PUBLICATION testpub1,
testpub2 WITH (refresh = false);
\dRs+
This was an existing typo on HEAD. Addressed in other thread in [1]/messages/by-id/TYCPR01MB83737EA140C79B7D099F65E8EDC69@TYCPR01MB8373.jpnprd01.prod.outlook.com.
5) This can be changed to "Is it larger than expected?" + # Is it larger than expected ? + cmp_ok($logged_delay, '>', $expected, + "The wait time of the apply worker is long enough expectedly" + );
Fixed.
6) 2022 should be changed to 2023 +++ b/src/test/subscription/t/032_apply_delay.pl @@ -0,0 +1,170 @@ + +# Copyright (c) 2022, PostgreSQL Global Development Group + +# Test replication apply delay
Fixed.
7) Termination full stop is not required for single line comments: 7.a) +use Test::More; + +# Create publisher node. +my $node_publisher = PostgreSQL::Test::Cluster->new('publisher');7.b) + +# Create subscriber node. +my $node_subscriber = PostgreSQL::Test::Cluster->new('subscriber');7.c) + +# Create some preexisting content on publisher. +$node_publisher->safe_psql('postgres',7.d) similarly in rest of the files
Removed the periods for single line comments.
8) Is it possible to add one test for spooling also?
There is a streaming transaction case in the TAP test already.
I conducted some minor comment modifications along with above changes.
Kindly have a look at the v16.
[1]: /messages/by-id/TYCPR01MB83737EA140C79B7D099F65E8EDC69@TYCPR01MB8373.jpnprd01.prod.outlook.com
Best Regards,
Takamichi Osumi
Attachments:
v16-0001-Time-delayed-logical-replication-subscriber.patchapplication/octet-stream; name=v16-0001-Time-delayed-logical-replication-subscriber.patchDownload
From 47af0bc714caccb39005386b495fc2c9460d25de Mon Sep 17 00:00:00 2001
From: Takamichi Osumi <osumi.takamichi@fujitsu.com>
Date: Tue, 17 Jan 2023 10:03:30 +0000
Subject: [PATCH v16] Time-delayed logical replication subscriber
Similar to physical replication, a time-delayed copy of the data for
logical replication is useful for some scenarios (particularly to fix
errors that might cause data loss).
If the subscription sets min_apply_delay parameter, the logical
replication worker will delay the transaction commit for min_apply_delay
milliseconds.
The delay is calculated between the WAL time stamp and the current time
on the subscriber.
The delay occurs only on WAL records for transaction begins. The main
reason is to avoid keeping a transaction open for a long time. Regular
and prepared transactions are covered. Streamed transactions are also
covered.
Prohibit the combination of this feature and parallel streaming mode.
Author: Euler Taveira
Discussion: https://postgr.es/m/CAB-JLwYOYwL=XTyAXKiH5CtM_Vm8KjKh7aaitCKvmCh4rzr5pQ@mail.gmail.com
---
doc/src/sgml/catalogs.sgml | 9 +
doc/src/sgml/config.sgml | 12 ++
doc/src/sgml/logical-replication.sgml | 7 +
doc/src/sgml/ref/alter_subscription.sgml | 5 +-
doc/src/sgml/ref/create_subscription.sgml | 59 +++++-
src/backend/catalog/pg_subscription.c | 1 +
src/backend/catalog/system_views.sql | 7 +-
src/backend/commands/subscriptioncmds.c | 89 ++++++++-
.../replication/logical/applyparallelworker.c | 3 +-
src/backend/replication/logical/worker.c | 166 ++++++++++++++--
src/backend/utils/adt/timestamp.c | 29 +++
src/bin/pg_dump/pg_dump.c | 16 +-
src/bin/pg_dump/pg_dump.h | 1 +
src/bin/psql/describe.c | 9 +-
src/bin/psql/tab-complete.c | 4 +-
src/include/catalog/pg_subscription.h | 3 +
src/include/datatype/timestamp.h | 2 +
src/include/replication/worker_internal.h | 2 +-
src/include/utils/timestamp.h | 2 +
src/test/regress/expected/subscription.out | 185 +++++++++++-------
src/test/regress/sql/subscription.sql | 25 +++
src/test/subscription/meson.build | 1 +
src/test/subscription/t/032_apply_delay.pl | 182 +++++++++++++++++
23 files changed, 717 insertions(+), 102 deletions(-)
create mode 100644 src/test/subscription/t/032_apply_delay.pl
diff --git a/doc/src/sgml/catalogs.sgml b/doc/src/sgml/catalogs.sgml
index c1e4048054..bf3c05241c 100644
--- a/doc/src/sgml/catalogs.sgml
+++ b/doc/src/sgml/catalogs.sgml
@@ -7873,6 +7873,15 @@ SCRAM-SHA-256$<replaceable><iteration count></replaceable>:<replaceable>&l
</para></entry>
</row>
+ <row>
+ <entry role="catalog_table_entry"><para role="column_definition">
+ <structfield>subminapplydelay</structfield> <type>int8</type>
+ </para>
+ <para>
+ The length of time (ms) to delay the application of changes.
+ </para></entry>
+ </row>
+
<row>
<entry role="catalog_table_entry"><para role="column_definition">
<structfield>subname</structfield> <type>name</type>
diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml
index 77574e2d4e..8258e251df 100644
--- a/doc/src/sgml/config.sgml
+++ b/doc/src/sgml/config.sgml
@@ -4753,6 +4753,18 @@ ANY <replaceable class="parameter">num_sync</replaceable> ( <replaceable class="
the <filename>postgresql.conf</filename> file or on the server
command line.
</para>
+ <para>
+ For time-delayed logical replication, the apply worker sends a Standby
+ Status Update message to the corresponding publisher per the indicated
+ time of this parameter. Therefore, if this parameter is longer than
+ <literal>wal_sender_timeout</literal> on the publisher, then the
+ walsender doesn't get any update message during the delay and repeatedly
+ terminates due to the timeout errors. Hence, make sure this parameter is
+ shorter than the <literal>wal_sender_timeout</literal> of the publisher.
+ If this parameter is set to zero with time-delayed replication, the
+ apply worker doesn't send any feedback messages during the
+ <literal>min_apply_delay</literal>.
+ </para>
</listitem>
</varlistentry>
diff --git a/doc/src/sgml/logical-replication.sgml b/doc/src/sgml/logical-replication.sgml
index 54f48be87f..6407804547 100644
--- a/doc/src/sgml/logical-replication.sgml
+++ b/doc/src/sgml/logical-replication.sgml
@@ -247,6 +247,13 @@
target table.
</para>
+ <para>
+ The subscriber replication can be instructed to lag behind the publisher
+ side changes by specifying the <literal>min_apply_delay</literal>
+ subscription parameter. See <xref linkend="sql-createsubscription"/> for
+ details.
+ </para>
+
<sect2 id="logical-replication-subscription-slot">
<title>Replication Slot Management</title>
diff --git a/doc/src/sgml/ref/alter_subscription.sgml b/doc/src/sgml/ref/alter_subscription.sgml
index ad93553a1d..1c6e9dd2d1 100644
--- a/doc/src/sgml/ref/alter_subscription.sgml
+++ b/doc/src/sgml/ref/alter_subscription.sgml
@@ -213,8 +213,9 @@ ALTER SUBSCRIPTION <replaceable class="parameter">name</replaceable> RENAME TO <
are <literal>slot_name</literal>,
<literal>synchronous_commit</literal>,
<literal>binary</literal>, <literal>streaming</literal>,
- <literal>disable_on_error</literal>, and
- <literal>origin</literal>.
+ <literal>disable_on_error</literal>,
+ <literal>origin</literal>, and
+ <literal>min_apply_delay</literal>.
</para>
</listitem>
</varlistentry>
diff --git a/doc/src/sgml/ref/create_subscription.sgml b/doc/src/sgml/ref/create_subscription.sgml
index eba72c6af6..ad97914dc8 100644
--- a/doc/src/sgml/ref/create_subscription.sgml
+++ b/doc/src/sgml/ref/create_subscription.sgml
@@ -349,7 +349,47 @@ CREATE SUBSCRIPTION <replaceable class="parameter">subscription_name</replaceabl
</para>
</listitem>
</varlistentry>
- </variablelist></para>
+
+ <varlistentry>
+ <term><literal>min_apply_delay</literal> (<type>integer</type>)</term>
+ <listitem>
+ <para>
+ By default, the subscriber applies changes as soon as possible. As
+ with the physical replication feature
+ (<xref linkend="guc-recovery-min-apply-delay"/>), it can be useful to
+ have a time-delayed logical replica. This parameter lets the user to
+ delay the application of changes by a specified amount of time. If this
+ value is specified without units, it is taken as milliseconds. The
+ default is zero(no delay).
+ </para>
+ <para>
+ The delay occurs only on WAL records for transaction begins and after
+ the initial table synchronization. It is possible that the
+ replication delay between publisher and subscriber exceeds the value
+ of this parameter, in which case no delay is added. Note that the
+ delay is calculated between the WAL time stamp as written on
+ publisher and the current time on the subscriber. Time spent in logical
+ decoding and in transferring the transaction may reduce the actual wait
+ time. If the system clocks on publisher and subscriber are not
+ synchronized, this may lead to apply changes earlier than expected,
+ but this is not a major issue because this parameter is typically much
+ larger than the time deviations between servers. Note that if this
+ parameter is set to a long delay, the replication will stop if the
+ replication slot falls behind the current LSN by more than
+ <link linkend="guc-max-slot-wal-keep-size"><literal>max_slot_wal_keep_size</literal></link>.
+ </para>
+ <warning>
+ <para>
+ Delaying the replication can mean there is a much longer time between making
+ a change on the publisher, and that change being committed on the subscriber.
+ This can have a big impact on synchronous replication.
+ See <xref linkend="guc-synchronous-commit"/>.
+ </para>
+ </warning>
+ </listitem>
+ </varlistentry>
+ </variablelist>
+ </para>
</listitem>
</varlistentry>
@@ -413,6 +453,11 @@ CREATE SUBSCRIPTION <replaceable class="parameter">subscription_name</replaceabl
published with different column lists are not supported.
</para>
+ <para>
+ Setting streaming to <literal>parallel</literal> mode and <literal>min_apply_delay</literal>
+ simultaneously is not supported.
+ </para>
+
<para>
We allow non-existent publications to be specified so that users can add
those later. This means
@@ -472,6 +517,18 @@ CREATE SUBSCRIPTION mysub
PUBLICATION insert_only
WITH (enabled = false);
</programlisting></para>
+
+ <para>
+ Create a subscription to a remote server that replicates tables in
+ the <literal>mypub</literal> publication and starts replicating immediately
+ on commit. Pre-existing data is not copied. The application of changes is
+ delayed by 4 hours.
+<programlisting>
+CREATE SUBSCRIPTION mysub
+ CONNECTION 'host=192.0.2.4 port=5432 user=foo dbname=foodb'
+ PUBLICATION mypub
+ WITH (copy_data = false, min_apply_delay = '4h');
+</programlisting></para>
</refsect1>
<refsect1>
diff --git a/src/backend/catalog/pg_subscription.c b/src/backend/catalog/pg_subscription.c
index a56ae311c3..c767cc1c3a 100644
--- a/src/backend/catalog/pg_subscription.c
+++ b/src/backend/catalog/pg_subscription.c
@@ -64,6 +64,7 @@ GetSubscription(Oid subid, bool missing_ok)
sub->oid = subid;
sub->dbid = subform->subdbid;
sub->skiplsn = subform->subskiplsn;
+ sub->minapplydelay = subform->subminapplydelay;
sub->name = pstrdup(NameStr(subform->subname));
sub->owner = subform->subowner;
sub->enabled = subform->subenabled;
diff --git a/src/backend/catalog/system_views.sql b/src/backend/catalog/system_views.sql
index d2a8c82900..0950b4b74b 100644
--- a/src/backend/catalog/system_views.sql
+++ b/src/backend/catalog/system_views.sql
@@ -1298,9 +1298,10 @@ REVOKE ALL ON pg_replication_origin_status FROM public;
-- All columns of pg_subscription except subconninfo are publicly readable.
REVOKE ALL ON pg_subscription FROM public;
-GRANT SELECT (oid, subdbid, subskiplsn, subname, subowner, subenabled,
- subbinary, substream, subtwophasestate, subdisableonerr,
- subslotname, subsynccommit, subpublications, suborigin)
+GRANT SELECT (oid, subdbid, subskiplsn, subminapplydelay, subname, subowner,
+ subenabled, subbinary, substream, subtwophasestate,
+ subdisableonerr, subslotname, subsynccommit, subpublications,
+ suborigin)
ON pg_subscription TO public;
CREATE VIEW pg_stat_subscription_stats AS
diff --git a/src/backend/commands/subscriptioncmds.c b/src/backend/commands/subscriptioncmds.c
index baff00dd74..bd9b8c0996 100644
--- a/src/backend/commands/subscriptioncmds.c
+++ b/src/backend/commands/subscriptioncmds.c
@@ -66,6 +66,7 @@
#define SUBOPT_DISABLE_ON_ERR 0x00000400
#define SUBOPT_LSN 0x00000800
#define SUBOPT_ORIGIN 0x00001000
+#define SUBOPT_MIN_APPLY_DELAY 0x00002000
/* check if the 'val' has 'bits' set */
#define IsSet(val, bits) (((val) & (bits)) == (bits))
@@ -90,6 +91,7 @@ typedef struct SubOpts
bool disableonerr;
char *origin;
XLogRecPtr lsn;
+ int min_apply_delay;
} SubOpts;
static List *fetch_table_list(WalReceiverConn *wrconn, List *publications);
@@ -146,6 +148,8 @@ parse_subscription_options(ParseState *pstate, List *stmt_options,
opts->disableonerr = false;
if (IsSet(supported_opts, SUBOPT_ORIGIN))
opts->origin = pstrdup(LOGICALREP_ORIGIN_ANY);
+ if (IsSet(supported_opts, SUBOPT_MIN_APPLY_DELAY))
+ opts->min_apply_delay = 0;
/* Parse options */
foreach(lc, stmt_options)
@@ -324,6 +328,43 @@ parse_subscription_options(ParseState *pstate, List *stmt_options,
opts->specified_opts |= SUBOPT_LSN;
opts->lsn = lsn;
}
+ else if (IsSet(supported_opts, SUBOPT_MIN_APPLY_DELAY) &&
+ strcmp(defel->defname, "min_apply_delay") == 0)
+ {
+ char *val,
+ *tmp;
+ Interval *interval;
+ int64 ms;
+
+ if (IsSet(opts->specified_opts, SUBOPT_MIN_APPLY_DELAY))
+ errorConflictingDefElem(defel, pstate);
+
+ opts->specified_opts |= SUBOPT_MIN_APPLY_DELAY;
+ tmp = defGetString(defel);
+
+ /*
+ * If no unit was specified, then explicitly add 'ms' otherwise
+ * the interval_in function would assume 'seconds'.
+ */
+ if (strspn(tmp, "-0123456789 ") == strlen(tmp))
+ val = psprintf("%sms", tmp);
+ else
+ val = tmp;
+
+ interval = DatumGetIntervalP(DirectFunctionCall3(interval_in,
+ CStringGetDatum(val),
+ ObjectIdGetDatum(InvalidOid),
+ Int32GetDatum(-1)));
+
+ ms = interval2ms(interval);
+ if (ms < 0 || ms > INT_MAX)
+ ereport(ERROR,
+ errcode(ERRCODE_NUMERIC_VALUE_OUT_OF_RANGE),
+ errmsg("%lld ms is outside the valid range for parameter \"%s\" (0 .. %d)",
+ (long long) ms, "min_apply_delay", INT_MAX));
+
+ opts->min_apply_delay = ms;
+ }
else
ereport(ERROR,
(errcode(ERRCODE_SYNTAX_ERROR),
@@ -404,6 +445,17 @@ parse_subscription_options(ParseState *pstate, List *stmt_options,
"slot_name = NONE", "create_slot = false")));
}
}
+
+ /* Test the combination of streaming mode and min_apply_delay */
+ if (IsSet(supported_opts, SUBOPT_MIN_APPLY_DELAY) &&
+ opts->min_apply_delay > 0)
+ {
+ if (opts->streaming == LOGICALREP_STREAM_PARALLEL)
+ ereport(ERROR,
+ errcode(ERRCODE_SYNTAX_ERROR),
+ errmsg("%s and %s are mutually exclusive options",
+ "min_apply_delay > 0", "streaming = parallel"));
+ }
}
/*
@@ -560,7 +612,8 @@ CreateSubscription(ParseState *pstate, CreateSubscriptionStmt *stmt,
SUBOPT_SLOT_NAME | SUBOPT_COPY_DATA |
SUBOPT_SYNCHRONOUS_COMMIT | SUBOPT_BINARY |
SUBOPT_STREAMING | SUBOPT_TWOPHASE_COMMIT |
- SUBOPT_DISABLE_ON_ERR | SUBOPT_ORIGIN);
+ SUBOPT_DISABLE_ON_ERR | SUBOPT_ORIGIN |
+ SUBOPT_MIN_APPLY_DELAY);
parse_subscription_options(pstate, stmt->options, supported_opts, &opts);
/*
@@ -625,6 +678,7 @@ CreateSubscription(ParseState *pstate, CreateSubscriptionStmt *stmt,
values[Anum_pg_subscription_oid - 1] = ObjectIdGetDatum(subid);
values[Anum_pg_subscription_subdbid - 1] = ObjectIdGetDatum(MyDatabaseId);
values[Anum_pg_subscription_subskiplsn - 1] = LSNGetDatum(InvalidXLogRecPtr);
+ values[Anum_pg_subscription_subminapplydelay - 1] = Int64GetDatum(opts.min_apply_delay);
values[Anum_pg_subscription_subname - 1] =
DirectFunctionCall1(namein, CStringGetDatum(stmt->subname));
values[Anum_pg_subscription_subowner - 1] = ObjectIdGetDatum(owner);
@@ -1054,7 +1108,7 @@ AlterSubscription(ParseState *pstate, AlterSubscriptionStmt *stmt,
supported_opts = (SUBOPT_SLOT_NAME |
SUBOPT_SYNCHRONOUS_COMMIT | SUBOPT_BINARY |
SUBOPT_STREAMING | SUBOPT_DISABLE_ON_ERR |
- SUBOPT_ORIGIN);
+ SUBOPT_ORIGIN | SUBOPT_MIN_APPLY_DELAY);
parse_subscription_options(pstate, stmt->options,
supported_opts, &opts);
@@ -1098,6 +1152,18 @@ AlterSubscription(ParseState *pstate, AlterSubscriptionStmt *stmt,
if (IsSet(opts.specified_opts, SUBOPT_STREAMING))
{
+ /*
+ * Test the combination of streaming mode and
+ * min_apply_delay
+ */
+ if (opts.streaming == LOGICALREP_STREAM_PARALLEL)
+ if ((IsSet(opts.specified_opts, SUBOPT_MIN_APPLY_DELAY) && opts.min_apply_delay > 0) ||
+ (!IsSet(opts.specified_opts, SUBOPT_MIN_APPLY_DELAY) && sub->minapplydelay > 0))
+ ereport(ERROR,
+ errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
+ errmsg("cannot enable %s mode for subscription with %s",
+ "streaming = parallel", "min_apply_delay"));
+
values[Anum_pg_subscription_substream - 1] =
CharGetDatum(opts.streaming);
replaces[Anum_pg_subscription_substream - 1] = true;
@@ -1111,6 +1177,25 @@ AlterSubscription(ParseState *pstate, AlterSubscriptionStmt *stmt,
= true;
}
+ if (IsSet(opts.specified_opts, SUBOPT_MIN_APPLY_DELAY))
+ {
+ /*
+ * Test the combination of streaming mode and
+ * min_apply_delay
+ */
+ if (opts.min_apply_delay > 0)
+ if ((IsSet(opts.specified_opts, SUBOPT_STREAMING) && opts.streaming == LOGICALREP_STREAM_PARALLEL) ||
+ (!IsSet(opts.specified_opts, SUBOPT_STREAMING) && sub->stream == LOGICALREP_STREAM_PARALLEL))
+ ereport(ERROR,
+ errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
+ errmsg("cannot enable %s for subscription in %s mode",
+ "min_apply_delay", "streaming = parallel"));
+
+ values[Anum_pg_subscription_subminapplydelay - 1] =
+ Int64GetDatum(opts.min_apply_delay);
+ replaces[Anum_pg_subscription_subminapplydelay - 1] = true;
+ }
+
if (IsSet(opts.specified_opts, SUBOPT_ORIGIN))
{
values[Anum_pg_subscription_suborigin - 1] =
diff --git a/src/backend/replication/logical/applyparallelworker.c b/src/backend/replication/logical/applyparallelworker.c
index 3dfcff2798..cb945243a4 100644
--- a/src/backend/replication/logical/applyparallelworker.c
+++ b/src/backend/replication/logical/applyparallelworker.c
@@ -704,7 +704,8 @@ pa_process_spooled_messages_if_required(void)
{
apply_spooled_messages(&MyParallelShared->fileset,
MyParallelShared->xid,
- InvalidXLogRecPtr);
+ InvalidXLogRecPtr,
+ 0);
pa_set_fileset_state(MyParallelShared, FS_EMPTY);
}
diff --git a/src/backend/replication/logical/worker.c b/src/backend/replication/logical/worker.c
index a0084c7ef6..7ef653749b 100644
--- a/src/backend/replication/logical/worker.c
+++ b/src/backend/replication/logical/worker.c
@@ -318,6 +318,17 @@ static List *on_commit_wakeup_workers_subids = NIL;
bool in_remote_transaction = false;
static XLogRecPtr remote_final_lsn = InvalidXLogRecPtr;
+/*
+ * In order to avoid walsender's timeout during time-delayed replication,
+ * it's necessary to keep sending feedback messages during the delay from the
+ * worker process. Meanwhile, the feature delays the apply before starting the
+ * transaction and thus we don't write WALs for the suspended changes during
+ * the wait. Hence, in the case the worker process sends a feedback message
+ * during the delay, we should not make positions of the flushed and apply LSN
+ * overwritten by the last received latest LSN. See send_feedback() for details.
+ */
+static XLogRecPtr last_received = InvalidXLogRecPtr;
+
/* fields valid only when processing streamed transaction */
static bool in_streamed_transaction = false;
@@ -388,10 +399,13 @@ static void stream_write_change(char action, StringInfo s);
static void stream_open_and_write_change(TransactionId xid, char action, StringInfo s);
static void stream_close_file(void);
-static void send_feedback(XLogRecPtr recvpos, bool force, bool requestReply);
+static void send_feedback(XLogRecPtr recvpos, bool force, bool requestReply,
+ bool in_delaying_apply);
static void DisableSubscriptionAndExit(void);
+static void maybe_delay_apply(TimestampTz finish_ts);
+
static void apply_handle_commit_internal(LogicalRepCommitData *commit_data);
static void apply_handle_insert_internal(ApplyExecutionData *edata,
ResultRelInfo *relinfo,
@@ -998,6 +1012,99 @@ slot_modify_data(TupleTableSlot *slot, TupleTableSlot *srcslot,
ExecStoreVirtualTuple(slot);
}
+/*
+ * When min_apply_delay parameter is set on the subscriber, we wait long enough
+ * to make sure a transaction is applied at least that interval behind the
+ * publisher.
+ *
+ * While the physical replication applies the delay at commit time, this
+ * feature applies the delay for the next transaction but before starting the
+ * transaction. This is mainly because keeping a transaction that conducted
+ * write operations open for a long time results in some issues such as bloat
+ * and locks.
+ *
+ * The min_apply_delay parameter will take effect only after all tables are in
+ * READY state.
+ *
+ * finish_ts is the commit/prepare time of both regular (non-streamed) and
+ * streamed transactions.
+ */
+static void
+maybe_delay_apply(TimestampTz finish_ts)
+{
+ Assert(finish_ts > 0);
+
+ /* Nothing to do if no delay set */
+ if (MySubscription->minapplydelay <= 0)
+ return;
+
+ /*
+ * The min_apply_delay parameter is ignored until all tablesync workers
+ * have reached READY state. If we allow the delay during the catchup
+ * phase, once we reach the limit of tablesync workers, it will impose a
+ * delay for each subsequent worker. It means it will take a long time to
+ * finish the initial table synchronization.
+ */
+ if (!AllTablesyncsReady())
+ return;
+
+ while (true)
+ {
+ long diffms;
+
+ ResetLatch(MyLatch);
+
+ CHECK_FOR_INTERRUPTS();
+
+ if (ConfigReloadPending)
+ {
+ ConfigReloadPending = false;
+ ProcessConfigFile(PGC_SIGHUP);
+ }
+
+ /*
+ * Before calculating the time duration, reload the catalog if needed.
+ */
+ if (!in_remote_transaction && !in_streamed_transaction)
+ {
+ AcceptInvalidationMessages();
+ maybe_reread_subscription();
+ }
+
+ diffms = TimestampDifferenceMilliseconds(GetCurrentTimestamp(),
+ TimestampTzPlusMilliseconds(finish_ts, MySubscription->minapplydelay));
+
+ /*
+ * Exit without arming the latch if it's already past time to apply
+ * this transaction.
+ */
+ if (diffms <= 0)
+ break;
+
+ elog(DEBUG2, "logical replication apply delay: %ld ms", diffms);
+
+ /*
+ * Call send_feedback() to prevent the publisher from exiting by
+ * timeout during the delay, when wal_receiver_status_interval is
+ * available.
+ */
+ if (wal_receiver_status_interval > 0
+ && diffms > wal_receiver_status_interval * 1000)
+ {
+ WaitLatch(MyLatch,
+ WL_LATCH_SET | WL_TIMEOUT | WL_EXIT_ON_PM_DEATH,
+ (long) wal_receiver_status_interval * 1000,
+ WAIT_EVENT_RECOVERY_APPLY_DELAY);
+ send_feedback(last_received, true, false, true);
+ }
+ else
+ WaitLatch(MyLatch,
+ WL_LATCH_SET | WL_TIMEOUT | WL_EXIT_ON_PM_DEATH,
+ diffms,
+ WAIT_EVENT_RECOVERY_APPLY_DELAY);
+ }
+}
+
/*
* Handle BEGIN message.
*/
@@ -1012,6 +1119,9 @@ apply_handle_begin(StringInfo s)
logicalrep_read_begin(s, &begin_data);
set_apply_error_context_xact(begin_data.xid, begin_data.final_lsn);
+ /* Should we delay the current transaction? */
+ maybe_delay_apply(begin_data.committime);
+
remote_final_lsn = begin_data.final_lsn;
maybe_start_skipping_changes(begin_data.final_lsn);
@@ -1069,6 +1179,9 @@ apply_handle_begin_prepare(StringInfo s)
logicalrep_read_begin_prepare(s, &begin_data);
set_apply_error_context_xact(begin_data.xid, begin_data.prepare_lsn);
+ /* Should we delay the current prepared transaction? */
+ maybe_delay_apply(begin_data.prepare_time);
+
remote_final_lsn = begin_data.prepare_lsn;
maybe_start_skipping_changes(begin_data.prepare_lsn);
@@ -1316,7 +1429,8 @@ apply_handle_stream_prepare(StringInfo s)
* spooled operations.
*/
apply_spooled_messages(MyLogicalRepWorker->stream_fileset,
- prepare_data.xid, prepare_data.prepare_lsn);
+ prepare_data.xid, prepare_data.prepare_lsn,
+ prepare_data.prepare_time);
/* Mark the transaction as prepared. */
apply_handle_prepare_internal(&prepare_data);
@@ -2010,10 +2124,13 @@ ensure_last_message(FileSet *stream_fileset, TransactionId xid, int fileno,
/*
* Common spoolfile processing.
+ *
+ * The commit/prepare time for streaming transaction is required to achieve
+ * time-delayed replication.
*/
void
apply_spooled_messages(FileSet *stream_fileset, TransactionId xid,
- XLogRecPtr lsn)
+ XLogRecPtr lsn, TimestampTz finish_ts)
{
StringInfoData s2;
int nchanges;
@@ -2024,6 +2141,21 @@ apply_spooled_messages(FileSet *stream_fileset, TransactionId xid,
int fileno;
off_t offset;
+ /*
+ * Should we delay the current transaction?
+ *
+ * Unlike the regular (non-streamed) cases, the delay is applied in a
+ * STREAM COMMIT/STREAM PREPARE message for streamed transactions. The
+ * STREAM START message does not contain a commit/prepare time (it will be
+ * available when the in-progress transaction finishes). Hence, it's not
+ * appropriate to apply a delay at that time.
+ *
+ * It's not allowed to execute time-delayed replication with parallel
+ * apply feature.
+ */
+ if (!am_parallel_apply_worker())
+ maybe_delay_apply(finish_ts);
+
if (!am_parallel_apply_worker())
maybe_start_skipping_changes(lsn);
@@ -2173,7 +2305,7 @@ apply_handle_stream_commit(StringInfo s)
* spooled operations.
*/
apply_spooled_messages(MyLogicalRepWorker->stream_fileset, xid,
- commit_data.commit_lsn);
+ commit_data.commit_lsn, commit_data.committime);
apply_handle_commit_internal(&commit_data);
@@ -3446,7 +3578,7 @@ UpdateWorkerStats(XLogRecPtr last_lsn, TimestampTz send_time, bool reply)
* Apply main loop.
*/
static void
-LogicalRepApplyLoop(XLogRecPtr last_received)
+LogicalRepApplyLoop(void)
{
TimestampTz last_recv_timestamp = GetCurrentTimestamp();
bool ping_sent = false;
@@ -3567,7 +3699,7 @@ LogicalRepApplyLoop(XLogRecPtr last_received)
if (last_received < end_lsn)
last_received = end_lsn;
- send_feedback(last_received, reply_requested, false);
+ send_feedback(last_received, reply_requested, false, false);
UpdateWorkerStats(last_received, timestamp, true);
}
/* other message types are purposefully ignored */
@@ -3580,7 +3712,7 @@ LogicalRepApplyLoop(XLogRecPtr last_received)
}
/* confirm all writes so far */
- send_feedback(last_received, false, false);
+ send_feedback(last_received, false, false, false);
if (!in_remote_transaction && !in_streamed_transaction)
{
@@ -3677,7 +3809,7 @@ LogicalRepApplyLoop(XLogRecPtr last_received)
}
}
- send_feedback(last_received, requestReply, requestReply);
+ send_feedback(last_received, requestReply, requestReply, false);
/*
* Force reporting to ensure long idle periods don't lead to
@@ -3707,7 +3839,7 @@ LogicalRepApplyLoop(XLogRecPtr last_received)
* to send a response to avoid timeouts.
*/
static void
-send_feedback(XLogRecPtr recvpos, bool force, bool requestReply)
+send_feedback(XLogRecPtr recvpos, bool force, bool requestReply, bool in_delaying_apply)
{
static StringInfo reply_message = NULL;
static TimestampTz send_time = 0;
@@ -3737,8 +3869,15 @@ send_feedback(XLogRecPtr recvpos, bool force, bool requestReply)
/*
* No outstanding transactions to flush, we can report the latest received
* position. This is important for synchronous replication.
+ *
+ * During the delay of time-delayed replication, do not tell the publisher
+ * that the received latest LSN is already applied and flushed at this
+ * stage, since we don't apply the transaction yet. If we do so, it leads
+ * to a wrong assumption of logical replication progress on the publisher
+ * side. Here, we just send a feedback message to avoid publisher's
+ * timeout during the delay.
*/
- if (!have_pending_txes)
+ if (!have_pending_txes && !in_delaying_apply)
flushpos = writepos = recvpos;
if (writepos < last_writepos)
@@ -4354,11 +4493,11 @@ start_table_sync(XLogRecPtr *origin_startpos, char **myslotname)
* of system resource error and are not repeatable.
*/
static void
-start_apply(XLogRecPtr origin_startpos)
+start_apply(void)
{
PG_TRY();
{
- LogicalRepApplyLoop(origin_startpos);
+ LogicalRepApplyLoop();
}
PG_CATCH();
{
@@ -4645,7 +4784,8 @@ ApplyWorkerMain(Datum main_arg)
}
/* Run the main loop. */
- start_apply(origin_startpos);
+ last_received = origin_startpos;
+ start_apply();
proc_exit(0);
}
diff --git a/src/backend/utils/adt/timestamp.c b/src/backend/utils/adt/timestamp.c
index 928c330897..422e6ad0fa 100644
--- a/src/backend/utils/adt/timestamp.c
+++ b/src/backend/utils/adt/timestamp.c
@@ -2431,6 +2431,35 @@ interval_cmp_internal(const Interval *interval1, const Interval *interval2)
return int128_compare(span1, span2);
}
+/*
+ * Returns the number of milliseconds in the specified Interval.
+ */
+int64
+interval2ms(const Interval *interval)
+{
+ int64 days;
+ int64 ms;
+ int64 result;
+
+ days = interval->month * INT64CONST(30);
+ days += interval->day;
+
+ /* Detect whether the value of interval can cause an overflow */
+ if (pg_mul_s64_overflow(days, MSECS_PER_DAY, &result))
+ ereport(ERROR,
+ errcode(ERRCODE_NUMERIC_VALUE_OUT_OF_RANGE),
+ errmsg("bigint out of range"));
+
+ /* Adds portion time (in ms) to the previous result */
+ ms = interval->time / INT64CONST(1000);
+ if (pg_add_s64_overflow(result, ms, &result))
+ ereport(ERROR,
+ errcode(ERRCODE_NUMERIC_VALUE_OUT_OF_RANGE),
+ errmsg("bigint out of range"));
+
+ return result;
+}
+
Datum
interval_eq(PG_FUNCTION_ARGS)
{
diff --git a/src/bin/pg_dump/pg_dump.c b/src/bin/pg_dump/pg_dump.c
index 527c7651ab..7ffbab0915 100644
--- a/src/bin/pg_dump/pg_dump.c
+++ b/src/bin/pg_dump/pg_dump.c
@@ -4494,6 +4494,7 @@ getSubscriptions(Archive *fout)
int i_subsynccommit;
int i_subpublications;
int i_subbinary;
+ int i_subminapplydelay;
int i,
ntups;
@@ -4546,9 +4547,14 @@ getSubscriptions(Archive *fout)
LOGICALREP_TWOPHASE_STATE_DISABLED);
if (fout->remoteVersion >= 160000)
- appendPQExpBufferStr(query, " s.suborigin\n");
+ appendPQExpBufferStr(query,
+ " s.suborigin,\n"
+ " s.subminapplydelay\n");
else
- appendPQExpBuffer(query, " '%s' AS suborigin\n", LOGICALREP_ORIGIN_ANY);
+ {
+ appendPQExpBuffer(query, " '%s' AS suborigin,\n", LOGICALREP_ORIGIN_ANY);
+ appendPQExpBufferStr(query, " 0 AS subminapplydelay\n");
+ }
appendPQExpBufferStr(query,
"FROM pg_subscription s\n"
@@ -4576,6 +4582,7 @@ getSubscriptions(Archive *fout)
i_subtwophasestate = PQfnumber(res, "subtwophasestate");
i_subdisableonerr = PQfnumber(res, "subdisableonerr");
i_suborigin = PQfnumber(res, "suborigin");
+ i_subminapplydelay = PQfnumber(res, "subminapplydelay");
subinfo = pg_malloc(ntups * sizeof(SubscriptionInfo));
@@ -4606,6 +4613,8 @@ getSubscriptions(Archive *fout)
subinfo[i].subdisableonerr =
pg_strdup(PQgetvalue(res, i, i_subdisableonerr));
subinfo[i].suborigin = pg_strdup(PQgetvalue(res, i, i_suborigin));
+ subinfo[i].subminapplydelay =
+ strtoi64(PQgetvalue(res, i, i_subminapplydelay), NULL, 10);
/* Decide whether we want to dump it */
selectDumpableObject(&(subinfo[i].dobj), fout);
@@ -4687,6 +4696,9 @@ dumpSubscription(Archive *fout, const SubscriptionInfo *subinfo)
if (strcmp(subinfo->subsynccommit, "off") != 0)
appendPQExpBuffer(query, ", synchronous_commit = %s", fmtId(subinfo->subsynccommit));
+ if (subinfo->subminapplydelay > 0)
+ appendPQExpBuffer(query, ", min_apply_delay = '" INT64_FORMAT " ms'", subinfo->subminapplydelay);
+
appendPQExpBufferStr(query, ");\n");
if (subinfo->dobj.dump & DUMP_COMPONENT_DEFINITION)
diff --git a/src/bin/pg_dump/pg_dump.h b/src/bin/pg_dump/pg_dump.h
index e7cbd8d7ed..e2525f70ab 100644
--- a/src/bin/pg_dump/pg_dump.h
+++ b/src/bin/pg_dump/pg_dump.h
@@ -661,6 +661,7 @@ typedef struct _SubscriptionInfo
char *subdisableonerr;
char *suborigin;
char *subsynccommit;
+ int64 subminapplydelay;
char *subpublications;
} SubscriptionInfo;
diff --git a/src/bin/psql/describe.c b/src/bin/psql/describe.c
index c8a0bb7b3a..8a27063bed 100644
--- a/src/bin/psql/describe.c
+++ b/src/bin/psql/describe.c
@@ -6472,7 +6472,7 @@ describeSubscriptions(const char *pattern, bool verbose)
PGresult *res;
printQueryOpt myopt = pset.popt;
static const bool translate_columns[] = {false, false, false, false,
- false, false, false, false, false, false, false, false};
+ false, false, false, false, false, false, false, false, false};
if (pset.sversion < 100000)
{
@@ -6527,10 +6527,13 @@ describeSubscriptions(const char *pattern, bool verbose)
gettext_noop("Two-phase commit"),
gettext_noop("Disable on error"));
+ /* Origin and min_apply_delay are only supported in v16 and higher */
if (pset.sversion >= 160000)
appendPQExpBuffer(&buf,
- ", suborigin AS \"%s\"\n",
- gettext_noop("Origin"));
+ ", suborigin AS \"%s\"\n"
+ ", subminapplydelay AS \"%s\"\n",
+ gettext_noop("Origin"),
+ gettext_noop("Min apply delay (ms)"));
appendPQExpBuffer(&buf,
", subsynccommit AS \"%s\"\n"
diff --git a/src/bin/psql/tab-complete.c b/src/bin/psql/tab-complete.c
index 5e1882eaea..e8b9a43a47 100644
--- a/src/bin/psql/tab-complete.c
+++ b/src/bin/psql/tab-complete.c
@@ -1925,7 +1925,7 @@ psql_completion(const char *text, int start, int end)
COMPLETE_WITH("(", "PUBLICATION");
/* ALTER SUBSCRIPTION <name> SET ( */
else if (HeadMatches("ALTER", "SUBSCRIPTION", MatchAny) && TailMatches("SET", "("))
- COMPLETE_WITH("binary", "disable_on_error", "origin", "slot_name",
+ COMPLETE_WITH("binary", "disable_on_error", "min_apply_delay", "origin", "slot_name",
"streaming", "synchronous_commit");
/* ALTER SUBSCRIPTION <name> SKIP ( */
else if (HeadMatches("ALTER", "SUBSCRIPTION", MatchAny) && TailMatches("SKIP", "("))
@@ -3268,7 +3268,7 @@ psql_completion(const char *text, int start, int end)
/* Complete "CREATE SUBSCRIPTION <name> ... WITH ( <opt>" */
else if (HeadMatches("CREATE", "SUBSCRIPTION") && TailMatches("WITH", "("))
COMPLETE_WITH("binary", "connect", "copy_data", "create_slot",
- "disable_on_error", "enabled", "origin", "slot_name",
+ "disable_on_error", "enabled", "min_apply_delay", "origin", "slot_name",
"streaming", "synchronous_commit", "two_phase");
/* CREATE TRIGGER --- is allowed inside CREATE SCHEMA, so use TailMatches */
diff --git a/src/include/catalog/pg_subscription.h b/src/include/catalog/pg_subscription.h
index b0f2a1705d..078495cbf0 100644
--- a/src/include/catalog/pg_subscription.h
+++ b/src/include/catalog/pg_subscription.h
@@ -70,6 +70,8 @@ CATALOG(pg_subscription,6100,SubscriptionRelationId) BKI_SHARED_RELATION BKI_ROW
XLogRecPtr subskiplsn; /* All changes finished at this LSN are
* skipped */
+ int64 subminapplydelay; /* Replication apply delay */
+
NameData subname; /* Name of the subscription */
Oid subowner BKI_LOOKUP(pg_authid); /* Owner of the subscription */
@@ -120,6 +122,7 @@ typedef struct Subscription
* in */
XLogRecPtr skiplsn; /* All changes finished at this LSN are
* skipped */
+ int64 minapplydelay; /* Replication apply delay */
char *name; /* Name of the subscription */
Oid owner; /* Oid of the subscription owner */
bool enabled; /* Indicates if the subscription is enabled */
diff --git a/src/include/datatype/timestamp.h b/src/include/datatype/timestamp.h
index 21a37e21e9..8b368af299 100644
--- a/src/include/datatype/timestamp.h
+++ b/src/include/datatype/timestamp.h
@@ -127,6 +127,8 @@ struct pg_itm_in
#define SECS_PER_MINUTE 60
#define MINS_PER_HOUR 60
+#define MSECS_PER_DAY INT64CONST(86400000)
+
#define USECS_PER_DAY INT64CONST(86400000000)
#define USECS_PER_HOUR INT64CONST(3600000000)
#define USECS_PER_MINUTE INT64CONST(60000000)
diff --git a/src/include/replication/worker_internal.h b/src/include/replication/worker_internal.h
index db891eea8a..4ef132d243 100644
--- a/src/include/replication/worker_internal.h
+++ b/src/include/replication/worker_internal.h
@@ -255,7 +255,7 @@ extern void stream_stop_internal(TransactionId xid);
/* Common streaming function to apply all the spooled messages */
extern void apply_spooled_messages(FileSet *stream_fileset, TransactionId xid,
- XLogRecPtr lsn);
+ XLogRecPtr lsn, TimestampTz finish_ts);
extern void apply_dispatch(StringInfo s);
diff --git a/src/include/utils/timestamp.h b/src/include/utils/timestamp.h
index 42f802bb9d..534051fe13 100644
--- a/src/include/utils/timestamp.h
+++ b/src/include/utils/timestamp.h
@@ -102,6 +102,8 @@ extern bool TimestampDifferenceExceeds(TimestampTz start_time,
TimestampTz stop_time,
int msec);
+extern int64 interval2ms(const Interval *interval);
+
extern TimestampTz time_t_to_timestamptz(pg_time_t tm);
extern pg_time_t timestamptz_to_time_t(TimestampTz t);
diff --git a/src/test/regress/expected/subscription.out b/src/test/regress/expected/subscription.out
index 4e5cb0d3a9..dbf0803568 100644
--- a/src/test/regress/expected/subscription.out
+++ b/src/test/regress/expected/subscription.out
@@ -114,18 +114,18 @@ CREATE SUBSCRIPTION regress_testsub4 CONNECTION 'dbname=regress_doesnotexist' PU
WARNING: subscription was created, but is not connected
HINT: To initiate replication, you must manually create the replication slot, enable the subscription, and refresh the subscription.
\dRs+ regress_testsub4
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
-------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub4 | regress_subscription_user | f | {testpub} | f | off | d | f | none | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay (ms) | Synchronous commit | Conninfo | Skip LSN
+------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+----------------------+--------------------+-----------------------------+----------
+ regress_testsub4 | regress_subscription_user | f | {testpub} | f | off | d | f | none | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
ALTER SUBSCRIPTION regress_testsub4 SET (origin = any);
\dRs+ regress_testsub4
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
-------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub4 | regress_subscription_user | f | {testpub} | f | off | d | f | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay (ms) | Synchronous commit | Conninfo | Skip LSN
+------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+----------------------+--------------------+-----------------------------+----------
+ regress_testsub4 | regress_subscription_user | f | {testpub} | f | off | d | f | any | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
DROP SUBSCRIPTION regress_testsub3;
@@ -135,10 +135,10 @@ ALTER SUBSCRIPTION regress_testsub CONNECTION 'foobar';
ERROR: invalid connection string syntax: missing "=" after "foobar" in connection info string
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub} | f | off | d | f | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay (ms) | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+----------------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | off | d | f | any | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
ALTER SUBSCRIPTION regress_testsub SET PUBLICATION testpub2, testpub3 WITH (refresh = false);
@@ -155,10 +155,10 @@ ERROR: unrecognized subscription parameter: "create_slot"
-- ok
ALTER SUBSCRIPTION regress_testsub SKIP (lsn = '0/12345');
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+---------------------+--------+-----------+------------------+------------------+--------+--------------------+------------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub2,testpub3} | f | off | d | f | any | off | dbname=regress_doesnotexist2 | 0/12345
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay (ms) | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+---------------------+--------+-----------+------------------+------------------+--------+----------------------+--------------------+------------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub2,testpub3} | f | off | d | f | any | 0 | off | dbname=regress_doesnotexist2 | 0/12345
(1 row)
-- ok - with lsn = NONE
@@ -167,10 +167,10 @@ ALTER SUBSCRIPTION regress_testsub SKIP (lsn = NONE);
ALTER SUBSCRIPTION regress_testsub SKIP (lsn = '0/0');
ERROR: invalid WAL location (LSN): 0/0
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+---------------------+--------+-----------+------------------+------------------+--------+--------------------+------------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub2,testpub3} | f | off | d | f | any | off | dbname=regress_doesnotexist2 | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay (ms) | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+---------------------+--------+-----------+------------------+------------------+--------+----------------------+--------------------+------------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub2,testpub3} | f | off | d | f | any | 0 | off | dbname=regress_doesnotexist2 | 0/0
(1 row)
BEGIN;
@@ -202,10 +202,10 @@ ALTER SUBSCRIPTION regress_testsub_foo SET (synchronous_commit = foobar);
ERROR: invalid value for parameter "synchronous_commit": "foobar"
HINT: Available values: local, remote_write, remote_apply, on, off.
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
----------------------+---------------------------+---------+---------------------+--------+-----------+------------------+------------------+--------+--------------------+------------------------------+----------
- regress_testsub_foo | regress_subscription_user | f | {testpub2,testpub3} | f | off | d | f | any | local | dbname=regress_doesnotexist2 | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay (ms) | Synchronous commit | Conninfo | Skip LSN
+---------------------+---------------------------+---------+---------------------+--------+-----------+------------------+------------------+--------+----------------------+--------------------+------------------------------+----------
+ regress_testsub_foo | regress_subscription_user | f | {testpub2,testpub3} | f | off | d | f | any | 0 | local | dbname=regress_doesnotexist2 | 0/0
(1 row)
-- rename back to keep the rest simple
@@ -239,19 +239,19 @@ CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUB
WARNING: subscription was created, but is not connected
HINT: To initiate replication, you must manually create the replication slot, enable the subscription, and refresh the subscription.
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub} | t | off | d | f | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay (ms) | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+----------------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | t | off | d | f | any | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
ALTER SUBSCRIPTION regress_testsub SET (binary = false);
ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE);
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub} | f | off | d | f | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay (ms) | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+----------------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | off | d | f | any | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
DROP SUBSCRIPTION regress_testsub;
@@ -263,27 +263,27 @@ CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUB
WARNING: subscription was created, but is not connected
HINT: To initiate replication, you must manually create the replication slot, enable the subscription, and refresh the subscription.
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub} | f | on | d | f | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay (ms) | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+----------------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | on | d | f | any | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
ALTER SUBSCRIPTION regress_testsub SET (streaming = parallel);
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub} | f | parallel | d | f | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay (ms) | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+----------------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | parallel | d | f | any | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
ALTER SUBSCRIPTION regress_testsub SET (streaming = false);
ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE);
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub} | f | off | d | f | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay (ms) | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+----------------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | off | d | f | any | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
-- fail - publication already exists
@@ -298,10 +298,10 @@ ALTER SUBSCRIPTION regress_testsub ADD PUBLICATION testpub1, testpub2 WITH (refr
ALTER SUBSCRIPTION regress_testsub ADD PUBLICATION testpub1, testpub2 WITH (refresh = false);
ERROR: publication "testpub1" is already in subscription "regress_testsub"
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-----------------------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub,testpub1,testpub2} | f | off | d | f | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay (ms) | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-----------------------------+--------+-----------+------------------+------------------+--------+----------------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub,testpub1,testpub2} | f | off | d | f | any | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
-- fail - publication used more than once
@@ -316,10 +316,10 @@ ERROR: publication "testpub3" is not in subscription "regress_testsub"
-- ok - delete publications
ALTER SUBSCRIPTION regress_testsub DROP PUBLICATION testpub1, testpub2 WITH (refresh = false);
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub} | f | off | d | f | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay (ms) | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+----------------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | off | d | f | any | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
DROP SUBSCRIPTION regress_testsub;
@@ -355,10 +355,10 @@ CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUB
WARNING: subscription was created, but is not connected
HINT: To initiate replication, you must manually create the replication slot, enable the subscription, and refresh the subscription.
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub} | f | off | p | f | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay (ms) | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+----------------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | off | p | f | any | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
--fail - alter of two_phase option not supported.
@@ -367,10 +367,10 @@ ERROR: unrecognized subscription parameter: "two_phase"
-- but can alter streaming when two_phase enabled
ALTER SUBSCRIPTION regress_testsub SET (streaming = true);
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub} | f | on | p | f | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay (ms) | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+----------------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | on | p | f | any | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE);
@@ -380,10 +380,10 @@ CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUB
WARNING: subscription was created, but is not connected
HINT: To initiate replication, you must manually create the replication slot, enable the subscription, and refresh the subscription.
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub} | f | on | p | f | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay (ms) | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+----------------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | on | p | f | any | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE);
@@ -396,20 +396,61 @@ CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUB
WARNING: subscription was created, but is not connected
HINT: To initiate replication, you must manually create the replication slot, enable the subscription, and refresh the subscription.
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub} | f | off | d | f | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay (ms) | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+----------------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | off | d | f | any | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
ALTER SUBSCRIPTION regress_testsub SET (disable_on_error = true);
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub} | f | off | d | t | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay (ms) | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+----------------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | off | d | t | any | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
+ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE);
+DROP SUBSCRIPTION regress_testsub;
+-- fail -- min_apply_delay must be a non-negative integer
+CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUBLICATION testpub WITH (connect = false, min_apply_delay = foo);
+ERROR: invalid input syntax for type interval: "foo"
+CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUBLICATION testpub WITH (connect = false, min_apply_delay = -1);
+ERROR: -1 ms is outside the valid range for parameter "min_apply_delay" (0 .. 2147483647)
+-- fail - utilizing streaming = parallel with time delayed replication is not supported.
+CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUBLICATION testpub WITH (connect = false, streaming = parallel, min_apply_delay = '1 day');
+ERROR: min_apply_delay > 0 and streaming = parallel are mutually exclusive options
+-- success -- value without unit is taken as milliseconds
+CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUBLICATION testpub WITH (connect = false, min_apply_delay = 123);
+WARNING: subscription was created, but is not connected
+HINT: To initiate replication, you must manually create the replication slot, enable the subscription, and refresh the subscription.
+\dRs+
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay (ms) | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+----------------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | off | d | f | any | 123 | off | dbname=regress_doesnotexist | 0/0
+(1 row)
+
+-- success -- interval is converted into ms and stored as integer
+ALTER SUBSCRIPTION regress_testsub SET (min_apply_delay = '4h 27min 35s');
+\dRs+
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay (ms) | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+----------------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | off | d | f | any | 16055000 | off | dbname=regress_doesnotexist | 0/0
+(1 row)
+
+-- fail - alter subscription with streaming = parallel should fail when time delayed replication is set.
+ALTER SUBSCRIPTION regress_testsub SET (streaming = parallel);
+ERROR: cannot enable streaming = parallel mode for subscription with min_apply_delay
+ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE);
+DROP SUBSCRIPTION regress_testsub;
+-- fail - alter subscription with min_apply_delay should fail when streaming = parallel is set.
+CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUBLICATION testpub WITH (connect = false, streaming = parallel);
+WARNING: subscription was created, but is not connected
+HINT: To initiate replication, you must manually create the replication slot, enable the subscription, and refresh the subscription.
+ALTER SUBSCRIPTION regress_testsub SET (min_apply_delay = 123);
+ERROR: cannot enable min_apply_delay for subscription in streaming = parallel mode
ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE);
DROP SUBSCRIPTION regress_testsub;
RESET SESSION AUTHORIZATION;
diff --git a/src/test/regress/sql/subscription.sql b/src/test/regress/sql/subscription.sql
index 5f27b7d776..4b90b85702 100644
--- a/src/test/regress/sql/subscription.sql
+++ b/src/test/regress/sql/subscription.sql
@@ -279,6 +279,31 @@ ALTER SUBSCRIPTION regress_testsub SET (disable_on_error = true);
ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE);
DROP SUBSCRIPTION regress_testsub;
+-- fail -- min_apply_delay must be a non-negative integer
+CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUBLICATION testpub WITH (connect = false, min_apply_delay = foo);
+CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUBLICATION testpub WITH (connect = false, min_apply_delay = -1);
+-- fail - utilizing streaming = parallel with time delayed replication is not supported.
+CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUBLICATION testpub WITH (connect = false, streaming = parallel, min_apply_delay = '1 day');
+
+-- success -- value without unit is taken as milliseconds
+CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUBLICATION testpub WITH (connect = false, min_apply_delay = 123);
+\dRs+
+
+-- success -- interval is converted into ms and stored as integer
+ALTER SUBSCRIPTION regress_testsub SET (min_apply_delay = '4h 27min 35s');
+\dRs+
+
+-- fail - alter subscription with streaming = parallel should fail when time delayed replication is set.
+ALTER SUBSCRIPTION regress_testsub SET (streaming = parallel);
+ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE);
+DROP SUBSCRIPTION regress_testsub;
+
+-- fail - alter subscription with min_apply_delay should fail when streaming = parallel is set.
+CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUBLICATION testpub WITH (connect = false, streaming = parallel);
+ALTER SUBSCRIPTION regress_testsub SET (min_apply_delay = 123);
+ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE);
+DROP SUBSCRIPTION regress_testsub;
+
RESET SESSION AUTHORIZATION;
DROP ROLE regress_subscription_user;
DROP ROLE regress_subscription_user2;
diff --git a/src/test/subscription/meson.build b/src/test/subscription/meson.build
index 3db0fdfd96..a186876eb4 100644
--- a/src/test/subscription/meson.build
+++ b/src/test/subscription/meson.build
@@ -38,6 +38,7 @@ tests += {
't/029_on_error.pl',
't/030_origin.pl',
't/031_column_list.pl',
+ 't/032_apply_delay.pl',
't/100_bugs.pl',
],
},
diff --git a/src/test/subscription/t/032_apply_delay.pl b/src/test/subscription/t/032_apply_delay.pl
new file mode 100644
index 0000000000..25459e3dec
--- /dev/null
+++ b/src/test/subscription/t/032_apply_delay.pl
@@ -0,0 +1,182 @@
+
+# Copyright (c) 2023, PostgreSQL Global Development Group
+
+# Test replication apply delay
+use strict;
+use warnings;
+use PostgreSQL::Test::Cluster;
+use PostgreSQL::Test::Utils;
+use Test::More;
+
+my $log_location = 0;
+
+# Confirm the time-delayed replication has been effective from the server log
+# message where the apply worker emits for applying delay. When necessary,
+# verifies that the current worker's delayed time is sufficiently bigger than
+# the expected value, in order to check any update of the min_apply_delay.
+sub check_apply_delay_log
+{
+ my ($node_subscriber, $message, $expected) = @_;
+ $expected = 0 unless defined $expected;
+
+ my $old_log_location = $log_location;
+
+ $log_location = $node_subscriber->wait_for_log(qr/$message/, $log_location);
+
+ cmp_ok($log_location, '>', $old_log_location,
+ "logfile contains triggered logical replication apply delay"
+ );
+
+ if ($expected > 0)
+ {
+ # Get the delay time in the server log
+ my $contents = slurp_file($node_subscriber->logfile, $old_log_location);
+ $contents =~
+ qr/$message: (\d+) ms/
+ or die "could not get delayed time";
+ my $logged_delay = $1;
+
+ # Is it larger than expected?
+ cmp_ok($logged_delay, '>', $expected,
+ "The wait time of the apply worker is long enough expectedly"
+ );
+ }
+}
+
+sub check_apply_delay_time
+{
+ my ($node_publisher, $node_subscriber, $primary_key, $expected_diffs) = @_;
+
+ my $inserted_time_on_pub = $node_publisher->safe_psql('postgres', qq[
+ SELECT extract(epoch from c) FROM test_tab WHERE a = $primary_key;
+ ]);
+
+ my $inserted_time_on_sub = $node_subscriber->safe_psql('postgres', qq[
+ SELECT extract(epoch from c) FROM test_tab WHERE a = $primary_key;
+ ]);
+
+ cmp_ok($inserted_time_on_sub - $inserted_time_on_pub, '>', $expected_diffs,
+ "The tuple on the subscriber was modified later than the publisher");
+}
+
+# Create publisher node
+my $node_publisher = PostgreSQL::Test::Cluster->new('publisher');
+$node_publisher->init(allows_streaming => 'logical');
+$node_publisher->append_conf('postgresql.conf',
+ 'logical_decoding_work_mem = 64kB');
+$node_publisher->start;
+
+# Create subscriber node
+my $node_subscriber = PostgreSQL::Test::Cluster->new('subscriber');
+$node_subscriber->init;
+$node_subscriber->append_conf('postgresql.conf',
+ "log_min_messages = debug2");
+$node_subscriber->start;
+
+# Create some preexisting content on publisher
+$node_publisher->safe_psql('postgres',
+ "CREATE TABLE test_tab (a int primary key, b varchar, c timestamptz DEFAULT now())");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO test_tab VALUES (1, 'foo'), (2, 'bar')");
+
+# Setup structure on subscriber
+$node_subscriber->safe_psql('postgres',
+ "CREATE TABLE test_tab (a int primary key, b text, c timestamptz DEFAULT now(), d bigint DEFAULT 999)"
+);
+
+# Setup logical replication
+my $publisher_connstr = $node_publisher->connstr . ' dbname=postgres';
+
+# The column c must not be published because we want to compare the time
+# difference.
+$node_publisher->safe_psql('postgres',
+ "CREATE PUBLICATION tap_pub FOR TABLE test_tab (a, b)");
+
+my $appname = 'tap_sub';
+$node_subscriber->safe_psql('postgres',
+ "CREATE SUBSCRIPTION tap_sub CONNECTION '$publisher_connstr application_name=$appname' PUBLICATION tap_pub WITH (streaming = on, min_apply_delay = '3s')"
+);
+
+# Wait for initial table sync to finish
+$node_subscriber->wait_for_subscription_sync($node_publisher, $appname);
+
+# Check log starting now for logical replication apply delay
+$log_location = -s $node_subscriber->logfile;
+
+my $result =
+ $node_subscriber->safe_psql('postgres',
+ "SELECT count(*), min(a), max(a) FROM test_tab");
+is($result, qq(2|1|2), 'check initial data was copied to subscriber');
+
+# New row to trigger apply delay
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO test_tab VALUES (3, 'baz')");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO test_tab VALUES (4, 'abc')");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO test_tab VALUES (5, 'def')");
+
+$node_publisher->wait_for_catchup($appname);
+
+$result =
+ $node_subscriber->safe_psql('postgres',
+ "SELECT count(*), min(a), max(a) FROM test_tab");
+is($result, qq(5|1|5), 'check if the new rows were applied to subscriber');
+
+check_apply_delay_log($node_subscriber, "logical replication apply delay");
+check_apply_delay_time($node_publisher, $node_subscriber, '5', '3');
+
+# Reduce the amounts of writes for spooling file
+$node_publisher->append_conf('postgres.conf',
+ 'logical_decoding_mode = immediate');
+$node_publisher->reload;
+
+# Test streamed transaction by insert, update and delete enough rows to exceed
+# 64kB limit.
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO test_tab SELECT i, md5(i::text) FROM generate_series(6, 8) s(i);");
+
+$node_publisher->wait_for_catchup($appname);
+
+$result =
+ $node_subscriber->safe_psql('postgres',
+ "SELECT count(*), min(a), max(a) FROM test_tab");
+is($result, qq(8|1|8), 'check if the new rows were applied to subscriber');
+
+check_apply_delay_log($node_subscriber, "logical replication apply delay");
+check_apply_delay_time($node_publisher, $node_subscriber, '8', '3');
+
+# Test whether ALTER SUBSCRIPTION changes the delayed time of the apply worker
+# (1 day 1 minute).
+$node_subscriber->safe_psql('postgres',
+ "ALTER SUBSCRIPTION tap_sub SET (min_apply_delay = 86460000)"
+);
+
+# New row to trigger apply delay
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO test_tab VALUES (0, 'foobar')");
+
+# Make sure we have long enough min_apply_delay after the ALTER command
+check_apply_delay_log($node_subscriber, "logical replication apply delay", "80000000");
+
+# Disable subscription and the worker should die immediately
+$node_subscriber->safe_psql('postgres',
+ "ALTER SUBSCRIPTION tap_sub DISABLE;"
+);
+
+# Wait until worker dies
+my $sub_query =
+ "SELECT count(1) = 0 FROM pg_stat_subscription WHERE subname = 'tap_sub' AND pid IS NOT NULL;";
+$node_subscriber->poll_query_until('postgres', $sub_query)
+ or die "Timed out while waiting for subscriber to die";
+
+# Confirm the suspended record doesn't get applied by the ALTER DISABLE
+# command.
+$result = $node_subscriber->safe_psql('postgres',
+ "SELECT count(a) FROM test_tab WHERE a = 0;");
+is($result, qq(0), "check if the delayed transaction doesn't get applied expectedly");
+
+$node_subscriber->stop;
+$node_publisher->stop;
+
+done_testing();
--
2.30.0
Dear hackers,
At present, in the latter case,
the apply worker exits without applying the suspended transaction
after ALTER SUBSCRIPTION DISABLE command for the subscription.
Meanwhile, there is no "disabling" command for physical replication,
but I checked the behavior about what happens for promoting a secondary
during the delay of recovery_min_apply_delay for physical replication as one
example.
The transaction has become visible even in the promoting in the middle of delay.I'm not sure if I should make the time-delayed LR aligned with this behavior.
Does someone has an opinion for this ?
I put my opinion here. The current specification is correct; we should not follow
a physical replication manner.
One motivation for this feature is to offer opportunities to correct data loss
errors. When accidental delete events occur, DBA can stop propagations on subscribers
by disabling the subscription, with the patch at present.
IIUC, when the subscription is disabled before transactions are started,
workers exit and stop applications. This feature delays starting txns, so we
should regard such an alternation as that is executed before the transaction.
Best Regards,
Hayato Kuroda
FUJITSU LIMITED
On Tue, Jan 17, 2023 at 4:30 PM Takamichi Osumi (Fujitsu)
<osumi.takamichi@fujitsu.com> wrote:
On Saturday, January 14, 2023 3:27 PM vignesh C <vignesh21@gmail.com> wrote:
2) I'm not sure if this will add any extra coverage as the altering value of min_apply_delay is already tested in the regression, if so this test can be removed: +# Test ALTER SUBSCRIPTION. Delay 86460 seconds (1 day 1 minute). +$node_subscriber->safe_psql('postgres', + "ALTER SUBSCRIPTION tap_sub SET (min_apply_delay = 86460000)" +); + +# New row to trigger apply delay. +$node_publisher->safe_psql('postgres', + "INSERT INTO test_tab VALUES (0, 'foobar')"); + +check_apply_delay_log("logical replication apply delay", "80000000");While addressing this point, I've noticed that there is a
behavior difference between physical replication's recovery_min_apply_delay
and this feature when stopping the replication during delays.At present, in the latter case,
the apply worker exits without applying the suspended transaction
after ALTER SUBSCRIPTION DISABLE command for the subscription.
In the previous paragraph, you said the behavior difference while
stopping the replication but it is not clear from where this DISABLE
command comes in that scenario.
Meanwhile, there is no "disabling" command for physical replication,
but I checked the behavior about what happens for promoting a secondary
during the delay of recovery_min_apply_delay for physical replication as one example.
The transaction has become visible even in the promoting in the middle of delay.
What causes such a transaction to be visible after promotion? Ideally,
if the commit doesn't succeed, the transaction shouldn't be visible.
Do, we allow the transaction waiting due to delay to get committed on
promotion?
I'm not sure if I should make the time-delayed LR aligned with this behavior.
Does someone has an opinion for this ?
Can you please explain a bit more as asked above to understand the difference?
--
With Regards,
Amit Kapila.
Hi,
On Tuesday, January 17, 2023 9:54 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
On Tue, Jan 17, 2023 at 4:30 PM Takamichi Osumi (Fujitsu)
<osumi.takamichi@fujitsu.com> wrote:On Saturday, January 14, 2023 3:27 PM vignesh C <vignesh21@gmail.com>
wrote:
2) I'm not sure if this will add any extra coverage as the altering value of min_apply_delay is already tested in the regression, if so this test can be removed: +# Test ALTER SUBSCRIPTION. Delay 86460 seconds (1 day 1 minute). +$node_subscriber->safe_psql('postgres', + "ALTER SUBSCRIPTION tap_sub SET (min_apply_delay = 86460000)" +); + +# New row to trigger apply delay. +$node_publisher->safe_psql('postgres', + "INSERT INTO test_tab VALUES (0, 'foobar')"); + +check_apply_delay_log("logical replication apply delay", +"80000000");While addressing this point, I've noticed that there is a behavior
difference between physical replication's recovery_min_apply_delay and
this feature when stopping the replication during delays.At present, in the latter case,
the apply worker exits without applying the suspended transaction
after ALTER SUBSCRIPTION DISABLE command for the subscription.In the previous paragraph, you said the behavior difference while stopping the
replication but it is not clear from where this DISABLE command comes in that
scenario.
Sorry for my unclear description. I mean "stopping the replication" is
to disable the subscription during the "min_apply_delay" wait time on logical
replication setup.
I proposed and mentioned this discussion point to define
how the time-delayed apply worker should behave when there is a transaction
delayed by "min_apply_delay" parameter and additionally the user issues
ALTER SUBSCRIPTION ... DISABLE during the delay. When it comes to physical
replication, it's hard to find a perfect correspondent for LR's ALTER SUBSCRIPTION
DISABLE command, but I chose a scenario to promote a secondary during
"recovery_min_apply_delay" for comparison this time. After the promotion of
the secondary in the physical replication, the transaction
committed on the publisher but delayed on the secondary can be seen.
This would be because CheckForStandbyTrigger in recoveryApplyDelay returns true
and we apply the record by breaking the wait.
I checked and got the LOG message "received promote request" in the secondary log
when I tested this case.
Meanwhile, there is no "disabling" command for physical replication,
but I checked the behavior about what happens for promoting a
secondary during the delay of recovery_min_apply_delay for physicalreplication as one example.
The transaction has become visible even in the promoting in the middle of
delay.
What causes such a transaction to be visible after promotion? Ideally, if the
commit doesn't succeed, the transaction shouldn't be visible.
Do, we allow the transaction waiting due to delay to get committed on
promotion?
The commit succeeded on the primary and then I promoted the secondary
during the "recovery_min_apply_delay" wait of the transaction. Then, the result
is the transaction turned out to be available on the promoted secondary.
I'm not sure if I should make the time-delayed LR aligned with this behavior.
Does someone has an opinion for this ?Can you please explain a bit more as asked above to understand the
difference?
So, the current difference is that the time-delayed apply
worker of logical replication doesn't apply the delayed transaction on the subscriber
when the subscription has been disabled during the delay, while (in one example
of a promotion) the physical replication does the apply of the delayed transaction.
Best Regards,
Takamichi Osumi
On Wed, Jan 18, 2023 at 6:37 AM Takamichi Osumi (Fujitsu)
<osumi.takamichi@fujitsu.com> wrote:
Can you please explain a bit more as asked above to understand the
difference?So, the current difference is that the time-delayed apply
worker of logical replication doesn't apply the delayed transaction on the subscriber
when the subscription has been disabled during the delay, while (in one example
of a promotion) the physical replication does the apply of the delayed transaction.
I don't see any particular reason here to allow the transaction apply
to complete if the subscription is disabled. Note, that here we are
waiting at the beginning of the transaction and for large
transactions, it might cause a significant delay if we allow applying
the xact. OTOH, if someone comes up with a valid use case to allow the
transaction apply to get completed after the subscription is disabled
then we can anyway do it later as well.
--
With Regards,
Amit Kapila.
Hi,
On Wednesday, January 18, 2023 2:19 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
On Wed, Jan 18, 2023 at 6:37 AM Takamichi Osumi (Fujitsu)
<osumi.takamichi@fujitsu.com> wrote:Can you please explain a bit more as asked above to understand the
difference?So, the current difference is that the time-delayed apply worker of
logical replication doesn't apply the delayed transaction on the
subscriber when the subscription has been disabled during the delay,
while (in one example of a promotion) the physical replication does the applyof the delayed transaction.
I don't see any particular reason here to allow the transaction apply to complete
if the subscription is disabled. Note, that here we are waiting at the beginning
of the transaction and for large transactions, it might cause a significant delay if
we allow applying the xact. OTOH, if someone comes up with a valid use case
to allow the transaction apply to get completed after the subscription is
disabled then we can anyway do it later as well.
This makes sense. I agree with you. So, I'll keep the current behavior of
the patch.
Best Regards,
Takamichi Osumi
Here are my review comments for the latest patch v16-0001. (excluding
the test code)
======
General
1.
Since the value of min_apply_delay cannot be < 0, I was thinking
probably it should have been declared everywhere in this patch as a
uint64 instead of an int64, right?
======
Commit message
2.
If the subscription sets min_apply_delay parameter, the logical
replication worker will delay the transaction commit for min_apply_delay
milliseconds.
~
IMO there should be another sentence before this just to say that a
new parameter is being added:
e.g.
This patch implements a new subscription parameter called 'min_apply_delay'.
======
doc/src/sgml/config.sgml
3.
+ <para>
+ For time-delayed logical replication, the apply worker sends a Standby
+ Status Update message to the corresponding publisher per the indicated
+ time of this parameter. Therefore, if this parameter is longer than
+ <literal>wal_sender_timeout</literal> on the publisher, then the
+ walsender doesn't get any update message during the delay and repeatedly
+ terminates due to the timeout errors. Hence, make sure this parameter is
+ shorter than the <literal>wal_sender_timeout</literal> of the publisher.
+ If this parameter is set to zero with time-delayed replication, the
+ apply worker doesn't send any feedback messages during the
+ <literal>min_apply_delay</literal>.
+ </para>
This paragraph seemed confusing. I think it needs to be reworded to
change all of the "this parameter" references because there are at
least 3 different parameters mentioned in this paragraph. e.g. maybe
just change them to explicitly name the parameter you are talking
about.
I also think it needs to mention the ‘min_apply_delay’ subscription
parameter up-front and then refer to it appropriately.
The end result might be something like I wrote below (this is just my
guess – probably you can word it better).
SUGGESTION
For time-delayed logical replication (i.e. when the subscription is
created with parameter min_apply_delay > 0), the apply worker sends a
Standby Status Update message to the publisher with a period of
wal_receiver_status_interval . Make sure to set
wal_receiver_status_interval less than the wal_sender_timeout on the
publisher, otherwise, the walsender will repeatedly terminate due to
the timeout errors. If wal_receiver_status_interval is set to zero,
the apply worker doesn't send any feedback messages during the
subscriber’s min_apply_delay period.
======
doc/src/sgml/ref/create_subscription.sgml
4.
+ <para>
+ By default, the subscriber applies changes as soon as possible. As
+ with the physical replication feature
+ (<xref linkend="guc-recovery-min-apply-delay"/>), it can be useful to
+ have a time-delayed logical replica. This parameter lets the user to
+ delay the application of changes by a specified amount of
time. If this
+ value is specified without units, it is taken as milliseconds. The
+ default is zero(no delay).
+ </para>
4a.
As with the physical replication feature (recovery_min_apply_delay),
it can be useful to have a time-delayed logical replica.
IMO not sure that the above sentence is necessary. It seems only to be
saying that this parameter can be useful. Why do we need to say that?
~
4b.
"This parameter lets the user to delay" -> "This parameter lets the user delay"
OR
"This parameter lets the user to delay" -> "This parameter allows the
user to delay"
~
4c.
"If this value is specified without units" -> "If the value is
specified without units"
~
4d.
"zero(no delay)." -> "zero (no delay)."
----
5.
+ <para>
+ The delay occurs only on WAL records for transaction begins and after
+ the initial table synchronization. It is possible that the
+ replication delay between publisher and subscriber exceeds the value
+ of this parameter, in which case no delay is added. Note that the
+ delay is calculated between the WAL time stamp as written on
+ publisher and the current time on the subscriber. Time
spent in logical
+ decoding and in transferring the transaction may reduce the
actual wait
+ time. If the system clocks on publisher and subscriber are not
+ synchronized, this may lead to apply changes earlier than expected,
+ but this is not a major issue because this parameter is
typically much
+ larger than the time deviations between servers. Note that if this
+ parameter is set to a long delay, the replication will stop if the
+ replication slot falls behind the current LSN by more than
+ <link
linkend="guc-max-slot-wal-keep-size"><literal>max_slot_wal_keep_size</literal></link>.
+ </para>
I think the first part can be reworded slightly. See what you think
about the suggestion below.
SUGGESTION
Any delay occurs only on WAL records for transaction begins after all
initial table synchronization has finished. The delay is calculated
between the WAL timestamp as written on the publisher and the current
time on the subscriber. Any overhead of time spent in logical decoding
and in transferring the transaction may reduce the actual wait time.
It is also possible that the overhead already exceeds the requested
'min_apply_delay' value, in which case no additional wait is
necessary. If the system clocks...
----
6.
+ <para>
+ Setting streaming to <literal>parallel</literal> mode and
<literal>min_apply_delay</literal>
+ simultaneously is not supported.
+ </para>
SUGGESTION
A non-zero min_apply_delay parameter is not allowed when streaming in
parallel mode.
======
src/backend/commands/subscriptioncmds.c
7. parse_subscription_options
@@ -404,6 +445,17 @@ parse_subscription_options(ParseState *pstate,
List *stmt_options,
"slot_name = NONE", "create_slot = false")));
}
}
+
+ /* Test the combination of streaming mode and min_apply_delay */
+ if (IsSet(supported_opts, SUBOPT_MIN_APPLY_DELAY) &&
+ opts->min_apply_delay > 0)
+ {
+ if (opts->streaming == LOGICALREP_STREAM_PARALLEL)
+ ereport(ERROR,
+ errcode(ERRCODE_SYNTAX_ERROR),
+ errmsg("%s and %s are mutually exclusive options",
+ "min_apply_delay > 0", "streaming = parallel"));
+ }
SUGGESTION (comment)
The combination of parallel streaming mode and min_apply_delay is not allowed.
~~~
8. AlterSubscription (general)
I observed during testing there are 3 different errors….
At subscription CREATE time you can get this error:
ERROR: min_apply_delay > 0 and streaming = parallel are mutually
exclusive options
If you try to ALTER the min_apply_delay when already streaming =
parallel you can get this error:
ERROR: cannot enable min_apply_delay for subscription in streaming =
parallel mode
If you try to ALTER the streaming to be parallel if there is already a
min_apply_delay > 0 then you can get this error:
ERROR: cannot enable streaming = parallel mode for subscription with
min_apply_delay
~
IMO there is no need to have 3 different error message texts. I think
all these cases are explained by just the first text (ERROR:
min_apply_delay > 0 and streaming = parallel are mutually exclusive
options)
~~~
9. AlterSubscription
@@ -1098,6 +1152,18 @@ AlterSubscription(ParseState *pstate,
AlterSubscriptionStmt *stmt,
if (IsSet(opts.specified_opts, SUBOPT_STREAMING))
{
+ /*
+ * Test the combination of streaming mode and
+ * min_apply_delay
+ */
+ if (opts.streaming == LOGICALREP_STREAM_PARALLEL)
+ if ((IsSet(opts.specified_opts, SUBOPT_MIN_APPLY_DELAY) &&
opts.min_apply_delay > 0) ||
+ (!IsSet(opts.specified_opts, SUBOPT_MIN_APPLY_DELAY) &&
sub->minapplydelay > 0))
+ ereport(ERROR,
+ errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
+ errmsg("cannot enable %s mode for subscription with %s",
+ "streaming = parallel", "min_apply_delay"));
+
9a.
SUGGESTION (comment)
The combination of parallel streaming mode and min_apply_delay is not allowed.
~
9b.
(see AlterSubscription general review comment #8 above)
Here you can use the same comment error message that says
min_apply_delay > 0 and streaming = parallel are mutually exclusive
options.
~~~
10. AlterSubscription
@@ -1111,6 +1177,25 @@ AlterSubscription(ParseState *pstate,
AlterSubscriptionStmt *stmt,
= true;
}
+ if (IsSet(opts.specified_opts, SUBOPT_MIN_APPLY_DELAY))
+ {
+ /*
+ * Test the combination of streaming mode and
+ * min_apply_delay
+ */
+ if (opts.min_apply_delay > 0)
+ if ((IsSet(opts.specified_opts, SUBOPT_STREAMING) && opts.streaming
== LOGICALREP_STREAM_PARALLEL) ||
+ (!IsSet(opts.specified_opts, SUBOPT_STREAMING) && sub->stream ==
LOGICALREP_STREAM_PARALLEL))
+ ereport(ERROR,
+ errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
+ errmsg("cannot enable %s for subscription in %s mode",
+ "min_apply_delay", "streaming = parallel"));
+
+ values[Anum_pg_subscription_subminapplydelay - 1] =
+ Int64GetDatum(opts.min_apply_delay);
+ replaces[Anum_pg_subscription_subminapplydelay - 1] = true;
+ }
10a.
SUGGESTION (comment)
The combination of parallel streaming mode and min_apply_delay is not allowed.
~
10b.
(see AlterSubscription general review comment #8 above)
Here you can use the same comment error message that says
min_apply_delay > 0 and streaming = parallel are mutually exclusive
options.
======
.../replication/logical/applyparallelworker.c
11.
@@ -704,7 +704,8 @@ pa_process_spooled_messages_if_required(void)
{
apply_spooled_messages(&MyParallelShared->fileset,
MyParallelShared->xid,
- InvalidXLogRecPtr);
+ InvalidXLogRecPtr,
+ 0);
IMO this passing of 0 is a bit strange because it is currently acting
like a dummy value since the apply_spooled_messages will never make
use of the 'finish_ts' anyway (since this call is from a parallel
apply worker).
I think a better way to code this might be to pass the 0 (same as you
are doing here) but inside the apply_spooled_messages change the code:
FROM
if (!am_parallel_apply_worker())
maybe_delay_apply(finish_ts);
TO
if (finish_ts)
maybe_delay_apply(finish_ts);
That does 2 things.
- It makes the passed-in 0 have some meaning
- It simplifies the apply_spooled_messages code
======
src/backend/replication/logical/worker.c
12.
@@ -318,6 +318,17 @@ static List *on_commit_wakeup_workers_subids = NIL;
bool in_remote_transaction = false;
static XLogRecPtr remote_final_lsn = InvalidXLogRecPtr;
+/*
+ * In order to avoid walsender's timeout during time-delayed replication,
+ * it's necessary to keep sending feedback messages during the delay from the
+ * worker process. Meanwhile, the feature delays the apply before starting the
+ * transaction and thus we don't write WALs for the suspended changes during
+ * the wait. Hence, in the case the worker process sends a feedback message
+ * during the delay, we should not make positions of the flushed and apply LSN
+ * overwritten by the last received latest LSN. See send_feedback()
for details.
+ */
+static XLogRecPtr last_received = InvalidXLogRecPtr;
12a.
Suggest a small change to the first sentence of the comment.
BEFORE
In order to avoid walsender's timeout during time-delayed replication,
it's necessary to keep sending feedback messages during the delay from
the worker process.
AFTER
In order to avoid walsender timeout for time-delayed replication the
worker process keeps sending feedback messages during the delay
period.
~
12b.
"Hence, in the case" -> "When"
~~~
13. forward declare
-static void send_feedback(XLogRecPtr recvpos, bool force, bool requestReply);
+static void send_feedback(XLogRecPtr recvpos, bool force, bool requestReply,
+ bool in_delaying_apply);
Change the param name:
"in_delaying_apply" -> "in_delayed_apply” (??)
~~~
14. maybe_delay_apply
+ /* Nothing to do if no delay set */
+ if (MySubscription->minapplydelay <= 0)
+ return;
IIUC min_apply_delay cannot be < 0 so this condition could simply be:
if (!MySubscription->minapplydelay)
return;
~~~
15. maybe_delay_apply
+ /*
+ * The min_apply_delay parameter is ignored until all tablesync workers
+ * have reached READY state. If we allow the delay during the catchup
+ * phase, once we reach the limit of tablesync workers, it will impose a
+ * delay for each subsequent worker. It means it will take a long time to
+ * finish the initial table synchronization.
+ */
+ if (!AllTablesyncsReady())
+ return;
SUGGESTION (slight rewording)
The min_apply_delay parameter is ignored until all tablesync workers
have reached READY state. This is because if we allowed the delay
during the catchup phase, then once we reached the limit of tablesync
workers it would impose a delay for each subsequent worker. That would
cause initial table synchronization completion to take a long time.
~~~
16. maybe_delay_apply
+ while (true)
+ {
+ long diffms;
+
+ ResetLatch(MyLatch);
+
+ CHECK_FOR_INTERRUPTS();
IMO there should be some small explanatory comment here at the top of
the while loop.
~~~
17. apply_spooled_messages
@@ -2024,6 +2141,21 @@ apply_spooled_messages(FileSet *stream_fileset,
TransactionId xid,
int fileno;
off_t offset;
+ /*
+ * Should we delay the current transaction?
+ *
+ * Unlike the regular (non-streamed) cases, the delay is applied in a
+ * STREAM COMMIT/STREAM PREPARE message for streamed transactions. The
+ * STREAM START message does not contain a commit/prepare time (it will be
+ * available when the in-progress transaction finishes). Hence, it's not
+ * appropriate to apply a delay at that time.
+ *
+ * It's not allowed to execute time-delayed replication with parallel
+ * apply feature.
+ */
+ if (!am_parallel_apply_worker())
+ maybe_delay_apply(finish_ts);
That whole comment part "Unlike the regular (non-streamed) cases"
seems misplaced here. Perhaps this part of the comment is better put
into the function header where the meaning of 'finish_ts' is
explained?
~~~
18. apply_spooled_messages
+ * It's not allowed to execute time-delayed replication with parallel
+ * apply feature.
+ */
+ if (!am_parallel_apply_worker())
+ maybe_delay_apply(finish_ts);
As was mentioned in comment #11 above this code could be changed like
if (finish_ts)
maybe_delay_apply(finish_ts);
then you don't even need to make mention of "parallel apply" at all here.
OTOH if you want to still have the parallel apply comment then maybe
reword it like this:
"It is not allowed to combine time-delayed replication with the
parallel apply feature."
~~~
19. apply_spooled_messages
If you chose not to do my suggestion from comment #11, then there are
2 identical conditions (!am_parallel_apply_worker()); In this case, I
was wondering if it would be better to refactor to use a single
condition instead.
~~~
20. send_feedback
(same as comment #13)
Maybe change the new param name to “in_delayed_apply”?
~~~
21.
@@ -3737,8 +3869,15 @@ send_feedback(XLogRecPtr recvpos, bool force,
bool requestReply)
/*
* No outstanding transactions to flush, we can report the latest received
* position. This is important for synchronous replication.
+ *
+ * During the delay of time-delayed replication, do not tell the publisher
+ * that the received latest LSN is already applied and flushed at this
+ * stage, since we don't apply the transaction yet. If we do so, it leads
+ * to a wrong assumption of logical replication progress on the publisher
+ * side. Here, we just send a feedback message to avoid publisher's
+ * timeout during the delay.
*/
Minor rewording of the comment
SUGGESTION
If the subscriber side apply is delayed (because of time-delayed
replication) then do not tell the publisher that the received latest
LSN is already applied and flushed, otherwise, it leads to the
publisher side making a wrong assumption of logical replication
progress. Instead, we just send a feedback message to avoid a
publisher timeout during the delay.
======
src/bin/pg_dump/pg_dump.c
22.
@@ -4546,9 +4547,14 @@ getSubscriptions(Archive *fout)
LOGICALREP_TWOPHASE_STATE_DISABLED);
if (fout->remoteVersion >= 160000)
- appendPQExpBufferStr(query, " s.suborigin\n");
+ appendPQExpBufferStr(query,
+ " s.suborigin,\n"
+ " s.subminapplydelay\n");
else
- appendPQExpBuffer(query, " '%s' AS suborigin\n", LOGICALREP_ORIGIN_ANY);
+ {
+ appendPQExpBuffer(query, " '%s' AS suborigin,\n", LOGICALREP_ORIGIN_ANY);
+ appendPQExpBufferStr(query, " 0 AS subminapplydelay\n");
+ }
Can’t those appends in the else part can be combined to a single
appendPQExpBuffer
appendPQExpBuffer(query,
" '%s' AS suborigin,\n"
" 0 AS subminapplydelay\n"
LOGICALREP_ORIGIN_ANY);
======
src/include/catalog/pg_subscription.h
23.
@@ -70,6 +70,8 @@ CATALOG(pg_subscription,6100,SubscriptionRelationId)
BKI_SHARED_RELATION BKI_ROW
XLogRecPtr subskiplsn; /* All changes finished at this LSN are
* skipped */
+ int64 subminapplydelay; /* Replication apply delay */
+
NameData subname; /* Name of the subscription */
Oid subowner BKI_LOOKUP(pg_authid); /* Owner of the subscription */
SUGGESTION (for comment)
Replication apply delay (ms)
~~
24.
@@ -120,6 +122,7 @@ typedef struct Subscription
* in */
XLogRecPtr skiplsn; /* All changes finished at this LSN are
* skipped */
+ int64 minapplydelay; /* Replication apply delay */
SUGGESTION (for comment)
Replication apply delay (ms)
------
Kind Regards,
Peter Smith.
Fujitsu Australia
On Wed, Jan 18, 2023 at 6:06 PM Peter Smith <smithpb2250@gmail.com> wrote:
Here are my review comments for the latest patch v16-0001. (excluding
the test code)
...
8. AlterSubscription (general)
I observed during testing there are 3 different errors….
At subscription CREATE time you can get this error:
ERROR: min_apply_delay > 0 and streaming = parallel are mutually
exclusive optionsIf you try to ALTER the min_apply_delay when already streaming =
parallel you can get this error:
ERROR: cannot enable min_apply_delay for subscription in streaming =
parallel modeIf you try to ALTER the streaming to be parallel if there is already a
min_apply_delay > 0 then you can get this error:
ERROR: cannot enable streaming = parallel mode for subscription with
min_apply_delay~
IMO there is no need to have 3 different error message texts. I think
all these cases are explained by just the first text (ERROR:
min_apply_delay > 0 and streaming = parallel are mutually exclusive
options)
After checking the regression test output I can see the merit of your
separate error messages like this, even if they are maybe not strictly
necessary. So feel free to ignore my previous review comment.
------
Kind Regards,
Peter Smith.
Fujitsu Australia
On Wed, Jan 18, 2023 at 6:06 PM Peter Smith <smithpb2250@gmail.com> wrote:
Here are my review comments for the latest patch v16-0001. (excluding
the test code)
And here are some review comments for the v16-0001 test code.
======
src/test/regress/sql/subscription.sql
1. General
For all comments
"time delayed replication" -> "time-delayed replication" maybe is better?
~~~
2.
-- fail - utilizing streaming = parallel with time delayed replication
is not supported.
For readability please put a blank line before this test.
~~~
3.
-- success -- value without unit is taken as milliseconds
"value" -> "min_apply_delay value"
~~~
4.
-- success -- interval is converted into ms and stored as integer
"interval" -> "min_apply_delay interval"
"integer" -> "an integer"
~~~
5.
You could also add another test where min_apply_delay is 0
Then the following combination can be confirmed OK -- success create
subscription with (streaming=parallel, min_apply_delay=0)
~~
6.
-- fail - alter subscription with min_apply_delay should fail when
streaming = parallel is set.
CREATE SUBSCRIPTION regress_testsub CONNECTION
'dbname=regress_doesnotexist' PUBLICATION testpub WITH (connect =
false, streaming = parallel);
There is another way to do this test without creating a brand-new
subscription. You could just alter the existing subscription like:
ALTER ... SET (min_apply_delay = 0)
then ALTER ... SET (parallel = streaming)
then ALTER ... SET (min_apply_delay = 123)
======
src/test/subscription/t/032_apply_delay.pl
7. sub check_apply_delay_log
my ($node_subscriber, $message, $expected) = @_;
Why pass in the message text? I is always the same so can be hardwired
in this function, right?
~~~
8.
# Get the delay time in the server log
"int the server log" -> "from the server log" (?)
~~~
9.
qr/$message: (\d+) ms/
or die "could not get delayed time";
my $logged_delay = $1;
# Is it larger than expected?
cmp_ok($logged_delay, '>', $expected,
"The wait time of the apply worker is long enough expectedly"
);
9a.
"could not get delayed time" -> "could not get the apply worker wait time"
9b.
"The wait time of the apply worker is long enough expectedly" -> "The
apply worker wait time has expected duration"
~~~
10.
sub check_apply_delay_time
Maybe a brief explanatory comment for this function is needed to
explain the unreplicated column c.
~~~
11.
$node_subscriber->safe_psql('postgres',
"CREATE SUBSCRIPTION tap_sub CONNECTION '$publisher_connstr
application_name=$appname' PUBLICATION tap_pub WITH (streaming = on,
min_apply_delay = '3s')"
I think there should be a comment here highlighting that you are
setting up a subscriber time delay of 3 seconds, and then later you
can better describe the parameters for the checking functions...
e.g. (add this comment)
# verifies that the subscriber lags the publisher by at least 3 seconds
check_apply_delay_time($node_publisher, $node_subscriber, '5', '3');
e.g.
# verifies that the subscriber lags the publisher by at least 3 seconds
check_apply_delay_time($node_publisher, $node_subscriber, '8', '3');
~~~
12.
# Test whether ALTER SUBSCRIPTION changes the delayed time of the apply worker
# (1 day 1 minute).
$node_subscriber->safe_psql('postgres',
"ALTER SUBSCRIPTION tap_sub SET (min_apply_delay = 86460000)"
);
Update the comment with another note.
# Note - The extra 1 min is to account for any decoding/network overhead.
~~~
13.
# Make sure we have long enough min_apply_delay after the ALTER command
check_apply_delay_log($node_subscriber, "logical replication apply
delay", "80000000");
IMO the expectation of 1 day (86460000 ms) wait time might be a better
number for your "expected" value.
So update the comment/call like this:
# Make sure the apply worker knows to wait for more than 1 day (86400000 ms)
check_apply_delay_log($node_subscriber, "logical replication apply
delay", "86400000");
------
Kind Regards,
Peter Smith.
Fujitsu Australia
On Thursday, January 19, 2023 10:49 AM Peter Smith <smithpb2250@gmail.com> wrote:
On Wed, Jan 18, 2023 at 6:06 PM Peter Smith <smithpb2250@gmail.com>
wrote:Here are my review comments for the latest patch v16-0001. (excluding
the test code)And here are some review comments for the v16-0001 test code.
Hi, thanks for your review !
======
src/test/regress/sql/subscription.sql
1. General
For all comments"time delayed replication" -> "time-delayed replication" maybe is better?
Fixed.
~~~
2.
-- fail - utilizing streaming = parallel with time delayed replication is not
supported.For readability please put a blank line before this test.
Fixed.
~~~
3.
-- success -- value without unit is taken as milliseconds"value" -> "min_apply_delay value"
Fixed.
~~~
4.
-- success -- interval is converted into ms and stored as integer"interval" -> "min_apply_delay interval"
"integer" -> "an integer"
Both are fixed.
~~~
5.
You could also add another test where min_apply_delay is 0Then the following combination can be confirmed OK -- success create
subscription with (streaming=parallel, min_apply_delay=0)
This combination is added with the modification for #6.
~~
6.
-- fail - alter subscription with min_apply_delay should fail when streaming =
parallel is set.
CREATE SUBSCRIPTION regress_testsub CONNECTION
'dbname=regress_doesnotexist' PUBLICATION testpub WITH (connect = false,
streaming = parallel);There is another way to do this test without creating a brand-new subscription.
You could just alter the existing subscription like:
ALTER ... SET (min_apply_delay = 0)
then ALTER ... SET (parallel = streaming) then ALTER ... SET (min_apply_delay
= 123)
Fixed.
======
src/test/subscription/t/032_apply_delay.pl
7. sub check_apply_delay_log
my ($node_subscriber, $message, $expected) = @_;
Why pass in the message text? I is always the same so can be hardwired in this
function, right?
Fixed.
~~~
8.
# Get the delay time in the server log"int the server log" -> "from the server log" (?)
Fixed.
~~~
9.
qr/$message: (\d+) ms/
or die "could not get delayed time";
my $logged_delay = $1;# Is it larger than expected?
cmp_ok($logged_delay, '>', $expected,
"The wait time of the apply worker is long enough expectedly"
);9a.
"could not get delayed time" -> "could not get the apply worker wait time"9b.
"The wait time of the apply worker is long enough expectedly" -> "The apply
worker wait time has expected duration"
Both are fixed.
~~~
10.
sub check_apply_delay_timeMaybe a brief explanatory comment for this function is needed to explain the
unreplicated column c.
Added.
~~~
11.
$node_subscriber->safe_psql('postgres',
"CREATE SUBSCRIPTION tap_sub CONNECTION '$publisher_connstr
application_name=$appname' PUBLICATION tap_pub WITH (streaming = on,
min_apply_delay = '3s')"I think there should be a comment here highlighting that you are setting up a
subscriber time delay of 3 seconds, and then later you can better describe the
parameters for the checking functions...
Added a comment for CREATE SUBSCRIPTION command.
e.g. (add this comment)
# verifies that the subscriber lags the publisher by at least 3 seconds
check_apply_delay_time($node_publisher, $node_subscriber, '5', '3');e.g.
# verifies that the subscriber lags the publisher by at least 3 seconds
check_apply_delay_time($node_publisher, $node_subscriber, '8', '3');
Added.
~~~
12.
# Test whether ALTER SUBSCRIPTION changes the delayed time of the apply
worker # (1 day 1 minute).
$node_subscriber->safe_psql('postgres',
"ALTER SUBSCRIPTION tap_sub SET (min_apply_delay = 86460000)"
);Update the comment with another note.
# Note - The extra 1 min is to account for any decoding/network overhead.
Okay, added the comment. In general, TAP tests
fail if we wait for more than 3 minutes. Then,
we should think setting the maximum consumed time
more than 3 minutes is safe. For example, if
(which should not happen usually, but)
we consumed more than 1 minutes between this ALTER SUBSCRIPTION SET
and below check_apply_delay_log() then, the test will fail.
So made the extra time bigger.
~~~
13.
# Make sure we have long enough min_apply_delay after the ALTER command
check_apply_delay_log($node_subscriber, "logical replication apply delay",
"80000000");IMO the expectation of 1 day (86460000 ms) wait time might be a better number
for your "expected" value.So update the comment/call like this:
# Make sure the apply worker knows to wait for more than 1 day (86400000 ms)
check_apply_delay_log($node_subscriber, "logical replication apply delay",
"86400000");
Updated the comment and the function call.
Kindly have a look at the updated patch v17.
Best Regards,
Takamichi Osumi
Attachments:
v17-0001-Time-delayed-logical-replication-subscriber.patchapplication/octet-stream; name=v17-0001-Time-delayed-logical-replication-subscriber.patchDownload
From 71c11435538e5cf6fab7e507f85d91de602907a1 Mon Sep 17 00:00:00 2001
From: Takamichi Osumi <osumi.takamichi@fujitsu.com>
Date: Thu, 19 Jan 2023 06:09:50 +0000
Subject: [PATCH v17] Time-delayed logical replication subscriber
Similar to physical replication, a time-delayed copy of the data for
logical replication is useful for some scenarios (particularly to fix
errors that might cause data loss).
This patch implements a new subscription parameter called 'min_apply_delay'.
If the subscription sets min_apply_delay parameter, the logical
replication worker will delay the transaction commit for min_apply_delay
milliseconds.
The delay is calculated between the WAL time stamp and the current time
on the subscriber.
The delay occurs only on WAL records for transaction begins. The main
reason is to avoid keeping a transaction open for a long time. Regular
and prepared transactions are covered. Streamed transactions are also
covered.
Prohibit the combination of this feature and parallel streaming mode.
Author: Euler Taveira
Discussion: https://postgr.es/m/CAB-JLwYOYwL=XTyAXKiH5CtM_Vm8KjKh7aaitCKvmCh4rzr5pQ@mail.gmail.com
---
doc/src/sgml/catalogs.sgml | 9 +
doc/src/sgml/config.sgml | 13 ++
doc/src/sgml/logical-replication.sgml | 7 +
doc/src/sgml/ref/alter_subscription.sgml | 5 +-
doc/src/sgml/ref/create_subscription.sgml | 56 ++++-
src/backend/catalog/pg_subscription.c | 1 +
src/backend/catalog/system_views.sql | 7 +-
src/backend/commands/subscriptioncmds.c | 92 ++++++++-
.../replication/logical/applyparallelworker.c | 3 +-
src/backend/replication/logical/worker.c | 161 +++++++++++++--
src/backend/utils/adt/timestamp.c | 29 +++
src/bin/pg_dump/pg_dump.c | 15 +-
src/bin/pg_dump/pg_dump.h | 1 +
src/bin/psql/describe.c | 9 +-
src/bin/psql/tab-complete.c | 4 +-
src/include/catalog/pg_subscription.h | 3 +
src/include/datatype/timestamp.h | 2 +
src/include/replication/worker_internal.h | 2 +-
src/include/utils/timestamp.h | 2 +
src/test/regress/expected/subscription.out | 181 +++++++++-------
src/test/regress/sql/subscription.sql | 24 +++
src/test/subscription/meson.build | 1 +
src/test/subscription/t/032_apply_delay.pl | 195 ++++++++++++++++++
23 files changed, 720 insertions(+), 102 deletions(-)
create mode 100644 src/test/subscription/t/032_apply_delay.pl
diff --git a/doc/src/sgml/catalogs.sgml b/doc/src/sgml/catalogs.sgml
index c1e4048054..bf3c05241c 100644
--- a/doc/src/sgml/catalogs.sgml
+++ b/doc/src/sgml/catalogs.sgml
@@ -7873,6 +7873,15 @@ SCRAM-SHA-256$<replaceable><iteration count></replaceable>:<replaceable>&l
</para></entry>
</row>
+ <row>
+ <entry role="catalog_table_entry"><para role="column_definition">
+ <structfield>subminapplydelay</structfield> <type>int8</type>
+ </para>
+ <para>
+ The length of time (ms) to delay the application of changes.
+ </para></entry>
+ </row>
+
<row>
<entry role="catalog_table_entry"><para role="column_definition">
<structfield>subname</structfield> <type>name</type>
diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml
index 89d53f2a64..13dd422c25 100644
--- a/doc/src/sgml/config.sgml
+++ b/doc/src/sgml/config.sgml
@@ -4753,6 +4753,19 @@ ANY <replaceable class="parameter">num_sync</replaceable> ( <replaceable class="
the <filename>postgresql.conf</filename> file or on the server
command line.
</para>
+ <para>
+ For time-delayed logical replication (i.e. when the subscription is
+ created with parameter min_apply_delay > 0), the apply worker sends a
+ Standby Status Update message to the publisher with a period of
+ <literal>wal_receiver_status_interval</literal>. Make sure to set
+ <literal>wal_receiver_status_interval</literal> less than the
+ <literal>wal_sender_timeout</literal> on the publisher, otherwise, the
+ walsender will repeatedly terminate due to the timeout errors. If
+ <literal>wal_receiver_status_interval</literal> is set to zero, the apply
+ worker doesn't send any feedback messages during the subscriber's
+ <literal>min_apply_delay</literal> period. See
+ <xref linkend="sql-createsubscription"/> for details.
+ </para>
</listitem>
</varlistentry>
diff --git a/doc/src/sgml/logical-replication.sgml b/doc/src/sgml/logical-replication.sgml
index f4b4e641be..9bfbb3b61d 100644
--- a/doc/src/sgml/logical-replication.sgml
+++ b/doc/src/sgml/logical-replication.sgml
@@ -247,6 +247,13 @@
target table.
</para>
+ <para>
+ The subscriber replication can be instructed to lag behind the publisher
+ side changes by specifying the <literal>min_apply_delay</literal>
+ subscription parameter. See <xref linkend="sql-createsubscription"/> for
+ details.
+ </para>
+
<sect2 id="logical-replication-subscription-slot">
<title>Replication Slot Management</title>
diff --git a/doc/src/sgml/ref/alter_subscription.sgml b/doc/src/sgml/ref/alter_subscription.sgml
index ad93553a1d..1c6e9dd2d1 100644
--- a/doc/src/sgml/ref/alter_subscription.sgml
+++ b/doc/src/sgml/ref/alter_subscription.sgml
@@ -213,8 +213,9 @@ ALTER SUBSCRIPTION <replaceable class="parameter">name</replaceable> RENAME TO <
are <literal>slot_name</literal>,
<literal>synchronous_commit</literal>,
<literal>binary</literal>, <literal>streaming</literal>,
- <literal>disable_on_error</literal>, and
- <literal>origin</literal>.
+ <literal>disable_on_error</literal>,
+ <literal>origin</literal>, and
+ <literal>min_apply_delay</literal>.
</para>
</listitem>
</varlistentry>
diff --git a/doc/src/sgml/ref/create_subscription.sgml b/doc/src/sgml/ref/create_subscription.sgml
index eba72c6af6..ac0d477974 100644
--- a/doc/src/sgml/ref/create_subscription.sgml
+++ b/doc/src/sgml/ref/create_subscription.sgml
@@ -349,7 +349,44 @@ CREATE SUBSCRIPTION <replaceable class="parameter">subscription_name</replaceabl
</para>
</listitem>
</varlistentry>
- </variablelist></para>
+
+ <varlistentry>
+ <term><literal>min_apply_delay</literal> (<type>integer</type>)</term>
+ <listitem>
+ <para>
+ By default, the subscriber applies changes as soon as possible. This
+ parameter allows the user to delay the application of changes by a
+ specified amount of time. If the value is specified without units, it
+ is taken as milliseconds. The default is zero (no delay).
+ </para>
+ <para>
+ Any delay occurs only on WAL records for transaction begins after all
+ initial table synchronization has finished. The delay is calculated
+ between the WAL timestamp as written on the publisher and the current
+ time on the subscriber. Any overhead of time spent in logical decoding
+ and in transferring the transaction may reduce the actual wait time.
+ It is also possible that the overhead already execeeds the requested
+ <literal>min_apply_delay</literal> value, in which case no additional
+ wait is necessary. If the system clocks on publisher and subscriber
+ are not synchronized, this may lead to apply changes earlier than
+ expected, but this is not a major issue because this parameter is
+ typically much larger than the time deviations between servers. Note
+ that if this parameter is set to a long delay, the replication will
+ stop if the replication slot falls behind the current LSN by more than
+ <link linkend="guc-max-slot-wal-keep-size"><literal>max_slot_wal_keep_size</literal></link>.
+ </para>
+ <warning>
+ <para>
+ Delaying the replication can mean there is a much longer time between making
+ a change on the publisher, and that change being committed on the subscriber.
+ This can have a big impact on synchronous replication.
+ See <xref linkend="guc-synchronous-commit"/>.
+ </para>
+ </warning>
+ </listitem>
+ </varlistentry>
+ </variablelist>
+ </para>
</listitem>
</varlistentry>
@@ -413,6 +450,11 @@ CREATE SUBSCRIPTION <replaceable class="parameter">subscription_name</replaceabl
published with different column lists are not supported.
</para>
+ <para>
+ A non-zero <literal>min_apply_delay</literal> parameter is not allowed when streaming
+ in parallel mode.
+ </para>
+
<para>
We allow non-existent publications to be specified so that users can add
those later. This means
@@ -472,6 +514,18 @@ CREATE SUBSCRIPTION mysub
PUBLICATION insert_only
WITH (enabled = false);
</programlisting></para>
+
+ <para>
+ Create a subscription to a remote server that replicates tables in
+ the <literal>mypub</literal> publication and starts replicating immediately
+ on commit. Pre-existing data is not copied. The application of changes is
+ delayed by 4 hours.
+<programlisting>
+CREATE SUBSCRIPTION mysub
+ CONNECTION 'host=192.0.2.4 port=5432 user=foo dbname=foodb'
+ PUBLICATION mypub
+ WITH (copy_data = false, min_apply_delay = '4h');
+</programlisting></para>
</refsect1>
<refsect1>
diff --git a/src/backend/catalog/pg_subscription.c b/src/backend/catalog/pg_subscription.c
index a56ae311c3..c767cc1c3a 100644
--- a/src/backend/catalog/pg_subscription.c
+++ b/src/backend/catalog/pg_subscription.c
@@ -64,6 +64,7 @@ GetSubscription(Oid subid, bool missing_ok)
sub->oid = subid;
sub->dbid = subform->subdbid;
sub->skiplsn = subform->subskiplsn;
+ sub->minapplydelay = subform->subminapplydelay;
sub->name = pstrdup(NameStr(subform->subname));
sub->owner = subform->subowner;
sub->enabled = subform->subenabled;
diff --git a/src/backend/catalog/system_views.sql b/src/backend/catalog/system_views.sql
index 8608e3fa5b..317c2010cb 100644
--- a/src/backend/catalog/system_views.sql
+++ b/src/backend/catalog/system_views.sql
@@ -1299,9 +1299,10 @@ REVOKE ALL ON pg_replication_origin_status FROM public;
-- All columns of pg_subscription except subconninfo are publicly readable.
REVOKE ALL ON pg_subscription FROM public;
-GRANT SELECT (oid, subdbid, subskiplsn, subname, subowner, subenabled,
- subbinary, substream, subtwophasestate, subdisableonerr,
- subslotname, subsynccommit, subpublications, suborigin)
+GRANT SELECT (oid, subdbid, subskiplsn, subminapplydelay, subname, subowner,
+ subenabled, subbinary, substream, subtwophasestate,
+ subdisableonerr, subslotname, subsynccommit, subpublications,
+ suborigin)
ON pg_subscription TO public;
CREATE VIEW pg_stat_subscription_stats AS
diff --git a/src/backend/commands/subscriptioncmds.c b/src/backend/commands/subscriptioncmds.c
index baff00dd74..118842f9ff 100644
--- a/src/backend/commands/subscriptioncmds.c
+++ b/src/backend/commands/subscriptioncmds.c
@@ -66,6 +66,7 @@
#define SUBOPT_DISABLE_ON_ERR 0x00000400
#define SUBOPT_LSN 0x00000800
#define SUBOPT_ORIGIN 0x00001000
+#define SUBOPT_MIN_APPLY_DELAY 0x00002000
/* check if the 'val' has 'bits' set */
#define IsSet(val, bits) (((val) & (bits)) == (bits))
@@ -90,6 +91,7 @@ typedef struct SubOpts
bool disableonerr;
char *origin;
XLogRecPtr lsn;
+ int min_apply_delay;
} SubOpts;
static List *fetch_table_list(WalReceiverConn *wrconn, List *publications);
@@ -146,6 +148,8 @@ parse_subscription_options(ParseState *pstate, List *stmt_options,
opts->disableonerr = false;
if (IsSet(supported_opts, SUBOPT_ORIGIN))
opts->origin = pstrdup(LOGICALREP_ORIGIN_ANY);
+ if (IsSet(supported_opts, SUBOPT_MIN_APPLY_DELAY))
+ opts->min_apply_delay = 0;
/* Parse options */
foreach(lc, stmt_options)
@@ -324,6 +328,43 @@ parse_subscription_options(ParseState *pstate, List *stmt_options,
opts->specified_opts |= SUBOPT_LSN;
opts->lsn = lsn;
}
+ else if (IsSet(supported_opts, SUBOPT_MIN_APPLY_DELAY) &&
+ strcmp(defel->defname, "min_apply_delay") == 0)
+ {
+ char *val,
+ *tmp;
+ Interval *interval;
+ int64 ms;
+
+ if (IsSet(opts->specified_opts, SUBOPT_MIN_APPLY_DELAY))
+ errorConflictingDefElem(defel, pstate);
+
+ opts->specified_opts |= SUBOPT_MIN_APPLY_DELAY;
+ tmp = defGetString(defel);
+
+ /*
+ * If no unit was specified, then explicitly add 'ms' otherwise
+ * the interval_in function would assume 'seconds'.
+ */
+ if (strspn(tmp, "-0123456789 ") == strlen(tmp))
+ val = psprintf("%sms", tmp);
+ else
+ val = tmp;
+
+ interval = DatumGetIntervalP(DirectFunctionCall3(interval_in,
+ CStringGetDatum(val),
+ ObjectIdGetDatum(InvalidOid),
+ Int32GetDatum(-1)));
+
+ ms = interval2ms(interval);
+ if (ms < 0 || ms > INT_MAX)
+ ereport(ERROR,
+ errcode(ERRCODE_NUMERIC_VALUE_OUT_OF_RANGE),
+ errmsg("%lld ms is outside the valid range for parameter \"%s\" (0 .. %d)",
+ (long long) ms, "min_apply_delay", INT_MAX));
+
+ opts->min_apply_delay = ms;
+ }
else
ereport(ERROR,
(errcode(ERRCODE_SYNTAX_ERROR),
@@ -404,6 +445,20 @@ parse_subscription_options(ParseState *pstate, List *stmt_options,
"slot_name = NONE", "create_slot = false")));
}
}
+
+ /*
+ * The combination of parallel streaming mode and min_apply_delay is not
+ * allowed.
+ */
+ if (IsSet(supported_opts, SUBOPT_MIN_APPLY_DELAY) &&
+ opts->min_apply_delay > 0)
+ {
+ if (opts->streaming == LOGICALREP_STREAM_PARALLEL)
+ ereport(ERROR,
+ errcode(ERRCODE_SYNTAX_ERROR),
+ errmsg("%s and %s are mutually exclusive options",
+ "min_apply_delay > 0", "streaming = parallel"));
+ }
}
/*
@@ -560,7 +615,8 @@ CreateSubscription(ParseState *pstate, CreateSubscriptionStmt *stmt,
SUBOPT_SLOT_NAME | SUBOPT_COPY_DATA |
SUBOPT_SYNCHRONOUS_COMMIT | SUBOPT_BINARY |
SUBOPT_STREAMING | SUBOPT_TWOPHASE_COMMIT |
- SUBOPT_DISABLE_ON_ERR | SUBOPT_ORIGIN);
+ SUBOPT_DISABLE_ON_ERR | SUBOPT_ORIGIN |
+ SUBOPT_MIN_APPLY_DELAY);
parse_subscription_options(pstate, stmt->options, supported_opts, &opts);
/*
@@ -625,6 +681,7 @@ CreateSubscription(ParseState *pstate, CreateSubscriptionStmt *stmt,
values[Anum_pg_subscription_oid - 1] = ObjectIdGetDatum(subid);
values[Anum_pg_subscription_subdbid - 1] = ObjectIdGetDatum(MyDatabaseId);
values[Anum_pg_subscription_subskiplsn - 1] = LSNGetDatum(InvalidXLogRecPtr);
+ values[Anum_pg_subscription_subminapplydelay - 1] = Int64GetDatum(opts.min_apply_delay);
values[Anum_pg_subscription_subname - 1] =
DirectFunctionCall1(namein, CStringGetDatum(stmt->subname));
values[Anum_pg_subscription_subowner - 1] = ObjectIdGetDatum(owner);
@@ -1054,7 +1111,7 @@ AlterSubscription(ParseState *pstate, AlterSubscriptionStmt *stmt,
supported_opts = (SUBOPT_SLOT_NAME |
SUBOPT_SYNCHRONOUS_COMMIT | SUBOPT_BINARY |
SUBOPT_STREAMING | SUBOPT_DISABLE_ON_ERR |
- SUBOPT_ORIGIN);
+ SUBOPT_ORIGIN | SUBOPT_MIN_APPLY_DELAY);
parse_subscription_options(pstate, stmt->options,
supported_opts, &opts);
@@ -1098,6 +1155,18 @@ AlterSubscription(ParseState *pstate, AlterSubscriptionStmt *stmt,
if (IsSet(opts.specified_opts, SUBOPT_STREAMING))
{
+ /*
+ * The combination of parallel streaming mode and
+ * min_apply_delay is not allowed.
+ */
+ if (opts.streaming == LOGICALREP_STREAM_PARALLEL)
+ if ((IsSet(opts.specified_opts, SUBOPT_MIN_APPLY_DELAY) && opts.min_apply_delay > 0) ||
+ (!IsSet(opts.specified_opts, SUBOPT_MIN_APPLY_DELAY) && sub->minapplydelay > 0))
+ ereport(ERROR,
+ errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
+ errmsg("cannot enable %s mode for subscription with %s",
+ "streaming = parallel", "min_apply_delay"));
+
values[Anum_pg_subscription_substream - 1] =
CharGetDatum(opts.streaming);
replaces[Anum_pg_subscription_substream - 1] = true;
@@ -1111,6 +1180,25 @@ AlterSubscription(ParseState *pstate, AlterSubscriptionStmt *stmt,
= true;
}
+ if (IsSet(opts.specified_opts, SUBOPT_MIN_APPLY_DELAY))
+ {
+ /*
+ * The combination of parallel streaming mode and
+ * min_apply_delay is not allowed.
+ */
+ if (opts.min_apply_delay > 0)
+ if ((IsSet(opts.specified_opts, SUBOPT_STREAMING) && opts.streaming == LOGICALREP_STREAM_PARALLEL) ||
+ (!IsSet(opts.specified_opts, SUBOPT_STREAMING) && sub->stream == LOGICALREP_STREAM_PARALLEL))
+ ereport(ERROR,
+ errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
+ errmsg("cannot enable %s for subscription in %s mode",
+ "min_apply_delay", "streaming = parallel"));
+
+ values[Anum_pg_subscription_subminapplydelay - 1] =
+ Int64GetDatum(opts.min_apply_delay);
+ replaces[Anum_pg_subscription_subminapplydelay - 1] = true;
+ }
+
if (IsSet(opts.specified_opts, SUBOPT_ORIGIN))
{
values[Anum_pg_subscription_suborigin - 1] =
diff --git a/src/backend/replication/logical/applyparallelworker.c b/src/backend/replication/logical/applyparallelworker.c
index 3579e704fe..7302bce7a0 100644
--- a/src/backend/replication/logical/applyparallelworker.c
+++ b/src/backend/replication/logical/applyparallelworker.c
@@ -704,7 +704,8 @@ pa_process_spooled_messages_if_required(void)
{
apply_spooled_messages(&MyParallelShared->fileset,
MyParallelShared->xid,
- InvalidXLogRecPtr);
+ InvalidXLogRecPtr,
+ 0);
pa_set_fileset_state(MyParallelShared, FS_EMPTY);
}
diff --git a/src/backend/replication/logical/worker.c b/src/backend/replication/logical/worker.c
index a0084c7ef6..7512ae5b7d 100644
--- a/src/backend/replication/logical/worker.c
+++ b/src/backend/replication/logical/worker.c
@@ -318,6 +318,17 @@ static List *on_commit_wakeup_workers_subids = NIL;
bool in_remote_transaction = false;
static XLogRecPtr remote_final_lsn = InvalidXLogRecPtr;
+/*
+ * In order to avoid walsender timeout for time-delayed replication the worker
+ * process keeps sending feedback messages during the delay period.
+ * Meanwhile, the feature delays the apply before starting the
+ * transaction and thus we don't write WALs for the suspended changes during
+ * the wait. When the worker process sends a feedback message
+ * during the delay, we should not make positions of the flushed and apply LSN
+ * overwritten by the last received latest LSN. See send_feedback() for details.
+ */
+static XLogRecPtr last_received = InvalidXLogRecPtr;
+
/* fields valid only when processing streamed transaction */
static bool in_streamed_transaction = false;
@@ -388,10 +399,13 @@ static void stream_write_change(char action, StringInfo s);
static void stream_open_and_write_change(TransactionId xid, char action, StringInfo s);
static void stream_close_file(void);
-static void send_feedback(XLogRecPtr recvpos, bool force, bool requestReply);
+static void send_feedback(XLogRecPtr recvpos, bool force, bool requestReply,
+ bool in_delayed_apply);
static void DisableSubscriptionAndExit(void);
+static void maybe_delay_apply(TimestampTz finish_ts);
+
static void apply_handle_commit_internal(LogicalRepCommitData *commit_data);
static void apply_handle_insert_internal(ApplyExecutionData *edata,
ResultRelInfo *relinfo,
@@ -998,6 +1012,105 @@ slot_modify_data(TupleTableSlot *slot, TupleTableSlot *srcslot,
ExecStoreVirtualTuple(slot);
}
+/*
+ * When min_apply_delay parameter is set on the subscriber, we wait long enough
+ * to make sure a transaction is applied at least that interval behind the
+ * publisher.
+ *
+ * While the physical replication applies the delay at commit time, this
+ * feature applies the delay for the next transaction but before starting the
+ * transaction. This is mainly because keeping a transaction that conducted
+ * write operations open for a long time results in some issues such as bloat
+ * and locks.
+ *
+ * The min_apply_delay parameter will take effect only after all tables are in
+ * READY state.
+ *
+ * finish_ts is the commit/prepare time of both regular (non-streamed) and
+ * streamed transactions. Unlike the regular (non-streamed) cases, the delay
+ * is applied in a STREAM COMMIT/STREAM PREPARE message for streamed
+ * transactions. The STREAM START message does not contain a commit/prepare
+ * time (it will be available when the in-progress transaction finishes).
+ * Hence, it's not appropriate to apply a delay at the time.
+ */
+static void
+maybe_delay_apply(TimestampTz finish_ts)
+{
+ Assert(finish_ts > 0);
+
+ /* Nothing to do if no delay set */
+ if (!MySubscription->minapplydelay)
+ return;
+
+ /*
+ * The min_apply_delay parameter is ignored until all tablesync workers
+ * have reached READY state. This is because if we allowed the delay
+ * during the catchup phase, then once we reached the limit of tablesync
+ * workers it would impose a delay for each subsequent worker. That would
+ * cause initial table synchronization completion to take a long time.
+ */
+ if (!AllTablesyncsReady())
+ return;
+
+ /* Apply the delay by the latch mechanism */
+ while (true)
+ {
+ long diffms;
+
+ ResetLatch(MyLatch);
+
+ CHECK_FOR_INTERRUPTS();
+
+ /* This might change wal_receiver_status_interval */
+ if (ConfigReloadPending)
+ {
+ ConfigReloadPending = false;
+ ProcessConfigFile(PGC_SIGHUP);
+ }
+
+ /*
+ * Before calculating the time duration, reload the catalog if needed.
+ */
+ if (!in_remote_transaction && !in_streamed_transaction)
+ {
+ AcceptInvalidationMessages();
+ maybe_reread_subscription();
+ }
+
+ diffms = TimestampDifferenceMilliseconds(GetCurrentTimestamp(),
+ TimestampTzPlusMilliseconds(finish_ts, MySubscription->minapplydelay));
+
+ /*
+ * Exit without arming the latch if it's already past time to apply
+ * this transaction.
+ */
+ if (diffms <= 0)
+ break;
+
+ elog(DEBUG2, "logical replication apply delay: %ld ms", diffms);
+
+ /*
+ * Call send_feedback() to prevent the publisher from exiting by
+ * timeout during the delay, when wal_receiver_status_interval is
+ * available.
+ */
+ if (wal_receiver_status_interval > 0
+ && diffms > wal_receiver_status_interval * 1000)
+ {
+ WaitLatch(MyLatch,
+ WL_LATCH_SET | WL_TIMEOUT | WL_EXIT_ON_PM_DEATH,
+ (long) wal_receiver_status_interval * 1000,
+ WAIT_EVENT_RECOVERY_APPLY_DELAY);
+ send_feedback(last_received, true, false, true);
+ }
+ else
+ WaitLatch(MyLatch,
+ WL_LATCH_SET | WL_TIMEOUT | WL_EXIT_ON_PM_DEATH,
+ diffms,
+ WAIT_EVENT_RECOVERY_APPLY_DELAY);
+ }
+}
+
/*
* Handle BEGIN message.
*/
@@ -1012,6 +1125,9 @@ apply_handle_begin(StringInfo s)
logicalrep_read_begin(s, &begin_data);
set_apply_error_context_xact(begin_data.xid, begin_data.final_lsn);
+ /* Should we delay the current transaction? */
+ maybe_delay_apply(begin_data.committime);
+
remote_final_lsn = begin_data.final_lsn;
maybe_start_skipping_changes(begin_data.final_lsn);
@@ -1069,6 +1185,9 @@ apply_handle_begin_prepare(StringInfo s)
logicalrep_read_begin_prepare(s, &begin_data);
set_apply_error_context_xact(begin_data.xid, begin_data.prepare_lsn);
+ /* Should we delay the current prepared transaction? */
+ maybe_delay_apply(begin_data.prepare_time);
+
remote_final_lsn = begin_data.prepare_lsn;
maybe_start_skipping_changes(begin_data.prepare_lsn);
@@ -1316,7 +1435,8 @@ apply_handle_stream_prepare(StringInfo s)
* spooled operations.
*/
apply_spooled_messages(MyLogicalRepWorker->stream_fileset,
- prepare_data.xid, prepare_data.prepare_lsn);
+ prepare_data.xid, prepare_data.prepare_lsn,
+ prepare_data.prepare_time);
/* Mark the transaction as prepared. */
apply_handle_prepare_internal(&prepare_data);
@@ -2010,10 +2130,13 @@ ensure_last_message(FileSet *stream_fileset, TransactionId xid, int fileno,
/*
* Common spoolfile processing.
+ *
+ * The commit/prepare time for streaming transaction is required to achieve
+ * time-delayed replication.
*/
void
apply_spooled_messages(FileSet *stream_fileset, TransactionId xid,
- XLogRecPtr lsn)
+ XLogRecPtr lsn, TimestampTz finish_ts)
{
StringInfoData s2;
int nchanges;
@@ -2024,6 +2147,10 @@ apply_spooled_messages(FileSet *stream_fileset, TransactionId xid,
int fileno;
off_t offset;
+ /* Should we delay the current transaction? */
+ if (finish_ts)
+ maybe_delay_apply(finish_ts);
+
if (!am_parallel_apply_worker())
maybe_start_skipping_changes(lsn);
@@ -2173,7 +2300,7 @@ apply_handle_stream_commit(StringInfo s)
* spooled operations.
*/
apply_spooled_messages(MyLogicalRepWorker->stream_fileset, xid,
- commit_data.commit_lsn);
+ commit_data.commit_lsn, commit_data.committime);
apply_handle_commit_internal(&commit_data);
@@ -3446,7 +3573,7 @@ UpdateWorkerStats(XLogRecPtr last_lsn, TimestampTz send_time, bool reply)
* Apply main loop.
*/
static void
-LogicalRepApplyLoop(XLogRecPtr last_received)
+LogicalRepApplyLoop(void)
{
TimestampTz last_recv_timestamp = GetCurrentTimestamp();
bool ping_sent = false;
@@ -3567,7 +3694,7 @@ LogicalRepApplyLoop(XLogRecPtr last_received)
if (last_received < end_lsn)
last_received = end_lsn;
- send_feedback(last_received, reply_requested, false);
+ send_feedback(last_received, reply_requested, false, false);
UpdateWorkerStats(last_received, timestamp, true);
}
/* other message types are purposefully ignored */
@@ -3580,7 +3707,7 @@ LogicalRepApplyLoop(XLogRecPtr last_received)
}
/* confirm all writes so far */
- send_feedback(last_received, false, false);
+ send_feedback(last_received, false, false, false);
if (!in_remote_transaction && !in_streamed_transaction)
{
@@ -3677,7 +3804,7 @@ LogicalRepApplyLoop(XLogRecPtr last_received)
}
}
- send_feedback(last_received, requestReply, requestReply);
+ send_feedback(last_received, requestReply, requestReply, false);
/*
* Force reporting to ensure long idle periods don't lead to
@@ -3707,7 +3834,7 @@ LogicalRepApplyLoop(XLogRecPtr last_received)
* to send a response to avoid timeouts.
*/
static void
-send_feedback(XLogRecPtr recvpos, bool force, bool requestReply)
+send_feedback(XLogRecPtr recvpos, bool force, bool requestReply, bool in_delayed_apply)
{
static StringInfo reply_message = NULL;
static TimestampTz send_time = 0;
@@ -3737,8 +3864,15 @@ send_feedback(XLogRecPtr recvpos, bool force, bool requestReply)
/*
* No outstanding transactions to flush, we can report the latest received
* position. This is important for synchronous replication.
+ *
+ * If the subscriber side apply is delayed (because of time-delayed
+ * replication) then do not tell the publisher that the received latest
+ * LSN is already applied and flushed, otherwise, it leads to the
+ * publisher side making a wrong assumption of logical replication
+ * progress. Instead, we just send a feedback message to avoid a publisher
+ * timeout during the delay.
*/
- if (!have_pending_txes)
+ if (!have_pending_txes && !in_delayed_apply)
flushpos = writepos = recvpos;
if (writepos < last_writepos)
@@ -4354,11 +4488,11 @@ start_table_sync(XLogRecPtr *origin_startpos, char **myslotname)
* of system resource error and are not repeatable.
*/
static void
-start_apply(XLogRecPtr origin_startpos)
+start_apply(void)
{
PG_TRY();
{
- LogicalRepApplyLoop(origin_startpos);
+ LogicalRepApplyLoop();
}
PG_CATCH();
{
@@ -4645,7 +4779,8 @@ ApplyWorkerMain(Datum main_arg)
}
/* Run the main loop. */
- start_apply(origin_startpos);
+ last_received = origin_startpos;
+ start_apply();
proc_exit(0);
}
diff --git a/src/backend/utils/adt/timestamp.c b/src/backend/utils/adt/timestamp.c
index 928c330897..422e6ad0fa 100644
--- a/src/backend/utils/adt/timestamp.c
+++ b/src/backend/utils/adt/timestamp.c
@@ -2431,6 +2431,35 @@ interval_cmp_internal(const Interval *interval1, const Interval *interval2)
return int128_compare(span1, span2);
}
+/*
+ * Returns the number of milliseconds in the specified Interval.
+ */
+int64
+interval2ms(const Interval *interval)
+{
+ int64 days;
+ int64 ms;
+ int64 result;
+
+ days = interval->month * INT64CONST(30);
+ days += interval->day;
+
+ /* Detect whether the value of interval can cause an overflow */
+ if (pg_mul_s64_overflow(days, MSECS_PER_DAY, &result))
+ ereport(ERROR,
+ errcode(ERRCODE_NUMERIC_VALUE_OUT_OF_RANGE),
+ errmsg("bigint out of range"));
+
+ /* Adds portion time (in ms) to the previous result */
+ ms = interval->time / INT64CONST(1000);
+ if (pg_add_s64_overflow(result, ms, &result))
+ ereport(ERROR,
+ errcode(ERRCODE_NUMERIC_VALUE_OUT_OF_RANGE),
+ errmsg("bigint out of range"));
+
+ return result;
+}
+
Datum
interval_eq(PG_FUNCTION_ARGS)
{
diff --git a/src/bin/pg_dump/pg_dump.c b/src/bin/pg_dump/pg_dump.c
index 527c7651ab..c0f69cb43b 100644
--- a/src/bin/pg_dump/pg_dump.c
+++ b/src/bin/pg_dump/pg_dump.c
@@ -4494,6 +4494,7 @@ getSubscriptions(Archive *fout)
int i_subsynccommit;
int i_subpublications;
int i_subbinary;
+ int i_subminapplydelay;
int i,
ntups;
@@ -4546,9 +4547,13 @@ getSubscriptions(Archive *fout)
LOGICALREP_TWOPHASE_STATE_DISABLED);
if (fout->remoteVersion >= 160000)
- appendPQExpBufferStr(query, " s.suborigin\n");
+ appendPQExpBufferStr(query,
+ " s.suborigin,\n"
+ " s.subminapplydelay\n");
else
- appendPQExpBuffer(query, " '%s' AS suborigin\n", LOGICALREP_ORIGIN_ANY);
+ appendPQExpBuffer(query, " '%s' AS suborigin,\n"
+ " 0 AS subminapplydelay\n",
+ LOGICALREP_ORIGIN_ANY);
appendPQExpBufferStr(query,
"FROM pg_subscription s\n"
@@ -4576,6 +4581,7 @@ getSubscriptions(Archive *fout)
i_subtwophasestate = PQfnumber(res, "subtwophasestate");
i_subdisableonerr = PQfnumber(res, "subdisableonerr");
i_suborigin = PQfnumber(res, "suborigin");
+ i_subminapplydelay = PQfnumber(res, "subminapplydelay");
subinfo = pg_malloc(ntups * sizeof(SubscriptionInfo));
@@ -4606,6 +4612,8 @@ getSubscriptions(Archive *fout)
subinfo[i].subdisableonerr =
pg_strdup(PQgetvalue(res, i, i_subdisableonerr));
subinfo[i].suborigin = pg_strdup(PQgetvalue(res, i, i_suborigin));
+ subinfo[i].subminapplydelay =
+ strtoi64(PQgetvalue(res, i, i_subminapplydelay), NULL, 10);
/* Decide whether we want to dump it */
selectDumpableObject(&(subinfo[i].dobj), fout);
@@ -4687,6 +4695,9 @@ dumpSubscription(Archive *fout, const SubscriptionInfo *subinfo)
if (strcmp(subinfo->subsynccommit, "off") != 0)
appendPQExpBuffer(query, ", synchronous_commit = %s", fmtId(subinfo->subsynccommit));
+ if (subinfo->subminapplydelay > 0)
+ appendPQExpBuffer(query, ", min_apply_delay = '" INT64_FORMAT " ms'", subinfo->subminapplydelay);
+
appendPQExpBufferStr(query, ");\n");
if (subinfo->dobj.dump & DUMP_COMPONENT_DEFINITION)
diff --git a/src/bin/pg_dump/pg_dump.h b/src/bin/pg_dump/pg_dump.h
index e7cbd8d7ed..e2525f70ab 100644
--- a/src/bin/pg_dump/pg_dump.h
+++ b/src/bin/pg_dump/pg_dump.h
@@ -661,6 +661,7 @@ typedef struct _SubscriptionInfo
char *subdisableonerr;
char *suborigin;
char *subsynccommit;
+ int64 subminapplydelay;
char *subpublications;
} SubscriptionInfo;
diff --git a/src/bin/psql/describe.c b/src/bin/psql/describe.c
index c8a0bb7b3a..8a27063bed 100644
--- a/src/bin/psql/describe.c
+++ b/src/bin/psql/describe.c
@@ -6472,7 +6472,7 @@ describeSubscriptions(const char *pattern, bool verbose)
PGresult *res;
printQueryOpt myopt = pset.popt;
static const bool translate_columns[] = {false, false, false, false,
- false, false, false, false, false, false, false, false};
+ false, false, false, false, false, false, false, false, false};
if (pset.sversion < 100000)
{
@@ -6527,10 +6527,13 @@ describeSubscriptions(const char *pattern, bool verbose)
gettext_noop("Two-phase commit"),
gettext_noop("Disable on error"));
+ /* Origin and min_apply_delay are only supported in v16 and higher */
if (pset.sversion >= 160000)
appendPQExpBuffer(&buf,
- ", suborigin AS \"%s\"\n",
- gettext_noop("Origin"));
+ ", suborigin AS \"%s\"\n"
+ ", subminapplydelay AS \"%s\"\n",
+ gettext_noop("Origin"),
+ gettext_noop("Min apply delay (ms)"));
appendPQExpBuffer(&buf,
", subsynccommit AS \"%s\"\n"
diff --git a/src/bin/psql/tab-complete.c b/src/bin/psql/tab-complete.c
index 5e1882eaea..e8b9a43a47 100644
--- a/src/bin/psql/tab-complete.c
+++ b/src/bin/psql/tab-complete.c
@@ -1925,7 +1925,7 @@ psql_completion(const char *text, int start, int end)
COMPLETE_WITH("(", "PUBLICATION");
/* ALTER SUBSCRIPTION <name> SET ( */
else if (HeadMatches("ALTER", "SUBSCRIPTION", MatchAny) && TailMatches("SET", "("))
- COMPLETE_WITH("binary", "disable_on_error", "origin", "slot_name",
+ COMPLETE_WITH("binary", "disable_on_error", "min_apply_delay", "origin", "slot_name",
"streaming", "synchronous_commit");
/* ALTER SUBSCRIPTION <name> SKIP ( */
else if (HeadMatches("ALTER", "SUBSCRIPTION", MatchAny) && TailMatches("SKIP", "("))
@@ -3268,7 +3268,7 @@ psql_completion(const char *text, int start, int end)
/* Complete "CREATE SUBSCRIPTION <name> ... WITH ( <opt>" */
else if (HeadMatches("CREATE", "SUBSCRIPTION") && TailMatches("WITH", "("))
COMPLETE_WITH("binary", "connect", "copy_data", "create_slot",
- "disable_on_error", "enabled", "origin", "slot_name",
+ "disable_on_error", "enabled", "min_apply_delay", "origin", "slot_name",
"streaming", "synchronous_commit", "two_phase");
/* CREATE TRIGGER --- is allowed inside CREATE SCHEMA, so use TailMatches */
diff --git a/src/include/catalog/pg_subscription.h b/src/include/catalog/pg_subscription.h
index b0f2a1705d..e06f35c037 100644
--- a/src/include/catalog/pg_subscription.h
+++ b/src/include/catalog/pg_subscription.h
@@ -70,6 +70,8 @@ CATALOG(pg_subscription,6100,SubscriptionRelationId) BKI_SHARED_RELATION BKI_ROW
XLogRecPtr subskiplsn; /* All changes finished at this LSN are
* skipped */
+ int64 subminapplydelay; /* Replication apply delay (ms) */
+
NameData subname; /* Name of the subscription */
Oid subowner BKI_LOOKUP(pg_authid); /* Owner of the subscription */
@@ -120,6 +122,7 @@ typedef struct Subscription
* in */
XLogRecPtr skiplsn; /* All changes finished at this LSN are
* skipped */
+ int64 minapplydelay; /* Replication apply delay (ms) */
char *name; /* Name of the subscription */
Oid owner; /* Oid of the subscription owner */
bool enabled; /* Indicates if the subscription is enabled */
diff --git a/src/include/datatype/timestamp.h b/src/include/datatype/timestamp.h
index 21a37e21e9..8b368af299 100644
--- a/src/include/datatype/timestamp.h
+++ b/src/include/datatype/timestamp.h
@@ -127,6 +127,8 @@ struct pg_itm_in
#define SECS_PER_MINUTE 60
#define MINS_PER_HOUR 60
+#define MSECS_PER_DAY INT64CONST(86400000)
+
#define USECS_PER_DAY INT64CONST(86400000000)
#define USECS_PER_HOUR INT64CONST(3600000000)
#define USECS_PER_MINUTE INT64CONST(60000000)
diff --git a/src/include/replication/worker_internal.h b/src/include/replication/worker_internal.h
index dc87a4edd1..3dc09d1a4c 100644
--- a/src/include/replication/worker_internal.h
+++ b/src/include/replication/worker_internal.h
@@ -255,7 +255,7 @@ extern void stream_stop_internal(TransactionId xid);
/* Common streaming function to apply all the spooled messages */
extern void apply_spooled_messages(FileSet *stream_fileset, TransactionId xid,
- XLogRecPtr lsn);
+ XLogRecPtr lsn, TimestampTz finish_ts);
extern void apply_dispatch(StringInfo s);
diff --git a/src/include/utils/timestamp.h b/src/include/utils/timestamp.h
index 42f802bb9d..534051fe13 100644
--- a/src/include/utils/timestamp.h
+++ b/src/include/utils/timestamp.h
@@ -102,6 +102,8 @@ extern bool TimestampDifferenceExceeds(TimestampTz start_time,
TimestampTz stop_time,
int msec);
+extern int64 interval2ms(const Interval *interval);
+
extern TimestampTz time_t_to_timestamptz(pg_time_t tm);
extern pg_time_t timestamptz_to_time_t(TimestampTz t);
diff --git a/src/test/regress/expected/subscription.out b/src/test/regress/expected/subscription.out
index 4e5cb0d3a9..eb25c286a2 100644
--- a/src/test/regress/expected/subscription.out
+++ b/src/test/regress/expected/subscription.out
@@ -114,18 +114,18 @@ CREATE SUBSCRIPTION regress_testsub4 CONNECTION 'dbname=regress_doesnotexist' PU
WARNING: subscription was created, but is not connected
HINT: To initiate replication, you must manually create the replication slot, enable the subscription, and refresh the subscription.
\dRs+ regress_testsub4
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
-------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub4 | regress_subscription_user | f | {testpub} | f | off | d | f | none | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay (ms) | Synchronous commit | Conninfo | Skip LSN
+------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+----------------------+--------------------+-----------------------------+----------
+ regress_testsub4 | regress_subscription_user | f | {testpub} | f | off | d | f | none | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
ALTER SUBSCRIPTION regress_testsub4 SET (origin = any);
\dRs+ regress_testsub4
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
-------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub4 | regress_subscription_user | f | {testpub} | f | off | d | f | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay (ms) | Synchronous commit | Conninfo | Skip LSN
+------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+----------------------+--------------------+-----------------------------+----------
+ regress_testsub4 | regress_subscription_user | f | {testpub} | f | off | d | f | any | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
DROP SUBSCRIPTION regress_testsub3;
@@ -135,10 +135,10 @@ ALTER SUBSCRIPTION regress_testsub CONNECTION 'foobar';
ERROR: invalid connection string syntax: missing "=" after "foobar" in connection info string
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub} | f | off | d | f | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay (ms) | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+----------------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | off | d | f | any | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
ALTER SUBSCRIPTION regress_testsub SET PUBLICATION testpub2, testpub3 WITH (refresh = false);
@@ -155,10 +155,10 @@ ERROR: unrecognized subscription parameter: "create_slot"
-- ok
ALTER SUBSCRIPTION regress_testsub SKIP (lsn = '0/12345');
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+---------------------+--------+-----------+------------------+------------------+--------+--------------------+------------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub2,testpub3} | f | off | d | f | any | off | dbname=regress_doesnotexist2 | 0/12345
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay (ms) | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+---------------------+--------+-----------+------------------+------------------+--------+----------------------+--------------------+------------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub2,testpub3} | f | off | d | f | any | 0 | off | dbname=regress_doesnotexist2 | 0/12345
(1 row)
-- ok - with lsn = NONE
@@ -167,10 +167,10 @@ ALTER SUBSCRIPTION regress_testsub SKIP (lsn = NONE);
ALTER SUBSCRIPTION regress_testsub SKIP (lsn = '0/0');
ERROR: invalid WAL location (LSN): 0/0
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+---------------------+--------+-----------+------------------+------------------+--------+--------------------+------------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub2,testpub3} | f | off | d | f | any | off | dbname=regress_doesnotexist2 | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay (ms) | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+---------------------+--------+-----------+------------------+------------------+--------+----------------------+--------------------+------------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub2,testpub3} | f | off | d | f | any | 0 | off | dbname=regress_doesnotexist2 | 0/0
(1 row)
BEGIN;
@@ -202,10 +202,10 @@ ALTER SUBSCRIPTION regress_testsub_foo SET (synchronous_commit = foobar);
ERROR: invalid value for parameter "synchronous_commit": "foobar"
HINT: Available values: local, remote_write, remote_apply, on, off.
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
----------------------+---------------------------+---------+---------------------+--------+-----------+------------------+------------------+--------+--------------------+------------------------------+----------
- regress_testsub_foo | regress_subscription_user | f | {testpub2,testpub3} | f | off | d | f | any | local | dbname=regress_doesnotexist2 | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay (ms) | Synchronous commit | Conninfo | Skip LSN
+---------------------+---------------------------+---------+---------------------+--------+-----------+------------------+------------------+--------+----------------------+--------------------+------------------------------+----------
+ regress_testsub_foo | regress_subscription_user | f | {testpub2,testpub3} | f | off | d | f | any | 0 | local | dbname=regress_doesnotexist2 | 0/0
(1 row)
-- rename back to keep the rest simple
@@ -239,19 +239,19 @@ CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUB
WARNING: subscription was created, but is not connected
HINT: To initiate replication, you must manually create the replication slot, enable the subscription, and refresh the subscription.
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub} | t | off | d | f | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay (ms) | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+----------------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | t | off | d | f | any | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
ALTER SUBSCRIPTION regress_testsub SET (binary = false);
ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE);
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub} | f | off | d | f | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay (ms) | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+----------------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | off | d | f | any | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
DROP SUBSCRIPTION regress_testsub;
@@ -263,27 +263,27 @@ CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUB
WARNING: subscription was created, but is not connected
HINT: To initiate replication, you must manually create the replication slot, enable the subscription, and refresh the subscription.
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub} | f | on | d | f | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay (ms) | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+----------------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | on | d | f | any | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
ALTER SUBSCRIPTION regress_testsub SET (streaming = parallel);
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub} | f | parallel | d | f | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay (ms) | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+----------------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | parallel | d | f | any | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
ALTER SUBSCRIPTION regress_testsub SET (streaming = false);
ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE);
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub} | f | off | d | f | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay (ms) | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+----------------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | off | d | f | any | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
-- fail - publication already exists
@@ -298,10 +298,10 @@ ALTER SUBSCRIPTION regress_testsub ADD PUBLICATION testpub1, testpub2 WITH (refr
ALTER SUBSCRIPTION regress_testsub ADD PUBLICATION testpub1, testpub2 WITH (refresh = false);
ERROR: publication "testpub1" is already in subscription "regress_testsub"
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-----------------------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub,testpub1,testpub2} | f | off | d | f | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay (ms) | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-----------------------------+--------+-----------+------------------+------------------+--------+----------------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub,testpub1,testpub2} | f | off | d | f | any | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
-- fail - publication used more than once
@@ -316,10 +316,10 @@ ERROR: publication "testpub3" is not in subscription "regress_testsub"
-- ok - delete publications
ALTER SUBSCRIPTION regress_testsub DROP PUBLICATION testpub1, testpub2 WITH (refresh = false);
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub} | f | off | d | f | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay (ms) | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+----------------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | off | d | f | any | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
DROP SUBSCRIPTION regress_testsub;
@@ -355,10 +355,10 @@ CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUB
WARNING: subscription was created, but is not connected
HINT: To initiate replication, you must manually create the replication slot, enable the subscription, and refresh the subscription.
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub} | f | off | p | f | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay (ms) | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+----------------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | off | p | f | any | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
--fail - alter of two_phase option not supported.
@@ -367,10 +367,10 @@ ERROR: unrecognized subscription parameter: "two_phase"
-- but can alter streaming when two_phase enabled
ALTER SUBSCRIPTION regress_testsub SET (streaming = true);
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub} | f | on | p | f | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay (ms) | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+----------------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | on | p | f | any | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE);
@@ -380,10 +380,10 @@ CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUB
WARNING: subscription was created, but is not connected
HINT: To initiate replication, you must manually create the replication slot, enable the subscription, and refresh the subscription.
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub} | f | on | p | f | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay (ms) | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+----------------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | on | p | f | any | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE);
@@ -396,20 +396,57 @@ CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUB
WARNING: subscription was created, but is not connected
HINT: To initiate replication, you must manually create the replication slot, enable the subscription, and refresh the subscription.
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub} | f | off | d | f | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay (ms) | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+----------------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | off | d | f | any | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
ALTER SUBSCRIPTION regress_testsub SET (disable_on_error = true);
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub} | f | off | d | t | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay (ms) | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+----------------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | off | d | t | any | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
+ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE);
+DROP SUBSCRIPTION regress_testsub;
+-- fail -- min_apply_delay must be a non-negative integer
+CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUBLICATION testpub WITH (connect = false, min_apply_delay = foo);
+ERROR: invalid input syntax for type interval: "foo"
+CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUBLICATION testpub WITH (connect = false, min_apply_delay = -1);
+ERROR: -1 ms is outside the valid range for parameter "min_apply_delay" (0 .. 2147483647)
+-- fail - utilizing streaming = parallel with time-delayed replication is not supported.
+CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUBLICATION testpub WITH (connect = false, streaming = parallel, min_apply_delay = '1 day');
+ERROR: min_apply_delay > 0 and streaming = parallel are mutually exclusive options
+-- success -- min_apply_delay value without unit is taken as milliseconds
+CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUBLICATION testpub WITH (connect = false, min_apply_delay = 123);
+WARNING: subscription was created, but is not connected
+HINT: To initiate replication, you must manually create the replication slot, enable the subscription, and refresh the subscription.
+\dRs+
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay (ms) | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+----------------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | off | d | f | any | 123 | off | dbname=regress_doesnotexist | 0/0
+(1 row)
+
+-- success -- min_apply_delay interval is converted into ms and stored as an integer
+ALTER SUBSCRIPTION regress_testsub SET (min_apply_delay = '4h 27min 35s');
+\dRs+
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay (ms) | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+----------------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | off | d | f | any | 16055000 | off | dbname=regress_doesnotexist | 0/0
+(1 row)
+
+-- fail - alter subscription with streaming = parallel should fail when time-delayed replication is set.
+ALTER SUBSCRIPTION regress_testsub SET (streaming = parallel);
+ERROR: cannot enable streaming = parallel mode for subscription with min_apply_delay
+-- fail - alter subscription with min_apply_delay should fail when streaming = parallel is set.
+ALTER SUBSCRIPTION regress_testsub SET (min_apply_delay = 0, streaming = parallel);
+ALTER SUBSCRIPTION regress_testsub SET (min_apply_delay = 123);
+ERROR: cannot enable min_apply_delay for subscription in streaming = parallel mode
ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE);
DROP SUBSCRIPTION regress_testsub;
RESET SESSION AUTHORIZATION;
diff --git a/src/test/regress/sql/subscription.sql b/src/test/regress/sql/subscription.sql
index 5f27b7d776..d4c2a1987e 100644
--- a/src/test/regress/sql/subscription.sql
+++ b/src/test/regress/sql/subscription.sql
@@ -279,6 +279,30 @@ ALTER SUBSCRIPTION regress_testsub SET (disable_on_error = true);
ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE);
DROP SUBSCRIPTION regress_testsub;
+-- fail -- min_apply_delay must be a non-negative integer
+CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUBLICATION testpub WITH (connect = false, min_apply_delay = foo);
+CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUBLICATION testpub WITH (connect = false, min_apply_delay = -1);
+
+-- fail - utilizing streaming = parallel with time-delayed replication is not supported.
+CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUBLICATION testpub WITH (connect = false, streaming = parallel, min_apply_delay = '1 day');
+
+-- success -- min_apply_delay value without unit is taken as milliseconds
+CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUBLICATION testpub WITH (connect = false, min_apply_delay = 123);
+\dRs+
+
+-- success -- min_apply_delay interval is converted into ms and stored as an integer
+ALTER SUBSCRIPTION regress_testsub SET (min_apply_delay = '4h 27min 35s');
+\dRs+
+
+-- fail - alter subscription with streaming = parallel should fail when time-delayed replication is set.
+ALTER SUBSCRIPTION regress_testsub SET (streaming = parallel);
+
+-- fail - alter subscription with min_apply_delay should fail when streaming = parallel is set.
+ALTER SUBSCRIPTION regress_testsub SET (min_apply_delay = 0, streaming = parallel);
+ALTER SUBSCRIPTION regress_testsub SET (min_apply_delay = 123);
+ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE);
+DROP SUBSCRIPTION regress_testsub;
+
RESET SESSION AUTHORIZATION;
DROP ROLE regress_subscription_user;
DROP ROLE regress_subscription_user2;
diff --git a/src/test/subscription/meson.build b/src/test/subscription/meson.build
index 3db0fdfd96..a186876eb4 100644
--- a/src/test/subscription/meson.build
+++ b/src/test/subscription/meson.build
@@ -38,6 +38,7 @@ tests += {
't/029_on_error.pl',
't/030_origin.pl',
't/031_column_list.pl',
+ 't/032_apply_delay.pl',
't/100_bugs.pl',
],
},
diff --git a/src/test/subscription/t/032_apply_delay.pl b/src/test/subscription/t/032_apply_delay.pl
new file mode 100644
index 0000000000..7a84a3f0e2
--- /dev/null
+++ b/src/test/subscription/t/032_apply_delay.pl
@@ -0,0 +1,195 @@
+
+# Copyright (c) 2023, PostgreSQL Global Development Group
+
+# Test replication apply delay
+use strict;
+use warnings;
+use PostgreSQL::Test::Cluster;
+use PostgreSQL::Test::Utils;
+use Test::More;
+
+my $log_location = 0;
+
+# Confirm the time-delayed replication has been effective from the server log
+# message where the apply worker emits for applying delay. When necessary,
+# verifies that the current worker's delayed time is sufficiently bigger than
+# the expected value, in order to check any update of the min_apply_delay.
+sub check_apply_delay_log
+{
+ my ($node_subscriber, $expected) = @_;
+ $expected = 0 unless defined $expected;
+
+ my $old_log_location = $log_location;
+
+ $log_location = $node_subscriber->wait_for_log(qr/logical replication apply delay/, $log_location);
+
+ cmp_ok($log_location, '>', $old_log_location,
+ "logfile contains triggered logical replication apply delay"
+ );
+
+ if ($expected > 0)
+ {
+ # Get the delay time from the server log
+ my $contents = slurp_file($node_subscriber->logfile, $old_log_location);
+ $contents =~
+ qr/logical replication apply delay: (\d+) ms/
+ or die "could not get the apply worker wait time";
+ my $logged_delay = $1;
+
+ # Is it larger than expected?
+ cmp_ok($logged_delay, '>', $expected,
+ "The apply worker wait time has expected duration"
+ );
+ }
+}
+
+# Compare inserted time on the publisher with applied time on the subscriber to
+# confirm the latter is applied after expected time.
+sub check_apply_delay_time
+{
+ my ($node_publisher, $node_subscriber, $primary_key, $expected_diffs) = @_;
+
+ my $inserted_time_on_pub = $node_publisher->safe_psql('postgres', qq[
+ SELECT extract(epoch from c) FROM test_tab WHERE a = $primary_key;
+ ]);
+
+ my $inserted_time_on_sub = $node_subscriber->safe_psql('postgres', qq[
+ SELECT extract(epoch from c) FROM test_tab WHERE a = $primary_key;
+ ]);
+
+ cmp_ok($inserted_time_on_sub - $inserted_time_on_pub, '>', $expected_diffs,
+ "The tuple on the subscriber was modified later than the publisher");
+}
+
+# Create publisher node
+my $node_publisher = PostgreSQL::Test::Cluster->new('publisher');
+$node_publisher->init(allows_streaming => 'logical');
+$node_publisher->append_conf('postgresql.conf',
+ 'logical_decoding_work_mem = 64kB');
+$node_publisher->start;
+
+# Create subscriber node
+my $node_subscriber = PostgreSQL::Test::Cluster->new('subscriber');
+$node_subscriber->init;
+$node_subscriber->append_conf('postgresql.conf',
+ "log_min_messages = debug2");
+$node_subscriber->start;
+
+# Create some preexisting content on publisher
+$node_publisher->safe_psql('postgres',
+ "CREATE TABLE test_tab (a int primary key, b varchar, c timestamptz DEFAULT now())");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO test_tab VALUES (1, 'foo'), (2, 'bar')");
+
+# Setup structure on subscriber
+$node_subscriber->safe_psql('postgres',
+ "CREATE TABLE test_tab (a int primary key, b text, c timestamptz DEFAULT now(), d bigint DEFAULT 999)"
+);
+
+# Setup logical replication
+my $publisher_connstr = $node_publisher->connstr . ' dbname=postgres';
+
+# The column c must not be published because we want to compare the time
+# difference.
+$node_publisher->safe_psql('postgres',
+ "CREATE PUBLICATION tap_pub FOR TABLE test_tab (a, b)");
+
+my $appname = 'tap_sub';
+
+# Create a subscription that applies the trasaction after 3 seconds delay
+$node_subscriber->safe_psql('postgres',
+ "CREATE SUBSCRIPTION tap_sub CONNECTION '$publisher_connstr application_name=$appname' PUBLICATION tap_pub WITH (streaming = on, min_apply_delay = '3s')"
+);
+
+# Wait for initial table sync to finish
+$node_subscriber->wait_for_subscription_sync($node_publisher, $appname);
+
+# Check log starting now for logical replication apply delay
+$log_location = -s $node_subscriber->logfile;
+
+my $result =
+ $node_subscriber->safe_psql('postgres',
+ "SELECT count(*), min(a), max(a) FROM test_tab");
+is($result, qq(2|1|2), 'check initial data was copied to subscriber');
+
+# New row to trigger apply delay
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO test_tab VALUES (3, 'baz')");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO test_tab VALUES (4, 'abc')");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO test_tab VALUES (5, 'def')");
+
+$node_publisher->wait_for_catchup($appname);
+
+$result =
+ $node_subscriber->safe_psql('postgres',
+ "SELECT count(*), min(a), max(a) FROM test_tab");
+is($result, qq(5|1|5), 'check if the new rows were applied to subscriber');
+
+# Verify that the apply worker emits the apply delay log
+check_apply_delay_log($node_subscriber);
+
+# Verify that the subscriber lags the publisher by at least 3 seconds
+check_apply_delay_time($node_publisher, $node_subscriber, '5', '3');
+
+# Reduce the amounts of writes for spooling file
+$node_publisher->append_conf('postgres.conf',
+ 'logical_decoding_mode = immediate');
+$node_publisher->reload;
+
+# Run a query to make sure that the reload has taken effect.
+$node_publisher->safe_psql('postgres', q{SELECT 1});
+
+# Test streamed transaction by insert, update and delete
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO test_tab SELECT i, md5(i::text) FROM generate_series(6, 8) s(i);");
+
+$node_publisher->wait_for_catchup($appname);
+
+$result =
+ $node_subscriber->safe_psql('postgres',
+ "SELECT count(*), min(a), max(a) FROM test_tab");
+is($result, qq(8|1|8), 'check if the new rows were applied to subscriber');
+
+# Verify that the apply worker emits the apply delay log
+check_apply_delay_log($node_subscriber);
+
+# Verify that the subscriber lags the publisher by at least 3 seconds
+check_apply_delay_time($node_publisher, $node_subscriber, '8', '3');
+
+# Test whether ALTER SUBSCRIPTION changes the delayed time of the apply worker
+# (1 day 5 minutes). Note that the extra 5 minute is to account for any
+# decoding/network overhead.
+$node_subscriber->safe_psql('postgres',
+ "ALTER SUBSCRIPTION tap_sub SET (min_apply_delay = 86700000)"
+);
+
+# New row to trigger apply delay
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO test_tab VALUES (0, 'foobar')");
+
+# Make sure the apply worker knows to wait for more than 1 day
+check_apply_delay_log($node_subscriber, "86400000");
+
+# Disable subscription and the worker should die immediately
+$node_subscriber->safe_psql('postgres',
+ "ALTER SUBSCRIPTION tap_sub DISABLE;"
+);
+
+# Wait until worker dies
+my $sub_query =
+ "SELECT count(1) = 0 FROM pg_stat_subscription WHERE subname = 'tap_sub' AND pid IS NOT NULL;";
+$node_subscriber->poll_query_until('postgres', $sub_query)
+ or die "Timed out while waiting for subscriber to die";
+
+# Confirm the suspended record doesn't get applied expectedly by the ALTER
+# DISABLE command.
+$result = $node_subscriber->safe_psql('postgres',
+ "SELECT count(a) FROM test_tab WHERE a = 0;");
+is($result, qq(0), "check if the delayed transaction doesn't get applied expectedly");
+
+$node_subscriber->stop;
+$node_publisher->stop;
+
+done_testing();
--
2.30.0
On Wednesday, January 18, 2023 4:06 PM Peter Smith <smithpb2250@gmail.com> wrote:
Here are my review comments for the latest patch v16-0001. (excluding the
test code)
Hi, thank you for your review !
======
General
1.
Since the value of min_apply_delay cannot be < 0, I was thinking probably it
should have been declared everywhere in this patch as a
uint64 instead of an int64, right?
No, we won't be able to adopt this idea.
It seems that we are not able to use uint for catalog type.
So, can't applying it to the pg_subscription.h definitions
and then similarly Int64GetDatum to store catalog variables
and the argument variable of Int64GetDatum.
Plus, there is a possibility that type Interval becomes negative value,
then we are not able to change the int64 variable to get
the return value of interval2ms().
======
Commit message
2.
If the subscription sets min_apply_delay parameter, the logical replication
worker will delay the transaction commit for min_apply_delay milliseconds.~
IMO there should be another sentence before this just to say that a new
parameter is being added:e.g.
This patch implements a new subscription parameter called
'min_apply_delay'.
Added.
======
doc/src/sgml/config.sgml
3.
+ <para> + For time-delayed logical replication, the apply worker sends a Standby + Status Update message to the corresponding publisher per the indicated + time of this parameter. Therefore, if this parameter is longer than + <literal>wal_sender_timeout</literal> on the publisher, then the + walsender doesn't get any update message during the delay and repeatedly + terminates due to the timeout errors. Hence, make sure this parameter is + shorter than the <literal>wal_sender_timeout</literal> of the publisher. + If this parameter is set to zero with time-delayed replication, the + apply worker doesn't send any feedback messages during the + <literal>min_apply_delay</literal>. + </para>This paragraph seemed confusing. I think it needs to be reworded to change all
of the "this parameter" references because there are at least 3 different
parameters mentioned in this paragraph. e.g. maybe just change them to
explicitly name the parameter you are talking about.I also think it needs to mention the ‘min_apply_delay’ subscription parameter
up-front and then refer to it appropriately.The end result might be something like I wrote below (this is just my guess ?
probably you can word it better).SUGGESTION
For time-delayed logical replication (i.e. when the subscription is created with
parameter min_apply_delay > 0), the apply worker sends a Standby Status
Update message to the publisher with a period of wal_receiver_status_interval .
Make sure to set wal_receiver_status_interval less than the
wal_sender_timeout on the publisher, otherwise, the walsender will repeatedly
terminate due to the timeout errors. If wal_receiver_status_interval is set to zero,
the apply worker doesn't send any feedback messages during the subscriber’s
min_apply_delay period.
Applied. Also, I added one reference for min_apply_delay parameter
at the end of this description.
======
doc/src/sgml/ref/create_subscription.sgml
4.
+ <para> + By default, the subscriber applies changes as soon as possible. As + with the physical replication feature + (<xref linkend="guc-recovery-min-apply-delay"/>), it can be useful to + have a time-delayed logical replica. This parameter lets the user to + delay the application of changes by a specified amount of time. If this + value is specified without units, it is taken as milliseconds. The + default is zero(no delay). + </para>4a.
As with the physical replication feature (recovery_min_apply_delay), it can be
useful to have a time-delayed logical replica.IMO not sure that the above sentence is necessary. It seems only to be saying
that this parameter can be useful. Why do we need to say that?
Removed the sentence.
~
4b.
"This parameter lets the user to delay" -> "This parameter lets the user delay"
OR
"This parameter lets the user to delay" -> "This parameter allows the user to
delay"
Fixed.
~
4c.
"If this value is specified without units" -> "If the value is specified without
units"
Fixed.
~
4d.
"zero(no delay)." -> "zero (no delay)."
Fixed.
----
5.
+ <para> + The delay occurs only on WAL records for transaction begins and after + the initial table synchronization. It is possible that the + replication delay between publisher and subscriber exceeds the value + of this parameter, in which case no delay is added. Note that the + delay is calculated between the WAL time stamp as written on + publisher and the current time on the subscriber. Time spent in logical + decoding and in transferring the transaction may reduce the actual wait + time. If the system clocks on publisher and subscriber are not + synchronized, this may lead to apply changes earlier than expected, + but this is not a major issue because this parameter is typically much + larger than the time deviations between servers. Note that if this + parameter is set to a long delay, the replication will stop if the + replication slot falls behind the current LSN by more than + <link linkend="guc-max-slot-wal-keep-size"><literal>max_slot_wal_keep_size</ literal></link>. + </para>I think the first part can be reworded slightly. See what you think about the
suggestion below.SUGGESTION
Any delay occurs only on WAL records for transaction begins after all initial
table synchronization has finished. The delay is calculated between the WAL
timestamp as written on the publisher and the current time on the subscriber.
Any overhead of time spent in logical decoding and in transferring the
transaction may reduce the actual wait time.
It is also possible that the overhead already exceeds the requested
'min_apply_delay' value, in which case no additional wait is necessary. If the
system clocks...
Addressed.
----
6.
+ <para> + Setting streaming to <literal>parallel</literal> mode and <literal>min_apply_delay</literal> + simultaneously is not supported. + </para>SUGGESTION
A non-zero min_apply_delay parameter is not allowed when streaming in
parallel mode.
Applied.
======
src/backend/commands/subscriptioncmds.c
7. parse_subscription_options
@@ -404,6 +445,17 @@ parse_subscription_options(ParseState *pstate, List *stmt_options, "slot_name = NONE", "create_slot = false"))); } } + + /* Test the combination of streaming mode and min_apply_delay */ if + (IsSet(supported_opts, SUBOPT_MIN_APPLY_DELAY) && + opts->min_apply_delay > 0) + { + if (opts->streaming == LOGICALREP_STREAM_PARALLEL) ereport(ERROR, + errcode(ERRCODE_SYNTAX_ERROR), errmsg("%s and %s are mutually + exclusive options", + "min_apply_delay > 0", "streaming = parallel")); }SUGGESTION (comment)
The combination of parallel streaming mode and min_apply_delay is not
allowed.
Fixed.
~~~
8. AlterSubscription (general)
I observed during testing there are 3 different errors….
At subscription CREATE time you can get this error:
ERROR: min_apply_delay > 0 and streaming = parallel are mutually exclusive
optionsIf you try to ALTER the min_apply_delay when already streaming = parallel you
can get this error:
ERROR: cannot enable min_apply_delay for subscription in streaming =
parallel modeIf you try to ALTER the streaming to be parallel if there is already a
min_apply_delay > 0 then you can get this error:
ERROR: cannot enable streaming = parallel mode for subscription with
min_apply_delay
Yes. This is because the existing error message styles
in AlterSubscription and parse_subscription_options.
The former uses "mutually exclusive" messages consistently,
while the latter does "cannot enable ..." ones.
~
IMO there is no need to have 3 different error message texts. I think all these
cases are explained by just the first text (ERROR:
min_apply_delay > 0 and streaming = parallel are mutually exclusive
options)
Then, we followed this kind of formats.
~~~
9. AlterSubscription
@@ -1098,6 +1152,18 @@ AlterSubscription(ParseState *pstate,
AlterSubscriptionStmt *stmt,if (IsSet(opts.specified_opts, SUBOPT_STREAMING)) { + /* + * Test the combination of streaming mode and + * min_apply_delay + */ + if (opts.streaming == LOGICALREP_STREAM_PARALLEL) if + ((IsSet(opts.specified_opts, SUBOPT_MIN_APPLY_DELAY) && opts.min_apply_delay > 0) || + (!IsSet(opts.specified_opts, SUBOPT_MIN_APPLY_DELAY) && sub->minapplydelay > 0)) + ereport(ERROR, + errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE), + errmsg("cannot enable %s mode for subscription with %s", + "streaming = parallel", "min_apply_delay")); +9a.
SUGGESTION (comment)
The combination of parallel streaming mode and min_apply_delay is not
allowed.
Fixed.
~
9b.
(see AlterSubscription general review comment #8 above) Here you can use the
same comment error message that says min_apply_delay > 0 and streaming =
parallel are mutually exclusive options.
As described above, we followed the current style in the existing functions.
~~~
10. AlterSubscription
@@ -1111,6 +1177,25 @@ AlterSubscription(ParseState *pstate,
AlterSubscriptionStmt *stmt,
= true;
}+ if (IsSet(opts.specified_opts, SUBOPT_MIN_APPLY_DELAY)) { + /* + * Test the combination of streaming mode and + * min_apply_delay + */ + if (opts.min_apply_delay > 0) + if ((IsSet(opts.specified_opts, SUBOPT_STREAMING) && opts.streaming == LOGICALREP_STREAM_PARALLEL) || + (!IsSet(opts.specified_opts, SUBOPT_STREAMING) && sub->stream == LOGICALREP_STREAM_PARALLEL)) + ereport(ERROR, + errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE), + errmsg("cannot enable %s for subscription in %s mode", + "min_apply_delay", "streaming = parallel")); + + values[Anum_pg_subscription_subminapplydelay - 1] = + Int64GetDatum(opts.min_apply_delay); + replaces[Anum_pg_subscription_subminapplydelay - 1] = true; }10a.
SUGGESTION (comment)
The combination of parallel streaming mode and min_apply_delay is not
allowed.
Fixed.
~
10b.
(see AlterSubscription general review comment #8 above) Here you can use the
same comment error message that says min_apply_delay > 0 and streaming =
parallel are mutually exclusive options.
Same as 9b.
======
.../replication/logical/applyparallelworker.c
11.
@@ -704,7 +704,8 @@ pa_process_spooled_messages_if_required(void) { apply_spooled_messages(&MyParallelShared->fileset, MyParallelShared->xid, - InvalidXLogRecPtr); + InvalidXLogRecPtr, + 0);IMO this passing of 0 is a bit strange because it is currently acting like a dummy
value since the apply_spooled_messages will never make use of the 'finish_ts'
anyway (since this call is from a parallel apply worker).I think a better way to code this might be to pass the 0 (same as you are doing
here) but inside the apply_spooled_messages change the code:FROM
if (!am_parallel_apply_worker())
maybe_delay_apply(finish_ts);TO
if (finish_ts)
maybe_delay_apply(finish_ts);That does 2 things.
- It makes the passed-in 0 have some meaning
- It simplifies the apply_spooled_messages code
Adopted.
======
src/backend/replication/logical/worker.c
12.
@@ -318,6 +318,17 @@ static List *on_commit_wakeup_workers_subids =
NIL; bool in_remote_transaction = false; static XLogRecPtr
remote_final_lsn = InvalidXLogRecPtr;+/* + * In order to avoid walsender's timeout during time-delayed +replication, + * it's necessary to keep sending feedback messages during the delay +from the + * worker process. Meanwhile, the feature delays the apply before +starting the + * transaction and thus we don't write WALs for the suspended changes +during + * the wait. Hence, in the case the worker process sends a feedback +message + * during the delay, we should not make positions of the flushed and +apply LSN + * overwritten by the last received latest LSN. See send_feedback() for details. + */ +static XLogRecPtr last_received = InvalidXLogRecPtr;12a.
Suggest a small change to the first sentence of the comment.BEFORE
In order to avoid walsender's timeout during time-delayed replication, it's
necessary to keep sending feedback messages during the delay from the
worker process.AFTER
In order to avoid walsender timeout for time-delayed replication the worker
process keeps sending feedback messages during the delay period.
Fixed.
~
12b.
"Hence, in the case" -> "When"
Fixed.
~~~
13. forward declare
-static void send_feedback(XLogRecPtr recvpos, bool force, bool requestReply); +static void send_feedback(XLogRecPtr recvpos, bool force, bool requestReply, + bool in_delaying_apply);Change the param name:
"in_delaying_apply" -> "in_delayed_apply” (??)
Changed. The initial intention to append the "in_"
prefix is to make the variable name aligned with
some other variables such as "in_remote_transaction" and
"in_streamed_transaction" that mean the current status
for the transaction. So, until there is a better name proposed,
we can keep it.
~~~
14. maybe_delay_apply
+ /* Nothing to do if no delay set */ + if (MySubscription->minapplydelay <= 0) return;IIUC min_apply_delay cannot be < 0 so this condition could simply be:
if (!MySubscription->minapplydelay)
return;
Fixed.
~~~
15. maybe_delay_apply
+ /* + * The min_apply_delay parameter is ignored until all tablesync workers + * have reached READY state. If we allow the delay during the catchup + * phase, once we reach the limit of tablesync workers, it will impose + a + * delay for each subsequent worker. It means it will take a long time + to + * finish the initial table synchronization. + */ + if (!AllTablesyncsReady()) + return;SUGGESTION (slight rewording)
The min_apply_delay parameter is ignored until all tablesync workers have
reached READY state. This is because if we allowed the delay during the
catchup phase, then once we reached the limit of tablesync workers it would
impose a delay for each subsequent worker. That would cause initial table
synchronization completion to take a long time.
Fixed.
~~~
16. maybe_delay_apply
+ while (true) + { + long diffms; + + ResetLatch(MyLatch); + + CHECK_FOR_INTERRUPTS();IMO there should be some small explanatory comment here at the top of the
while loop.
Added.
~~~
17. apply_spooled_messages
@@ -2024,6 +2141,21 @@ apply_spooled_messages(FileSet *stream_fileset,
TransactionId xid,
int fileno;
off_t offset;+ /* + * Should we delay the current transaction? + * + * Unlike the regular (non-streamed) cases, the delay is applied in a + * STREAM COMMIT/STREAM PREPARE message for streamed transactions. The + * STREAM START message does not contain a commit/prepare time (it will + be + * available when the in-progress transaction finishes). Hence, it's + not + * appropriate to apply a delay at that time. + * + * It's not allowed to execute time-delayed replication with parallel + * apply feature. + */ + if (!am_parallel_apply_worker()) + maybe_delay_apply(finish_ts);That whole comment part "Unlike the regular (non-streamed) cases"
seems misplaced here. Perhaps this part of the comment is better put into
the function header where the meaning of 'finish_ts' is explained?
Moved it to the header comment for maybe_delay_apply.
~~~
18. apply_spooled_messages
+ * It's not allowed to execute time-delayed replication with parallel + * apply feature. + */ + if (!am_parallel_apply_worker()) + maybe_delay_apply(finish_ts);As was mentioned in comment #11 above this code could be changed like
if (finish_ts)
maybe_delay_apply(finish_ts);
then you don't even need to make mention of "parallel apply" at all here.OTOH if you want to still have the parallel apply comment then maybe reword it
like this:
"It is not allowed to combine time-delayed replication with the parallel apply
feature."
Changed and now I don't mention the parallel apply feature.
~~~
19. apply_spooled_messages
If you chose not to do my suggestion from comment #11, then there are
2 identical conditions (!am_parallel_apply_worker()); In this case, I was
wondering if it would be better to refactor to use a single condition instead.
I applied #11 comment. Now, the conditions are not identical.
~~~
20. send_feedback
(same as comment #13)Maybe change the new param name to “in_delayed_apply”?
Changed.
~~~
21.
@@ -3737,8 +3869,15 @@ send_feedback(XLogRecPtr recvpos, bool force, bool requestReply) /* * No outstanding transactions to flush, we can report the latest received * position. This is important for synchronous replication. + * + * During the delay of time-delayed replication, do not tell the + publisher + * that the received latest LSN is already applied and flushed at this + * stage, since we don't apply the transaction yet. If we do so, it + leads + * to a wrong assumption of logical replication progress on the + publisher + * side. Here, we just send a feedback message to avoid publisher's + * timeout during the delay. */Minor rewording of the comment
SUGGESTION
If the subscriber side apply is delayed (because of time-delayed
replication) then do not tell the publisher that the received latest LSN is already
applied and flushed, otherwise, it leads to the publisher side making a wrong
assumption of logical replication progress. Instead, we just send a feedback
message to avoid a publisher timeout during the delay.
Adopted.
======
src/bin/pg_dump/pg_dump.c
22.
@@ -4546,9 +4547,14 @@ getSubscriptions(Archive *fout)
LOGICALREP_TWOPHASE_STATE_DISABLED);if (fout->remoteVersion >= 160000) - appendPQExpBufferStr(query, " s.suborigin\n"); + appendPQExpBufferStr(query, + " s.suborigin,\n" + " s.subminapplydelay\n"); else - appendPQExpBuffer(query, " '%s' AS suborigin\n", LOGICALREP_ORIGIN_ANY); + { + appendPQExpBuffer(query, " '%s' AS suborigin,\n", + LOGICALREP_ORIGIN_ANY); appendPQExpBufferStr(query, " 0 AS + subminapplydelay\n"); }Can’t those appends in the else part can be combined to a single
appendPQExpBufferappendPQExpBuffer(query,
" '%s' AS suborigin,\n"
" 0 AS subminapplydelay\n"
LOGICALREP_ORIGIN_ANY);
Adopted.
======
src/include/catalog/pg_subscription.h
23.
@@ -70,6 +70,8 @@ CATALOG(pg_subscription,6100,SubscriptionRelationId)
BKI_SHARED_RELATION BKI_ROW
XLogRecPtr subskiplsn; /* All changes finished at this LSN are
* skipped */+ int64 subminapplydelay; /* Replication apply delay */ + NameData subname; /* Name of the subscription */Oid subowner BKI_LOOKUP(pg_authid); /* Owner of the subscription */
SUGGESTION (for comment)
Replication apply delay (ms)
Fixed.
~~
24.
@@ -120,6 +122,7 @@ typedef struct Subscription * in */ XLogRecPtr skiplsn; /* All changes finished at this LSN are * skipped */ + int64 minapplydelay; /* Replication apply delay */SUGGESTION (for comment)
Replication apply delay (ms)
Fixed.
Kindly have a look at the latest v17 patch in [1]/messages/by-id/TYCPR01MB8373F5162C7A0E6224670CF0EDC49@TYCPR01MB8373.jpnprd01.prod.outlook.com.
[1]: /messages/by-id/TYCPR01MB8373F5162C7A0E6224670CF0EDC49@TYCPR01MB8373.jpnprd01.prod.outlook.com
Best Regards,
Takamichi Osumi
On Thursday, January 19, 2023 10:42 AM Peter Smith <smithpb2250@gmail.com> wrote:
On Wed, Jan 18, 2023 at 6:06 PM Peter Smith <smithpb2250@gmail.com>
wrote:Here are my review comments for the latest patch v16-0001. (excluding
the test code)...
8. AlterSubscription (general)
I observed during testing there are 3 different errors….
At subscription CREATE time you can get this error:
ERROR: min_apply_delay > 0 and streaming = parallel are mutually
exclusive optionsIf you try to ALTER the min_apply_delay when already streaming =
parallel you can get this error:
ERROR: cannot enable min_apply_delay for subscription in streaming =
parallel modeIf you try to ALTER the streaming to be parallel if there is already a
min_apply_delay > 0 then you can get this error:
ERROR: cannot enable streaming = parallel mode for subscription with
min_apply_delay~
IMO there is no need to have 3 different error message texts. I think
all these cases are explained by just the first text (ERROR:
min_apply_delay > 0 and streaming = parallel are mutually exclusive
options)After checking the regression test output I can see the merit of your separate
error messages like this, even if they are maybe not strictly necessary. So feel
free to ignore my previous review comment.
Thank you for your notification.
I wrote another reason why we wrote those messages in [1]/messages/by-id/TYCPR01MB8373447440202B248BB63805EDC49@TYCPR01MB8373.jpnprd01.prod.outlook.com.
So, please have a look at it.
[1]: /messages/by-id/TYCPR01MB8373447440202B248BB63805EDC49@TYCPR01MB8373.jpnprd01.prod.outlook.com
Best Regards,
Takamichi Osumi
On Thu, 19 Jan 2023 at 12:06, Takamichi Osumi (Fujitsu)
<osumi.takamichi@fujitsu.com> wrote:
Updated the comment and the function call.
Kindly have a look at the updated patch v17.
Thanks for the updated patch, few comments:
1) min_apply_delay was accepting values like '600 m s h', I was not
sure if we should allow this:
alter subscription sub1 set (min_apply_delay = ' 600 m s h');
+ /*
+ * If no unit was specified, then explicitly
add 'ms' otherwise
+ * the interval_in function would assume 'seconds'.
+ */
+ if (strspn(tmp, "-0123456789 ") == strlen(tmp))
+ val = psprintf("%sms", tmp);
+ else
+ val = tmp;
+
+ interval =
DatumGetIntervalP(DirectFunctionCall3(interval_in,
+
CStringGetDatum(val),
+
ObjectIdGetDatum(InvalidOid),
+
Int32GetDatum(-1)));
2) How about adding current_txn_wait_time in
pg_stat_subscription_stats, we can update the current_txn_wait_time
periodically, this will help the user to check approximately how much
time is left(min_apply_delay - stat value) before this transaction
will be applied in the subscription. If you agree this can be 0002
patch.
3) There is one check at parse_subscription_options and another check
in AlterSubscription, this looks like a redundant check in case of
alter subscription, can we try to merge and keep in one place:
/*
* The combination of parallel streaming mode and min_apply_delay is not
* allowed.
*/
if (IsSet(supported_opts, SUBOPT_MIN_APPLY_DELAY) &&
opts->min_apply_delay > 0)
{
if (opts->streaming == LOGICALREP_STREAM_PARALLEL)
ereport(ERROR,
errcode(ERRCODE_SYNTAX_ERROR),
errmsg("%s and %s are mutually exclusive options",
"min_apply_delay > 0", "streaming = parallel"));
}
if (IsSet(opts.specified_opts, SUBOPT_MIN_APPLY_DELAY))
{
/*
* The combination of parallel streaming mode and
* min_apply_delay is not allowed.
*/
if (opts.min_apply_delay > 0)
if ((IsSet(opts.specified_opts, SUBOPT_STREAMING) && opts.streaming ==
LOGICALREP_STREAM_PARALLEL) ||
(!IsSet(opts.specified_opts, SUBOPT_STREAMING) && sub->stream ==
LOGICALREP_STREAM_PARALLEL))
ereport(ERROR,
errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
errmsg("cannot enable %s for subscription in %s mode",
"min_apply_delay", "streaming = parallel"));
values[Anum_pg_subscription_subminapplydelay - 1] =
Int64GetDatum(opts.min_apply_delay);
replaces[Anum_pg_subscription_subminapplydelay - 1] = true;
}
4) typo "execeeds" should be "exceeds"
+ time on the subscriber. Any overhead of time spent in
logical decoding
+ and in transferring the transaction may reduce the actual wait time.
+ It is also possible that the overhead already execeeds the requested
+ <literal>min_apply_delay</literal> value, in which case no additional
+ wait is necessary. If the system clocks on publisher and subscriber
+ are not synchronized, this may lead to apply changes earlier than
Regards,
Vignesh
On Thu, Jan 19, 2023 at 4:25 PM vignesh C <vignesh21@gmail.com> wrote:
On Thu, 19 Jan 2023 at 12:06, Takamichi Osumi (Fujitsu)
<osumi.takamichi@fujitsu.com> wrote:Updated the comment and the function call.
Kindly have a look at the updated patch v17.
Thanks for the updated patch, few comments:
1) min_apply_delay was accepting values like '600 m s h', I was not
sure if we should allow this:
alter subscription sub1 set (min_apply_delay = ' 600 m s h');
I think here we should have specs similar to recovery_min_apply_delay.
2) How about adding current_txn_wait_time in
pg_stat_subscription_stats, we can update the current_txn_wait_time
periodically, this will help the user to check approximately how much
time is left(min_apply_delay - stat value) before this transaction
will be applied in the subscription. If you agree this can be 0002
patch.
Do we have any similar stats for recovery_min_apply_delay? If not, I
suggest let's postpone this to see if users really need such a
parameter.
--
With Regards,
Amit Kapila.
On Thu, Jan 19, 2023 at 12:06 PM Takamichi Osumi (Fujitsu)
<osumi.takamichi@fujitsu.com> wrote:
Kindly have a look at the updated patch v17.
Can we try to optimize the test time for this test? On my machine, it
is the second highest time-consuming test in src/test/subscription. It
seems you are waiting twice for apply_delay and both are for streaming
cases by varying the number of changes. I think it should be just once
and that too for the non-streaming case. I think it would be good to
test streaming code path interaction but not sure if it is important
enough to have two test cases for apply_delay.
One minor comment that I observed while going through the patch.
+ /*
+ * The combination of parallel streaming mode and min_apply_delay is not
+ * allowed.
+ */
+ if (IsSet(supported_opts, SUBOPT_MIN_APPLY_DELAY) &&
+ opts->min_apply_delay > 0)
I think it would be good if you can specify the reason for not
allowing this combination in the comments.
--
With Regards,
Amit Kapila.
On Thu, 19 Jan 2023 at 18:29, Amit Kapila <amit.kapila16@gmail.com> wrote:
On Thu, Jan 19, 2023 at 4:25 PM vignesh C <vignesh21@gmail.com> wrote:
On Thu, 19 Jan 2023 at 12:06, Takamichi Osumi (Fujitsu)
<osumi.takamichi@fujitsu.com> wrote:Updated the comment and the function call.
Kindly have a look at the updated patch v17.
Thanks for the updated patch, few comments:
1) min_apply_delay was accepting values like '600 m s h', I was not
sure if we should allow this:
alter subscription sub1 set (min_apply_delay = ' 600 m s h');I think here we should have specs similar to recovery_min_apply_delay.
2) How about adding current_txn_wait_time in
pg_stat_subscription_stats, we can update the current_txn_wait_time
periodically, this will help the user to check approximately how much
time is left(min_apply_delay - stat value) before this transaction
will be applied in the subscription. If you agree this can be 0002
patch.Do we have any similar stats for recovery_min_apply_delay? If not, I
suggest let's postpone this to see if users really need such a
parameter.
I did not find any statistics for recovery_min_apply_delay, ok it can
be delayed to a later time.
Regards,
Vignesh
On Thu, Jan 19, 2023 at 12:42 PM Takamichi Osumi (Fujitsu)
<osumi.takamichi@fujitsu.com> wrote:
On Wednesday, January 18, 2023 4:06 PM Peter Smith <smithpb2250@gmail.com> wrote:
Here are my review comments for the latest patch v16-0001. (excluding the
test code)Hi, thank you for your review !
======
General
1.
Since the value of min_apply_delay cannot be < 0, I was thinking probably it
should have been declared everywhere in this patch as a
uint64 instead of an int64, right?No, we won't be able to adopt this idea.
It seems that we are not able to use uint for catalog type.
So, can't applying it to the pg_subscription.h definitions
and then similarly Int64GetDatum to store catalog variables
and the argument variable of Int64GetDatum.Plus, there is a possibility that type Interval becomes negative value,
then we are not able to change the int64 variable to get
the return value of interval2ms().======
Commit message
2.
If the subscription sets min_apply_delay parameter, the logical replication
worker will delay the transaction commit for min_apply_delay milliseconds.~
IMO there should be another sentence before this just to say that a new
parameter is being added:e.g.
This patch implements a new subscription parameter called
'min_apply_delay'.Added.
======
doc/src/sgml/config.sgml
3.
+ <para> + For time-delayed logical replication, the apply worker sends a Standby + Status Update message to the corresponding publisher per the indicated + time of this parameter. Therefore, if this parameter is longer than + <literal>wal_sender_timeout</literal> on the publisher, then the + walsender doesn't get any update message during the delay and repeatedly + terminates due to the timeout errors. Hence, make sure this parameter is + shorter than the <literal>wal_sender_timeout</literal> of the publisher. + If this parameter is set to zero with time-delayed replication, the + apply worker doesn't send any feedback messages during the + <literal>min_apply_delay</literal>. + </para>This paragraph seemed confusing. I think it needs to be reworded to change all
of the "this parameter" references because there are at least 3 different
parameters mentioned in this paragraph. e.g. maybe just change them to
explicitly name the parameter you are talking about.I also think it needs to mention the ‘min_apply_delay’ subscription parameter
up-front and then refer to it appropriately.The end result might be something like I wrote below (this is just my guess ?
probably you can word it better).SUGGESTION
For time-delayed logical replication (i.e. when the subscription is created with
parameter min_apply_delay > 0), the apply worker sends a Standby Status
Update message to the publisher with a period of wal_receiver_status_interval .
Make sure to set wal_receiver_status_interval less than the
wal_sender_timeout on the publisher, otherwise, the walsender will repeatedly
terminate due to the timeout errors. If wal_receiver_status_interval is set to zero,
the apply worker doesn't send any feedback messages during the subscriber’s
min_apply_delay period.Applied. Also, I added one reference for min_apply_delay parameter
at the end of this description.======
doc/src/sgml/ref/create_subscription.sgml
4.
+ <para> + By default, the subscriber applies changes as soon as possible. As + with the physical replication feature + (<xref linkend="guc-recovery-min-apply-delay"/>), it can be useful to + have a time-delayed logical replica. This parameter lets the user to + delay the application of changes by a specified amount of time. If this + value is specified without units, it is taken as milliseconds. The + default is zero(no delay). + </para>4a.
As with the physical replication feature (recovery_min_apply_delay), it can be
useful to have a time-delayed logical replica.IMO not sure that the above sentence is necessary. It seems only to be saying
that this parameter can be useful. Why do we need to say that?Removed the sentence.
~
4b.
"This parameter lets the user to delay" -> "This parameter lets the user delay"
OR
"This parameter lets the user to delay" -> "This parameter allows the user to
delay"Fixed.
~
4c.
"If this value is specified without units" -> "If the value is specified without
units"Fixed.
~
4d.
"zero(no delay)." -> "zero (no delay)."Fixed.
----
5.
+ <para> + The delay occurs only on WAL records for transaction begins and after + the initial table synchronization. It is possible that the + replication delay between publisher and subscriber exceeds the value + of this parameter, in which case no delay is added. Note that the + delay is calculated between the WAL time stamp as written on + publisher and the current time on the subscriber. Time spent in logical + decoding and in transferring the transaction may reduce the actual wait + time. If the system clocks on publisher and subscriber are not + synchronized, this may lead to apply changes earlier than expected, + but this is not a major issue because this parameter is typically much + larger than the time deviations between servers. Note that if this + parameter is set to a long delay, the replication will stop if the + replication slot falls behind the current LSN by more than + <link linkend="guc-max-slot-wal-keep-size"><literal>max_slot_wal_keep_size</ literal></link>. + </para>I think the first part can be reworded slightly. See what you think about the
suggestion below.SUGGESTION
Any delay occurs only on WAL records for transaction begins after all initial
table synchronization has finished. The delay is calculated between the WAL
timestamp as written on the publisher and the current time on the subscriber.
Any overhead of time spent in logical decoding and in transferring the
transaction may reduce the actual wait time.
It is also possible that the overhead already exceeds the requested
'min_apply_delay' value, in which case no additional wait is necessary. If the
system clocks...Addressed.
----
6.
+ <para> + Setting streaming to <literal>parallel</literal> mode and <literal>min_apply_delay</literal> + simultaneously is not supported. + </para>SUGGESTION
A non-zero min_apply_delay parameter is not allowed when streaming in
parallel mode.Applied.
======
src/backend/commands/subscriptioncmds.c
7. parse_subscription_options
@@ -404,6 +445,17 @@ parse_subscription_options(ParseState *pstate, List *stmt_options, "slot_name = NONE", "create_slot = false"))); } } + + /* Test the combination of streaming mode and min_apply_delay */ if + (IsSet(supported_opts, SUBOPT_MIN_APPLY_DELAY) && + opts->min_apply_delay > 0) + { + if (opts->streaming == LOGICALREP_STREAM_PARALLEL) ereport(ERROR, + errcode(ERRCODE_SYNTAX_ERROR), errmsg("%s and %s are mutually + exclusive options", + "min_apply_delay > 0", "streaming = parallel")); }SUGGESTION (comment)
The combination of parallel streaming mode and min_apply_delay is not
allowed.Fixed.
~~~
8. AlterSubscription (general)
I observed during testing there are 3 different errors….
At subscription CREATE time you can get this error:
ERROR: min_apply_delay > 0 and streaming = parallel are mutually exclusive
optionsIf you try to ALTER the min_apply_delay when already streaming = parallel you
can get this error:
ERROR: cannot enable min_apply_delay for subscription in streaming =
parallel modeIf you try to ALTER the streaming to be parallel if there is already a
min_apply_delay > 0 then you can get this error:
ERROR: cannot enable streaming = parallel mode for subscription with
min_apply_delayYes. This is because the existing error message styles
in AlterSubscription and parse_subscription_options.The former uses "mutually exclusive" messages consistently,
while the latter does "cannot enable ..." ones.~
IMO there is no need to have 3 different error message texts. I think all these
cases are explained by just the first text (ERROR:
min_apply_delay > 0 and streaming = parallel are mutually exclusive
options)Then, we followed this kind of formats.
~~~
9. AlterSubscription
@@ -1098,6 +1152,18 @@ AlterSubscription(ParseState *pstate,
AlterSubscriptionStmt *stmt,if (IsSet(opts.specified_opts, SUBOPT_STREAMING)) { + /* + * Test the combination of streaming mode and + * min_apply_delay + */ + if (opts.streaming == LOGICALREP_STREAM_PARALLEL) if + ((IsSet(opts.specified_opts, SUBOPT_MIN_APPLY_DELAY) && opts.min_apply_delay > 0) || + (!IsSet(opts.specified_opts, SUBOPT_MIN_APPLY_DELAY) && sub->minapplydelay > 0)) + ereport(ERROR, + errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE), + errmsg("cannot enable %s mode for subscription with %s", + "streaming = parallel", "min_apply_delay")); +9a.
SUGGESTION (comment)
The combination of parallel streaming mode and min_apply_delay is not
allowed.Fixed.
~
9b.
(see AlterSubscription general review comment #8 above) Here you can use the
same comment error message that says min_apply_delay > 0 and streaming =
parallel are mutually exclusive options.As described above, we followed the current style in the existing functions.
~~~
10. AlterSubscription
@@ -1111,6 +1177,25 @@ AlterSubscription(ParseState *pstate,
AlterSubscriptionStmt *stmt,
= true;
}+ if (IsSet(opts.specified_opts, SUBOPT_MIN_APPLY_DELAY)) { + /* + * Test the combination of streaming mode and + * min_apply_delay + */ + if (opts.min_apply_delay > 0) + if ((IsSet(opts.specified_opts, SUBOPT_STREAMING) && opts.streaming == LOGICALREP_STREAM_PARALLEL) || + (!IsSet(opts.specified_opts, SUBOPT_STREAMING) && sub->stream == LOGICALREP_STREAM_PARALLEL)) + ereport(ERROR, + errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE), + errmsg("cannot enable %s for subscription in %s mode", + "min_apply_delay", "streaming = parallel")); + + values[Anum_pg_subscription_subminapplydelay - 1] = + Int64GetDatum(opts.min_apply_delay); + replaces[Anum_pg_subscription_subminapplydelay - 1] = true; }10a.
SUGGESTION (comment)
The combination of parallel streaming mode and min_apply_delay is not
allowed.Fixed.
~
10b.
(see AlterSubscription general review comment #8 above) Here you can use the
same comment error message that says min_apply_delay > 0 and streaming =
parallel are mutually exclusive options.Same as 9b.
======
.../replication/logical/applyparallelworker.c
11.
@@ -704,7 +704,8 @@ pa_process_spooled_messages_if_required(void) { apply_spooled_messages(&MyParallelShared->fileset, MyParallelShared->xid, - InvalidXLogRecPtr); + InvalidXLogRecPtr, + 0);IMO this passing of 0 is a bit strange because it is currently acting like a dummy
value since the apply_spooled_messages will never make use of the 'finish_ts'
anyway (since this call is from a parallel apply worker).I think a better way to code this might be to pass the 0 (same as you are doing
here) but inside the apply_spooled_messages change the code:FROM
if (!am_parallel_apply_worker())
maybe_delay_apply(finish_ts);TO
if (finish_ts)
maybe_delay_apply(finish_ts);That does 2 things.
- It makes the passed-in 0 have some meaning
- It simplifies the apply_spooled_messages codeAdopted.
======
src/backend/replication/logical/worker.c
12.
@@ -318,6 +318,17 @@ static List *on_commit_wakeup_workers_subids =
NIL; bool in_remote_transaction = false; static XLogRecPtr
remote_final_lsn = InvalidXLogRecPtr;+/* + * In order to avoid walsender's timeout during time-delayed +replication, + * it's necessary to keep sending feedback messages during the delay +from the + * worker process. Meanwhile, the feature delays the apply before +starting the + * transaction and thus we don't write WALs for the suspended changes +during + * the wait. Hence, in the case the worker process sends a feedback +message + * during the delay, we should not make positions of the flushed and +apply LSN + * overwritten by the last received latest LSN. See send_feedback() for details. + */ +static XLogRecPtr last_received = InvalidXLogRecPtr;12a.
Suggest a small change to the first sentence of the comment.BEFORE
In order to avoid walsender's timeout during time-delayed replication, it's
necessary to keep sending feedback messages during the delay from the
worker process.AFTER
In order to avoid walsender timeout for time-delayed replication the worker
process keeps sending feedback messages during the delay period.Fixed.
~
12b.
"Hence, in the case" -> "When"Fixed.
~~~
13. forward declare
-static void send_feedback(XLogRecPtr recvpos, bool force, bool requestReply); +static void send_feedback(XLogRecPtr recvpos, bool force, bool requestReply, + bool in_delaying_apply);Change the param name:
"in_delaying_apply" -> "in_delayed_apply” (??)
Changed. The initial intention to append the "in_"
prefix is to make the variable name aligned with
some other variables such as "in_remote_transaction" and
"in_streamed_transaction" that mean the current status
for the transaction. So, until there is a better name proposed,
we can keep it.~~~
14. maybe_delay_apply
+ /* Nothing to do if no delay set */ + if (MySubscription->minapplydelay <= 0) return;IIUC min_apply_delay cannot be < 0 so this condition could simply be:
if (!MySubscription->minapplydelay)
return;Fixed.
~~~
15. maybe_delay_apply
+ /* + * The min_apply_delay parameter is ignored until all tablesync workers + * have reached READY state. If we allow the delay during the catchup + * phase, once we reach the limit of tablesync workers, it will impose + a + * delay for each subsequent worker. It means it will take a long time + to + * finish the initial table synchronization. + */ + if (!AllTablesyncsReady()) + return;SUGGESTION (slight rewording)
The min_apply_delay parameter is ignored until all tablesync workers have
reached READY state. This is because if we allowed the delay during the
catchup phase, then once we reached the limit of tablesync workers it would
impose a delay for each subsequent worker. That would cause initial table
synchronization completion to take a long time.Fixed.
~~~
16. maybe_delay_apply
+ while (true) + { + long diffms; + + ResetLatch(MyLatch); + + CHECK_FOR_INTERRUPTS();IMO there should be some small explanatory comment here at the top of the
while loop.Added.
~~~
17. apply_spooled_messages
@@ -2024,6 +2141,21 @@ apply_spooled_messages(FileSet *stream_fileset,
TransactionId xid,
int fileno;
off_t offset;+ /* + * Should we delay the current transaction? + * + * Unlike the regular (non-streamed) cases, the delay is applied in a + * STREAM COMMIT/STREAM PREPARE message for streamed transactions. The + * STREAM START message does not contain a commit/prepare time (it will + be + * available when the in-progress transaction finishes). Hence, it's + not + * appropriate to apply a delay at that time. + * + * It's not allowed to execute time-delayed replication with parallel + * apply feature. + */ + if (!am_parallel_apply_worker()) + maybe_delay_apply(finish_ts);That whole comment part "Unlike the regular (non-streamed) cases"
seems misplaced here. Perhaps this part of the comment is better put into
the function header where the meaning of 'finish_ts' is explained?Moved it to the header comment for maybe_delay_apply.
~~~
18. apply_spooled_messages
+ * It's not allowed to execute time-delayed replication with parallel + * apply feature. + */ + if (!am_parallel_apply_worker()) + maybe_delay_apply(finish_ts);As was mentioned in comment #11 above this code could be changed like
if (finish_ts)
maybe_delay_apply(finish_ts);
then you don't even need to make mention of "parallel apply" at all here.OTOH if you want to still have the parallel apply comment then maybe reword it
like this:
"It is not allowed to combine time-delayed replication with the parallel apply
feature."Changed and now I don't mention the parallel apply feature.
~~~
19. apply_spooled_messages
If you chose not to do my suggestion from comment #11, then there are
2 identical conditions (!am_parallel_apply_worker()); In this case, I was
wondering if it would be better to refactor to use a single condition instead.I applied #11 comment. Now, the conditions are not identical.
~~~
20. send_feedback
(same as comment #13)Maybe change the new param name to “in_delayed_apply”?
Changed.
~~~
21.
@@ -3737,8 +3869,15 @@ send_feedback(XLogRecPtr recvpos, bool force, bool requestReply) /* * No outstanding transactions to flush, we can report the latest received * position. This is important for synchronous replication. + * + * During the delay of time-delayed replication, do not tell the + publisher + * that the received latest LSN is already applied and flushed at this + * stage, since we don't apply the transaction yet. If we do so, it + leads + * to a wrong assumption of logical replication progress on the + publisher + * side. Here, we just send a feedback message to avoid publisher's + * timeout during the delay. */Minor rewording of the comment
SUGGESTION
If the subscriber side apply is delayed (because of time-delayed
replication) then do not tell the publisher that the received latest LSN is already
applied and flushed, otherwise, it leads to the publisher side making a wrong
assumption of logical replication progress. Instead, we just send a feedback
message to avoid a publisher timeout during the delay.Adopted.
======
src/bin/pg_dump/pg_dump.c
22.
@@ -4546,9 +4547,14 @@ getSubscriptions(Archive *fout)
LOGICALREP_TWOPHASE_STATE_DISABLED);if (fout->remoteVersion >= 160000) - appendPQExpBufferStr(query, " s.suborigin\n"); + appendPQExpBufferStr(query, + " s.suborigin,\n" + " s.subminapplydelay\n"); else - appendPQExpBuffer(query, " '%s' AS suborigin\n", LOGICALREP_ORIGIN_ANY); + { + appendPQExpBuffer(query, " '%s' AS suborigin,\n", + LOGICALREP_ORIGIN_ANY); appendPQExpBufferStr(query, " 0 AS + subminapplydelay\n"); }Can’t those appends in the else part can be combined to a single
appendPQExpBufferappendPQExpBuffer(query,
" '%s' AS suborigin,\n"
" 0 AS subminapplydelay\n"
LOGICALREP_ORIGIN_ANY);Adopted.
======
src/include/catalog/pg_subscription.h
23.
@@ -70,6 +70,8 @@ CATALOG(pg_subscription,6100,SubscriptionRelationId)
BKI_SHARED_RELATION BKI_ROW
XLogRecPtr subskiplsn; /* All changes finished at this LSN are
* skipped */+ int64 subminapplydelay; /* Replication apply delay */ + NameData subname; /* Name of the subscription */Oid subowner BKI_LOOKUP(pg_authid); /* Owner of the subscription */
SUGGESTION (for comment)
Replication apply delay (ms)Fixed.
~~
24.
@@ -120,6 +122,7 @@ typedef struct Subscription * in */ XLogRecPtr skiplsn; /* All changes finished at this LSN are * skipped */ + int64 minapplydelay; /* Replication apply delay */SUGGESTION (for comment)
Replication apply delay (ms)Fixed.
Kindly have a look at the latest v17 patch in [1].
[1] - /messages/by-id/TYCPR01MB8373F5162C7A0E6224670CF0EDC49@TYCPR01MB8373.jpnprd01.prod.outlook.com
Best Regards,
Takamichi Osumi
1)
Tried different variations of altering 'min_apply_delay'. All passed
except one below:
postgres=# alter subscription mysubnew set (min_apply_delay = '10.9min 1ms');
ALTER SUBSCRIPTION
postgres=# alter subscription mysubnew set (min_apply_delay = '10.9min 2s 1ms');
ALTER SUBSCRIPTION
--very similar to above but fails,
postgres=# alter subscription mysubnew set (min_apply_delay = '10.9s 1ms');
ERROR: invalid input syntax for type interval: "10.9s 1ms"
2)
Logging:
2023-01-19 17:33:16.202 IST [404797] DEBUG: logical replication apply
delay: 19979 ms
2023-01-19 17:33:26.212 IST [404797] DEBUG: logical replication apply
delay: 9969 ms
2023-01-19 17:34:25.730 IST [404962] DEBUG: logical replication apply
delay: 179988 ms-->previous wait over, started for next txn
2023-01-19 17:34:35.737 IST [404962] DEBUG: logical replication apply
delay: 169981 ms
2023-01-19 17:34:45.746 IST [404962] DEBUG: logical replication apply
delay: 159972 ms
Is there a way to distinguish between these logs? Maybe dumping xids along-with?
thanks
Shveta
Hi, Horiguchi-san and Amit-san
On Wednesday, November 9, 2022 3:41 PM Kyotaro Horiguchi <horikyota.ntt@gmail.com> wrote:
Using interval is not standard as this kind of parameters but it seems
convenient. On the other hand, it's not great that the unit month introduces
some subtle ambiguity. This patch translates a month to 30 days but I'm not
sure it's the right thing to do. Perhaps we shouldn't allow the units upper than
days.
In the past discussion, we talked about the merits to utilize the interval type.
On the other hand, now we are facing some incompatibility issues of parsing
between this time-delayed feature and physical replication's recovery_min_apply_delay.
For instance, the interval type can accept '600 m s h', '1d 10min' and '1m',
but the recovery_min_apply_delay makes the server failed to start by all of those.
Therefore, this would confuse users and I'm going to make the feature's input
compatible with recovery_min_apply_delay in the next version.
Best Regards,
Takamichi Osumi
Hi Osumi-san, here are my review comments for the latest patch v17-0001.
======
Commit Message
1.
Prohibit the combination of this feature and parallel streaming mode.
SUGGESTION (using the same wording as in the code comments)
The combination of parallel streaming mode and min_apply_delay is not allowed.
======
doc/src/sgml/ref/create_subscription.sgml
2.
+ <para>
+ By default, the subscriber applies changes as soon as possible. This
+ parameter allows the user to delay the application of changes by a
+ specified amount of time. If the value is specified without units, it
+ is taken as milliseconds. The default is zero (no delay).
+ </para>
Looking at this again, it seemed a bit strange to repeat "specified"
twice in 2 sentences. Maybe change one of them.
I’ve also suggested using the word "interval" because I don’t think
docs yet mentioned anywhere (except in the example) that using
intervals is possible.
SUGGESTION (for the 2nd sentence)
This parameter allows the user to delay the application of changes by
a given time interval.
~~~
3.
+ <para>
+ Any delay occurs only on WAL records for transaction begins after all
+ initial table synchronization has finished. The delay is calculated
+ between the WAL timestamp as written on the publisher and the current
+ time on the subscriber. Any overhead of time spent in
logical decoding
+ and in transferring the transaction may reduce the actual wait time.
+ It is also possible that the overhead already execeeds the requested
+ <literal>min_apply_delay</literal> value, in which case no additional
+ wait is necessary. If the system clocks on publisher and subscriber
+ are not synchronized, this may lead to apply changes earlier than
+ expected, but this is not a major issue because this parameter is
+ typically much larger than the time deviations between servers. Note
+ that if this parameter is set to a long delay, the replication will
+ stop if the replication slot falls behind the current LSN
by more than
+ <link
linkend="guc-max-slot-wal-keep-size"><literal>max_slot_wal_keep_size</literal></link>.
+ </para>
3a.
Typo "execeeds" (I think Vignesh reported this already)
~
3b.
SUGGESTION (for the 2nd sentence)
BEFORE
The delay is calculated between the WAL timestamp...
AFTER
The delay is calculated as the difference between the WAL timestamp...
~~~
4.
+ <warning>
+ <para>
+ Delaying the replication can mean there is a much longer
time between making
+ a change on the publisher, and that change being
committed on the subscriber.
+ v
+ See <xref linkend="guc-synchronous-commit"/>.
+ </para>
+ </warning>
IMO maybe there is a better way to express the 2nd sentence:
BEFORE
This can have a big impact on synchronous replication.
AFTER
This can impact the performance of synchronous replication.
======
src/backend/commands/subscriptioncmds.c
5. parse_subscription_options
@@ -324,6 +328,43 @@ parse_subscription_options(ParseState *pstate,
List *stmt_options,
opts->specified_opts |= SUBOPT_LSN;
opts->lsn = lsn;
}
+ else if (IsSet(supported_opts, SUBOPT_MIN_APPLY_DELAY) &&
+ strcmp(defel->defname, "min_apply_delay") == 0)
+ {
+ char *val,
+ *tmp;
+ Interval *interval;
+ int64 ms;
IMO 'delay_ms' (or similar) would be a friendlier variable name than just 'ms'
~~~
6.
@@ -404,6 +445,20 @@ parse_subscription_options(ParseState *pstate,
List *stmt_options,
"slot_name = NONE", "create_slot = false")));
}
}
+
+ /*
+ * The combination of parallel streaming mode and min_apply_delay is not
+ * allowed.
+ */
+ if (IsSet(supported_opts, SUBOPT_MIN_APPLY_DELAY) &&
+ opts->min_apply_delay > 0)
+ {
+ if (opts->streaming == LOGICALREP_STREAM_PARALLEL)
+ ereport(ERROR,
+ errcode(ERRCODE_SYNTAX_ERROR),
+ errmsg("%s and %s are mutually exclusive options",
+ "min_apply_delay > 0", "streaming = parallel"));
+ }
This could be expressed as a single condition using &&, maybe also
with the brackets eliminated. (Unless you feel the current code is
more readable)
~~~
7.
+ if (opts.min_apply_delay > 0)
+ if ((IsSet(opts.specified_opts, SUBOPT_STREAMING) && opts.streaming
== LOGICALREP_STREAM_PARALLEL) ||
+ (!IsSet(opts.specified_opts, SUBOPT_STREAMING) && sub->stream ==
LOGICALREP_STREAM_PARALLEL))
+ ereport(ERROR,
+ errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
+ errmsg("cannot enable %s for subscription in %s mode",
+ "min_apply_delay", "streaming = parallel"));
These nested ifs could instead be a single "if" with && condition.
(Unless you feel the current code is more readable)
======
src/backend/replication/logical/worker.c
8. maybe_delay_apply
+ * Hence, it's not appropriate to apply a delay at the time.
+ */
+static void
+maybe_delay_apply(TimestampTz finish_ts)
That last sentence "Hence,... delay at the time" does not sound
correct. Is there a typo or missing words here?
Maybe it meant to say "... at the STREAM START time."?
~~~
9.
+ /* This might change wal_receiver_status_interval */
+ if (ConfigReloadPending)
+ {
+ ConfigReloadPending = false;
+ ProcessConfigFile(PGC_SIGHUP);
+ }
I was unsure why did you make a special mention of
'wal_receiver_status_interval' here. I mean, Aren't there also other
GUCs that might change and affect something here so was there some
special reason only this one was mentioned?
======
src/test/subscription/t/032_apply_delay.pl
10.
+
+# Compare inserted time on the publisher with applied time on the subscriber to
+# confirm the latter is applied after expected time.
+sub check_apply_delay_time
Maybe the comment could also mention that the time is automatically
stored in the table column 'c'.
~~~
11.
+# Confirm the suspended record doesn't get applied expectedly by the ALTER
+# DISABLE command.
+$result = $node_subscriber->safe_psql('postgres',
+ "SELECT count(a) FROM test_tab WHERE a = 0;");
+is($result, qq(0), "check if the delayed transaction doesn't get
applied expectedly");
The use of "doesn't get applied expectedly" (in 2 places here) seemed
strange. Maybe it's better to say like
SUGGESTION
# Confirm disabling the subscription by ALTER DISABLE did not cause
the delayed transaction to be applied.
$result = $node_subscriber->safe_psql('postgres',
"SELECT count(a) FROM test_tab WHERE a = 0;");
is($result, qq(0), "check the delayed transaction was not applied");
------
Kind Regards,
Peter Smith.
Fujitsu Australia
On Fri, Jan 20, 2023 at 2:47 PM shveta malik <shveta.malik@gmail.com> wrote:
...
2)
Logging:
2023-01-19 17:33:16.202 IST [404797] DEBUG: logical replication apply
delay: 19979 ms
2023-01-19 17:33:26.212 IST [404797] DEBUG: logical replication apply
delay: 9969 ms
2023-01-19 17:34:25.730 IST [404962] DEBUG: logical replication apply
delay: 179988 ms-->previous wait over, started for next txn
2023-01-19 17:34:35.737 IST [404962] DEBUG: logical replication apply
delay: 169981 ms
2023-01-19 17:34:45.746 IST [404962] DEBUG: logical replication apply
delay: 159972 msIs there a way to distinguish between these logs? Maybe dumping xids along-with?
+1
Also, I was thinking of some other logging enhancements
a) the message should say that this is the *remaining* time to left to wait.
b) it might be convenient to know from the log what was the original
min_apply_delay value in the 1st place.
For example, the logs might look something like this:
DEBUG: time-delayed replication for txid 1234, min_apply_delay =
160000 ms. Remaining wait time: 159972 ms
DEBUG: time-delayed replication for txid 1234, min_apply_delay =
160000 ms. Remaining wait time: 142828 ms
DEBUG: time-delayed replication for txid 1234, min_apply_delay =
160000 ms. Remaining wait time: 129994 ms
DEBUG: time-delayed replication for txid 1234, min_apply_delay =
160000 ms. Remaining wait time: 110001 ms
...
------
Kind Regards,
Peter Smith.
Fujitsu Australia
On Fri, Jan 20, 2023 at 1:08 PM Peter Smith <smithpb2250@gmail.com> wrote:
a) the message should say that this is the *remaining* time to left to wait.
b) it might be convenient to know from the log what was the original
min_apply_delay value in the 1st place.For example, the logs might look something like this:
DEBUG: time-delayed replication for txid 1234, min_apply_delay =
160000 ms. Remaining wait time: 159972 ms
DEBUG: time-delayed replication for txid 1234, min_apply_delay =
160000 ms. Remaining wait time: 142828 ms
DEBUG: time-delayed replication for txid 1234, min_apply_delay =
160000 ms. Remaining wait time: 129994 ms
DEBUG: time-delayed replication for txid 1234, min_apply_delay =
160000 ms. Remaining wait time: 110001 ms
...
+1
This will also help when min_apply_delay is set to a new value in
between the current wait. Lets say, I started with min_apply_delay=5
min, when the worker was half way through this, I changed
min_apply_delay to 3 min or say 10min, I see the impact of that change
i.e. new wait-time is adjusted, but log becomes confusing. So, please
keep this scenario as well in mind while improving logging.
thanks
Shveta
On Fri, Jan 20, 2023 at 2:23 PM shveta malik <shveta.malik@gmail.com> wrote:
On Fri, Jan 20, 2023 at 1:08 PM Peter Smith <smithpb2250@gmail.com> wrote:
a) the message should say that this is the *remaining* time to left to wait.
b) it might be convenient to know from the log what was the original
min_apply_delay value in the 1st place.For example, the logs might look something like this:
DEBUG: time-delayed replication for txid 1234, min_apply_delay =
160000 ms. Remaining wait time: 159972 ms
DEBUG: time-delayed replication for txid 1234, min_apply_delay =
160000 ms. Remaining wait time: 142828 ms
DEBUG: time-delayed replication for txid 1234, min_apply_delay =
160000 ms. Remaining wait time: 129994 ms
DEBUG: time-delayed replication for txid 1234, min_apply_delay =
160000 ms. Remaining wait time: 110001 ms
...+1
This will also help when min_apply_delay is set to a new value in
between the current wait. Lets say, I started with min_apply_delay=5
min, when the worker was half way through this, I changed
min_apply_delay to 3 min or say 10min, I see the impact of that change
i.e. new wait-time is adjusted, but log becomes confusing. So, please
keep this scenario as well in mind while improving logging.
when we send-feedback during apply-delay after every
wal_receiver_status_interval , the log comes as:
023-01-19 17:12:56.000 IST [404795] DEBUG: sending feedback (force 1)
to recv 0/1570840, write 0/1570840, flush 0/1570840
Shall we have some info here to indicate that it is sent while waiting
for apply_delay to distinguish it from other such send-feedback logs?
It will
make apply_delay flow clear in logs.
thanks
Shveta
On Friday, January 20, 2023 3:56 PM Peter Smith <smithpb2250@gmail.com> wrote:
Hi Osumi-san, here are my review comments for the latest patch v17-0001.
Thanks for your review !
======
Commit Message1.
Prohibit the combination of this feature and parallel streaming mode.SUGGESTION (using the same wording as in the code comments) The
combination of parallel streaming mode and min_apply_delay is not allowed.
Okay. Fixed.
======
doc/src/sgml/ref/create_subscription.sgml2. + <para> + By default, the subscriber applies changes as soon as possible. This + parameter allows the user to delay the application of changes by a + specified amount of time. If the value is specified without units, it + is taken as milliseconds. The default is zero (no delay). + </para>Looking at this again, it seemed a bit strange to repeat "specified"
twice in 2 sentences. Maybe change one of them.I’ve also suggested using the word "interval" because I don’t think docs yet
mentioned anywhere (except in the example) that using intervals is possible.SUGGESTION (for the 2nd sentence)
This parameter allows the user to delay the application of changes by a given
time interval.
Adopted.
~~~
3. + <para> + Any delay occurs only on WAL records for transaction begins after all + initial table synchronization has finished. The delay is calculated + between the WAL timestamp as written on the publisher and the current + time on the subscriber. Any overhead of time spent in logical decoding + and in transferring the transaction may reduce the actual wait time. + It is also possible that the overhead already execeeds the requested + <literal>min_apply_delay</literal> value, in which case no additional + wait is necessary. If the system clocks on publisher and subscriber + are not synchronized, this may lead to apply changes earlier than + expected, but this is not a major issue because this parameter is + typically much larger than the time deviations between servers. Note + that if this parameter is set to a long delay, the replication will + stop if the replication slot falls behind the current LSN by more than + <link linkend="guc-max-slot-wal-keep-size"><literal>max_slot_wal_keep_size</ literal></link>. + </para>3a.
Typo "execeeds" (I think Vignesh reported this already)
Fixed.
~
3b.
SUGGESTION (for the 2nd sentence)
BEFORE
The delay is calculated between the WAL timestamp...
AFTER
The delay is calculated as the difference between the WAL timestamp...
Fixed.
~~~
4. + <warning> + <para> + Delaying the replication can mean there is a much longer time between making + a change on the publisher, and that change being committed on the subscriber. + v + See <xref linkend="guc-synchronous-commit"/>. + </para> + </warning>IMO maybe there is a better way to express the 2nd sentence:
BEFORE
This can have a big impact on synchronous replication.
AFTER
This can impact the performance of synchronous replication.
Fixed.
======
src/backend/commands/subscriptioncmds.c5. parse_subscription_options
@@ -324,6 +328,43 @@ parse_subscription_options(ParseState *pstate, List *stmt_options, opts->specified_opts |= SUBOPT_LSN; opts->lsn = lsn; } + else if (IsSet(supported_opts, SUBOPT_MIN_APPLY_DELAY) && + strcmp(defel->defname, "min_apply_delay") == 0) { + char *val, + *tmp; + Interval *interval; + int64 ms;IMO 'delay_ms' (or similar) would be a friendlier variable name than just 'ms'
The variable name has been changed which is more clear to the feature.
~~~
6. @@ -404,6 +445,20 @@ parse_subscription_options(ParseState *pstate, List *stmt_options, "slot_name = NONE", "create_slot = false"))); } } + + /* + * The combination of parallel streaming mode and min_apply_delay is + not + * allowed. + */ + if (IsSet(supported_opts, SUBOPT_MIN_APPLY_DELAY) && + opts->min_apply_delay > 0) + { + if (opts->streaming == LOGICALREP_STREAM_PARALLEL) ereport(ERROR, + errcode(ERRCODE_SYNTAX_ERROR), errmsg("%s and %s are mutually + exclusive options", + "min_apply_delay > 0", "streaming = parallel")); }This could be expressed as a single condition using &&, maybe also with the
brackets eliminated. (Unless you feel the current code is more readable)
The current style is intentional. We feel the code is more readable.
~~~
7.
+ if (opts.min_apply_delay > 0) + if ((IsSet(opts.specified_opts, SUBOPT_STREAMING) && opts.streaming == LOGICALREP_STREAM_PARALLEL) || + (!IsSet(opts.specified_opts, SUBOPT_STREAMING) && sub->stream == LOGICALREP_STREAM_PARALLEL)) + ereport(ERROR, + errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE), + errmsg("cannot enable %s for subscription in %s mode", + "min_apply_delay", "streaming = parallel"));These nested ifs could instead be a single "if" with && condition.
(Unless you feel the current code is more readable)
Same as #6.
======
src/backend/replication/logical/worker.c8. maybe_delay_apply
+ * Hence, it's not appropriate to apply a delay at the time. + */ +static void +maybe_delay_apply(TimestampTz finish_ts)That last sentence "Hence,... delay at the time" does not sound correct. Is there
a typo or missing words here?Maybe it meant to say "... at the STREAM START time."?
Yes. Fixed.
~~~
9. + /* This might change wal_receiver_status_interval */ if + (ConfigReloadPending) { ConfigReloadPending = false; + ProcessConfigFile(PGC_SIGHUP); }I was unsure why did you make a special mention of
'wal_receiver_status_interval' here. I mean, Aren't there also other GUCs that
might change and affect something here so was there some special reason only
this one was mentioned?
This should be similar to the recoveryApplyDelay for physical replication.
It mentions the GUC used in the same function.
======
src/test/subscription/t/032_apply_delay.pl10. + +# Compare inserted time on the publisher with applied time on the +subscriber to # confirm the latter is applied after expected time. +sub check_apply_delay_timeMaybe the comment could also mention that the time is automatically stored in
the table column 'c'.
Added.
~~~
11. +# Confirm the suspended record doesn't get applied expectedly by the +ALTER # DISABLE command. +$result = $node_subscriber->safe_psql('postgres', + "SELECT count(a) FROM test_tab WHERE a = 0;"); is($result, qq(0), +"check if the delayed transaction doesn't get applied expectedly");The use of "doesn't get applied expectedly" (in 2 places here) seemed strange.
Maybe it's better to say likeSUGGESTION
# Confirm disabling the subscription by ALTER DISABLE did not cause the
delayed transaction to be applied.
$result = $node_subscriber->safe_psql('postgres',
"SELECT count(a) FROM test_tab WHERE a = 0;"); is($result, qq(0), "check
the delayed transaction was not applied");
Fixed.
Kindly have a look at the patch v18.
Best Regards,
Takamichi Osumi
Attachments:
v18-0001-Time-delayed-logical-replication-subscriber.patchapplication/octet-stream; name=v18-0001-Time-delayed-logical-replication-subscriber.patchDownload
From 737372d78eef4d2aa7ef8407ab0b6c8135fe55e1 Mon Sep 17 00:00:00 2001
From: Takamichi Osumi <osumi.takamichi@fujitsu.com>
Date: Fri, 20 Jan 2023 17:50:47 +0000
Subject: [PATCH v18] Time-delayed logical replication subscriber
Similar to physical replication, a time-delayed copy of the data for
logical replication is useful for some scenarios (particularly to fix
errors that might cause data loss).
This patch implements a new subscription parameter called 'min_apply_delay'.
If the subscription sets min_apply_delay parameter, the logical
replication worker will delay the transaction commit for min_apply_delay
milliseconds.
The delay is calculated between the WAL time stamp and the current time
on the subscriber.
The delay occurs only on WAL records for transaction begins. The main
reason is to avoid keeping a transaction open for a long time. Regular
and prepared transactions are covered. Streamed transactions are also
covered.
The combination of parallel streaming mode and min_apply_delay is not
allowed. The subscriber in the parallel streaming mode applies each
stream on arrival without the time of commit/prepare. So, the subscriber
needs to depend on the arrival time of the stream in this case. Therefore,
if we apply the time-delayed feature for such transactions, then there is
a possibility where some unnecessary delay will be added on the subscriber
by network communication break between two nodes or other heavy work load
on the publisher. On the other hand, applying the delay at the end of
transaction with parallel apply also can cause issues of used resource
bloat and locks kept in open for a long time. Thus, those feature can't
be work together.
Author: Euler Taveira
Discussion: https://postgr.es/m/CAB-JLwYOYwL=XTyAXKiH5CtM_Vm8KjKh7aaitCKvmCh4rzr5pQ@mail.gmail.com
---
doc/src/sgml/catalogs.sgml | 9 +
doc/src/sgml/config.sgml | 13 ++
doc/src/sgml/logical-replication.sgml | 7 +
doc/src/sgml/ref/alter_subscription.sgml | 5 +-
doc/src/sgml/ref/create_subscription.sgml | 57 +++++-
src/backend/catalog/pg_subscription.c | 1 +
src/backend/catalog/system_views.sql | 7 +-
src/backend/commands/subscriptioncmds.c | 125 +++++++++++-
.../replication/logical/applyparallelworker.c | 3 +-
src/backend/replication/logical/worker.c | 169 ++++++++++++++--
src/bin/pg_dump/pg_dump.c | 15 +-
src/bin/pg_dump/pg_dump.h | 1 +
src/bin/psql/describe.c | 9 +-
src/bin/psql/tab-complete.c | 4 +-
src/include/catalog/pg_subscription.h | 3 +
src/include/datatype/timestamp.h | 2 +
src/include/replication/worker_internal.h | 2 +-
src/test/regress/expected/subscription.out | 181 +++++++++++-------
src/test/regress/sql/subscription.sql | 24 +++
src/test/subscription/meson.build | 1 +
src/test/subscription/t/032_apply_delay.pl | 175 +++++++++++++++++
21 files changed, 708 insertions(+), 105 deletions(-)
create mode 100644 src/test/subscription/t/032_apply_delay.pl
diff --git a/doc/src/sgml/catalogs.sgml b/doc/src/sgml/catalogs.sgml
index c1e4048054..bf3c05241c 100644
--- a/doc/src/sgml/catalogs.sgml
+++ b/doc/src/sgml/catalogs.sgml
@@ -7873,6 +7873,15 @@ SCRAM-SHA-256$<replaceable><iteration count></replaceable>:<replaceable>&l
</para></entry>
</row>
+ <row>
+ <entry role="catalog_table_entry"><para role="column_definition">
+ <structfield>subminapplydelay</structfield> <type>int8</type>
+ </para>
+ <para>
+ The length of time (ms) to delay the application of changes.
+ </para></entry>
+ </row>
+
<row>
<entry role="catalog_table_entry"><para role="column_definition">
<structfield>subname</structfield> <type>name</type>
diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml
index 89d53f2a64..13dd422c25 100644
--- a/doc/src/sgml/config.sgml
+++ b/doc/src/sgml/config.sgml
@@ -4753,6 +4753,19 @@ ANY <replaceable class="parameter">num_sync</replaceable> ( <replaceable class="
the <filename>postgresql.conf</filename> file or on the server
command line.
</para>
+ <para>
+ For time-delayed logical replication (i.e. when the subscription is
+ created with parameter min_apply_delay > 0), the apply worker sends a
+ Standby Status Update message to the publisher with a period of
+ <literal>wal_receiver_status_interval</literal>. Make sure to set
+ <literal>wal_receiver_status_interval</literal> less than the
+ <literal>wal_sender_timeout</literal> on the publisher, otherwise, the
+ walsender will repeatedly terminate due to the timeout errors. If
+ <literal>wal_receiver_status_interval</literal> is set to zero, the apply
+ worker doesn't send any feedback messages during the subscriber's
+ <literal>min_apply_delay</literal> period. See
+ <xref linkend="sql-createsubscription"/> for details.
+ </para>
</listitem>
</varlistentry>
diff --git a/doc/src/sgml/logical-replication.sgml b/doc/src/sgml/logical-replication.sgml
index 1bd5660c87..863af11a47 100644
--- a/doc/src/sgml/logical-replication.sgml
+++ b/doc/src/sgml/logical-replication.sgml
@@ -247,6 +247,13 @@
target table.
</para>
+ <para>
+ The subscriber replication can be instructed to lag behind the publisher
+ side changes by specifying the <literal>min_apply_delay</literal>
+ subscription parameter. See <xref linkend="sql-createsubscription"/> for
+ details.
+ </para>
+
<sect2 id="logical-replication-subscription-slot">
<title>Replication Slot Management</title>
diff --git a/doc/src/sgml/ref/alter_subscription.sgml b/doc/src/sgml/ref/alter_subscription.sgml
index ad93553a1d..1c6e9dd2d1 100644
--- a/doc/src/sgml/ref/alter_subscription.sgml
+++ b/doc/src/sgml/ref/alter_subscription.sgml
@@ -213,8 +213,9 @@ ALTER SUBSCRIPTION <replaceable class="parameter">name</replaceable> RENAME TO <
are <literal>slot_name</literal>,
<literal>synchronous_commit</literal>,
<literal>binary</literal>, <literal>streaming</literal>,
- <literal>disable_on_error</literal>, and
- <literal>origin</literal>.
+ <literal>disable_on_error</literal>,
+ <literal>origin</literal>, and
+ <literal>min_apply_delay</literal>.
</para>
</listitem>
</varlistentry>
diff --git a/doc/src/sgml/ref/create_subscription.sgml b/doc/src/sgml/ref/create_subscription.sgml
index eba72c6af6..76ee9c0b3d 100644
--- a/doc/src/sgml/ref/create_subscription.sgml
+++ b/doc/src/sgml/ref/create_subscription.sgml
@@ -349,7 +349,45 @@ CREATE SUBSCRIPTION <replaceable class="parameter">subscription_name</replaceabl
</para>
</listitem>
</varlistentry>
- </variablelist></para>
+
+ <varlistentry>
+ <term><literal>min_apply_delay</literal> (<type>integer</type>)</term>
+ <listitem>
+ <para>
+ By default, the subscriber applies changes as soon as possible. This
+ parameter allows the user to delay the application of changes by a
+ given time interval. If the value is specified without units, it is
+ taken as milliseconds. The default is zero (no delay).
+ </para>
+ <para>
+ Any delay occurs only on WAL records for transaction begins after all
+ initial table synchronization has finished. The delay is calculated
+ as the difference between the WAL timestamp as written on the
+ publisher and the current time on the subscriber. Any overhead of
+ time spent in logical decoding and in transferring the transaction
+ may reduce the actual wait time. It is also possible that the overhead
+ already exceeds the requested <literal>min_apply_delay</literal> value,
+ in which case no additional wait is necessary. If the system clocks
+ on publisher and subscriber are not synchronized, this may lead to
+ apply changes earlier than expected, but this is not a major issue
+ because this parameter is typically much larger than the time
+ deviations between servers. Note that if this parameter is set to a
+ long delay, the replication will stop if the replication slot falls
+ behind the current LSN by more than
+ <link linkend="guc-max-slot-wal-keep-size"><literal>max_slot_wal_keep_size</literal></link>.
+ </para>
+ <warning>
+ <para>
+ Delaying the replication can mean there is a much longer time between making
+ a change on the publisher, and that change being committed on the subscriber.
+ This can impact the performance of synchronous replication.
+ See <xref linkend="guc-synchronous-commit"/>.
+ </para>
+ </warning>
+ </listitem>
+ </varlistentry>
+ </variablelist>
+ </para>
</listitem>
</varlistentry>
@@ -413,6 +451,11 @@ CREATE SUBSCRIPTION <replaceable class="parameter">subscription_name</replaceabl
published with different column lists are not supported.
</para>
+ <para>
+ A non-zero <literal>min_apply_delay</literal> parameter is not allowed when streaming
+ in parallel mode.
+ </para>
+
<para>
We allow non-existent publications to be specified so that users can add
those later. This means
@@ -472,6 +515,18 @@ CREATE SUBSCRIPTION mysub
PUBLICATION insert_only
WITH (enabled = false);
</programlisting></para>
+
+ <para>
+ Create a subscription to a remote server that replicates tables in
+ the <literal>mypub</literal> publication and starts replicating immediately
+ on commit. Pre-existing data is not copied. The application of changes is
+ delayed by 4 hours.
+<programlisting>
+CREATE SUBSCRIPTION mysub
+ CONNECTION 'host=192.0.2.4 port=5432 user=foo dbname=foodb'
+ PUBLICATION mypub
+ WITH (copy_data = false, min_apply_delay = '4h');
+</programlisting></para>
</refsect1>
<refsect1>
diff --git a/src/backend/catalog/pg_subscription.c b/src/backend/catalog/pg_subscription.c
index a56ae311c3..c767cc1c3a 100644
--- a/src/backend/catalog/pg_subscription.c
+++ b/src/backend/catalog/pg_subscription.c
@@ -64,6 +64,7 @@ GetSubscription(Oid subid, bool missing_ok)
sub->oid = subid;
sub->dbid = subform->subdbid;
sub->skiplsn = subform->subskiplsn;
+ sub->minapplydelay = subform->subminapplydelay;
sub->name = pstrdup(NameStr(subform->subname));
sub->owner = subform->subowner;
sub->enabled = subform->subenabled;
diff --git a/src/backend/catalog/system_views.sql b/src/backend/catalog/system_views.sql
index 8608e3fa5b..317c2010cb 100644
--- a/src/backend/catalog/system_views.sql
+++ b/src/backend/catalog/system_views.sql
@@ -1299,9 +1299,10 @@ REVOKE ALL ON pg_replication_origin_status FROM public;
-- All columns of pg_subscription except subconninfo are publicly readable.
REVOKE ALL ON pg_subscription FROM public;
-GRANT SELECT (oid, subdbid, subskiplsn, subname, subowner, subenabled,
- subbinary, substream, subtwophasestate, subdisableonerr,
- subslotname, subsynccommit, subpublications, suborigin)
+GRANT SELECT (oid, subdbid, subskiplsn, subminapplydelay, subname, subowner,
+ subenabled, subbinary, substream, subtwophasestate,
+ subdisableonerr, subslotname, subsynccommit, subpublications,
+ suborigin)
ON pg_subscription TO public;
CREATE VIEW pg_stat_subscription_stats AS
diff --git a/src/backend/commands/subscriptioncmds.c b/src/backend/commands/subscriptioncmds.c
index baff00dd74..68f6f76102 100644
--- a/src/backend/commands/subscriptioncmds.c
+++ b/src/backend/commands/subscriptioncmds.c
@@ -66,6 +66,7 @@
#define SUBOPT_DISABLE_ON_ERR 0x00000400
#define SUBOPT_LSN 0x00000800
#define SUBOPT_ORIGIN 0x00001000
+#define SUBOPT_MIN_APPLY_DELAY 0x00002000
/* check if the 'val' has 'bits' set */
#define IsSet(val, bits) (((val) & (bits)) == (bits))
@@ -90,6 +91,7 @@ typedef struct SubOpts
bool disableonerr;
char *origin;
XLogRecPtr lsn;
+ int min_apply_delay;
} SubOpts;
static List *fetch_table_list(WalReceiverConn *wrconn, List *publications);
@@ -100,7 +102,7 @@ static void check_publications_origin(WalReceiverConn *wrconn,
static void check_duplicates_in_publist(List *publist, Datum *datums);
static List *merge_publications(List *oldpublist, List *newpublist, bool addpub, const char *subname);
static void ReportSlotConnectionError(List *rstates, Oid subid, char *slotname, char *err);
-
+static int defGetMinApplyDelay(DefElem *def);
/*
* Common option parsing function for CREATE and ALTER SUBSCRIPTION commands.
@@ -146,6 +148,8 @@ parse_subscription_options(ParseState *pstate, List *stmt_options,
opts->disableonerr = false;
if (IsSet(supported_opts, SUBOPT_ORIGIN))
opts->origin = pstrdup(LOGICALREP_ORIGIN_ANY);
+ if (IsSet(supported_opts, SUBOPT_MIN_APPLY_DELAY))
+ opts->min_apply_delay = 0;
/* Parse options */
foreach(lc, stmt_options)
@@ -324,6 +328,16 @@ parse_subscription_options(ParseState *pstate, List *stmt_options,
opts->specified_opts |= SUBOPT_LSN;
opts->lsn = lsn;
}
+ else if (IsSet(supported_opts, SUBOPT_MIN_APPLY_DELAY) &&
+ strcmp(defel->defname, "min_apply_delay") == 0)
+ {
+ int min_apply_delay;
+
+ opts->specified_opts |= SUBOPT_MIN_APPLY_DELAY;
+ min_apply_delay = defGetMinApplyDelay(defel);
+
+ opts->min_apply_delay = min_apply_delay;
+ }
else
ereport(ERROR,
(errcode(ERRCODE_SYNTAX_ERROR),
@@ -404,6 +418,29 @@ parse_subscription_options(ParseState *pstate, List *stmt_options,
"slot_name = NONE", "create_slot = false")));
}
}
+
+ /*
+ * The combination of parallel streaming mode and min_apply_delay is not
+ * allowed. The subscriber in the parallel streaming mode applies each
+ * stream on arrival without the time of commit/prepare. So, the
+ * subscriber needs to depend on the arrival time of the stream in this
+ * case. Therefore, if we apply the time-delayed feature for such
+ * transactions, then there is a possibility where some unnecessary delay
+ * will be added on the subscriber by network communication break between
+ * nodes or other heavy work load on the publisher. On the other hand,
+ * applying the delay at the end of transaction with parallel apply also
+ * can cause issues of used resource bloat and locks kept in open for a
+ * long time. Thus, those feature can't be work together.
+ */
+ if (IsSet(supported_opts, SUBOPT_MIN_APPLY_DELAY) &&
+ opts->min_apply_delay > 0)
+ {
+ if (opts->streaming == LOGICALREP_STREAM_PARALLEL)
+ ereport(ERROR,
+ errcode(ERRCODE_SYNTAX_ERROR),
+ errmsg("%s and %s are mutually exclusive options",
+ "min_apply_delay > 0", "streaming = parallel"));
+ }
}
/*
@@ -560,7 +597,8 @@ CreateSubscription(ParseState *pstate, CreateSubscriptionStmt *stmt,
SUBOPT_SLOT_NAME | SUBOPT_COPY_DATA |
SUBOPT_SYNCHRONOUS_COMMIT | SUBOPT_BINARY |
SUBOPT_STREAMING | SUBOPT_TWOPHASE_COMMIT |
- SUBOPT_DISABLE_ON_ERR | SUBOPT_ORIGIN);
+ SUBOPT_DISABLE_ON_ERR | SUBOPT_ORIGIN |
+ SUBOPT_MIN_APPLY_DELAY);
parse_subscription_options(pstate, stmt->options, supported_opts, &opts);
/*
@@ -625,6 +663,7 @@ CreateSubscription(ParseState *pstate, CreateSubscriptionStmt *stmt,
values[Anum_pg_subscription_oid - 1] = ObjectIdGetDatum(subid);
values[Anum_pg_subscription_subdbid - 1] = ObjectIdGetDatum(MyDatabaseId);
values[Anum_pg_subscription_subskiplsn - 1] = LSNGetDatum(InvalidXLogRecPtr);
+ values[Anum_pg_subscription_subminapplydelay - 1] = Int64GetDatum(opts.min_apply_delay);
values[Anum_pg_subscription_subname - 1] =
DirectFunctionCall1(namein, CStringGetDatum(stmt->subname));
values[Anum_pg_subscription_subowner - 1] = ObjectIdGetDatum(owner);
@@ -1054,7 +1093,7 @@ AlterSubscription(ParseState *pstate, AlterSubscriptionStmt *stmt,
supported_opts = (SUBOPT_SLOT_NAME |
SUBOPT_SYNCHRONOUS_COMMIT | SUBOPT_BINARY |
SUBOPT_STREAMING | SUBOPT_DISABLE_ON_ERR |
- SUBOPT_ORIGIN);
+ SUBOPT_ORIGIN | SUBOPT_MIN_APPLY_DELAY);
parse_subscription_options(pstate, stmt->options,
supported_opts, &opts);
@@ -1098,6 +1137,18 @@ AlterSubscription(ParseState *pstate, AlterSubscriptionStmt *stmt,
if (IsSet(opts.specified_opts, SUBOPT_STREAMING))
{
+ /*
+ * The combination of parallel streaming mode and
+ * min_apply_delay is not allowed.
+ */
+ if (opts.streaming == LOGICALREP_STREAM_PARALLEL)
+ if ((IsSet(opts.specified_opts, SUBOPT_MIN_APPLY_DELAY) && opts.min_apply_delay > 0) ||
+ (!IsSet(opts.specified_opts, SUBOPT_MIN_APPLY_DELAY) && sub->minapplydelay > 0))
+ ereport(ERROR,
+ errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
+ errmsg("cannot enable %s mode for subscription with %s",
+ "streaming = parallel", "min_apply_delay"));
+
values[Anum_pg_subscription_substream - 1] =
CharGetDatum(opts.streaming);
replaces[Anum_pg_subscription_substream - 1] = true;
@@ -1111,6 +1162,25 @@ AlterSubscription(ParseState *pstate, AlterSubscriptionStmt *stmt,
= true;
}
+ if (IsSet(opts.specified_opts, SUBOPT_MIN_APPLY_DELAY))
+ {
+ /*
+ * The combination of parallel streaming mode and
+ * min_apply_delay is not allowed.
+ */
+ if (opts.min_apply_delay > 0)
+ if ((IsSet(opts.specified_opts, SUBOPT_STREAMING) && opts.streaming == LOGICALREP_STREAM_PARALLEL) ||
+ (!IsSet(opts.specified_opts, SUBOPT_STREAMING) && sub->stream == LOGICALREP_STREAM_PARALLEL))
+ ereport(ERROR,
+ errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
+ errmsg("cannot enable %s for subscription in %s mode",
+ "min_apply_delay", "streaming = parallel"));
+
+ values[Anum_pg_subscription_subminapplydelay - 1] =
+ Int64GetDatum(opts.min_apply_delay);
+ replaces[Anum_pg_subscription_subminapplydelay - 1] = true;
+ }
+
if (IsSet(opts.specified_opts, SUBOPT_ORIGIN))
{
values[Anum_pg_subscription_suborigin - 1] =
@@ -2185,3 +2255,52 @@ defGetStreamingMode(DefElem *def)
def->defname)));
return LOGICALREP_STREAM_OFF; /* keep compiler quiet */
}
+
+
+/*
+ * Extract the min_apply_delay mode value from a DefElem. This is very similar
+ * to PGC_INT case of parse_and_validate_value(), because min_apply_delay
+ * accepts the same string as recovery_min_apply_delay,
+ */
+int
+defGetMinApplyDelay(DefElem *def)
+{
+ char *value;
+ int result;
+ const char *hintmsg;
+
+ /*
+ * Raise an ERROR if no parameter value given
+ */
+ if (def->arg == NULL)
+ ereport(ERROR,
+ (errcode(ERRCODE_SYNTAX_ERROR),
+ errmsg("%s requires an integer value",
+ def->defname)));
+
+ value = defGetString(def);
+
+ /*
+ * Parse given string as parameter which has millisecond unit
+ */
+ if (!parse_int(value, &result, GUC_UNIT_MS, &hintmsg))
+ ereport(ERROR,
+ (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+ errmsg("invalid value for parameter \"%s\": \"%s\"",
+ "min_apply_delay", value),
+ hintmsg ? errhint("%s", _(hintmsg)) : 0));
+
+ /*
+ * Check lower bound. parse_int() has been already confirmed that result
+ * is equal to or smaller than INT_MAX.
+ */
+ if (result < 0)
+ ereport(ERROR,
+ (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+ errmsg("%d ms is outside the valid range for parameter \"%s\" (%d .. %d)",
+ result,
+ "min_apply_delay",
+ 0, INT_MAX)));
+
+ return result;
+}
diff --git a/src/backend/replication/logical/applyparallelworker.c b/src/backend/replication/logical/applyparallelworker.c
index 3579e704fe..7302bce7a0 100644
--- a/src/backend/replication/logical/applyparallelworker.c
+++ b/src/backend/replication/logical/applyparallelworker.c
@@ -704,7 +704,8 @@ pa_process_spooled_messages_if_required(void)
{
apply_spooled_messages(&MyParallelShared->fileset,
MyParallelShared->xid,
- InvalidXLogRecPtr);
+ InvalidXLogRecPtr,
+ 0);
pa_set_fileset_state(MyParallelShared, FS_EMPTY);
}
diff --git a/src/backend/replication/logical/worker.c b/src/backend/replication/logical/worker.c
index a0084c7ef6..31719db030 100644
--- a/src/backend/replication/logical/worker.c
+++ b/src/backend/replication/logical/worker.c
@@ -318,6 +318,17 @@ static List *on_commit_wakeup_workers_subids = NIL;
bool in_remote_transaction = false;
static XLogRecPtr remote_final_lsn = InvalidXLogRecPtr;
+/*
+ * In order to avoid walsender timeout for time-delayed replication the worker
+ * process keeps sending feedback messages during the delay period.
+ * Meanwhile, the feature delays the apply before starting the
+ * transaction and thus we don't write WALs for the suspended changes during
+ * the wait. When the worker process sends a feedback message
+ * during the delay, we should not make positions of the flushed and apply LSN
+ * overwritten by the last received latest LSN. See send_feedback() for details.
+ */
+static XLogRecPtr last_received = InvalidXLogRecPtr;
+
/* fields valid only when processing streamed transaction */
static bool in_streamed_transaction = false;
@@ -388,10 +399,13 @@ static void stream_write_change(char action, StringInfo s);
static void stream_open_and_write_change(TransactionId xid, char action, StringInfo s);
static void stream_close_file(void);
-static void send_feedback(XLogRecPtr recvpos, bool force, bool requestReply);
+static void send_feedback(XLogRecPtr recvpos, bool force, bool requestReply,
+ bool in_delayed_apply);
static void DisableSubscriptionAndExit(void);
+static void maybe_delay_apply(TransactionId xid, TimestampTz finish_ts);
+
static void apply_handle_commit_internal(LogicalRepCommitData *commit_data);
static void apply_handle_insert_internal(ApplyExecutionData *edata,
ResultRelInfo *relinfo,
@@ -998,6 +1012,108 @@ slot_modify_data(TupleTableSlot *slot, TupleTableSlot *srcslot,
ExecStoreVirtualTuple(slot);
}
+/*
+ * When min_apply_delay parameter is set on the subscriber, we wait long enough
+ * to make sure a transaction is applied at least that interval behind the
+ * publisher.
+ *
+ * While the physical replication applies the delay at commit time, this
+ * feature applies the delay for the next transaction but before starting the
+ * transaction. This is mainly because keeping a transaction that conducted
+ * write operations open for a long time results in some issues such as bloat
+ * and locks.
+ *
+ * The min_apply_delay parameter will take effect only after all tables are in
+ * READY state.
+ *
+ * xid is the transaction id where we apply the delay.
+ *
+ * finish_ts is the commit/prepare time of both regular (non-streamed) and
+ * streamed transactions. Unlike the regular (non-streamed) cases, the delay
+ * is applied in a STREAM COMMIT/STREAM PREPARE message for streamed
+ * transactions. The STREAM START message does not contain a commit/prepare
+ * time (it will be available when the in-progress transaction finishes).
+ * Hence, it's not appropriate to apply a delay at the STREAM START time.
+ */
+static void
+maybe_delay_apply(TransactionId xid, TimestampTz finish_ts)
+{
+ Assert(finish_ts > 0);
+
+ /* Nothing to do if no delay set */
+ if (!MySubscription->minapplydelay)
+ return;
+
+ /*
+ * The min_apply_delay parameter is ignored until all tablesync workers
+ * have reached READY state. This is because if we allowed the delay
+ * during the catchup phase, then once we reached the limit of tablesync
+ * workers it would impose a delay for each subsequent worker. That would
+ * cause initial table synchronization completion to take a long time.
+ */
+ if (!AllTablesyncsReady())
+ return;
+
+ /* Apply the delay by the latch mechanism */
+ while (true)
+ {
+ long diffms;
+
+ ResetLatch(MyLatch);
+
+ CHECK_FOR_INTERRUPTS();
+
+ /* This might change wal_receiver_status_interval */
+ if (ConfigReloadPending)
+ {
+ ConfigReloadPending = false;
+ ProcessConfigFile(PGC_SIGHUP);
+ }
+
+ /*
+ * Before calculating the time duration, reload the catalog if needed.
+ */
+ if (!in_remote_transaction && !in_streamed_transaction)
+ {
+ AcceptInvalidationMessages();
+ maybe_reread_subscription();
+ }
+
+ diffms = TimestampDifferenceMilliseconds(GetCurrentTimestamp(),
+ TimestampTzPlusMilliseconds(finish_ts, MySubscription->minapplydelay));
+
+ /*
+ * Exit without arming the latch if it's already past time to apply
+ * this transaction.
+ */
+ if (diffms <= 0)
+ break;
+
+ elog(DEBUG2, "time-delayed replication for txid %u, min_apply_delay = %lld ms, Remaining wait time: %ld ms",
+ xid, (long long) MySubscription->minapplydelay, diffms);
+
+ /*
+ * Call send_feedback() to prevent the publisher from exiting by
+ * timeout during the delay, when wal_receiver_status_interval is
+ * available.
+ */
+ if (wal_receiver_status_interval > 0
+ && diffms > wal_receiver_status_interval * 1000)
+ {
+ WaitLatch(MyLatch,
+ WL_LATCH_SET | WL_TIMEOUT | WL_EXIT_ON_PM_DEATH,
+ (long) wal_receiver_status_interval * 1000,
+ WAIT_EVENT_RECOVERY_APPLY_DELAY);
+ send_feedback(last_received, true, false, true);
+ }
+ else
+ WaitLatch(MyLatch,
+ WL_LATCH_SET | WL_TIMEOUT | WL_EXIT_ON_PM_DEATH,
+ diffms,
+ WAIT_EVENT_RECOVERY_APPLY_DELAY);
+ }
+}
+
/*
* Handle BEGIN message.
*/
@@ -1012,6 +1128,9 @@ apply_handle_begin(StringInfo s)
logicalrep_read_begin(s, &begin_data);
set_apply_error_context_xact(begin_data.xid, begin_data.final_lsn);
+ /* Should we delay the current transaction? */
+ maybe_delay_apply(begin_data.xid, begin_data.committime);
+
remote_final_lsn = begin_data.final_lsn;
maybe_start_skipping_changes(begin_data.final_lsn);
@@ -1069,6 +1188,9 @@ apply_handle_begin_prepare(StringInfo s)
logicalrep_read_begin_prepare(s, &begin_data);
set_apply_error_context_xact(begin_data.xid, begin_data.prepare_lsn);
+ /* Should we delay the current prepared transaction? */
+ maybe_delay_apply(begin_data.xid, begin_data.prepare_time);
+
remote_final_lsn = begin_data.prepare_lsn;
maybe_start_skipping_changes(begin_data.prepare_lsn);
@@ -1316,7 +1438,8 @@ apply_handle_stream_prepare(StringInfo s)
* spooled operations.
*/
apply_spooled_messages(MyLogicalRepWorker->stream_fileset,
- prepare_data.xid, prepare_data.prepare_lsn);
+ prepare_data.xid, prepare_data.prepare_lsn,
+ prepare_data.prepare_time);
/* Mark the transaction as prepared. */
apply_handle_prepare_internal(&prepare_data);
@@ -2010,10 +2133,13 @@ ensure_last_message(FileSet *stream_fileset, TransactionId xid, int fileno,
/*
* Common spoolfile processing.
+ *
+ * The commit/prepare time for streaming transaction is required to achieve
+ * time-delayed replication.
*/
void
apply_spooled_messages(FileSet *stream_fileset, TransactionId xid,
- XLogRecPtr lsn)
+ XLogRecPtr lsn, TimestampTz finish_ts)
{
StringInfoData s2;
int nchanges;
@@ -2024,6 +2150,10 @@ apply_spooled_messages(FileSet *stream_fileset, TransactionId xid,
int fileno;
off_t offset;
+ /* Should we delay the current transaction? */
+ if (finish_ts)
+ maybe_delay_apply(xid, finish_ts);
+
if (!am_parallel_apply_worker())
maybe_start_skipping_changes(lsn);
@@ -2173,7 +2303,7 @@ apply_handle_stream_commit(StringInfo s)
* spooled operations.
*/
apply_spooled_messages(MyLogicalRepWorker->stream_fileset, xid,
- commit_data.commit_lsn);
+ commit_data.commit_lsn, commit_data.committime);
apply_handle_commit_internal(&commit_data);
@@ -3446,7 +3576,7 @@ UpdateWorkerStats(XLogRecPtr last_lsn, TimestampTz send_time, bool reply)
* Apply main loop.
*/
static void
-LogicalRepApplyLoop(XLogRecPtr last_received)
+LogicalRepApplyLoop(void)
{
TimestampTz last_recv_timestamp = GetCurrentTimestamp();
bool ping_sent = false;
@@ -3567,7 +3697,7 @@ LogicalRepApplyLoop(XLogRecPtr last_received)
if (last_received < end_lsn)
last_received = end_lsn;
- send_feedback(last_received, reply_requested, false);
+ send_feedback(last_received, reply_requested, false, false);
UpdateWorkerStats(last_received, timestamp, true);
}
/* other message types are purposefully ignored */
@@ -3580,7 +3710,7 @@ LogicalRepApplyLoop(XLogRecPtr last_received)
}
/* confirm all writes so far */
- send_feedback(last_received, false, false);
+ send_feedback(last_received, false, false, false);
if (!in_remote_transaction && !in_streamed_transaction)
{
@@ -3677,7 +3807,7 @@ LogicalRepApplyLoop(XLogRecPtr last_received)
}
}
- send_feedback(last_received, requestReply, requestReply);
+ send_feedback(last_received, requestReply, requestReply, false);
/*
* Force reporting to ensure long idle periods don't lead to
@@ -3707,7 +3837,7 @@ LogicalRepApplyLoop(XLogRecPtr last_received)
* to send a response to avoid timeouts.
*/
static void
-send_feedback(XLogRecPtr recvpos, bool force, bool requestReply)
+send_feedback(XLogRecPtr recvpos, bool force, bool requestReply, bool in_delayed_apply)
{
static StringInfo reply_message = NULL;
static TimestampTz send_time = 0;
@@ -3737,8 +3867,15 @@ send_feedback(XLogRecPtr recvpos, bool force, bool requestReply)
/*
* No outstanding transactions to flush, we can report the latest received
* position. This is important for synchronous replication.
+ *
+ * If the subscriber side apply is delayed (because of time-delayed
+ * replication) then do not tell the publisher that the received latest
+ * LSN is already applied and flushed, otherwise, it leads to the
+ * publisher side making a wrong assumption of logical replication
+ * progress. Instead, we just send a feedback message to avoid a publisher
+ * timeout during the delay.
*/
- if (!have_pending_txes)
+ if (!have_pending_txes && !in_delayed_apply)
flushpos = writepos = recvpos;
if (writepos < last_writepos)
@@ -3775,11 +3912,12 @@ send_feedback(XLogRecPtr recvpos, bool force, bool requestReply)
pq_sendint64(reply_message, now); /* sendTime */
pq_sendbyte(reply_message, requestReply); /* replyRequested */
- elog(DEBUG2, "sending feedback (force %d) to recv %X/%X, write %X/%X, flush %X/%X",
+ elog(DEBUG2, "sending feedback (force %d) to recv %X/%X, write %X/%X, flush %X/%X in-delayed: %d",
force,
LSN_FORMAT_ARGS(recvpos),
LSN_FORMAT_ARGS(writepos),
- LSN_FORMAT_ARGS(flushpos));
+ LSN_FORMAT_ARGS(flushpos),
+ in_delayed_apply);
walrcv_send(LogRepWorkerWalRcvConn,
reply_message->data, reply_message->len);
@@ -4354,11 +4492,11 @@ start_table_sync(XLogRecPtr *origin_startpos, char **myslotname)
* of system resource error and are not repeatable.
*/
static void
-start_apply(XLogRecPtr origin_startpos)
+start_apply(void)
{
PG_TRY();
{
- LogicalRepApplyLoop(origin_startpos);
+ LogicalRepApplyLoop();
}
PG_CATCH();
{
@@ -4645,7 +4783,8 @@ ApplyWorkerMain(Datum main_arg)
}
/* Run the main loop. */
- start_apply(origin_startpos);
+ last_received = origin_startpos;
+ start_apply();
proc_exit(0);
}
diff --git a/src/bin/pg_dump/pg_dump.c b/src/bin/pg_dump/pg_dump.c
index 527c7651ab..c0f69cb43b 100644
--- a/src/bin/pg_dump/pg_dump.c
+++ b/src/bin/pg_dump/pg_dump.c
@@ -4494,6 +4494,7 @@ getSubscriptions(Archive *fout)
int i_subsynccommit;
int i_subpublications;
int i_subbinary;
+ int i_subminapplydelay;
int i,
ntups;
@@ -4546,9 +4547,13 @@ getSubscriptions(Archive *fout)
LOGICALREP_TWOPHASE_STATE_DISABLED);
if (fout->remoteVersion >= 160000)
- appendPQExpBufferStr(query, " s.suborigin\n");
+ appendPQExpBufferStr(query,
+ " s.suborigin,\n"
+ " s.subminapplydelay\n");
else
- appendPQExpBuffer(query, " '%s' AS suborigin\n", LOGICALREP_ORIGIN_ANY);
+ appendPQExpBuffer(query, " '%s' AS suborigin,\n"
+ " 0 AS subminapplydelay\n",
+ LOGICALREP_ORIGIN_ANY);
appendPQExpBufferStr(query,
"FROM pg_subscription s\n"
@@ -4576,6 +4581,7 @@ getSubscriptions(Archive *fout)
i_subtwophasestate = PQfnumber(res, "subtwophasestate");
i_subdisableonerr = PQfnumber(res, "subdisableonerr");
i_suborigin = PQfnumber(res, "suborigin");
+ i_subminapplydelay = PQfnumber(res, "subminapplydelay");
subinfo = pg_malloc(ntups * sizeof(SubscriptionInfo));
@@ -4606,6 +4612,8 @@ getSubscriptions(Archive *fout)
subinfo[i].subdisableonerr =
pg_strdup(PQgetvalue(res, i, i_subdisableonerr));
subinfo[i].suborigin = pg_strdup(PQgetvalue(res, i, i_suborigin));
+ subinfo[i].subminapplydelay =
+ strtoi64(PQgetvalue(res, i, i_subminapplydelay), NULL, 10);
/* Decide whether we want to dump it */
selectDumpableObject(&(subinfo[i].dobj), fout);
@@ -4687,6 +4695,9 @@ dumpSubscription(Archive *fout, const SubscriptionInfo *subinfo)
if (strcmp(subinfo->subsynccommit, "off") != 0)
appendPQExpBuffer(query, ", synchronous_commit = %s", fmtId(subinfo->subsynccommit));
+ if (subinfo->subminapplydelay > 0)
+ appendPQExpBuffer(query, ", min_apply_delay = '" INT64_FORMAT " ms'", subinfo->subminapplydelay);
+
appendPQExpBufferStr(query, ");\n");
if (subinfo->dobj.dump & DUMP_COMPONENT_DEFINITION)
diff --git a/src/bin/pg_dump/pg_dump.h b/src/bin/pg_dump/pg_dump.h
index e7cbd8d7ed..e2525f70ab 100644
--- a/src/bin/pg_dump/pg_dump.h
+++ b/src/bin/pg_dump/pg_dump.h
@@ -661,6 +661,7 @@ typedef struct _SubscriptionInfo
char *subdisableonerr;
char *suborigin;
char *subsynccommit;
+ int64 subminapplydelay;
char *subpublications;
} SubscriptionInfo;
diff --git a/src/bin/psql/describe.c b/src/bin/psql/describe.c
index c8a0bb7b3a..8a27063bed 100644
--- a/src/bin/psql/describe.c
+++ b/src/bin/psql/describe.c
@@ -6472,7 +6472,7 @@ describeSubscriptions(const char *pattern, bool verbose)
PGresult *res;
printQueryOpt myopt = pset.popt;
static const bool translate_columns[] = {false, false, false, false,
- false, false, false, false, false, false, false, false};
+ false, false, false, false, false, false, false, false, false};
if (pset.sversion < 100000)
{
@@ -6527,10 +6527,13 @@ describeSubscriptions(const char *pattern, bool verbose)
gettext_noop("Two-phase commit"),
gettext_noop("Disable on error"));
+ /* Origin and min_apply_delay are only supported in v16 and higher */
if (pset.sversion >= 160000)
appendPQExpBuffer(&buf,
- ", suborigin AS \"%s\"\n",
- gettext_noop("Origin"));
+ ", suborigin AS \"%s\"\n"
+ ", subminapplydelay AS \"%s\"\n",
+ gettext_noop("Origin"),
+ gettext_noop("Min apply delay (ms)"));
appendPQExpBuffer(&buf,
", subsynccommit AS \"%s\"\n"
diff --git a/src/bin/psql/tab-complete.c b/src/bin/psql/tab-complete.c
index 5e1882eaea..e8b9a43a47 100644
--- a/src/bin/psql/tab-complete.c
+++ b/src/bin/psql/tab-complete.c
@@ -1925,7 +1925,7 @@ psql_completion(const char *text, int start, int end)
COMPLETE_WITH("(", "PUBLICATION");
/* ALTER SUBSCRIPTION <name> SET ( */
else if (HeadMatches("ALTER", "SUBSCRIPTION", MatchAny) && TailMatches("SET", "("))
- COMPLETE_WITH("binary", "disable_on_error", "origin", "slot_name",
+ COMPLETE_WITH("binary", "disable_on_error", "min_apply_delay", "origin", "slot_name",
"streaming", "synchronous_commit");
/* ALTER SUBSCRIPTION <name> SKIP ( */
else if (HeadMatches("ALTER", "SUBSCRIPTION", MatchAny) && TailMatches("SKIP", "("))
@@ -3268,7 +3268,7 @@ psql_completion(const char *text, int start, int end)
/* Complete "CREATE SUBSCRIPTION <name> ... WITH ( <opt>" */
else if (HeadMatches("CREATE", "SUBSCRIPTION") && TailMatches("WITH", "("))
COMPLETE_WITH("binary", "connect", "copy_data", "create_slot",
- "disable_on_error", "enabled", "origin", "slot_name",
+ "disable_on_error", "enabled", "min_apply_delay", "origin", "slot_name",
"streaming", "synchronous_commit", "two_phase");
/* CREATE TRIGGER --- is allowed inside CREATE SCHEMA, so use TailMatches */
diff --git a/src/include/catalog/pg_subscription.h b/src/include/catalog/pg_subscription.h
index b0f2a1705d..e06f35c037 100644
--- a/src/include/catalog/pg_subscription.h
+++ b/src/include/catalog/pg_subscription.h
@@ -70,6 +70,8 @@ CATALOG(pg_subscription,6100,SubscriptionRelationId) BKI_SHARED_RELATION BKI_ROW
XLogRecPtr subskiplsn; /* All changes finished at this LSN are
* skipped */
+ int64 subminapplydelay; /* Replication apply delay (ms) */
+
NameData subname; /* Name of the subscription */
Oid subowner BKI_LOOKUP(pg_authid); /* Owner of the subscription */
@@ -120,6 +122,7 @@ typedef struct Subscription
* in */
XLogRecPtr skiplsn; /* All changes finished at this LSN are
* skipped */
+ int64 minapplydelay; /* Replication apply delay (ms) */
char *name; /* Name of the subscription */
Oid owner; /* Oid of the subscription owner */
bool enabled; /* Indicates if the subscription is enabled */
diff --git a/src/include/datatype/timestamp.h b/src/include/datatype/timestamp.h
index 21a37e21e9..8b368af299 100644
--- a/src/include/datatype/timestamp.h
+++ b/src/include/datatype/timestamp.h
@@ -127,6 +127,8 @@ struct pg_itm_in
#define SECS_PER_MINUTE 60
#define MINS_PER_HOUR 60
+#define MSECS_PER_DAY INT64CONST(86400000)
+
#define USECS_PER_DAY INT64CONST(86400000000)
#define USECS_PER_HOUR INT64CONST(3600000000)
#define USECS_PER_MINUTE INT64CONST(60000000)
diff --git a/src/include/replication/worker_internal.h b/src/include/replication/worker_internal.h
index dc87a4edd1..3dc09d1a4c 100644
--- a/src/include/replication/worker_internal.h
+++ b/src/include/replication/worker_internal.h
@@ -255,7 +255,7 @@ extern void stream_stop_internal(TransactionId xid);
/* Common streaming function to apply all the spooled messages */
extern void apply_spooled_messages(FileSet *stream_fileset, TransactionId xid,
- XLogRecPtr lsn);
+ XLogRecPtr lsn, TimestampTz finish_ts);
extern void apply_dispatch(StringInfo s);
diff --git a/src/test/regress/expected/subscription.out b/src/test/regress/expected/subscription.out
index 4e5cb0d3a9..89ca5505f8 100644
--- a/src/test/regress/expected/subscription.out
+++ b/src/test/regress/expected/subscription.out
@@ -114,18 +114,18 @@ CREATE SUBSCRIPTION regress_testsub4 CONNECTION 'dbname=regress_doesnotexist' PU
WARNING: subscription was created, but is not connected
HINT: To initiate replication, you must manually create the replication slot, enable the subscription, and refresh the subscription.
\dRs+ regress_testsub4
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
-------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub4 | regress_subscription_user | f | {testpub} | f | off | d | f | none | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay (ms) | Synchronous commit | Conninfo | Skip LSN
+------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+----------------------+--------------------+-----------------------------+----------
+ regress_testsub4 | regress_subscription_user | f | {testpub} | f | off | d | f | none | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
ALTER SUBSCRIPTION regress_testsub4 SET (origin = any);
\dRs+ regress_testsub4
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
-------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub4 | regress_subscription_user | f | {testpub} | f | off | d | f | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay (ms) | Synchronous commit | Conninfo | Skip LSN
+------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+----------------------+--------------------+-----------------------------+----------
+ regress_testsub4 | regress_subscription_user | f | {testpub} | f | off | d | f | any | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
DROP SUBSCRIPTION regress_testsub3;
@@ -135,10 +135,10 @@ ALTER SUBSCRIPTION regress_testsub CONNECTION 'foobar';
ERROR: invalid connection string syntax: missing "=" after "foobar" in connection info string
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub} | f | off | d | f | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay (ms) | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+----------------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | off | d | f | any | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
ALTER SUBSCRIPTION regress_testsub SET PUBLICATION testpub2, testpub3 WITH (refresh = false);
@@ -155,10 +155,10 @@ ERROR: unrecognized subscription parameter: "create_slot"
-- ok
ALTER SUBSCRIPTION regress_testsub SKIP (lsn = '0/12345');
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+---------------------+--------+-----------+------------------+------------------+--------+--------------------+------------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub2,testpub3} | f | off | d | f | any | off | dbname=regress_doesnotexist2 | 0/12345
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay (ms) | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+---------------------+--------+-----------+------------------+------------------+--------+----------------------+--------------------+------------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub2,testpub3} | f | off | d | f | any | 0 | off | dbname=regress_doesnotexist2 | 0/12345
(1 row)
-- ok - with lsn = NONE
@@ -167,10 +167,10 @@ ALTER SUBSCRIPTION regress_testsub SKIP (lsn = NONE);
ALTER SUBSCRIPTION regress_testsub SKIP (lsn = '0/0');
ERROR: invalid WAL location (LSN): 0/0
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+---------------------+--------+-----------+------------------+------------------+--------+--------------------+------------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub2,testpub3} | f | off | d | f | any | off | dbname=regress_doesnotexist2 | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay (ms) | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+---------------------+--------+-----------+------------------+------------------+--------+----------------------+--------------------+------------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub2,testpub3} | f | off | d | f | any | 0 | off | dbname=regress_doesnotexist2 | 0/0
(1 row)
BEGIN;
@@ -202,10 +202,10 @@ ALTER SUBSCRIPTION regress_testsub_foo SET (synchronous_commit = foobar);
ERROR: invalid value for parameter "synchronous_commit": "foobar"
HINT: Available values: local, remote_write, remote_apply, on, off.
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
----------------------+---------------------------+---------+---------------------+--------+-----------+------------------+------------------+--------+--------------------+------------------------------+----------
- regress_testsub_foo | regress_subscription_user | f | {testpub2,testpub3} | f | off | d | f | any | local | dbname=regress_doesnotexist2 | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay (ms) | Synchronous commit | Conninfo | Skip LSN
+---------------------+---------------------------+---------+---------------------+--------+-----------+------------------+------------------+--------+----------------------+--------------------+------------------------------+----------
+ regress_testsub_foo | regress_subscription_user | f | {testpub2,testpub3} | f | off | d | f | any | 0 | local | dbname=regress_doesnotexist2 | 0/0
(1 row)
-- rename back to keep the rest simple
@@ -239,19 +239,19 @@ CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUB
WARNING: subscription was created, but is not connected
HINT: To initiate replication, you must manually create the replication slot, enable the subscription, and refresh the subscription.
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub} | t | off | d | f | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay (ms) | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+----------------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | t | off | d | f | any | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
ALTER SUBSCRIPTION regress_testsub SET (binary = false);
ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE);
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub} | f | off | d | f | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay (ms) | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+----------------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | off | d | f | any | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
DROP SUBSCRIPTION regress_testsub;
@@ -263,27 +263,27 @@ CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUB
WARNING: subscription was created, but is not connected
HINT: To initiate replication, you must manually create the replication slot, enable the subscription, and refresh the subscription.
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub} | f | on | d | f | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay (ms) | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+----------------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | on | d | f | any | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
ALTER SUBSCRIPTION regress_testsub SET (streaming = parallel);
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub} | f | parallel | d | f | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay (ms) | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+----------------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | parallel | d | f | any | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
ALTER SUBSCRIPTION regress_testsub SET (streaming = false);
ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE);
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub} | f | off | d | f | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay (ms) | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+----------------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | off | d | f | any | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
-- fail - publication already exists
@@ -298,10 +298,10 @@ ALTER SUBSCRIPTION regress_testsub ADD PUBLICATION testpub1, testpub2 WITH (refr
ALTER SUBSCRIPTION regress_testsub ADD PUBLICATION testpub1, testpub2 WITH (refresh = false);
ERROR: publication "testpub1" is already in subscription "regress_testsub"
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-----------------------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub,testpub1,testpub2} | f | off | d | f | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay (ms) | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-----------------------------+--------+-----------+------------------+------------------+--------+----------------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub,testpub1,testpub2} | f | off | d | f | any | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
-- fail - publication used more than once
@@ -316,10 +316,10 @@ ERROR: publication "testpub3" is not in subscription "regress_testsub"
-- ok - delete publications
ALTER SUBSCRIPTION regress_testsub DROP PUBLICATION testpub1, testpub2 WITH (refresh = false);
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub} | f | off | d | f | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay (ms) | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+----------------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | off | d | f | any | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
DROP SUBSCRIPTION regress_testsub;
@@ -355,10 +355,10 @@ CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUB
WARNING: subscription was created, but is not connected
HINT: To initiate replication, you must manually create the replication slot, enable the subscription, and refresh the subscription.
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub} | f | off | p | f | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay (ms) | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+----------------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | off | p | f | any | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
--fail - alter of two_phase option not supported.
@@ -367,10 +367,10 @@ ERROR: unrecognized subscription parameter: "two_phase"
-- but can alter streaming when two_phase enabled
ALTER SUBSCRIPTION regress_testsub SET (streaming = true);
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub} | f | on | p | f | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay (ms) | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+----------------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | on | p | f | any | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE);
@@ -380,10 +380,10 @@ CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUB
WARNING: subscription was created, but is not connected
HINT: To initiate replication, you must manually create the replication slot, enable the subscription, and refresh the subscription.
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub} | f | on | p | f | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay (ms) | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+----------------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | on | p | f | any | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE);
@@ -396,20 +396,57 @@ CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUB
WARNING: subscription was created, but is not connected
HINT: To initiate replication, you must manually create the replication slot, enable the subscription, and refresh the subscription.
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub} | f | off | d | f | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay (ms) | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+----------------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | off | d | f | any | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
ALTER SUBSCRIPTION regress_testsub SET (disable_on_error = true);
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub} | f | off | d | t | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay (ms) | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+----------------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | off | d | t | any | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
+ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE);
+DROP SUBSCRIPTION regress_testsub;
+-- fail -- min_apply_delay must be a non-negative integer
+CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUBLICATION testpub WITH (connect = false, min_apply_delay = foo);
+ERROR: invalid value for parameter "min_apply_delay": "foo"
+CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUBLICATION testpub WITH (connect = false, min_apply_delay = -1);
+ERROR: -1 ms is outside the valid range for parameter "min_apply_delay" (0 .. 2147483647)
+-- fail - utilizing streaming = parallel with time-delayed replication is not supported.
+CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUBLICATION testpub WITH (connect = false, streaming = parallel, min_apply_delay = 123);
+ERROR: min_apply_delay > 0 and streaming = parallel are mutually exclusive options
+-- success -- min_apply_delay value without unit is taken as milliseconds
+CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUBLICATION testpub WITH (connect = false, min_apply_delay = 123);
+WARNING: subscription was created, but is not connected
+HINT: To initiate replication, you must manually create the replication slot, enable the subscription, and refresh the subscription.
+\dRs+
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay (ms) | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+----------------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | off | d | f | any | 123 | off | dbname=regress_doesnotexist | 0/0
+(1 row)
+
+-- success -- min_apply_delay value with unit is converted into ms and stored as an integer
+ALTER SUBSCRIPTION regress_testsub SET (min_apply_delay = '1 d');
+\dRs+
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay (ms) | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+----------------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | off | d | f | any | 86400000 | off | dbname=regress_doesnotexist | 0/0
+(1 row)
+
+-- fail - alter subscription with streaming = parallel should fail when time-delayed replication is set.
+ALTER SUBSCRIPTION regress_testsub SET (streaming = parallel);
+ERROR: cannot enable streaming = parallel mode for subscription with min_apply_delay
+-- fail - alter subscription with min_apply_delay should fail when streaming = parallel is set.
+ALTER SUBSCRIPTION regress_testsub SET (min_apply_delay = 0, streaming = parallel);
+ALTER SUBSCRIPTION regress_testsub SET (min_apply_delay = 123);
+ERROR: cannot enable min_apply_delay for subscription in streaming = parallel mode
ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE);
DROP SUBSCRIPTION regress_testsub;
RESET SESSION AUTHORIZATION;
diff --git a/src/test/regress/sql/subscription.sql b/src/test/regress/sql/subscription.sql
index 5f27b7d776..d6b893a3d0 100644
--- a/src/test/regress/sql/subscription.sql
+++ b/src/test/regress/sql/subscription.sql
@@ -279,6 +279,30 @@ ALTER SUBSCRIPTION regress_testsub SET (disable_on_error = true);
ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE);
DROP SUBSCRIPTION regress_testsub;
+-- fail -- min_apply_delay must be a non-negative integer
+CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUBLICATION testpub WITH (connect = false, min_apply_delay = foo);
+CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUBLICATION testpub WITH (connect = false, min_apply_delay = -1);
+
+-- fail - utilizing streaming = parallel with time-delayed replication is not supported.
+CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUBLICATION testpub WITH (connect = false, streaming = parallel, min_apply_delay = 123);
+
+-- success -- min_apply_delay value without unit is taken as milliseconds
+CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUBLICATION testpub WITH (connect = false, min_apply_delay = 123);
+\dRs+
+
+-- success -- min_apply_delay value with unit is converted into ms and stored as an integer
+ALTER SUBSCRIPTION regress_testsub SET (min_apply_delay = '1 d');
+\dRs+
+
+-- fail - alter subscription with streaming = parallel should fail when time-delayed replication is set.
+ALTER SUBSCRIPTION regress_testsub SET (streaming = parallel);
+
+-- fail - alter subscription with min_apply_delay should fail when streaming = parallel is set.
+ALTER SUBSCRIPTION regress_testsub SET (min_apply_delay = 0, streaming = parallel);
+ALTER SUBSCRIPTION regress_testsub SET (min_apply_delay = 123);
+ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE);
+DROP SUBSCRIPTION regress_testsub;
+
RESET SESSION AUTHORIZATION;
DROP ROLE regress_subscription_user;
DROP ROLE regress_subscription_user2;
diff --git a/src/test/subscription/meson.build b/src/test/subscription/meson.build
index 3db0fdfd96..a186876eb4 100644
--- a/src/test/subscription/meson.build
+++ b/src/test/subscription/meson.build
@@ -38,6 +38,7 @@ tests += {
't/029_on_error.pl',
't/030_origin.pl',
't/031_column_list.pl',
+ 't/032_apply_delay.pl',
't/100_bugs.pl',
],
},
diff --git a/src/test/subscription/t/032_apply_delay.pl b/src/test/subscription/t/032_apply_delay.pl
new file mode 100644
index 0000000000..51dde83cec
--- /dev/null
+++ b/src/test/subscription/t/032_apply_delay.pl
@@ -0,0 +1,175 @@
+
+# Copyright (c) 2023, PostgreSQL Global Development Group
+
+# Test replication apply delay
+use strict;
+use warnings;
+use PostgreSQL::Test::Cluster;
+use PostgreSQL::Test::Utils;
+use Test::More;
+
+# Confirm the time-delayed replication has been effective from the server log
+# message where the apply worker emits for applying delay. Moreover, verifies
+# that the current worker's delayed time is sufficiently bigger than the
+# expected value, in order to check any update of the min_apply_delay.
+sub check_apply_delay_log
+{
+ my ($node_subscriber, $offset, $expected) = @_;
+
+ my $log_location = $node_subscriber->wait_for_log(qr/time-delayed replication for txid (\d+), min_apply_delay = (\d+) ms, Remaining wait time: (\d+) ms/, $offset);
+
+ cmp_ok($log_location, '>', $offset,
+ "logfile contains triggered logical replication apply delay"
+ );
+
+ # Get the delay time from the server log
+ my $contents = slurp_file($node_subscriber->logfile, $offset);
+ $contents =~
+ qr/time-delayed replication for txid (\d+), min_apply_delay = (\d+) ms, Remaining wait time: (\d+) ms/
+ or die "could not get the apply worker wait time";
+ my $logged_delay = $3;
+
+ # Is it larger than expected?
+ cmp_ok($logged_delay, '>', $expected,
+ "The apply worker wait time has expected duration"
+ );
+}
+
+# Compare inserted time on the publisher with applied time on the subscriber to
+# confirm the latter is applied after expected time. The time is automatically
+# generated and stored in the table column 'c'.
+sub check_apply_delay_time
+{
+ my ($node_publisher, $node_subscriber, $primary_key, $expected_diffs) = @_;
+
+ my $inserted_time_on_pub = $node_publisher->safe_psql('postgres', qq[
+ SELECT extract(epoch from c) FROM test_tab WHERE a = $primary_key;
+ ]);
+
+ my $inserted_time_on_sub = $node_subscriber->safe_psql('postgres', qq[
+ SELECT extract(epoch from c) FROM test_tab WHERE a = $primary_key;
+ ]);
+
+ cmp_ok($inserted_time_on_sub - $inserted_time_on_pub, '>', $expected_diffs,
+ "The tuple on the subscriber was modified later than the publisher");
+}
+
+# Create publisher node
+my $node_publisher = PostgreSQL::Test::Cluster->new('publisher');
+$node_publisher->init(allows_streaming => 'logical');
+$node_publisher->append_conf('postgresql.conf',
+ 'logical_decoding_work_mem = 64kB');
+$node_publisher->start;
+
+# Create subscriber node
+my $node_subscriber = PostgreSQL::Test::Cluster->new('subscriber');
+$node_subscriber->init;
+$node_subscriber->append_conf('postgresql.conf',
+ "log_min_messages = debug2");
+$node_subscriber->start;
+
+# Setup structure on publisher
+$node_publisher->safe_psql('postgres',
+ "CREATE TABLE test_tab (a int primary key, b varchar, c timestamptz (6) DEFAULT now())");
+
+# Setup structure on subscriber
+$node_subscriber->safe_psql('postgres',
+ "CREATE TABLE test_tab (a int primary key, b text, c timestamptz (6) DEFAULT now())"
+);
+
+# Setup logical replication
+my $publisher_connstr = $node_publisher->connstr . ' dbname=postgres';
+
+# The column c must not be published because we want to compare the time
+# difference.
+$node_publisher->safe_psql('postgres',
+ "CREATE PUBLICATION tap_pub FOR TABLE test_tab (a, b)");
+
+my $appname = 'tap_sub';
+
+# Create a subscription that applies the trasaction after 50 milliseconds delay
+$node_subscriber->safe_psql('postgres',
+ "CREATE SUBSCRIPTION tap_sub CONNECTION '$publisher_connstr application_name=$appname' PUBLICATION tap_pub WITH (copy_data = off, min_apply_delay = '50ms', streaming = 'on')"
+);
+
+# New row to trigger apply delay
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO test_tab VALUES (1, 'foo')");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO test_tab VALUES (2, 'bar')");
+
+$node_publisher->wait_for_catchup($appname);
+
+my $result =
+ $node_subscriber->safe_psql('postgres',
+ "SELECT count(*), min(a), max(a) FROM test_tab");
+is($result, qq(2|1|2), 'check if the new rows were applied to subscriber');
+
+# Note that we cannot call check_apply_delay_log() here because there is a
+# possibility that the delay is skipped. The event happens when the WAL
+# replication between publisher and subscriber is delayed due to a mechanical
+# problem. The log output will be checked later - substantial delay-time case.
+
+# Verify that the subscriber lags the publisher by at least 50 milliseconds
+check_apply_delay_time($node_publisher, $node_subscriber, '2', '0.05');
+
+# Setup for streaming case
+$node_publisher->append_conf('postgres.conf',
+ 'logical_decoding_mode = immediate');
+$node_publisher->reload;
+
+# Run a query to make sure that the reload has taken effect.
+$node_publisher->safe_psql('postgres', q{SELECT 1});
+
+# Test streamed transaction by insert
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO test_tab SELECT i, md5(i::text) FROM generate_series(3, 5) s(i);");
+
+$node_publisher->wait_for_catchup($appname);
+
+$result =
+ $node_subscriber->safe_psql('postgres',
+ "SELECT count(*), min(a), max(a) FROM test_tab");
+is($result, qq(5|1|5), 'check if the new rows were applied to subscriber');
+
+# Verify that the subscriber lags the publisher by at least 50 milliseconds
+check_apply_delay_time($node_publisher, $node_subscriber, '5', '0.05');
+
+# Test whether ALTER SUBSCRIPTION changes the delayed time of the apply worker
+# (1 day 5 minutes). Note that the extra 5 minute is to account for any
+# decoding/network overhead.
+$node_subscriber->safe_psql('postgres',
+ "ALTER SUBSCRIPTION tap_sub SET (min_apply_delay = 86700000)"
+);
+
+# Check log starting now for logical replication apply delay
+my $offset = -s $node_subscriber->logfile;
+
+# New row to trigger apply delay
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO test_tab VALUES (0, 'foobar')");
+
+# Make sure the apply worker knows to wait for more than 1 day
+check_apply_delay_log($node_subscriber, $offset, "86400000");
+
+# Disable subscription and the worker should die immediately
+$node_subscriber->safe_psql('postgres',
+ "ALTER SUBSCRIPTION tap_sub DISABLE;"
+);
+
+# Wait until worker dies
+my $sub_query =
+ "SELECT count(1) = 0 FROM pg_stat_subscription WHERE subname = 'tap_sub' AND pid IS NOT NULL;";
+$node_subscriber->poll_query_until('postgres', $sub_query)
+ or die "Timed out while waiting for subscriber to die";
+
+# Confirm disabling the subscription by ALTER SUBSCRIPTION DISABLE did not cause
+# the delayed transaction to be applied.
+$result = $node_subscriber->safe_psql('postgres',
+ "SELECT count(a) FROM test_tab WHERE a = 0;");
+is($result, qq(0), "check the delayed transaction was not applied");
+
+$node_subscriber->stop;
+$node_publisher->stop;
+
+done_testing();
--
2.30.0
Hi,
On Friday, January 20, 2023 6:13 PM shveta malik <shveta.malik@gmail.com> wrote:
On Fri, Jan 20, 2023 at 2:23 PM shveta malik <shveta.malik@gmail.com> wrote:
On Fri, Jan 20, 2023 at 1:08 PM Peter Smith <smithpb2250@gmail.com>
wrote:
a) the message should say that this is the *remaining* time to left to wait.
b) it might be convenient to know from the log what was the original
min_apply_delay value in the 1st place.For example, the logs might look something like this:
DEBUG: time-delayed replication for txid 1234, min_apply_delay =
160000 ms. Remaining wait time: 159972 ms
DEBUG: time-delayed replication for txid 1234, min_apply_delay =
160000 ms. Remaining wait time: 142828 ms
DEBUG: time-delayed replication for txid 1234, min_apply_delay =
160000 ms. Remaining wait time: 129994 ms
DEBUG: time-delayed replication for txid 1234, min_apply_delay =
160000 ms. Remaining wait time: 110001 ms ...+1
This will also help when min_apply_delay is set to a new value in
between the current wait. Lets say, I started with min_apply_delay=5
min, when the worker was half way through this, I changed
min_apply_delay to 3 min or say 10min, I see the impact of that change
i.e. new wait-time is adjusted, but log becomes confusing. So, please
keep this scenario as well in mind while improving logging.when we send-feedback during apply-delay after every
wal_receiver_status_interval , the log comes as:
023-01-19 17:12:56.000 IST [404795] DEBUG: sending feedback (force 1) to
recv 0/1570840, write 0/1570840, flush 0/1570840Shall we have some info here to indicate that it is sent while waiting for
apply_delay to distinguish it from other such send-feedback logs?
It will
make apply_delay flow clear in logs.
This additional tip of log information has been added in the latest v18.
Kindly have a look at it in [1]/messages/by-id/TYCPR01MB8373BED9E390C4839AF56685EDC59@TYCPR01MB8373.jpnprd01.prod.outlook.com.
[1]: /messages/by-id/TYCPR01MB8373BED9E390C4839AF56685EDC59@TYCPR01MB8373.jpnprd01.prod.outlook.com
Best Regards,
Takamichi Osumi
On Friday, January 20, 2023 5:54 PM shveta malik <shveta.malik@gmail.com> wrote:
On Fri, Jan 20, 2023 at 1:08 PM Peter Smith <smithpb2250@gmail.com> wrote:
a) the message should say that this is the *remaining* time to left to wait.
b) it might be convenient to know from the log what was the original
min_apply_delay value in the 1st place.For example, the logs might look something like this:
DEBUG: time-delayed replication for txid 1234, min_apply_delay =
160000 ms. Remaining wait time: 159972 ms
DEBUG: time-delayed replication for txid 1234, min_apply_delay =
160000 ms. Remaining wait time: 142828 ms
DEBUG: time-delayed replication for txid 1234, min_apply_delay =
160000 ms. Remaining wait time: 129994 ms
DEBUG: time-delayed replication for txid 1234, min_apply_delay =
160000 ms. Remaining wait time: 110001 ms ...+1
This will also help when min_apply_delay is set to a new value in between the
current wait. Lets say, I started with min_apply_delay=5 min, when the worker
was half way through this, I changed min_apply_delay to 3 min or say 10min, I
see the impact of that change i.e. new wait-time is adjusted, but log becomes
confusing. So, please keep this scenario as well in mind while improving
logging.
Yes, now the change of min_apply_delay value can be detected
since I followed the format provided above. So, this scenario is also covered.
Best Regards,
Takamichi Osumi
On Friday, January 20, 2023 12:47 PM shveta malik <shveta.malik@gmail.com> wrote:
1)
Tried different variations of altering 'min_apply_delay'. All passed except one
below:postgres=# alter subscription mysubnew set (min_apply_delay = '10.9min
1ms'); ALTER SUBSCRIPTION postgres=# alter subscription mysubnew set
(min_apply_delay = '10.9min 2s 1ms'); ALTER SUBSCRIPTION --very similar to
above but fails, postgres=# alter subscription mysubnew set
(min_apply_delay = '10.9s 1ms');
ERROR: invalid input syntax for type interval: "10.9s 1ms"
FYI, this was because the interval type couldn't accept this format.
But now we changed the input format from interval to integer alinged
with recovery_min_apply_delay. Thus, we don't face this issue now.
Best Regards,
Takamichi Osumi
Hi,
On Thursday, January 19, 2023 10:17 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
On Thu, Jan 19, 2023 at 12:06 PM Takamichi Osumi (Fujitsu)
<osumi.takamichi@fujitsu.com> wrote:Kindly have a look at the updated patch v17.
Can we try to optimize the test time for this test? On my machine, it is the
second highest time-consuming test in src/test/subscription. It seems you are
waiting twice for apply_delay and both are for streaming cases by varying the
number of changes. I think it should be just once and that too for the
non-streaming case. I think it would be good to test streaming code path
interaction but not sure if it is important enough to have two test cases for
apply_delay.
The first insert test is for non-streaming case and we need both cases
for coverage. Regarding the time of test, conducted some optimization
such as turning off the initial table sync, shortening the time of wait, and so on.
One minor comment that I observed while going through the patch. + /* + * The combination of parallel streaming mode and min_apply_delay is + not + * allowed. + */ + if (IsSet(supported_opts, SUBOPT_MIN_APPLY_DELAY) && + opts->min_apply_delay > 0)I think it would be good if you can specify the reason for not allowing this
combination in the comments.
Added.
Please have a look at the latest v18 patch in [1]/messages/by-id/TYCPR01MB8373BED9E390C4839AF56685EDC59@TYCPR01MB8373.jpnprd01.prod.outlook.com.
[1]: /messages/by-id/TYCPR01MB8373BED9E390C4839AF56685EDC59@TYCPR01MB8373.jpnprd01.prod.outlook.com
Best Regards,
Takamichi Osumi
Hi,
On Thursday, January 19, 2023 7:55 PM vignesh C <vignesh21@gmail.com> wrote:
On Thu, 19 Jan 2023 at 12:06, Takamichi Osumi (Fujitsu)
<osumi.takamichi@fujitsu.com> wrote:Updated the comment and the function call.
Kindly have a look at the updated patch v17.
Thanks for the updated patch, few comments:
1) min_apply_delay was accepting values like '600 m s h', I was not sure if we
should allow this:
alter subscription sub1 set (min_apply_delay = ' 600 m s h');+ /* + * If no unit was specified, then explicitly add 'ms' otherwise + * the interval_in function would assume 'seconds'. + */ + if (strspn(tmp, "-0123456789 ") == strlen(tmp)) + val = psprintf("%sms", tmp); + else + val = tmp; + + interval = DatumGetIntervalP(DirectFunctionCall3(interval_in, +CStringGetDatum(val),
+ObjectIdGetDatum(InvalidOid),
+
Int32GetDatum(-1)));
FYI, the input can be accepted by the interval type.
Now we changed the direction of the type from interval to integer
but plus some unit can be added like recovery_min_apply_delay.
Please check.
3) There is one check at parse_subscription_options and another check in
AlterSubscription, this looks like a redundant check in case of alter
subscription, can we try to merge and keep in one place:
/*
* The combination of parallel streaming mode and min_apply_delay is not
* allowed.
*/
if (IsSet(supported_opts, SUBOPT_MIN_APPLY_DELAY) &&
opts->min_apply_delay > 0)
{
if (opts->streaming == LOGICALREP_STREAM_PARALLEL) ereport(ERROR,
errcode(ERRCODE_SYNTAX_ERROR), errmsg("%s and %s are mutually
exclusive options",
"min_apply_delay > 0", "streaming = parallel")); }if (IsSet(opts.specified_opts, SUBOPT_MIN_APPLY_DELAY)) {
/*
* The combination of parallel streaming mode and
* min_apply_delay is not allowed.
*/
if (opts.min_apply_delay > 0)
if ((IsSet(opts.specified_opts, SUBOPT_STREAMING) && opts.streaming ==
LOGICALREP_STREAM_PARALLEL) ||
(!IsSet(opts.specified_opts, SUBOPT_STREAMING) && sub->stream ==
LOGICALREP_STREAM_PARALLEL))
ereport(ERROR,
errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
errmsg("cannot enable %s for subscription in %s mode",
"min_apply_delay", "streaming = parallel"));values[Anum_pg_subscription_subminapplydelay - 1] =
Int64GetDatum(opts.min_apply_delay);
replaces[Anum_pg_subscription_subminapplydelay - 1] = true; }
We can't. For create subscription, we need to check the patch
from parse_subscription_options, while for alter subscription,
we need to refer the current MySubscription value for those tests
in AlterSubscription.
4) typo "execeeds" should be "exceeds"
+ time on the subscriber. Any overhead of time spent in logical decoding + and in transferring the transaction may reduce the actual wait time. + It is also possible that the overhead already execeeds the requested + <literal>min_apply_delay</literal> value, in which case no additional + wait is necessary. If the system clocks on publisher and subscriber + are not synchronized, this may lead to apply changes earlier + than
Fixed.
Kindly have a look at the v18 patch in [1]/messages/by-id/TYCPR01MB8373BED9E390C4839AF56685EDC59@TYCPR01MB8373.jpnprd01.prod.outlook.com.
[1]: /messages/by-id/TYCPR01MB8373BED9E390C4839AF56685EDC59@TYCPR01MB8373.jpnprd01.prod.outlook.com
Best Regards,
Takamichi Osumi
On Saturday, January 21, 2023 3:36 AM I wrote:
Kindly have a look at the patch v18.
I've conducted some refactoring for v18.
Now the latest patch should be tidier and
the comments would be clearer and more aligned as a whole.
Attached the updated patch v19.
Best Regards,
Takamichi Osumi
Attachments:
v19-0001-Time-delayed-logical-replication-subscriber.patchapplication/octet-stream; name=v19-0001-Time-delayed-logical-replication-subscriber.patchDownload
From 04d5055a353e31f37d8cafca2a672a8f31041193 Mon Sep 17 00:00:00 2001
From: Takamichi Osumi <osumi.takamichi@fujitsu.com>
Date: Sun, 22 Jan 2023 12:01:26 +0000
Subject: [PATCH v19] Time-delayed logical replication subscriber
Similar to physical replication, a time-delayed copy of the data for
logical replication is useful for some scenarios (particularly to fix
errors that might cause data loss).
This patch implements a new subscription parameter called 'min_apply_delay'.
If the subscription sets min_apply_delay parameter, the logical
replication worker will delay the transaction commit for min_apply_delay
milliseconds.
The delay is calculated between the WAL time stamp and the current time
on the subscriber.
The delay occurs only on WAL records for transaction begins. The main
reason is to avoid keeping a transaction open for a long time. Regular
and prepared transactions are covered. Streamed transactions are also
covered.
The combination of parallel streaming mode and min_apply_delay is not
allowed. The subscriber in the parallel streaming mode applies each
stream on arrival without the time of commit/prepare. So, the
subscriber needs to depend on the arrival time of the stream in this
case, if we apply the time-delayed feature for such transactions. Then
there is a possibility where some unnecessary delay will be added on
the subscriber by network communication break between nodes or other
heavy work load on the publisher. On the other hand, applying the delay
at the end of transaction with parallel apply also can cause issues of
used resource bloat and locks kept in open for a long time. Thus, those
features can't work together.
Author: Euler Taveira
Discussion: https://postgr.es/m/CAB-JLwYOYwL=XTyAXKiH5CtM_Vm8KjKh7aaitCKvmCh4rzr5pQ@mail.gmail.com
---
doc/src/sgml/catalogs.sgml | 9 +
doc/src/sgml/config.sgml | 13 ++
doc/src/sgml/logical-replication.sgml | 7 +
doc/src/sgml/ref/alter_subscription.sgml | 5 +-
doc/src/sgml/ref/create_subscription.sgml | 57 +++++-
src/backend/catalog/pg_subscription.c | 1 +
src/backend/catalog/system_views.sql | 7 +-
src/backend/commands/subscriptioncmds.c | 118 +++++++++++-
.../replication/logical/applyparallelworker.c | 3 +-
src/backend/replication/logical/worker.c | 169 ++++++++++++++--
src/bin/pg_dump/pg_dump.c | 15 +-
src/bin/pg_dump/pg_dump.h | 1 +
src/bin/psql/describe.c | 9 +-
src/bin/psql/tab-complete.c | 4 +-
src/include/catalog/pg_subscription.h | 3 +
src/include/replication/worker_internal.h | 2 +-
src/test/regress/expected/subscription.out | 181 +++++++++++-------
src/test/regress/sql/subscription.sql | 24 +++
src/test/subscription/meson.build | 1 +
src/test/subscription/t/032_apply_delay.pl | 175 +++++++++++++++++
20 files changed, 699 insertions(+), 105 deletions(-)
create mode 100644 src/test/subscription/t/032_apply_delay.pl
diff --git a/doc/src/sgml/catalogs.sgml b/doc/src/sgml/catalogs.sgml
index c1e4048054..bf3c05241c 100644
--- a/doc/src/sgml/catalogs.sgml
+++ b/doc/src/sgml/catalogs.sgml
@@ -7873,6 +7873,15 @@ SCRAM-SHA-256$<replaceable><iteration count></replaceable>:<replaceable>&l
</para></entry>
</row>
+ <row>
+ <entry role="catalog_table_entry"><para role="column_definition">
+ <structfield>subminapplydelay</structfield> <type>int8</type>
+ </para>
+ <para>
+ The length of time (ms) to delay the application of changes.
+ </para></entry>
+ </row>
+
<row>
<entry role="catalog_table_entry"><para role="column_definition">
<structfield>subname</structfield> <type>name</type>
diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml
index dc9b78b0b7..b8f6120b0d 100644
--- a/doc/src/sgml/config.sgml
+++ b/doc/src/sgml/config.sgml
@@ -4787,6 +4787,19 @@ ANY <replaceable class="parameter">num_sync</replaceable> ( <replaceable class="
the <filename>postgresql.conf</filename> file or on the server
command line.
</para>
+ <para>
+ For time-delayed logical replication (i.e. when the subscription is
+ created with parameter min_apply_delay > 0), the apply worker sends a
+ Standby Status Update message to the publisher with a period of
+ <literal>wal_receiver_status_interval</literal>. Make sure to set
+ <literal>wal_receiver_status_interval</literal> less than the
+ <literal>wal_sender_timeout</literal> on the publisher, otherwise, the
+ walsender will repeatedly terminate due to the timeout errors. If
+ <literal>wal_receiver_status_interval</literal> is set to zero, the apply
+ worker doesn't send any feedback messages during the subscriber's
+ <literal>min_apply_delay</literal> period. See
+ <xref linkend="sql-createsubscription"/> for details.
+ </para>
</listitem>
</varlistentry>
diff --git a/doc/src/sgml/logical-replication.sgml b/doc/src/sgml/logical-replication.sgml
index 1bd5660c87..863af11a47 100644
--- a/doc/src/sgml/logical-replication.sgml
+++ b/doc/src/sgml/logical-replication.sgml
@@ -247,6 +247,13 @@
target table.
</para>
+ <para>
+ The subscriber replication can be instructed to lag behind the publisher
+ side changes by specifying the <literal>min_apply_delay</literal>
+ subscription parameter. See <xref linkend="sql-createsubscription"/> for
+ details.
+ </para>
+
<sect2 id="logical-replication-subscription-slot">
<title>Replication Slot Management</title>
diff --git a/doc/src/sgml/ref/alter_subscription.sgml b/doc/src/sgml/ref/alter_subscription.sgml
index ad93553a1d..1c6e9dd2d1 100644
--- a/doc/src/sgml/ref/alter_subscription.sgml
+++ b/doc/src/sgml/ref/alter_subscription.sgml
@@ -213,8 +213,9 @@ ALTER SUBSCRIPTION <replaceable class="parameter">name</replaceable> RENAME TO <
are <literal>slot_name</literal>,
<literal>synchronous_commit</literal>,
<literal>binary</literal>, <literal>streaming</literal>,
- <literal>disable_on_error</literal>, and
- <literal>origin</literal>.
+ <literal>disable_on_error</literal>,
+ <literal>origin</literal>, and
+ <literal>min_apply_delay</literal>.
</para>
</listitem>
</varlistentry>
diff --git a/doc/src/sgml/ref/create_subscription.sgml b/doc/src/sgml/ref/create_subscription.sgml
index eba72c6af6..76ee9c0b3d 100644
--- a/doc/src/sgml/ref/create_subscription.sgml
+++ b/doc/src/sgml/ref/create_subscription.sgml
@@ -349,7 +349,45 @@ CREATE SUBSCRIPTION <replaceable class="parameter">subscription_name</replaceabl
</para>
</listitem>
</varlistentry>
- </variablelist></para>
+
+ <varlistentry>
+ <term><literal>min_apply_delay</literal> (<type>integer</type>)</term>
+ <listitem>
+ <para>
+ By default, the subscriber applies changes as soon as possible. This
+ parameter allows the user to delay the application of changes by a
+ given time interval. If the value is specified without units, it is
+ taken as milliseconds. The default is zero (no delay).
+ </para>
+ <para>
+ Any delay occurs only on WAL records for transaction begins after all
+ initial table synchronization has finished. The delay is calculated
+ as the difference between the WAL timestamp as written on the
+ publisher and the current time on the subscriber. Any overhead of
+ time spent in logical decoding and in transferring the transaction
+ may reduce the actual wait time. It is also possible that the overhead
+ already exceeds the requested <literal>min_apply_delay</literal> value,
+ in which case no additional wait is necessary. If the system clocks
+ on publisher and subscriber are not synchronized, this may lead to
+ apply changes earlier than expected, but this is not a major issue
+ because this parameter is typically much larger than the time
+ deviations between servers. Note that if this parameter is set to a
+ long delay, the replication will stop if the replication slot falls
+ behind the current LSN by more than
+ <link linkend="guc-max-slot-wal-keep-size"><literal>max_slot_wal_keep_size</literal></link>.
+ </para>
+ <warning>
+ <para>
+ Delaying the replication can mean there is a much longer time between making
+ a change on the publisher, and that change being committed on the subscriber.
+ This can impact the performance of synchronous replication.
+ See <xref linkend="guc-synchronous-commit"/>.
+ </para>
+ </warning>
+ </listitem>
+ </varlistentry>
+ </variablelist>
+ </para>
</listitem>
</varlistentry>
@@ -413,6 +451,11 @@ CREATE SUBSCRIPTION <replaceable class="parameter">subscription_name</replaceabl
published with different column lists are not supported.
</para>
+ <para>
+ A non-zero <literal>min_apply_delay</literal> parameter is not allowed when streaming
+ in parallel mode.
+ </para>
+
<para>
We allow non-existent publications to be specified so that users can add
those later. This means
@@ -472,6 +515,18 @@ CREATE SUBSCRIPTION mysub
PUBLICATION insert_only
WITH (enabled = false);
</programlisting></para>
+
+ <para>
+ Create a subscription to a remote server that replicates tables in
+ the <literal>mypub</literal> publication and starts replicating immediately
+ on commit. Pre-existing data is not copied. The application of changes is
+ delayed by 4 hours.
+<programlisting>
+CREATE SUBSCRIPTION mysub
+ CONNECTION 'host=192.0.2.4 port=5432 user=foo dbname=foodb'
+ PUBLICATION mypub
+ WITH (copy_data = false, min_apply_delay = '4h');
+</programlisting></para>
</refsect1>
<refsect1>
diff --git a/src/backend/catalog/pg_subscription.c b/src/backend/catalog/pg_subscription.c
index a56ae311c3..c767cc1c3a 100644
--- a/src/backend/catalog/pg_subscription.c
+++ b/src/backend/catalog/pg_subscription.c
@@ -64,6 +64,7 @@ GetSubscription(Oid subid, bool missing_ok)
sub->oid = subid;
sub->dbid = subform->subdbid;
sub->skiplsn = subform->subskiplsn;
+ sub->minapplydelay = subform->subminapplydelay;
sub->name = pstrdup(NameStr(subform->subname));
sub->owner = subform->subowner;
sub->enabled = subform->subenabled;
diff --git a/src/backend/catalog/system_views.sql b/src/backend/catalog/system_views.sql
index 8608e3fa5b..317c2010cb 100644
--- a/src/backend/catalog/system_views.sql
+++ b/src/backend/catalog/system_views.sql
@@ -1299,9 +1299,10 @@ REVOKE ALL ON pg_replication_origin_status FROM public;
-- All columns of pg_subscription except subconninfo are publicly readable.
REVOKE ALL ON pg_subscription FROM public;
-GRANT SELECT (oid, subdbid, subskiplsn, subname, subowner, subenabled,
- subbinary, substream, subtwophasestate, subdisableonerr,
- subslotname, subsynccommit, subpublications, suborigin)
+GRANT SELECT (oid, subdbid, subskiplsn, subminapplydelay, subname, subowner,
+ subenabled, subbinary, substream, subtwophasestate,
+ subdisableonerr, subslotname, subsynccommit, subpublications,
+ suborigin)
ON pg_subscription TO public;
CREATE VIEW pg_stat_subscription_stats AS
diff --git a/src/backend/commands/subscriptioncmds.c b/src/backend/commands/subscriptioncmds.c
index baff00dd74..a5d30ec585 100644
--- a/src/backend/commands/subscriptioncmds.c
+++ b/src/backend/commands/subscriptioncmds.c
@@ -66,6 +66,7 @@
#define SUBOPT_DISABLE_ON_ERR 0x00000400
#define SUBOPT_LSN 0x00000800
#define SUBOPT_ORIGIN 0x00001000
+#define SUBOPT_MIN_APPLY_DELAY 0x00002000
/* check if the 'val' has 'bits' set */
#define IsSet(val, bits) (((val) & (bits)) == (bits))
@@ -90,6 +91,7 @@ typedef struct SubOpts
bool disableonerr;
char *origin;
XLogRecPtr lsn;
+ int min_apply_delay;
} SubOpts;
static List *fetch_table_list(WalReceiverConn *wrconn, List *publications);
@@ -100,7 +102,7 @@ static void check_publications_origin(WalReceiverConn *wrconn,
static void check_duplicates_in_publist(List *publist, Datum *datums);
static List *merge_publications(List *oldpublist, List *newpublist, bool addpub, const char *subname);
static void ReportSlotConnectionError(List *rstates, Oid subid, char *slotname, char *err);
-
+static int defGetMinApplyDelay(DefElem *def);
/*
* Common option parsing function for CREATE and ALTER SUBSCRIPTION commands.
@@ -146,6 +148,8 @@ parse_subscription_options(ParseState *pstate, List *stmt_options,
opts->disableonerr = false;
if (IsSet(supported_opts, SUBOPT_ORIGIN))
opts->origin = pstrdup(LOGICALREP_ORIGIN_ANY);
+ if (IsSet(supported_opts, SUBOPT_MIN_APPLY_DELAY))
+ opts->min_apply_delay = 0;
/* Parse options */
foreach(lc, stmt_options)
@@ -324,6 +328,12 @@ parse_subscription_options(ParseState *pstate, List *stmt_options,
opts->specified_opts |= SUBOPT_LSN;
opts->lsn = lsn;
}
+ else if (IsSet(supported_opts, SUBOPT_MIN_APPLY_DELAY) &&
+ strcmp(defel->defname, "min_apply_delay") == 0)
+ {
+ opts->specified_opts |= SUBOPT_MIN_APPLY_DELAY;
+ opts->min_apply_delay = defGetMinApplyDelay(defel);
+ }
else
ereport(ERROR,
(errcode(ERRCODE_SYNTAX_ERROR),
@@ -404,6 +414,26 @@ parse_subscription_options(ParseState *pstate, List *stmt_options,
"slot_name = NONE", "create_slot = false")));
}
}
+
+ /*
+ * The combination of parallel streaming mode and min_apply_delay is not
+ * allowed. The subscriber in the parallel streaming mode applies each
+ * stream on arrival without the time of commit/prepare. So, the
+ * subscriber needs to depend on the arrival time of the stream in this
+ * case, if we apply the time-delayed feature for such transactions. Then
+ * there is a possibility where some unnecessary delay will be added on
+ * the subscriber by network communication break between nodes or other
+ * heavy work load on the publisher. On the other hand, applying the delay
+ * at the end of transaction with parallel apply also can cause issues of
+ * used resource bloat and locks kept in open for a long time. Thus, those
+ * features can't work together.
+ */
+ if (IsSet(supported_opts, SUBOPT_MIN_APPLY_DELAY) &&
+ opts->min_apply_delay > 0 && opts->streaming == LOGICALREP_STREAM_PARALLEL)
+ ereport(ERROR,
+ errcode(ERRCODE_SYNTAX_ERROR),
+ errmsg("%s and %s are mutually exclusive options",
+ "min_apply_delay > 0", "streaming = parallel"));
}
/*
@@ -560,7 +590,8 @@ CreateSubscription(ParseState *pstate, CreateSubscriptionStmt *stmt,
SUBOPT_SLOT_NAME | SUBOPT_COPY_DATA |
SUBOPT_SYNCHRONOUS_COMMIT | SUBOPT_BINARY |
SUBOPT_STREAMING | SUBOPT_TWOPHASE_COMMIT |
- SUBOPT_DISABLE_ON_ERR | SUBOPT_ORIGIN);
+ SUBOPT_DISABLE_ON_ERR | SUBOPT_ORIGIN |
+ SUBOPT_MIN_APPLY_DELAY);
parse_subscription_options(pstate, stmt->options, supported_opts, &opts);
/*
@@ -625,6 +656,7 @@ CreateSubscription(ParseState *pstate, CreateSubscriptionStmt *stmt,
values[Anum_pg_subscription_oid - 1] = ObjectIdGetDatum(subid);
values[Anum_pg_subscription_subdbid - 1] = ObjectIdGetDatum(MyDatabaseId);
values[Anum_pg_subscription_subskiplsn - 1] = LSNGetDatum(InvalidXLogRecPtr);
+ values[Anum_pg_subscription_subminapplydelay - 1] = Int64GetDatum(opts.min_apply_delay);
values[Anum_pg_subscription_subname - 1] =
DirectFunctionCall1(namein, CStringGetDatum(stmt->subname));
values[Anum_pg_subscription_subowner - 1] = ObjectIdGetDatum(owner);
@@ -1054,7 +1086,7 @@ AlterSubscription(ParseState *pstate, AlterSubscriptionStmt *stmt,
supported_opts = (SUBOPT_SLOT_NAME |
SUBOPT_SYNCHRONOUS_COMMIT | SUBOPT_BINARY |
SUBOPT_STREAMING | SUBOPT_DISABLE_ON_ERR |
- SUBOPT_ORIGIN);
+ SUBOPT_ORIGIN | SUBOPT_MIN_APPLY_DELAY);
parse_subscription_options(pstate, stmt->options,
supported_opts, &opts);
@@ -1098,6 +1130,18 @@ AlterSubscription(ParseState *pstate, AlterSubscriptionStmt *stmt,
if (IsSet(opts.specified_opts, SUBOPT_STREAMING))
{
+ /*
+ * The combination of parallel streaming mode and
+ * min_apply_delay is not allowed.
+ */
+ if (opts.streaming == LOGICALREP_STREAM_PARALLEL)
+ if ((IsSet(opts.specified_opts, SUBOPT_MIN_APPLY_DELAY) && opts.min_apply_delay > 0) ||
+ (!IsSet(opts.specified_opts, SUBOPT_MIN_APPLY_DELAY) && sub->minapplydelay > 0))
+ ereport(ERROR,
+ errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
+ errmsg("cannot enable %s mode for subscription with %s",
+ "streaming = parallel", "min_apply_delay"));
+
values[Anum_pg_subscription_substream - 1] =
CharGetDatum(opts.streaming);
replaces[Anum_pg_subscription_substream - 1] = true;
@@ -1111,6 +1155,25 @@ AlterSubscription(ParseState *pstate, AlterSubscriptionStmt *stmt,
= true;
}
+ if (IsSet(opts.specified_opts, SUBOPT_MIN_APPLY_DELAY))
+ {
+ /*
+ * The combination of parallel streaming mode and
+ * min_apply_delay is not allowed.
+ */
+ if (opts.min_apply_delay > 0)
+ if ((IsSet(opts.specified_opts, SUBOPT_STREAMING) && opts.streaming == LOGICALREP_STREAM_PARALLEL) ||
+ (!IsSet(opts.specified_opts, SUBOPT_STREAMING) && sub->stream == LOGICALREP_STREAM_PARALLEL))
+ ereport(ERROR,
+ errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
+ errmsg("cannot enable %s for subscription in %s mode",
+ "min_apply_delay", "streaming = parallel"));
+
+ values[Anum_pg_subscription_subminapplydelay - 1] =
+ Int64GetDatum(opts.min_apply_delay);
+ replaces[Anum_pg_subscription_subminapplydelay - 1] = true;
+ }
+
if (IsSet(opts.specified_opts, SUBOPT_ORIGIN))
{
values[Anum_pg_subscription_suborigin - 1] =
@@ -2185,3 +2248,52 @@ defGetStreamingMode(DefElem *def)
def->defname)));
return LOGICALREP_STREAM_OFF; /* keep compiler quiet */
}
+
+
+/*
+ * Extract the min_apply_delay mode value from a DefElem. This is very similar
+ * to PGC_INT case of parse_and_validate_value(), because min_apply_delay
+ * accepts the same string as recovery_min_apply_delay.
+ */
+int
+defGetMinApplyDelay(DefElem *def)
+{
+ char *value;
+ int result;
+ const char *hintmsg;
+
+ /*
+ * Raise an ERROR if no parameter value given
+ */
+ if (def->arg == NULL)
+ ereport(ERROR,
+ (errcode(ERRCODE_SYNTAX_ERROR),
+ errmsg("%s requires an integer value",
+ def->defname)));
+
+ value = defGetString(def);
+
+ /*
+ * Parse given string as parameter which has millisecond unit
+ */
+ if (!parse_int(value, &result, GUC_UNIT_MS, &hintmsg))
+ ereport(ERROR,
+ (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+ errmsg("invalid value for parameter \"%s\": \"%s\"",
+ "min_apply_delay", value),
+ hintmsg ? errhint("%s", _(hintmsg)) : 0));
+
+ /*
+ * Check lower bound. parse_int() has been already confirmed that result
+ * is equal to or smaller than INT_MAX.
+ */
+ if (result < 0)
+ ereport(ERROR,
+ (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+ errmsg("%d ms is outside the valid range for parameter \"%s\" (%d .. %d)",
+ result,
+ "min_apply_delay",
+ 0, INT_MAX)));
+
+ return result;
+}
diff --git a/src/backend/replication/logical/applyparallelworker.c b/src/backend/replication/logical/applyparallelworker.c
index 3579e704fe..7302bce7a0 100644
--- a/src/backend/replication/logical/applyparallelworker.c
+++ b/src/backend/replication/logical/applyparallelworker.c
@@ -704,7 +704,8 @@ pa_process_spooled_messages_if_required(void)
{
apply_spooled_messages(&MyParallelShared->fileset,
MyParallelShared->xid,
- InvalidXLogRecPtr);
+ InvalidXLogRecPtr,
+ 0);
pa_set_fileset_state(MyParallelShared, FS_EMPTY);
}
diff --git a/src/backend/replication/logical/worker.c b/src/backend/replication/logical/worker.c
index a0084c7ef6..31719db030 100644
--- a/src/backend/replication/logical/worker.c
+++ b/src/backend/replication/logical/worker.c
@@ -318,6 +318,17 @@ static List *on_commit_wakeup_workers_subids = NIL;
bool in_remote_transaction = false;
static XLogRecPtr remote_final_lsn = InvalidXLogRecPtr;
+/*
+ * In order to avoid walsender timeout for time-delayed replication the worker
+ * process keeps sending feedback messages during the delay period.
+ * Meanwhile, the feature delays the apply before starting the
+ * transaction and thus we don't write WALs for the suspended changes during
+ * the wait. When the worker process sends a feedback message
+ * during the delay, we should not make positions of the flushed and apply LSN
+ * overwritten by the last received latest LSN. See send_feedback() for details.
+ */
+static XLogRecPtr last_received = InvalidXLogRecPtr;
+
/* fields valid only when processing streamed transaction */
static bool in_streamed_transaction = false;
@@ -388,10 +399,13 @@ static void stream_write_change(char action, StringInfo s);
static void stream_open_and_write_change(TransactionId xid, char action, StringInfo s);
static void stream_close_file(void);
-static void send_feedback(XLogRecPtr recvpos, bool force, bool requestReply);
+static void send_feedback(XLogRecPtr recvpos, bool force, bool requestReply,
+ bool in_delayed_apply);
static void DisableSubscriptionAndExit(void);
+static void maybe_delay_apply(TransactionId xid, TimestampTz finish_ts);
+
static void apply_handle_commit_internal(LogicalRepCommitData *commit_data);
static void apply_handle_insert_internal(ApplyExecutionData *edata,
ResultRelInfo *relinfo,
@@ -998,6 +1012,108 @@ slot_modify_data(TupleTableSlot *slot, TupleTableSlot *srcslot,
ExecStoreVirtualTuple(slot);
}
+/*
+ * When min_apply_delay parameter is set on the subscriber, we wait long enough
+ * to make sure a transaction is applied at least that interval behind the
+ * publisher.
+ *
+ * While the physical replication applies the delay at commit time, this
+ * feature applies the delay for the next transaction but before starting the
+ * transaction. This is mainly because keeping a transaction that conducted
+ * write operations open for a long time results in some issues such as bloat
+ * and locks.
+ *
+ * The min_apply_delay parameter will take effect only after all tables are in
+ * READY state.
+ *
+ * xid is the transaction id where we apply the delay.
+ *
+ * finish_ts is the commit/prepare time of both regular (non-streamed) and
+ * streamed transactions. Unlike the regular (non-streamed) cases, the delay
+ * is applied in a STREAM COMMIT/STREAM PREPARE message for streamed
+ * transactions. The STREAM START message does not contain a commit/prepare
+ * time (it will be available when the in-progress transaction finishes).
+ * Hence, it's not appropriate to apply a delay at the STREAM START time.
+ */
+static void
+maybe_delay_apply(TransactionId xid, TimestampTz finish_ts)
+{
+ Assert(finish_ts > 0);
+
+ /* Nothing to do if no delay set */
+ if (!MySubscription->minapplydelay)
+ return;
+
+ /*
+ * The min_apply_delay parameter is ignored until all tablesync workers
+ * have reached READY state. This is because if we allowed the delay
+ * during the catchup phase, then once we reached the limit of tablesync
+ * workers it would impose a delay for each subsequent worker. That would
+ * cause initial table synchronization completion to take a long time.
+ */
+ if (!AllTablesyncsReady())
+ return;
+
+ /* Apply the delay by the latch mechanism */
+ while (true)
+ {
+ long diffms;
+
+ ResetLatch(MyLatch);
+
+ CHECK_FOR_INTERRUPTS();
+
+ /* This might change wal_receiver_status_interval */
+ if (ConfigReloadPending)
+ {
+ ConfigReloadPending = false;
+ ProcessConfigFile(PGC_SIGHUP);
+ }
+
+ /*
+ * Before calculating the time duration, reload the catalog if needed.
+ */
+ if (!in_remote_transaction && !in_streamed_transaction)
+ {
+ AcceptInvalidationMessages();
+ maybe_reread_subscription();
+ }
+
+ diffms = TimestampDifferenceMilliseconds(GetCurrentTimestamp(),
+ TimestampTzPlusMilliseconds(finish_ts, MySubscription->minapplydelay));
+
+ /*
+ * Exit without arming the latch if it's already past time to apply
+ * this transaction.
+ */
+ if (diffms <= 0)
+ break;
+
+ elog(DEBUG2, "time-delayed replication for txid %u, min_apply_delay = %lld ms, Remaining wait time: %ld ms",
+ xid, (long long) MySubscription->minapplydelay, diffms);
+
+ /*
+ * Call send_feedback() to prevent the publisher from exiting by
+ * timeout during the delay, when wal_receiver_status_interval is
+ * available.
+ */
+ if (wal_receiver_status_interval > 0
+ && diffms > wal_receiver_status_interval * 1000)
+ {
+ WaitLatch(MyLatch,
+ WL_LATCH_SET | WL_TIMEOUT | WL_EXIT_ON_PM_DEATH,
+ (long) wal_receiver_status_interval * 1000,
+ WAIT_EVENT_RECOVERY_APPLY_DELAY);
+ send_feedback(last_received, true, false, true);
+ }
+ else
+ WaitLatch(MyLatch,
+ WL_LATCH_SET | WL_TIMEOUT | WL_EXIT_ON_PM_DEATH,
+ diffms,
+ WAIT_EVENT_RECOVERY_APPLY_DELAY);
+ }
+}
+
/*
* Handle BEGIN message.
*/
@@ -1012,6 +1128,9 @@ apply_handle_begin(StringInfo s)
logicalrep_read_begin(s, &begin_data);
set_apply_error_context_xact(begin_data.xid, begin_data.final_lsn);
+ /* Should we delay the current transaction? */
+ maybe_delay_apply(begin_data.xid, begin_data.committime);
+
remote_final_lsn = begin_data.final_lsn;
maybe_start_skipping_changes(begin_data.final_lsn);
@@ -1069,6 +1188,9 @@ apply_handle_begin_prepare(StringInfo s)
logicalrep_read_begin_prepare(s, &begin_data);
set_apply_error_context_xact(begin_data.xid, begin_data.prepare_lsn);
+ /* Should we delay the current prepared transaction? */
+ maybe_delay_apply(begin_data.xid, begin_data.prepare_time);
+
remote_final_lsn = begin_data.prepare_lsn;
maybe_start_skipping_changes(begin_data.prepare_lsn);
@@ -1316,7 +1438,8 @@ apply_handle_stream_prepare(StringInfo s)
* spooled operations.
*/
apply_spooled_messages(MyLogicalRepWorker->stream_fileset,
- prepare_data.xid, prepare_data.prepare_lsn);
+ prepare_data.xid, prepare_data.prepare_lsn,
+ prepare_data.prepare_time);
/* Mark the transaction as prepared. */
apply_handle_prepare_internal(&prepare_data);
@@ -2010,10 +2133,13 @@ ensure_last_message(FileSet *stream_fileset, TransactionId xid, int fileno,
/*
* Common spoolfile processing.
+ *
+ * The commit/prepare time for streaming transaction is required to achieve
+ * time-delayed replication.
*/
void
apply_spooled_messages(FileSet *stream_fileset, TransactionId xid,
- XLogRecPtr lsn)
+ XLogRecPtr lsn, TimestampTz finish_ts)
{
StringInfoData s2;
int nchanges;
@@ -2024,6 +2150,10 @@ apply_spooled_messages(FileSet *stream_fileset, TransactionId xid,
int fileno;
off_t offset;
+ /* Should we delay the current transaction? */
+ if (finish_ts)
+ maybe_delay_apply(xid, finish_ts);
+
if (!am_parallel_apply_worker())
maybe_start_skipping_changes(lsn);
@@ -2173,7 +2303,7 @@ apply_handle_stream_commit(StringInfo s)
* spooled operations.
*/
apply_spooled_messages(MyLogicalRepWorker->stream_fileset, xid,
- commit_data.commit_lsn);
+ commit_data.commit_lsn, commit_data.committime);
apply_handle_commit_internal(&commit_data);
@@ -3446,7 +3576,7 @@ UpdateWorkerStats(XLogRecPtr last_lsn, TimestampTz send_time, bool reply)
* Apply main loop.
*/
static void
-LogicalRepApplyLoop(XLogRecPtr last_received)
+LogicalRepApplyLoop(void)
{
TimestampTz last_recv_timestamp = GetCurrentTimestamp();
bool ping_sent = false;
@@ -3567,7 +3697,7 @@ LogicalRepApplyLoop(XLogRecPtr last_received)
if (last_received < end_lsn)
last_received = end_lsn;
- send_feedback(last_received, reply_requested, false);
+ send_feedback(last_received, reply_requested, false, false);
UpdateWorkerStats(last_received, timestamp, true);
}
/* other message types are purposefully ignored */
@@ -3580,7 +3710,7 @@ LogicalRepApplyLoop(XLogRecPtr last_received)
}
/* confirm all writes so far */
- send_feedback(last_received, false, false);
+ send_feedback(last_received, false, false, false);
if (!in_remote_transaction && !in_streamed_transaction)
{
@@ -3677,7 +3807,7 @@ LogicalRepApplyLoop(XLogRecPtr last_received)
}
}
- send_feedback(last_received, requestReply, requestReply);
+ send_feedback(last_received, requestReply, requestReply, false);
/*
* Force reporting to ensure long idle periods don't lead to
@@ -3707,7 +3837,7 @@ LogicalRepApplyLoop(XLogRecPtr last_received)
* to send a response to avoid timeouts.
*/
static void
-send_feedback(XLogRecPtr recvpos, bool force, bool requestReply)
+send_feedback(XLogRecPtr recvpos, bool force, bool requestReply, bool in_delayed_apply)
{
static StringInfo reply_message = NULL;
static TimestampTz send_time = 0;
@@ -3737,8 +3867,15 @@ send_feedback(XLogRecPtr recvpos, bool force, bool requestReply)
/*
* No outstanding transactions to flush, we can report the latest received
* position. This is important for synchronous replication.
+ *
+ * If the subscriber side apply is delayed (because of time-delayed
+ * replication) then do not tell the publisher that the received latest
+ * LSN is already applied and flushed, otherwise, it leads to the
+ * publisher side making a wrong assumption of logical replication
+ * progress. Instead, we just send a feedback message to avoid a publisher
+ * timeout during the delay.
*/
- if (!have_pending_txes)
+ if (!have_pending_txes && !in_delayed_apply)
flushpos = writepos = recvpos;
if (writepos < last_writepos)
@@ -3775,11 +3912,12 @@ send_feedback(XLogRecPtr recvpos, bool force, bool requestReply)
pq_sendint64(reply_message, now); /* sendTime */
pq_sendbyte(reply_message, requestReply); /* replyRequested */
- elog(DEBUG2, "sending feedback (force %d) to recv %X/%X, write %X/%X, flush %X/%X",
+ elog(DEBUG2, "sending feedback (force %d) to recv %X/%X, write %X/%X, flush %X/%X in-delayed: %d",
force,
LSN_FORMAT_ARGS(recvpos),
LSN_FORMAT_ARGS(writepos),
- LSN_FORMAT_ARGS(flushpos));
+ LSN_FORMAT_ARGS(flushpos),
+ in_delayed_apply);
walrcv_send(LogRepWorkerWalRcvConn,
reply_message->data, reply_message->len);
@@ -4354,11 +4492,11 @@ start_table_sync(XLogRecPtr *origin_startpos, char **myslotname)
* of system resource error and are not repeatable.
*/
static void
-start_apply(XLogRecPtr origin_startpos)
+start_apply(void)
{
PG_TRY();
{
- LogicalRepApplyLoop(origin_startpos);
+ LogicalRepApplyLoop();
}
PG_CATCH();
{
@@ -4645,7 +4783,8 @@ ApplyWorkerMain(Datum main_arg)
}
/* Run the main loop. */
- start_apply(origin_startpos);
+ last_received = origin_startpos;
+ start_apply();
proc_exit(0);
}
diff --git a/src/bin/pg_dump/pg_dump.c b/src/bin/pg_dump/pg_dump.c
index 527c7651ab..c0f69cb43b 100644
--- a/src/bin/pg_dump/pg_dump.c
+++ b/src/bin/pg_dump/pg_dump.c
@@ -4494,6 +4494,7 @@ getSubscriptions(Archive *fout)
int i_subsynccommit;
int i_subpublications;
int i_subbinary;
+ int i_subminapplydelay;
int i,
ntups;
@@ -4546,9 +4547,13 @@ getSubscriptions(Archive *fout)
LOGICALREP_TWOPHASE_STATE_DISABLED);
if (fout->remoteVersion >= 160000)
- appendPQExpBufferStr(query, " s.suborigin\n");
+ appendPQExpBufferStr(query,
+ " s.suborigin,\n"
+ " s.subminapplydelay\n");
else
- appendPQExpBuffer(query, " '%s' AS suborigin\n", LOGICALREP_ORIGIN_ANY);
+ appendPQExpBuffer(query, " '%s' AS suborigin,\n"
+ " 0 AS subminapplydelay\n",
+ LOGICALREP_ORIGIN_ANY);
appendPQExpBufferStr(query,
"FROM pg_subscription s\n"
@@ -4576,6 +4581,7 @@ getSubscriptions(Archive *fout)
i_subtwophasestate = PQfnumber(res, "subtwophasestate");
i_subdisableonerr = PQfnumber(res, "subdisableonerr");
i_suborigin = PQfnumber(res, "suborigin");
+ i_subminapplydelay = PQfnumber(res, "subminapplydelay");
subinfo = pg_malloc(ntups * sizeof(SubscriptionInfo));
@@ -4606,6 +4612,8 @@ getSubscriptions(Archive *fout)
subinfo[i].subdisableonerr =
pg_strdup(PQgetvalue(res, i, i_subdisableonerr));
subinfo[i].suborigin = pg_strdup(PQgetvalue(res, i, i_suborigin));
+ subinfo[i].subminapplydelay =
+ strtoi64(PQgetvalue(res, i, i_subminapplydelay), NULL, 10);
/* Decide whether we want to dump it */
selectDumpableObject(&(subinfo[i].dobj), fout);
@@ -4687,6 +4695,9 @@ dumpSubscription(Archive *fout, const SubscriptionInfo *subinfo)
if (strcmp(subinfo->subsynccommit, "off") != 0)
appendPQExpBuffer(query, ", synchronous_commit = %s", fmtId(subinfo->subsynccommit));
+ if (subinfo->subminapplydelay > 0)
+ appendPQExpBuffer(query, ", min_apply_delay = '" INT64_FORMAT " ms'", subinfo->subminapplydelay);
+
appendPQExpBufferStr(query, ");\n");
if (subinfo->dobj.dump & DUMP_COMPONENT_DEFINITION)
diff --git a/src/bin/pg_dump/pg_dump.h b/src/bin/pg_dump/pg_dump.h
index e7cbd8d7ed..e2525f70ab 100644
--- a/src/bin/pg_dump/pg_dump.h
+++ b/src/bin/pg_dump/pg_dump.h
@@ -661,6 +661,7 @@ typedef struct _SubscriptionInfo
char *subdisableonerr;
char *suborigin;
char *subsynccommit;
+ int64 subminapplydelay;
char *subpublications;
} SubscriptionInfo;
diff --git a/src/bin/psql/describe.c b/src/bin/psql/describe.c
index c8a0bb7b3a..8a27063bed 100644
--- a/src/bin/psql/describe.c
+++ b/src/bin/psql/describe.c
@@ -6472,7 +6472,7 @@ describeSubscriptions(const char *pattern, bool verbose)
PGresult *res;
printQueryOpt myopt = pset.popt;
static const bool translate_columns[] = {false, false, false, false,
- false, false, false, false, false, false, false, false};
+ false, false, false, false, false, false, false, false, false};
if (pset.sversion < 100000)
{
@@ -6527,10 +6527,13 @@ describeSubscriptions(const char *pattern, bool verbose)
gettext_noop("Two-phase commit"),
gettext_noop("Disable on error"));
+ /* Origin and min_apply_delay are only supported in v16 and higher */
if (pset.sversion >= 160000)
appendPQExpBuffer(&buf,
- ", suborigin AS \"%s\"\n",
- gettext_noop("Origin"));
+ ", suborigin AS \"%s\"\n"
+ ", subminapplydelay AS \"%s\"\n",
+ gettext_noop("Origin"),
+ gettext_noop("Min apply delay (ms)"));
appendPQExpBuffer(&buf,
", subsynccommit AS \"%s\"\n"
diff --git a/src/bin/psql/tab-complete.c b/src/bin/psql/tab-complete.c
index 5e1882eaea..e8b9a43a47 100644
--- a/src/bin/psql/tab-complete.c
+++ b/src/bin/psql/tab-complete.c
@@ -1925,7 +1925,7 @@ psql_completion(const char *text, int start, int end)
COMPLETE_WITH("(", "PUBLICATION");
/* ALTER SUBSCRIPTION <name> SET ( */
else if (HeadMatches("ALTER", "SUBSCRIPTION", MatchAny) && TailMatches("SET", "("))
- COMPLETE_WITH("binary", "disable_on_error", "origin", "slot_name",
+ COMPLETE_WITH("binary", "disable_on_error", "min_apply_delay", "origin", "slot_name",
"streaming", "synchronous_commit");
/* ALTER SUBSCRIPTION <name> SKIP ( */
else if (HeadMatches("ALTER", "SUBSCRIPTION", MatchAny) && TailMatches("SKIP", "("))
@@ -3268,7 +3268,7 @@ psql_completion(const char *text, int start, int end)
/* Complete "CREATE SUBSCRIPTION <name> ... WITH ( <opt>" */
else if (HeadMatches("CREATE", "SUBSCRIPTION") && TailMatches("WITH", "("))
COMPLETE_WITH("binary", "connect", "copy_data", "create_slot",
- "disable_on_error", "enabled", "origin", "slot_name",
+ "disable_on_error", "enabled", "min_apply_delay", "origin", "slot_name",
"streaming", "synchronous_commit", "two_phase");
/* CREATE TRIGGER --- is allowed inside CREATE SCHEMA, so use TailMatches */
diff --git a/src/include/catalog/pg_subscription.h b/src/include/catalog/pg_subscription.h
index b0f2a1705d..e06f35c037 100644
--- a/src/include/catalog/pg_subscription.h
+++ b/src/include/catalog/pg_subscription.h
@@ -70,6 +70,8 @@ CATALOG(pg_subscription,6100,SubscriptionRelationId) BKI_SHARED_RELATION BKI_ROW
XLogRecPtr subskiplsn; /* All changes finished at this LSN are
* skipped */
+ int64 subminapplydelay; /* Replication apply delay (ms) */
+
NameData subname; /* Name of the subscription */
Oid subowner BKI_LOOKUP(pg_authid); /* Owner of the subscription */
@@ -120,6 +122,7 @@ typedef struct Subscription
* in */
XLogRecPtr skiplsn; /* All changes finished at this LSN are
* skipped */
+ int64 minapplydelay; /* Replication apply delay (ms) */
char *name; /* Name of the subscription */
Oid owner; /* Oid of the subscription owner */
bool enabled; /* Indicates if the subscription is enabled */
diff --git a/src/include/replication/worker_internal.h b/src/include/replication/worker_internal.h
index dc87a4edd1..3dc09d1a4c 100644
--- a/src/include/replication/worker_internal.h
+++ b/src/include/replication/worker_internal.h
@@ -255,7 +255,7 @@ extern void stream_stop_internal(TransactionId xid);
/* Common streaming function to apply all the spooled messages */
extern void apply_spooled_messages(FileSet *stream_fileset, TransactionId xid,
- XLogRecPtr lsn);
+ XLogRecPtr lsn, TimestampTz finish_ts);
extern void apply_dispatch(StringInfo s);
diff --git a/src/test/regress/expected/subscription.out b/src/test/regress/expected/subscription.out
index 4e5cb0d3a9..1230bcb096 100644
--- a/src/test/regress/expected/subscription.out
+++ b/src/test/regress/expected/subscription.out
@@ -114,18 +114,18 @@ CREATE SUBSCRIPTION regress_testsub4 CONNECTION 'dbname=regress_doesnotexist' PU
WARNING: subscription was created, but is not connected
HINT: To initiate replication, you must manually create the replication slot, enable the subscription, and refresh the subscription.
\dRs+ regress_testsub4
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
-------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub4 | regress_subscription_user | f | {testpub} | f | off | d | f | none | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay (ms) | Synchronous commit | Conninfo | Skip LSN
+------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+----------------------+--------------------+-----------------------------+----------
+ regress_testsub4 | regress_subscription_user | f | {testpub} | f | off | d | f | none | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
ALTER SUBSCRIPTION regress_testsub4 SET (origin = any);
\dRs+ regress_testsub4
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
-------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub4 | regress_subscription_user | f | {testpub} | f | off | d | f | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay (ms) | Synchronous commit | Conninfo | Skip LSN
+------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+----------------------+--------------------+-----------------------------+----------
+ regress_testsub4 | regress_subscription_user | f | {testpub} | f | off | d | f | any | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
DROP SUBSCRIPTION regress_testsub3;
@@ -135,10 +135,10 @@ ALTER SUBSCRIPTION regress_testsub CONNECTION 'foobar';
ERROR: invalid connection string syntax: missing "=" after "foobar" in connection info string
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub} | f | off | d | f | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay (ms) | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+----------------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | off | d | f | any | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
ALTER SUBSCRIPTION regress_testsub SET PUBLICATION testpub2, testpub3 WITH (refresh = false);
@@ -155,10 +155,10 @@ ERROR: unrecognized subscription parameter: "create_slot"
-- ok
ALTER SUBSCRIPTION regress_testsub SKIP (lsn = '0/12345');
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+---------------------+--------+-----------+------------------+------------------+--------+--------------------+------------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub2,testpub3} | f | off | d | f | any | off | dbname=regress_doesnotexist2 | 0/12345
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay (ms) | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+---------------------+--------+-----------+------------------+------------------+--------+----------------------+--------------------+------------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub2,testpub3} | f | off | d | f | any | 0 | off | dbname=regress_doesnotexist2 | 0/12345
(1 row)
-- ok - with lsn = NONE
@@ -167,10 +167,10 @@ ALTER SUBSCRIPTION regress_testsub SKIP (lsn = NONE);
ALTER SUBSCRIPTION regress_testsub SKIP (lsn = '0/0');
ERROR: invalid WAL location (LSN): 0/0
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+---------------------+--------+-----------+------------------+------------------+--------+--------------------+------------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub2,testpub3} | f | off | d | f | any | off | dbname=regress_doesnotexist2 | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay (ms) | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+---------------------+--------+-----------+------------------+------------------+--------+----------------------+--------------------+------------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub2,testpub3} | f | off | d | f | any | 0 | off | dbname=regress_doesnotexist2 | 0/0
(1 row)
BEGIN;
@@ -202,10 +202,10 @@ ALTER SUBSCRIPTION regress_testsub_foo SET (synchronous_commit = foobar);
ERROR: invalid value for parameter "synchronous_commit": "foobar"
HINT: Available values: local, remote_write, remote_apply, on, off.
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
----------------------+---------------------------+---------+---------------------+--------+-----------+------------------+------------------+--------+--------------------+------------------------------+----------
- regress_testsub_foo | regress_subscription_user | f | {testpub2,testpub3} | f | off | d | f | any | local | dbname=regress_doesnotexist2 | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay (ms) | Synchronous commit | Conninfo | Skip LSN
+---------------------+---------------------------+---------+---------------------+--------+-----------+------------------+------------------+--------+----------------------+--------------------+------------------------------+----------
+ regress_testsub_foo | regress_subscription_user | f | {testpub2,testpub3} | f | off | d | f | any | 0 | local | dbname=regress_doesnotexist2 | 0/0
(1 row)
-- rename back to keep the rest simple
@@ -239,19 +239,19 @@ CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUB
WARNING: subscription was created, but is not connected
HINT: To initiate replication, you must manually create the replication slot, enable the subscription, and refresh the subscription.
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub} | t | off | d | f | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay (ms) | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+----------------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | t | off | d | f | any | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
ALTER SUBSCRIPTION regress_testsub SET (binary = false);
ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE);
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub} | f | off | d | f | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay (ms) | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+----------------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | off | d | f | any | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
DROP SUBSCRIPTION regress_testsub;
@@ -263,27 +263,27 @@ CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUB
WARNING: subscription was created, but is not connected
HINT: To initiate replication, you must manually create the replication slot, enable the subscription, and refresh the subscription.
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub} | f | on | d | f | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay (ms) | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+----------------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | on | d | f | any | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
ALTER SUBSCRIPTION regress_testsub SET (streaming = parallel);
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub} | f | parallel | d | f | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay (ms) | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+----------------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | parallel | d | f | any | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
ALTER SUBSCRIPTION regress_testsub SET (streaming = false);
ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE);
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub} | f | off | d | f | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay (ms) | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+----------------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | off | d | f | any | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
-- fail - publication already exists
@@ -298,10 +298,10 @@ ALTER SUBSCRIPTION regress_testsub ADD PUBLICATION testpub1, testpub2 WITH (refr
ALTER SUBSCRIPTION regress_testsub ADD PUBLICATION testpub1, testpub2 WITH (refresh = false);
ERROR: publication "testpub1" is already in subscription "regress_testsub"
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-----------------------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub,testpub1,testpub2} | f | off | d | f | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay (ms) | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-----------------------------+--------+-----------+------------------+------------------+--------+----------------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub,testpub1,testpub2} | f | off | d | f | any | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
-- fail - publication used more than once
@@ -316,10 +316,10 @@ ERROR: publication "testpub3" is not in subscription "regress_testsub"
-- ok - delete publications
ALTER SUBSCRIPTION regress_testsub DROP PUBLICATION testpub1, testpub2 WITH (refresh = false);
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub} | f | off | d | f | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay (ms) | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+----------------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | off | d | f | any | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
DROP SUBSCRIPTION regress_testsub;
@@ -355,10 +355,10 @@ CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUB
WARNING: subscription was created, but is not connected
HINT: To initiate replication, you must manually create the replication slot, enable the subscription, and refresh the subscription.
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub} | f | off | p | f | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay (ms) | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+----------------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | off | p | f | any | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
--fail - alter of two_phase option not supported.
@@ -367,10 +367,10 @@ ERROR: unrecognized subscription parameter: "two_phase"
-- but can alter streaming when two_phase enabled
ALTER SUBSCRIPTION regress_testsub SET (streaming = true);
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub} | f | on | p | f | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay (ms) | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+----------------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | on | p | f | any | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE);
@@ -380,10 +380,10 @@ CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUB
WARNING: subscription was created, but is not connected
HINT: To initiate replication, you must manually create the replication slot, enable the subscription, and refresh the subscription.
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub} | f | on | p | f | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay (ms) | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+----------------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | on | p | f | any | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE);
@@ -396,20 +396,57 @@ CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUB
WARNING: subscription was created, but is not connected
HINT: To initiate replication, you must manually create the replication slot, enable the subscription, and refresh the subscription.
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub} | f | off | d | f | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay (ms) | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+----------------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | off | d | f | any | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
ALTER SUBSCRIPTION regress_testsub SET (disable_on_error = true);
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub} | f | off | d | t | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay (ms) | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+----------------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | off | d | t | any | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
+ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE);
+DROP SUBSCRIPTION regress_testsub;
+-- fail -- min_apply_delay must be a non-negative integer
+CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUBLICATION testpub WITH (connect = false, min_apply_delay = foo);
+ERROR: invalid value for parameter "min_apply_delay": "foo"
+CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUBLICATION testpub WITH (connect = false, min_apply_delay = -1);
+ERROR: -1 ms is outside the valid range for parameter "min_apply_delay" (0 .. 2147483647)
+-- fail - utilizing streaming = parallel with time-delayed replication is not supported
+CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUBLICATION testpub WITH (connect = false, streaming = parallel, min_apply_delay = 123);
+ERROR: min_apply_delay > 0 and streaming = parallel are mutually exclusive options
+-- success -- min_apply_delay value without unit is taken as milliseconds
+CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUBLICATION testpub WITH (connect = false, min_apply_delay = 123);
+WARNING: subscription was created, but is not connected
+HINT: To initiate replication, you must manually create the replication slot, enable the subscription, and refresh the subscription.
+\dRs+
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay (ms) | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+----------------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | off | d | f | any | 123 | off | dbname=regress_doesnotexist | 0/0
+(1 row)
+
+-- success -- min_apply_delay value with unit is converted into ms and stored as an integer
+ALTER SUBSCRIPTION regress_testsub SET (min_apply_delay = '1 d');
+\dRs+
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay (ms) | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+----------------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | off | d | f | any | 86400000 | off | dbname=regress_doesnotexist | 0/0
+(1 row)
+
+-- fail - alter subscription with streaming = parallel should fail when time-delayed replication is set
+ALTER SUBSCRIPTION regress_testsub SET (streaming = parallel);
+ERROR: cannot enable streaming = parallel mode for subscription with min_apply_delay
+-- fail - alter subscription with min_apply_delay should fail when streaming = parallel is set
+ALTER SUBSCRIPTION regress_testsub SET (min_apply_delay = 0, streaming = parallel);
+ALTER SUBSCRIPTION regress_testsub SET (min_apply_delay = 123);
+ERROR: cannot enable min_apply_delay for subscription in streaming = parallel mode
ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE);
DROP SUBSCRIPTION regress_testsub;
RESET SESSION AUTHORIZATION;
diff --git a/src/test/regress/sql/subscription.sql b/src/test/regress/sql/subscription.sql
index 5f27b7d776..53fe2a4c6b 100644
--- a/src/test/regress/sql/subscription.sql
+++ b/src/test/regress/sql/subscription.sql
@@ -279,6 +279,30 @@ ALTER SUBSCRIPTION regress_testsub SET (disable_on_error = true);
ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE);
DROP SUBSCRIPTION regress_testsub;
+-- fail -- min_apply_delay must be a non-negative integer
+CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUBLICATION testpub WITH (connect = false, min_apply_delay = foo);
+CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUBLICATION testpub WITH (connect = false, min_apply_delay = -1);
+
+-- fail - utilizing streaming = parallel with time-delayed replication is not supported
+CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUBLICATION testpub WITH (connect = false, streaming = parallel, min_apply_delay = 123);
+
+-- success -- min_apply_delay value without unit is taken as milliseconds
+CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUBLICATION testpub WITH (connect = false, min_apply_delay = 123);
+\dRs+
+
+-- success -- min_apply_delay value with unit is converted into ms and stored as an integer
+ALTER SUBSCRIPTION regress_testsub SET (min_apply_delay = '1 d');
+\dRs+
+
+-- fail - alter subscription with streaming = parallel should fail when time-delayed replication is set
+ALTER SUBSCRIPTION regress_testsub SET (streaming = parallel);
+
+-- fail - alter subscription with min_apply_delay should fail when streaming = parallel is set
+ALTER SUBSCRIPTION regress_testsub SET (min_apply_delay = 0, streaming = parallel);
+ALTER SUBSCRIPTION regress_testsub SET (min_apply_delay = 123);
+ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE);
+DROP SUBSCRIPTION regress_testsub;
+
RESET SESSION AUTHORIZATION;
DROP ROLE regress_subscription_user;
DROP ROLE regress_subscription_user2;
diff --git a/src/test/subscription/meson.build b/src/test/subscription/meson.build
index 3db0fdfd96..a186876eb4 100644
--- a/src/test/subscription/meson.build
+++ b/src/test/subscription/meson.build
@@ -38,6 +38,7 @@ tests += {
't/029_on_error.pl',
't/030_origin.pl',
't/031_column_list.pl',
+ 't/032_apply_delay.pl',
't/100_bugs.pl',
],
},
diff --git a/src/test/subscription/t/032_apply_delay.pl b/src/test/subscription/t/032_apply_delay.pl
new file mode 100644
index 0000000000..37388e474f
--- /dev/null
+++ b/src/test/subscription/t/032_apply_delay.pl
@@ -0,0 +1,175 @@
+
+# Copyright (c) 2023, PostgreSQL Global Development Group
+
+# Test replication apply delay
+use strict;
+use warnings;
+use PostgreSQL::Test::Cluster;
+use PostgreSQL::Test::Utils;
+use Test::More;
+
+# Confirm the time-delayed replication has been effective from the server log
+# message where the apply worker emits for applying delay. Moreover, verifies
+# that the current worker's delayed time is sufficiently bigger than the
+# expected value, in order to check any update of the min_apply_delay.
+sub check_apply_delay_log
+{
+ my ($node_subscriber, $offset, $expected) = @_;
+
+ my $log_location = $node_subscriber->wait_for_log(qr/time-delayed replication for txid (\d+), min_apply_delay = (\d+) ms, Remaining wait time: (\d+) ms/, $offset);
+
+ cmp_ok($log_location, '>', $offset,
+ "logfile contains triggered logical replication apply delay"
+ );
+
+ # Get the delay time from the server log
+ my $contents = slurp_file($node_subscriber->logfile, $offset);
+ $contents =~
+ qr/time-delayed replication for txid (\d+), min_apply_delay = (\d+) ms, Remaining wait time: (\d+) ms/
+ or die "could not get the apply worker wait time";
+ my $logged_delay = $3;
+
+ # Is it larger than expected?
+ cmp_ok($logged_delay, '>', $expected,
+ "The apply worker wait time has expected duration"
+ );
+}
+
+# Compare inserted time on the publisher with applied time on the subscriber to
+# confirm the latter is applied after expected time. The time is automatically
+# generated and stored in the table column 'c'.
+sub check_apply_delay_time
+{
+ my ($node_publisher, $node_subscriber, $primary_key, $expected_diffs) = @_;
+
+ my $inserted_time_on_pub = $node_publisher->safe_psql('postgres', qq[
+ SELECT extract(epoch from c) FROM test_tab WHERE a = $primary_key;
+ ]);
+
+ my $inserted_time_on_sub = $node_subscriber->safe_psql('postgres', qq[
+ SELECT extract(epoch from c) FROM test_tab WHERE a = $primary_key;
+ ]);
+
+ cmp_ok($inserted_time_on_sub - $inserted_time_on_pub, '>', $expected_diffs,
+ "The tuple on the subscriber was modified later than the publisher");
+}
+
+# Create publisher node
+my $node_publisher = PostgreSQL::Test::Cluster->new('publisher');
+$node_publisher->init(allows_streaming => 'logical');
+$node_publisher->append_conf('postgresql.conf',
+ 'logical_decoding_work_mem = 64kB');
+$node_publisher->start;
+
+# Create subscriber node
+my $node_subscriber = PostgreSQL::Test::Cluster->new('subscriber');
+$node_subscriber->init;
+$node_subscriber->append_conf('postgresql.conf',
+ "log_min_messages = debug2");
+$node_subscriber->start;
+
+# Setup structure on publisher
+$node_publisher->safe_psql('postgres',
+ "CREATE TABLE test_tab (a int primary key, b varchar, c timestamptz (6) DEFAULT now())");
+
+# Setup structure on subscriber
+$node_subscriber->safe_psql('postgres',
+ "CREATE TABLE test_tab (a int primary key, b text, c timestamptz (6) DEFAULT now())"
+);
+
+# Setup logical replication
+my $publisher_connstr = $node_publisher->connstr . ' dbname=postgres';
+
+# The column 'c' must not be published because we want to compare the time
+# difference.
+$node_publisher->safe_psql('postgres',
+ "CREATE PUBLICATION tap_pub FOR TABLE test_tab (a, b)");
+
+my $appname = 'tap_sub';
+
+# Create a subscription that applies the trasaction after 50 milliseconds delay
+$node_subscriber->safe_psql('postgres',
+ "CREATE SUBSCRIPTION tap_sub CONNECTION '$publisher_connstr application_name=$appname' PUBLICATION tap_pub WITH (copy_data = off, min_apply_delay = '50ms', streaming = 'on')"
+);
+
+# New row to trigger apply delay
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO test_tab VALUES (1, 'foo')");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO test_tab VALUES (2, 'bar')");
+
+$node_publisher->wait_for_catchup($appname);
+
+my $result =
+ $node_subscriber->safe_psql('postgres',
+ "SELECT count(*), min(a), max(a) FROM test_tab");
+is($result, qq(2|1|2), 'check if the new rows were applied to subscriber');
+
+# Note that we cannot call check_apply_delay_log() here because there is a
+# possibility that the delay is skipped. The event happens when the WAL
+# replication between publisher and subscriber is delayed due to a mechanical
+# problem. The log output will be checked later - substantial delay-time case.
+
+# Verify that the subscriber lags the publisher by at least 50 milliseconds
+check_apply_delay_time($node_publisher, $node_subscriber, '2', '0.05');
+
+# Setup for streaming case
+$node_publisher->append_conf('postgres.conf',
+ 'logical_decoding_mode = immediate');
+$node_publisher->reload;
+
+# Run a query to make sure that the reload has taken effect.
+$node_publisher->safe_psql('postgres', q{SELECT 1});
+
+# Test streamed transaction by insert
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO test_tab SELECT i, md5(i::text) FROM generate_series(3, 5) s(i);");
+
+$node_publisher->wait_for_catchup($appname);
+
+$result =
+ $node_subscriber->safe_psql('postgres',
+ "SELECT count(*), min(a), max(a) FROM test_tab");
+is($result, qq(5|1|5), 'check if the new rows were applied to subscriber');
+
+# Verify that the subscriber lags the publisher by at least 50 milliseconds
+check_apply_delay_time($node_publisher, $node_subscriber, '5', '0.05');
+
+# Test whether ALTER SUBSCRIPTION changes the delayed time of the apply worker
+# (1 day 5 minutes). Note that the extra 5 minute is to account for any
+# decoding/network overhead.
+$node_subscriber->safe_psql('postgres',
+ "ALTER SUBSCRIPTION tap_sub SET (min_apply_delay = 86700000)"
+);
+
+# Check log starting now for logical replication apply delay
+my $offset = -s $node_subscriber->logfile;
+
+# New row to trigger apply delay
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO test_tab VALUES (0, 'foobar')");
+
+# Make sure the apply worker knows to wait for more than 1 day
+check_apply_delay_log($node_subscriber, $offset, "86400000");
+
+# Disable subscription and the worker should die immediately
+$node_subscriber->safe_psql('postgres',
+ "ALTER SUBSCRIPTION tap_sub DISABLE;"
+);
+
+# Wait until worker dies
+my $sub_query =
+ "SELECT count(1) = 0 FROM pg_stat_subscription WHERE subname = 'tap_sub' AND pid IS NOT NULL;";
+$node_subscriber->poll_query_until('postgres', $sub_query)
+ or die "Timed out while waiting for subscriber to die";
+
+# Confirm disabling the subscription by ALTER SUBSCRIPTION DISABLE did not cause
+# the delayed transaction to be applied.
+$result = $node_subscriber->safe_psql('postgres',
+ "SELECT count(a) FROM test_tab WHERE a = 0;");
+is($result, qq(0), "check the delayed transaction was not applied");
+
+$node_subscriber->stop;
+$node_publisher->stop;
+
+done_testing();
--
2.30.0
Here are my review comments for v19-0001.
======
Commit message
1.
The combination of parallel streaming mode and min_apply_delay is not
allowed. The subscriber in the parallel streaming mode applies each
stream on arrival without the time of commit/prepare. So, the
subscriber needs to depend on the arrival time of the stream in this
case, if we apply the time-delayed feature for such transactions. Then
there is a possibility where some unnecessary delay will be added on
the subscriber by network communication break between nodes or other
heavy work load on the publisher. On the other hand, applying the delay
at the end of transaction with parallel apply also can cause issues of
used resource bloat and locks kept in open for a long time. Thus, those
features can't work together.
~
I think the above is just cut/paste from a code comment within
subscriptioncmds.c. See review comments #5 below -- so if the code is
changed then this commit message should also change to match it.
======
doc/src/sgml/ref/create_subscription.sgml
2.
+ <varlistentry>
+ <term><literal>min_apply_delay</literal> (<type>integer</type>)</term>
+ <listitem>
+ <para>
+ By default, the subscriber applies changes as soon as possible. This
+ parameter allows the user to delay the application of changes by a
+ given time interval. If the value is specified without units, it is
+ taken as milliseconds. The default is zero (no delay).
+ </para>
2a.
The pgdocs says this is an integer default to “ms” unit. Also, the
example on this same page shows it is set to '4h'. But I did not see
any mention of what other units are available to the user. Maybe other
time units should be mentioned here, or maybe a link should be given
to the section “20.1.1. Parameter Names and Values".
~
2b.
Previously the word "interval" was deliberately used because this
parameter had interval support. But maybe now it should be changed so
it is not misleading.
"a given time interval" --> "a given time period" ??
======
src/backend/commands/subscriptioncmds.c
3. Forward declare
+static int defGetMinApplyDelay(DefElem *def);
If the new function is implemented as static near the top of this
source file then this forward declare would not even be necessary,
right?
~~~
4. parse_subscription_options
@@ -324,6 +328,12 @@ parse_subscription_options(ParseState *pstate,
List *stmt_options,
opts->specified_opts |= SUBOPT_LSN;
opts->lsn = lsn;
}
+ else if (IsSet(supported_opts, SUBOPT_MIN_APPLY_DELAY) &&
+ strcmp(defel->defname, "min_apply_delay") == 0)
+ {
+ opts->specified_opts |= SUBOPT_MIN_APPLY_DELAY;
+ opts->min_apply_delay = defGetMinApplyDelay(defel);
+ }
Should this code fragment be calling errorConflictingDefElem so it
will report an error if the same min_apply_delay parameter is
redundantly repeated? (IIUC, this appears to be the code pattern for
other parameters nearby).
~~~
5. parse_subscription_options
+ /*
+ * The combination of parallel streaming mode and min_apply_delay is not
+ * allowed. The subscriber in the parallel streaming mode applies each
+ * stream on arrival without the time of commit/prepare. So, the
+ * subscriber needs to depend on the arrival time of the stream in this
+ * case, if we apply the time-delayed feature for such transactions. Then
+ * there is a possibility where some unnecessary delay will be added on
+ * the subscriber by network communication break between nodes or other
+ * heavy work load on the publisher. On the other hand, applying the delay
+ * at the end of transaction with parallel apply also can cause issues of
+ * used resource bloat and locks kept in open for a long time. Thus, those
+ * features can't work together.
+ */
IMO some re-wording might be warranted here. I am not sure quite how
to do it. Perhaps like below?
SUGGESTION
The combination of parallel streaming mode and min_apply_delay is not allowed.
Here are some reasons why these features are incompatible:
a. In the parallel streaming mode the subscriber applies each stream
on arrival without knowledge of the commit/prepare time. This means we
cannot calculate the underlying network/decoding lag between publisher
and subscriber, and so always waiting for the full 'min_apply_delay'
period might include unnecessary delay.
b. If we apply the delay at the end of the transaction of the parallel
apply then that would cause issues related to resource bloat and locks
being held for a long time.
~~~
6. defGetMinApplyDelay
+
+
+/*
+ * Extract the min_apply_delay mode value from a DefElem. This is very similar
+ * to PGC_INT case of parse_and_validate_value(), because min_apply_delay
+ * accepts the same string as recovery_min_apply_delay.
+ */
+int
+defGetMinApplyDelay(DefElem *def)
6a.
"same string" -> "same parameter format" ??
~
6b.
I thought this function should be implemented as static and located at
the top of the subscriptioncmds.c source file.
======
src/backend/replication/logical/worker.c
7. maybe_delay_apply
+static void maybe_delay_apply(TransactionId xid, TimestampTz finish_ts);
Is there a reason why this is here? AFAIK the static implementation
precedes any usage so I doubt this forward declaration is required.
~~~
8. send_feedback
@@ -3775,11 +3912,12 @@ send_feedback(XLogRecPtr recvpos, bool force,
bool requestReply)
pq_sendint64(reply_message, now); /* sendTime */
pq_sendbyte(reply_message, requestReply); /* replyRequested */
- elog(DEBUG2, "sending feedback (force %d) to recv %X/%X, write
%X/%X, flush %X/%X",
+ elog(DEBUG2, "sending feedback (force %d) to recv %X/%X, write
%X/%X, flush %X/%X in-delayed: %d",
force,
LSN_FORMAT_ARGS(recvpos),
LSN_FORMAT_ARGS(writepos),
- LSN_FORMAT_ARGS(flushpos));
+ LSN_FORMAT_ARGS(flushpos),
+ in_delayed_apply);
Wondering if it is better to write this as:
"sending feedback (force %d, in_delayed_apply %d) to recv %X/%X, write
%X/%X, flush %X/%X"
======
src/test/regress/sql/subscription.sql
9. Add new test?
Should there be an additional test to check redundant parameter
setting -- eg. "... WITH (min_apply_delay=123, min_apply_delay=456)"
(this is related to the review comment #4)
~
10. Add new tests?
Should there be other tests just to verify different units (like 'd',
'h', 'min') are working OK?
======
src/test/subscription/t/032_apply_delay.pl
11.
+# Confirm the time-delayed replication has been effective from the server log
+# message where the apply worker emits for applying delay. Moreover, verifies
+# that the current worker's delayed time is sufficiently bigger than the
+# expected value, in order to check any update of the min_apply_delay.
+sub check_apply_delay_log
"the current worker's delayed time..." --> "the current worker's
remaining wait time..." ??
~~~
12.
+ # Get the delay time from the server log
+ my $contents = slurp_file($node_subscriber->logfile, $offset);
"Get the delay time...." --> "Get the remaining wait time..."
~~~
13.
+# Create a subscription that applies the trasaction after 50 milliseconds delay
+$node_subscriber->safe_psql('postgres',
+ "CREATE SUBSCRIPTION tap_sub CONNECTION '$publisher_connstr
application_name=$appname' PUBLICATION tap_pub WITH (copy_data = off,
min_apply_delay = '50ms', streaming = 'on')"
+);
13a.
typo: "trasaction"
~
13b
50ms seems an extremely short time – How do you even know if this is
testing anything related to the time delay? You may just be detecting
the normal lag between publisher and subscriber without time delay
having much to do with anything.
~
14.
+# Note that we cannot call check_apply_delay_log() here because there is a
+# possibility that the delay is skipped. The event happens when the WAL
+# replication between publisher and subscriber is delayed due to a mechanical
+# problem. The log output will be checked later - substantial delay-time case.
+
+# Verify that the subscriber lags the publisher by at least 50 milliseconds
+check_apply_delay_time($node_publisher, $node_subscriber, '2', '0.05');
14a.
"The event happens..." ??
Did you mean "This might happen if the WAL..."
~
14b.
The log output will be checked later - substantial delay-time case.
I think that needs re-wording to clarify.
e.g1. you have nothing called a "substantial delay-time" case.
e.g2. the word "later" confused me. Originally, I thought you meant it
is not tested yet but that you will check it "later", but now IIUC you
are just referring to the "1 day 5 minutes" test that comes below in
this location TAP file (??)
------
Kind Regards,
Peter Smith.
Fujitsu Australia
On Mon, Jan 23, 2023 at 1:36 PM Peter Smith <smithpb2250@gmail.com> wrote:
Here are my review comments for v19-0001.
...
5. parse_subscription_options
+ /* + * The combination of parallel streaming mode and min_apply_delay is not + * allowed. The subscriber in the parallel streaming mode applies each + * stream on arrival without the time of commit/prepare. So, the + * subscriber needs to depend on the arrival time of the stream in this + * case, if we apply the time-delayed feature for such transactions. Then + * there is a possibility where some unnecessary delay will be added on + * the subscriber by network communication break between nodes or other + * heavy work load on the publisher. On the other hand, applying the delay + * at the end of transaction with parallel apply also can cause issues of + * used resource bloat and locks kept in open for a long time. Thus, those + * features can't work together. + */IMO some re-wording might be warranted here. I am not sure quite how
to do it. Perhaps like below?SUGGESTION
The combination of parallel streaming mode and min_apply_delay is not allowed.
Here are some reasons why these features are incompatible:
a. In the parallel streaming mode the subscriber applies each stream
on arrival without knowledge of the commit/prepare time. This means we
cannot calculate the underlying network/decoding lag between publisher
and subscriber, and so always waiting for the full 'min_apply_delay'
period might include unnecessary delay.
b. If we apply the delay at the end of the transaction of the parallel
apply then that would cause issues related to resource bloat and locks
being held for a long time.~~~
How about something like:
The combination of parallel streaming mode and min_apply_delay is not
allowed. This is because we start applying the transaction stream as
soon as the first change arrives without knowing the transaction's
prepare/commit time. This means we cannot calculate the underlying
network/decoding lag between publisher and subscriber, and so always
waiting for the full 'min_apply_delay' period might include
unnecessary delay.
The other possibility is to apply the delay at the end of the parallel
apply transaction but that would cause issues related to resource
bloat and locks being held for a long time.
6. defGetMinApplyDelay
+ + +/* + * Extract the min_apply_delay mode value from a DefElem. This is very similar + * to PGC_INT case of parse_and_validate_value(), because min_apply_delay + * accepts the same string as recovery_min_apply_delay. + */ +int +defGetMinApplyDelay(DefElem *def)6a.
"same string" -> "same parameter format" ??~
6b.
I thought this function should be implemented as static and located at
the top of the subscriptioncmds.c source file.
I agree that this should be a static function but I think its current
location is a better place as other similar function is just above it.
======
src/test/regress/sql/subscription.sql9. Add new test?
Should there be an additional test to check redundant parameter
setting -- eg. "... WITH (min_apply_delay=123, min_apply_delay=456)"
I don't think that will be of much help. We don't seem to have other
tests for subscription parameters.
--
With Regards,
Amit Kapila.
On Sun, Jan 22, 2023 at 6:12 PM Takamichi Osumi (Fujitsu)
<osumi.takamichi@fujitsu.com> wrote:
Attached the updated patch v19.
Few comments:
=============
1.
}
+
+
+/*
Only one empty line is sufficient between different functions.
2.
+ if (IsSet(supported_opts, SUBOPT_MIN_APPLY_DELAY) &&
+ opts->min_apply_delay > 0 && opts->streaming == LOGICALREP_STREAM_PARALLEL)
+ ereport(ERROR,
+ errcode(ERRCODE_SYNTAX_ERROR),
+ errmsg("%s and %s are mutually exclusive options",
+ "min_apply_delay > 0", "streaming = parallel"));
}
I think here we should add a comment for the translator as we are
doing in some other nearby cases.
3.
+ /*
+ * The combination of parallel streaming mode and
+ * min_apply_delay is not allowed.
+ */
+ if (opts.streaming == LOGICALREP_STREAM_PARALLEL)
+ if ((IsSet(opts.specified_opts, SUBOPT_MIN_APPLY_DELAY) &&
opts.min_apply_delay > 0) ||
+ (!IsSet(opts.specified_opts, SUBOPT_MIN_APPLY_DELAY) &&
sub->minapplydelay > 0))
+ ereport(ERROR,
+ errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
+ errmsg("cannot enable %s mode for subscription with %s",
+ "streaming = parallel", "min_apply_delay"));
+
A. When can second condition ((!IsSet(opts.specified_opts,
SUBOPT_MIN_APPLY_DELAY) && sub->minapplydelay > 0)) in above check be
true?
B. In comments, you can say "See parse_subscription_options."
4.
+/*
+ * When min_apply_delay parameter is set on the subscriber, we wait long enough
+ * to make sure a transaction is applied at least that interval behind the
+ * publisher.
Shouldn't this part of the comment needs to be updated after the patch
has stopped using interval?
5. How does this feature interacts with the SKIP feature? Currently,
it doesn't care whether the changes of a particular xact are skipped
or not. I think that might be okay because anyway the purpose of this
feature is to make subscriber lag from publishers. What do you think?
I feel we can add some comments to indicate the same.
--
With Regards,
Amit Kapila.
On Mon, Jan 23, 2023 at 9:44 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
On Mon, Jan 23, 2023 at 1:36 PM Peter Smith <smithpb2250@gmail.com> wrote:
Here are my review comments for v19-0001.
...
5. parse_subscription_options
+ /* + * The combination of parallel streaming mode and min_apply_delay is not + * allowed. The subscriber in the parallel streaming mode applies each + * stream on arrival without the time of commit/prepare. So, the + * subscriber needs to depend on the arrival time of the stream in this + * case, if we apply the time-delayed feature for such transactions. Then + * there is a possibility where some unnecessary delay will be added on + * the subscriber by network communication break between nodes or other + * heavy work load on the publisher. On the other hand, applying the delay + * at the end of transaction with parallel apply also can cause issues of + * used resource bloat and locks kept in open for a long time. Thus, those + * features can't work together. + */IMO some re-wording might be warranted here. I am not sure quite how
to do it. Perhaps like below?SUGGESTION
The combination of parallel streaming mode and min_apply_delay is not allowed.
Here are some reasons why these features are incompatible:
a. In the parallel streaming mode the subscriber applies each stream
on arrival without knowledge of the commit/prepare time. This means we
cannot calculate the underlying network/decoding lag between publisher
and subscriber, and so always waiting for the full 'min_apply_delay'
period might include unnecessary delay.
b. If we apply the delay at the end of the transaction of the parallel
apply then that would cause issues related to resource bloat and locks
being held for a long time.~~~
How about something like:
The combination of parallel streaming mode and min_apply_delay is not
allowed. This is because we start applying the transaction stream as
soon as the first change arrives without knowing the transaction's
prepare/commit time. This means we cannot calculate the underlying
network/decoding lag between publisher and subscriber, and so always
waiting for the full 'min_apply_delay' period might include
unnecessary delay.The other possibility is to apply the delay at the end of the parallel
apply transaction but that would cause issues related to resource
bloat and locks being held for a long time.
+1. That's better.
6. defGetMinApplyDelay
...
6b.
I thought this function should be implemented as static and located at
the top of the subscriptioncmds.c source file.I agree that this should be a static function but I think its current
location is a better place as other similar function is just above it.
But, why not do everything, instead of settling on a half-fix?
e.g.
1. Change the new function (defGetMinApplyDelay) to be static as it should be
2. And move defGetMinApplyDelay to the top of the file where IMO it
really belongs
3. And then remove the (now) redundant forward declaration of
defGetMinApplyDelay
4. And also move the existing function (defGetStreamingMode) to the
top of the file so that those similar functions (defGetMinApplyDelay
and defGetStreamingMode) can remain together
------
Kind Regards,
Peter Smith.
Fujitsu Australia
On Sun, Jan 22, 2023, at 9:42 AM, Takamichi Osumi (Fujitsu) wrote:
On Saturday, January 21, 2023 3:36 AM I wrote:
Kindly have a look at the patch v18.
I've conducted some refactoring for v18.
Now the latest patch should be tidier and
the comments would be clearer and more aligned as a whole.Attached the updated patch v19.
[I haven't been following this thread for a long time...]
Good to know that you keep improving this patch. I have a few suggestions that
were easier to provide a patch on top of your latest patch than to provide an
inline suggestions.
There are a few documentation polishing. Let me comment some of them above.
- The length of time (ms) to delay the application of changes.
+ Total time spent delaying the application of changes, in milliseconds
I don't remember if I suggested this description for catalog but IMO the
suggestion reads better for me.
- For time-delayed logical replication (i.e. when the subscription is
- created with parameter min_apply_delay > 0), the apply worker sends a
- Standby Status Update message to the publisher with a period of
- <literal>wal_receiver_status_interval</literal>. Make sure to set
- <literal>wal_receiver_status_interval</literal> less than the
- <literal>wal_sender_timeout</literal> on the publisher, otherwise, the
- walsender will repeatedly terminate due to the timeout errors. If
- <literal>wal_receiver_status_interval</literal> is set to zero, the apply
- worker doesn't send any feedback messages during the subscriber's
- <literal>min_apply_delay</literal> period. See
- <xref linkend="sql-createsubscription"/> for details.
+ For time-delayed logical replication, the apply worker sends a feedback
+ message to the publisher every
+ <varname>wal_receiver_status_interval</varname> milliseconds. Make sure
+ to set <varname>wal_receiver_status_interval</varname> less than the
+ <varname>wal_sender_timeout</varname> on the publisher, otherwise, the
+ <literal>walsender</literal> will repeatedly terminate due to timeout
+ error. If <varname>wal_receiver_status_interval</varname> is set to
+ zero, the apply worker doesn't send any feedback messages during the
+ <literal>min_apply_delay</literal> interval.
I removed the parenthesis explanation about time-delayed logical replication.
If you are reading the documentation and does not know what it means you should
(a) read the logical replication chapter or (b) check the glossary (maybe a new
entry should be added). I also removed the Standby status Update message but it
is a low level detail; let's refer to it as feedback message as the other
sentences do. I changed "literal" to "varname" that's the correct tag for
parameters. I replace "period" with "interval" that was the previous
terminology. IMO we should be uniform, use one or the other.
- The subscriber replication can be instructed to lag behind the publisher
- side changes by specifying the <literal>min_apply_delay</literal>
- subscription parameter. See <xref linkend="sql-createsubscription"/> for
- details.
+ A logical replication subscription can delay the application of changes by
+ specifying the <literal>min_apply_delay</literal> subscription parameter.
+ See <xref linkend="sql-createsubscription"/> for details.
This feature refers to a specific subscription, hence, "logical replication
subscription" instead of "subscriber replication".
+ if (IsSet(opts->specified_opts, SUBOPT_MIN_APPLY_DELAY))
+ errorConflictingDefElem(defel, pstate);
+
Peter S referred to this missing piece of code too.
-int
+static int
defGetMinApplyDelay(DefElem *def)
{
It seems you forgot static keyword.
- elog(DEBUG2, "time-delayed replication for txid %u, min_apply_delay = %lld ms, Remaining wait time: %ld ms",
- xid, (long long) MySubscription->minapplydelay, diffms);
+ elog(DEBUG2, "time-delayed replication for txid %u, min_apply_delay = " INT64_FORMAT " ms, remaining wait time: %ld ms",
+ xid, MySubscription->minapplydelay, diffms);
int64 should use format modifier INT64_FORMAT.
- (long) wal_receiver_status_interval * 1000,
+ wal_receiver_status_interval * 1000L,
Cast is not required. I added a suffix to the constant.
- elog(DEBUG2, "sending feedback (force %d) to recv %X/%X, write %X/%X, flush %X/%X in-delayed: %d",
+ elog(DEBUG2, "sending feedback (force %d) to recv %X/%X, write %X/%X, flush %X/%X, apply delay: %s",
force,
LSN_FORMAT_ARGS(recvpos),
LSN_FORMAT_ARGS(writepos),
LSN_FORMAT_ARGS(flushpos),
- in_delayed_apply);
+ in_delayed_apply? "yes" : "no");
It is better to use a string to represent the yes/no option.
- gettext_noop("Min apply delay (ms)"));
+ gettext_noop("Min apply delay"));
I don't know if it was discussed but we don't add units to headers. When I
think about this parameter representation (internal and external), I decided to
use the previous code because it provides a unit for external representation. I
understand that using the same representation as recovery_min_apply_delay is
good but the current code does not handle the external representation
accordingly. (recovery_min_apply_delay uses the GUC machinery to adds the unit
but for min_apply_delay, it doesn't).
# Setup for streaming case
-$node_publisher->append_conf('postgres.conf',
+$node_publisher->append_conf('postgresql.conf',
'logical_decoding_mode = immediate');
$node_publisher->reload;
Fix configuration file name.
Maybe tests should do a better job. I think check_apply_delay_time is fragile
because it does not guarantee that time is not shifted. Time-delayed
replication is a subscriber feature and to check its correctness it should
check the logs.
# Note that we cannot call check_apply_delay_log() here because there is a
# possibility that the delay is skipped. The event happens when the WAL
# replication between publisher and subscriber is delayed due to a mechanical
# problem. The log output will be checked later - substantial delay-time case.
If you might not use the logs for it, it should adjust the min_apply_delay, no?
It does not exercise the min_apply_delay vs parallel streaming mode.
+ /*
+ * The combination of parallel streaming mode and
+ * min_apply_delay is not allowed.
+ */
+ if (opts.streaming == LOGICALREP_STREAM_PARALLEL)
+ if ((IsSet(opts.specified_opts, SUBOPT_MIN_APPLY_DELAY) && opts.min_apply_delay > 0) ||
+ (!IsSet(opts.specified_opts, SUBOPT_MIN_APPLY_DELAY) && sub->minapplydelay > 0))
+ ereport(ERROR,
+ errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
+ errmsg("cannot enable %s mode for subscription with %s",
+ "streaming = parallel", "min_apply_delay"));
+
Is this code correct? I also didn't like this message. "cannot enable streaming
= parallel mode for subscription with min_apply_delay" is far from a good error
message. How about refer parallelism to "parallel streaming mode".
--
Euler Taveira
EDB https://www.enterprisedb.com/
Attachments:
review-1.patchtext/x-patch; name=review-1.patchDownload
From 5024325284ee3b4a4dc0a6a1cc6457ed5608cb46 Mon Sep 17 00:00:00 2001
From: Euler Taveira <euler.taveira@enterprisedb.com>
Date: Mon, 23 Jan 2023 15:52:55 -0300
Subject: [PATCH] Euler's review
---
doc/src/sgml/catalogs.sgml | 2 +-
doc/src/sgml/config.sgml | 20 ++-
doc/src/sgml/logical-replication.sgml | 7 +-
doc/src/sgml/ref/create_subscription.sgml | 13 +-
src/backend/commands/subscriptioncmds.c | 13 +-
src/backend/replication/logical/worker.c | 40 +++---
src/bin/psql/describe.c | 2 +-
src/test/regress/expected/subscription.out | 160 ++++++++++-----------
src/test/subscription/t/032_apply_delay.pl | 8 +-
9 files changed, 133 insertions(+), 132 deletions(-)
diff --git a/doc/src/sgml/catalogs.sgml b/doc/src/sgml/catalogs.sgml
index bf3c05241c..0bdb683296 100644
--- a/doc/src/sgml/catalogs.sgml
+++ b/doc/src/sgml/catalogs.sgml
@@ -7878,7 +7878,7 @@ SCRAM-SHA-256$<replaceable><iteration count></replaceable>:<replaceable>&l
<structfield>subminapplydelay</structfield> <type>int8</type>
</para>
<para>
- The length of time (ms) to delay the application of changes.
+ Total time spent delaying the application of changes, in milliseconds
</para></entry>
</row>
diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml
index 39244bf64a..a15723d74f 100644
--- a/doc/src/sgml/config.sgml
+++ b/doc/src/sgml/config.sgml
@@ -4788,17 +4788,15 @@ ANY <replaceable class="parameter">num_sync</replaceable> ( <replaceable class="
command line.
</para>
<para>
- For time-delayed logical replication (i.e. when the subscription is
- created with parameter min_apply_delay > 0), the apply worker sends a
- Standby Status Update message to the publisher with a period of
- <literal>wal_receiver_status_interval</literal>. Make sure to set
- <literal>wal_receiver_status_interval</literal> less than the
- <literal>wal_sender_timeout</literal> on the publisher, otherwise, the
- walsender will repeatedly terminate due to the timeout errors. If
- <literal>wal_receiver_status_interval</literal> is set to zero, the apply
- worker doesn't send any feedback messages during the subscriber's
- <literal>min_apply_delay</literal> period. See
- <xref linkend="sql-createsubscription"/> for details.
+ For time-delayed logical replication, the apply worker sends a feedback
+ message to the publisher every
+ <varname>wal_receiver_status_interval</varname> milliseconds. Make sure
+ to set <varname>wal_receiver_status_interval</varname> less than the
+ <varname>wal_sender_timeout</varname> on the publisher, otherwise, the
+ <literal>walsender</literal> will repeatedly terminate due to timeout
+ error. If <varname>wal_receiver_status_interval</varname> is set to
+ zero, the apply worker doesn't send any feedback messages during the
+ <literal>min_apply_delay</literal> interval.
</para>
</listitem>
</varlistentry>
diff --git a/doc/src/sgml/logical-replication.sgml b/doc/src/sgml/logical-replication.sgml
index 863af11a47..d8ae93f88d 100644
--- a/doc/src/sgml/logical-replication.sgml
+++ b/doc/src/sgml/logical-replication.sgml
@@ -248,10 +248,9 @@
</para>
<para>
- The subscriber replication can be instructed to lag behind the publisher
- side changes by specifying the <literal>min_apply_delay</literal>
- subscription parameter. See <xref linkend="sql-createsubscription"/> for
- details.
+ A logical replication subscription can delay the application of changes by
+ specifying the <literal>min_apply_delay</literal> subscription parameter.
+ See <xref linkend="sql-createsubscription"/> for details.
</para>
<sect2 id="logical-replication-subscription-slot">
diff --git a/doc/src/sgml/ref/create_subscription.sgml b/doc/src/sgml/ref/create_subscription.sgml
index 76ee9c0b3d..97ca9f8d9e 100644
--- a/doc/src/sgml/ref/create_subscription.sgml
+++ b/doc/src/sgml/ref/create_subscription.sgml
@@ -378,10 +378,11 @@ CREATE SUBSCRIPTION <replaceable class="parameter">subscription_name</replaceabl
</para>
<warning>
<para>
- Delaying the replication can mean there is a much longer time between making
- a change on the publisher, and that change being committed on the subscriber.
- This can impact the performance of synchronous replication.
- See <xref linkend="guc-synchronous-commit"/>.
+ Delaying the replication can mean there is a much longer time
+ between making a change on the publisher, and that change being
+ committed on the subscriber. This can impact the performance of
+ synchronous replication. See <xref linkend="guc-synchronous-commit"/>
+ parameter.
</para>
</warning>
</listitem>
@@ -452,8 +453,8 @@ CREATE SUBSCRIPTION <replaceable class="parameter">subscription_name</replaceabl
</para>
<para>
- A non-zero <literal>min_apply_delay</literal> parameter is not allowed when streaming
- in parallel mode.
+ A non-zero <literal>min_apply_delay</literal> parameter is not allowed when
+ streaming in parallel mode.
</para>
<para>
diff --git a/src/backend/commands/subscriptioncmds.c b/src/backend/commands/subscriptioncmds.c
index 11e9e9160a..d5fa7a95a9 100644
--- a/src/backend/commands/subscriptioncmds.c
+++ b/src/backend/commands/subscriptioncmds.c
@@ -331,6 +331,9 @@ parse_subscription_options(ParseState *pstate, List *stmt_options,
else if (IsSet(supported_opts, SUBOPT_MIN_APPLY_DELAY) &&
strcmp(defel->defname, "min_apply_delay") == 0)
{
+ if (IsSet(opts->specified_opts, SUBOPT_MIN_APPLY_DELAY))
+ errorConflictingDefElem(defel, pstate);
+
opts->specified_opts |= SUBOPT_MIN_APPLY_DELAY;
opts->min_apply_delay = defGetMinApplyDelay(defel);
}
@@ -2261,11 +2264,11 @@ defGetStreamingMode(DefElem *def)
/*
- * Extract the min_apply_delay mode value from a DefElem. This is very similar
- * to PGC_INT case of parse_and_validate_value(), because min_apply_delay
+ * Extract the min_apply_delay value from a DefElem. This is very similar to
+ * parse_and_validate_value() for integer values, because min_apply_delay
* accepts the same string as recovery_min_apply_delay.
*/
-int
+static int
defGetMinApplyDelay(DefElem *def)
{
char *value;
@@ -2294,8 +2297,8 @@ defGetMinApplyDelay(DefElem *def)
hintmsg ? errhint("%s", _(hintmsg)) : 0));
/*
- * Check lower bound. parse_int() has been already confirmed that result
- * is equal to or smaller than INT_MAX.
+ * Check lower bound. parse_int() has already been confirmed that result
+ * is less than or equal to INT_MAX.
*/
if (result < 0)
ereport(ERROR,
diff --git a/src/backend/replication/logical/worker.c b/src/backend/replication/logical/worker.c
index eeac69ea13..00fe29fc20 100644
--- a/src/backend/replication/logical/worker.c
+++ b/src/backend/replication/logical/worker.c
@@ -320,11 +320,11 @@ bool in_remote_transaction = false;
static XLogRecPtr remote_final_lsn = InvalidXLogRecPtr;
/*
- * In order to avoid walsender timeout for time-delayed replication the worker
- * process keeps sending feedback messages during the delay period.
- * Meanwhile, the feature delays the apply before starting the
- * transaction and thus we don't write WALs for the suspended changes during
- * the wait. When the worker process sends a feedback message
+ * In order to avoid walsender timeout for time-delayed logical replication the
+ * apply worker keeps sending feedback messages during the delay interval.
+ * Meanwhile, the feature delays the apply before the start of the
+ * transaction and thus we don't write WAL records for the suspended changes during
+ * the wait. When the apply worker sends a feedback message
* during the delay, we should not make positions of the flushed and apply LSN
* overwritten by the last received latest LSN. See send_feedback() for details.
*/
@@ -1090,20 +1090,20 @@ maybe_delay_apply(TransactionId xid, TimestampTz finish_ts)
if (diffms <= 0)
break;
- elog(DEBUG2, "time-delayed replication for txid %u, min_apply_delay = %lld ms, Remaining wait time: %ld ms",
- xid, (long long) MySubscription->minapplydelay, diffms);
+ elog(DEBUG2, "time-delayed replication for txid %u, min_apply_delay = " INT64_FORMAT " ms, remaining wait time: %ld ms",
+ xid, MySubscription->minapplydelay, diffms);
/*
* Call send_feedback() to prevent the publisher from exiting by
* timeout during the delay, when wal_receiver_status_interval is
* available.
*/
- if (wal_receiver_status_interval > 0
- && diffms > wal_receiver_status_interval * 1000)
+ if (wal_receiver_status_interval > 0 &&
+ diffms > wal_receiver_status_interval * 1000L)
{
WaitLatch(MyLatch,
WL_LATCH_SET | WL_TIMEOUT | WL_EXIT_ON_PM_DEATH,
- (long) wal_receiver_status_interval * 1000,
+ wal_receiver_status_interval * 1000L,
WAIT_EVENT_RECOVERY_APPLY_DELAY);
send_feedback(last_received, true, false, true);
}
@@ -2135,8 +2135,8 @@ ensure_last_message(FileSet *stream_fileset, TransactionId xid, int fileno,
/*
* Common spoolfile processing.
*
- * The commit/prepare time for streaming transaction is required to achieve
- * time-delayed replication.
+ * The commit/prepare time (finish_ts) for streamed transactions is required
+ * for time-delayed logical replication.
*/
void
apply_spooled_messages(FileSet *stream_fileset, TransactionId xid,
@@ -3869,12 +3869,12 @@ send_feedback(XLogRecPtr recvpos, bool force, bool requestReply, bool in_delayed
* No outstanding transactions to flush, we can report the latest received
* position. This is important for synchronous replication.
*
- * If the subscriber side apply is delayed (because of time-delayed
- * replication) then do not tell the publisher that the received latest
- * LSN is already applied and flushed, otherwise, it leads to the
- * publisher side making a wrong assumption of logical replication
- * progress. Instead, we just send a feedback message to avoid a publisher
- * timeout during the delay.
+ * If the logical replication subscription is delayed (min_apply_delay
+ * parameter) then do not inform the publisher that the received latest LSN
+ * is already applied and flushed, otherwise, the publisher will make a
+ * wrong assumption about the logical replication progress. Instead, it
+ * just sends a feedback message to avoid a replication timeout during the
+ * delay.
*/
if (!have_pending_txes && !in_delayed_apply)
flushpos = writepos = recvpos;
@@ -3913,12 +3913,12 @@ send_feedback(XLogRecPtr recvpos, bool force, bool requestReply, bool in_delayed
pq_sendint64(reply_message, now); /* sendTime */
pq_sendbyte(reply_message, requestReply); /* replyRequested */
- elog(DEBUG2, "sending feedback (force %d) to recv %X/%X, write %X/%X, flush %X/%X in-delayed: %d",
+ elog(DEBUG2, "sending feedback (force %d) to recv %X/%X, write %X/%X, flush %X/%X, apply delay: %s",
force,
LSN_FORMAT_ARGS(recvpos),
LSN_FORMAT_ARGS(writepos),
LSN_FORMAT_ARGS(flushpos),
- in_delayed_apply);
+ in_delayed_apply? "yes" : "no");
walrcv_send(LogRepWorkerWalRcvConn,
reply_message->data, reply_message->len);
diff --git a/src/bin/psql/describe.c b/src/bin/psql/describe.c
index 8a27063bed..81d4607a1c 100644
--- a/src/bin/psql/describe.c
+++ b/src/bin/psql/describe.c
@@ -6533,7 +6533,7 @@ describeSubscriptions(const char *pattern, bool verbose)
", suborigin AS \"%s\"\n"
", subminapplydelay AS \"%s\"\n",
gettext_noop("Origin"),
- gettext_noop("Min apply delay (ms)"));
+ gettext_noop("Min apply delay"));
appendPQExpBuffer(&buf,
", subsynccommit AS \"%s\"\n"
diff --git a/src/test/regress/expected/subscription.out b/src/test/regress/expected/subscription.out
index 1230bcb096..977f73fe9b 100644
--- a/src/test/regress/expected/subscription.out
+++ b/src/test/regress/expected/subscription.out
@@ -114,18 +114,18 @@ CREATE SUBSCRIPTION regress_testsub4 CONNECTION 'dbname=regress_doesnotexist' PU
WARNING: subscription was created, but is not connected
HINT: To initiate replication, you must manually create the replication slot, enable the subscription, and refresh the subscription.
\dRs+ regress_testsub4
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay (ms) | Synchronous commit | Conninfo | Skip LSN
-------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+----------------------+--------------------+-----------------------------+----------
- regress_testsub4 | regress_subscription_user | f | {testpub} | f | off | d | f | none | 0 | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay | Synchronous commit | Conninfo | Skip LSN
+------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+-----------------+--------------------+-----------------------------+----------
+ regress_testsub4 | regress_subscription_user | f | {testpub} | f | off | d | f | none | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
ALTER SUBSCRIPTION regress_testsub4 SET (origin = any);
\dRs+ regress_testsub4
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay (ms) | Synchronous commit | Conninfo | Skip LSN
-------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+----------------------+--------------------+-----------------------------+----------
- regress_testsub4 | regress_subscription_user | f | {testpub} | f | off | d | f | any | 0 | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay | Synchronous commit | Conninfo | Skip LSN
+------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+-----------------+--------------------+-----------------------------+----------
+ regress_testsub4 | regress_subscription_user | f | {testpub} | f | off | d | f | any | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
DROP SUBSCRIPTION regress_testsub3;
@@ -135,10 +135,10 @@ ALTER SUBSCRIPTION regress_testsub CONNECTION 'foobar';
ERROR: invalid connection string syntax: missing "=" after "foobar" in connection info string
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay (ms) | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+----------------------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub} | f | off | d | f | any | 0 | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+-----------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | off | d | f | any | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
ALTER SUBSCRIPTION regress_testsub SET PUBLICATION testpub2, testpub3 WITH (refresh = false);
@@ -155,10 +155,10 @@ ERROR: unrecognized subscription parameter: "create_slot"
-- ok
ALTER SUBSCRIPTION regress_testsub SKIP (lsn = '0/12345');
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay (ms) | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+---------------------+--------+-----------+------------------+------------------+--------+----------------------+--------------------+------------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub2,testpub3} | f | off | d | f | any | 0 | off | dbname=regress_doesnotexist2 | 0/12345
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+---------------------+--------+-----------+------------------+------------------+--------+-----------------+--------------------+------------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub2,testpub3} | f | off | d | f | any | 0 | off | dbname=regress_doesnotexist2 | 0/12345
(1 row)
-- ok - with lsn = NONE
@@ -167,10 +167,10 @@ ALTER SUBSCRIPTION regress_testsub SKIP (lsn = NONE);
ALTER SUBSCRIPTION regress_testsub SKIP (lsn = '0/0');
ERROR: invalid WAL location (LSN): 0/0
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay (ms) | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+---------------------+--------+-----------+------------------+------------------+--------+----------------------+--------------------+------------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub2,testpub3} | f | off | d | f | any | 0 | off | dbname=regress_doesnotexist2 | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+---------------------+--------+-----------+------------------+------------------+--------+-----------------+--------------------+------------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub2,testpub3} | f | off | d | f | any | 0 | off | dbname=regress_doesnotexist2 | 0/0
(1 row)
BEGIN;
@@ -202,10 +202,10 @@ ALTER SUBSCRIPTION regress_testsub_foo SET (synchronous_commit = foobar);
ERROR: invalid value for parameter "synchronous_commit": "foobar"
HINT: Available values: local, remote_write, remote_apply, on, off.
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay (ms) | Synchronous commit | Conninfo | Skip LSN
----------------------+---------------------------+---------+---------------------+--------+-----------+------------------+------------------+--------+----------------------+--------------------+------------------------------+----------
- regress_testsub_foo | regress_subscription_user | f | {testpub2,testpub3} | f | off | d | f | any | 0 | local | dbname=regress_doesnotexist2 | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay | Synchronous commit | Conninfo | Skip LSN
+---------------------+---------------------------+---------+---------------------+--------+-----------+------------------+------------------+--------+-----------------+--------------------+------------------------------+----------
+ regress_testsub_foo | regress_subscription_user | f | {testpub2,testpub3} | f | off | d | f | any | 0 | local | dbname=regress_doesnotexist2 | 0/0
(1 row)
-- rename back to keep the rest simple
@@ -239,19 +239,19 @@ CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUB
WARNING: subscription was created, but is not connected
HINT: To initiate replication, you must manually create the replication slot, enable the subscription, and refresh the subscription.
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay (ms) | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+----------------------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub} | t | off | d | f | any | 0 | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+-----------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | t | off | d | f | any | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
ALTER SUBSCRIPTION regress_testsub SET (binary = false);
ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE);
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay (ms) | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+----------------------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub} | f | off | d | f | any | 0 | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+-----------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | off | d | f | any | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
DROP SUBSCRIPTION regress_testsub;
@@ -263,27 +263,27 @@ CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUB
WARNING: subscription was created, but is not connected
HINT: To initiate replication, you must manually create the replication slot, enable the subscription, and refresh the subscription.
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay (ms) | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+----------------------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub} | f | on | d | f | any | 0 | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+-----------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | on | d | f | any | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
ALTER SUBSCRIPTION regress_testsub SET (streaming = parallel);
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay (ms) | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+----------------------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub} | f | parallel | d | f | any | 0 | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+-----------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | parallel | d | f | any | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
ALTER SUBSCRIPTION regress_testsub SET (streaming = false);
ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE);
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay (ms) | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+----------------------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub} | f | off | d | f | any | 0 | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+-----------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | off | d | f | any | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
-- fail - publication already exists
@@ -298,10 +298,10 @@ ALTER SUBSCRIPTION regress_testsub ADD PUBLICATION testpub1, testpub2 WITH (refr
ALTER SUBSCRIPTION regress_testsub ADD PUBLICATION testpub1, testpub2 WITH (refresh = false);
ERROR: publication "testpub1" is already in subscription "regress_testsub"
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay (ms) | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-----------------------------+--------+-----------+------------------+------------------+--------+----------------------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub,testpub1,testpub2} | f | off | d | f | any | 0 | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-----------------------------+--------+-----------+------------------+------------------+--------+-----------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub,testpub1,testpub2} | f | off | d | f | any | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
-- fail - publication used more than once
@@ -316,10 +316,10 @@ ERROR: publication "testpub3" is not in subscription "regress_testsub"
-- ok - delete publications
ALTER SUBSCRIPTION regress_testsub DROP PUBLICATION testpub1, testpub2 WITH (refresh = false);
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay (ms) | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+----------------------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub} | f | off | d | f | any | 0 | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+-----------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | off | d | f | any | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
DROP SUBSCRIPTION regress_testsub;
@@ -355,10 +355,10 @@ CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUB
WARNING: subscription was created, but is not connected
HINT: To initiate replication, you must manually create the replication slot, enable the subscription, and refresh the subscription.
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay (ms) | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+----------------------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub} | f | off | p | f | any | 0 | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+-----------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | off | p | f | any | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
--fail - alter of two_phase option not supported.
@@ -367,10 +367,10 @@ ERROR: unrecognized subscription parameter: "two_phase"
-- but can alter streaming when two_phase enabled
ALTER SUBSCRIPTION regress_testsub SET (streaming = true);
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay (ms) | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+----------------------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub} | f | on | p | f | any | 0 | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+-----------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | on | p | f | any | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE);
@@ -380,10 +380,10 @@ CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUB
WARNING: subscription was created, but is not connected
HINT: To initiate replication, you must manually create the replication slot, enable the subscription, and refresh the subscription.
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay (ms) | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+----------------------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub} | f | on | p | f | any | 0 | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+-----------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | on | p | f | any | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE);
@@ -396,18 +396,18 @@ CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUB
WARNING: subscription was created, but is not connected
HINT: To initiate replication, you must manually create the replication slot, enable the subscription, and refresh the subscription.
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay (ms) | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+----------------------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub} | f | off | d | f | any | 0 | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+-----------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | off | d | f | any | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
ALTER SUBSCRIPTION regress_testsub SET (disable_on_error = true);
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay (ms) | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+----------------------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub} | f | off | d | t | any | 0 | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+-----------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | off | d | t | any | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE);
@@ -425,19 +425,19 @@ CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUB
WARNING: subscription was created, but is not connected
HINT: To initiate replication, you must manually create the replication slot, enable the subscription, and refresh the subscription.
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay (ms) | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+----------------------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub} | f | off | d | f | any | 123 | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+-----------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | off | d | f | any | 123 | off | dbname=regress_doesnotexist | 0/0
(1 row)
-- success -- min_apply_delay value with unit is converted into ms and stored as an integer
ALTER SUBSCRIPTION regress_testsub SET (min_apply_delay = '1 d');
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay (ms) | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+----------------------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub} | f | off | d | f | any | 86400000 | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+-----------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | off | d | f | any | 86400000 | off | dbname=regress_doesnotexist | 0/0
(1 row)
-- fail - alter subscription with streaming = parallel should fail when time-delayed replication is set
diff --git a/src/test/subscription/t/032_apply_delay.pl b/src/test/subscription/t/032_apply_delay.pl
index 37388e474f..1b6bc1ef80 100644
--- a/src/test/subscription/t/032_apply_delay.pl
+++ b/src/test/subscription/t/032_apply_delay.pl
@@ -16,7 +16,7 @@ sub check_apply_delay_log
{
my ($node_subscriber, $offset, $expected) = @_;
- my $log_location = $node_subscriber->wait_for_log(qr/time-delayed replication for txid (\d+), min_apply_delay = (\d+) ms, Remaining wait time: (\d+) ms/, $offset);
+ my $log_location = $node_subscriber->wait_for_log(qr/time-delayed replication for txid (\d+), min_apply_delay = (\d+) ms, remaining wait time: (\d+) ms/, $offset);
cmp_ok($log_location, '>', $offset,
"logfile contains triggered logical replication apply delay"
@@ -25,7 +25,7 @@ sub check_apply_delay_log
# Get the delay time from the server log
my $contents = slurp_file($node_subscriber->logfile, $offset);
$contents =~
- qr/time-delayed replication for txid (\d+), min_apply_delay = (\d+) ms, Remaining wait time: (\d+) ms/
+ qr/time-delayed replication for txid (\d+), min_apply_delay = (\d+) ms, remaining wait time: (\d+) ms/
or die "could not get the apply worker wait time";
my $logged_delay = $3;
@@ -87,7 +87,7 @@ $node_publisher->safe_psql('postgres',
my $appname = 'tap_sub';
-# Create a subscription that applies the trasaction after 50 milliseconds delay
+# Create a subscription that applies the transaction after 50 milliseconds delay
$node_subscriber->safe_psql('postgres',
"CREATE SUBSCRIPTION tap_sub CONNECTION '$publisher_connstr application_name=$appname' PUBLICATION tap_pub WITH (copy_data = off, min_apply_delay = '50ms', streaming = 'on')"
);
@@ -114,7 +114,7 @@ is($result, qq(2|1|2), 'check if the new rows were applied to subscriber');
check_apply_delay_time($node_publisher, $node_subscriber, '2', '0.05');
# Setup for streaming case
-$node_publisher->append_conf('postgres.conf',
+$node_publisher->append_conf('postgresql.conf',
'logical_decoding_mode = immediate');
$node_publisher->reload;
--
2.30.2
At Mon, 23 Jan 2023 17:36:13 +0530, Amit Kapila <amit.kapila16@gmail.com> wrote in
On Sun, Jan 22, 2023 at 6:12 PM Takamichi Osumi (Fujitsu)
<osumi.takamichi@fujitsu.com> wrote:Attached the updated patch v19.
Few comments: 2. + if (IsSet(supported_opts, SUBOPT_MIN_APPLY_DELAY) && + opts->min_apply_delay > 0 && opts->streaming == LOGICALREP_STREAM_PARALLEL) + ereport(ERROR, + errcode(ERRCODE_SYNTAX_ERROR), + errmsg("%s and %s are mutually exclusive options", + "min_apply_delay > 0", "streaming = parallel")); }I think here we should add a comment for the translator as we are
doing in some other nearby cases.
IMHO "foo > bar" is not an "option". I think we say "foo and bar are
mutually exclusive options" but I think don't say "foo = x and bar = y
are.. options". I wrote a comment as "this should be more like
human-speaking" and Euler seems having the same feeling for another
error message.
Concretely I would spell this as "min_apply_delay cannot be enabled
when parallel streaming mode is enabled" or something. And the
opposite-direction message nearby would be "parallel streaming mode
cannot be enabled when min_apply_delay is enabled."
regards.
--
Kyotaro Horiguchi
NTT Open Source Software Center
Attached the updated patch v19.
+ maybe_delay_apply(TransactionId xid, TimestampTz finish_ts)
I look this spelling strange. How about maybe_apply_delay()?
send_feedback():
+ * If the subscriber side apply is delayed (because of time-delayed
+ * replication) then do not tell the publisher that the received latest
+ * LSN is already applied and flushed, otherwise, it leads to the
+ * publisher side making a wrong assumption of logical replication
+ * progress. Instead, we just send a feedback message to avoid a publisher
+ * timeout during the delay.
*/
- if (!have_pending_txes)
+ if (!have_pending_txes && !in_delayed_apply)
flushpos = writepos = recvpos;
Honestly I don't like this wart. The reason for this is the function
assumes recvpos = applypos but we actually call it while holding
unapplied changes, that is, applypos < recvpos.
Couldn't we maintain an additional static variable "last_applied"
along with last_received? In this case the condition cited above
would be as follows and in_delayed_apply will become unnecessary.
+ if (!have_pending_txes && last_received == last_applied)
The function is a static function and always called with a variable
last_received that has the same scope with the function, as the first
parameter. Thus we can remove the first parameter then let the
function directly look at the both two varaibles instead.
regards.
--
Kyotaro Horiguchi
NTT Open Source Software Center
Sorry, I forgot to write one comment.
At Tue, 24 Jan 2023 11:45:35 +0900 (JST), Kyotaro Horiguchi <horikyota.ntt@gmail.com> wrote in
+ /* Should we delay the current transaction? */
+ if (finish_ts)
+ maybe_delay_apply(xid, finish_ts);
+
if (!am_parallel_apply_worker())
maybe_start_skipping_changes(lsn);
It may not give actual advantages, but isn't it better that delay
happens after skipping?
regards.
--
Kyotaro Horiguchi
NTT Open Source Software Center
On Tue, Jan 24, 2023 at 3:46 AM Peter Smith <smithpb2250@gmail.com> wrote:
On Mon, Jan 23, 2023 at 9:44 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
6. defGetMinApplyDelay
...
6b.
I thought this function should be implemented as static and located at
the top of the subscriptioncmds.c source file.I agree that this should be a static function but I think its current
location is a better place as other similar function is just above it.But, why not do everything, instead of settling on a half-fix?
e.g.
1. Change the new function (defGetMinApplyDelay) to be static as it should be
2. And move defGetMinApplyDelay to the top of the file where IMO it
really belongs
3. And then remove the (now) redundant forward declaration of
defGetMinApplyDelay
4. And also move the existing function (defGetStreamingMode) to the
top of the file so that those similar functions (defGetMinApplyDelay
and defGetStreamingMode) can remain together
There are various other static functions (merge_publications,
check_duplicates_in_publist, etc.) which then also needs similar
change. BTW, I don't think we have a policy to always define static
functions before their usage. So, I don't see the need to do anything
in this matter.
--
With Regards,
Amit Kapila.
On Tue, Jan 24, 2023 at 5:02 AM Euler Taveira <euler@eulerto.com> wrote:
On Sun, Jan 22, 2023, at 9:42 AM, Takamichi Osumi (Fujitsu) wrote:
Attached the updated patch v19.
[I haven't been following this thread for a long time...]
Good to know that you keep improving this patch. I have a few suggestions that
were easier to provide a patch on top of your latest patch than to provide an
inline suggestions.
Euler, thanks for your comments. We have an existing problem related
to shutdown which impacts this patch. The problem is that during
shutdown on the publisher, we wait for all the WAL to be sent and
flushed on the subscriber. Now, if we user has configured a long value
for min_apply_delay on the subscriber then the shutdown won't be
successful. This can happen even today if the subscriber waits for
some lock during the apply. This is not so much a problem with
physical replication because there we have a separate process to first
flush the WAL. This problem has been discussed in a separate thread as
well. See [1]/messages/by-id/TYAPR01MB586668E50FC2447AD7F92491F5E89@TYAPR01MB5866.jpnprd01.prod.outlook.com. It is important to reach conclusion even if we just
want to document it. So, your thoughts on that other thread can help
us to make it move forward.
[1]: /messages/by-id/TYAPR01MB586668E50FC2447AD7F92491F5E89@TYAPR01MB5866.jpnprd01.prod.outlook.com
--
With Regards,
Amit Kapila.
On Tue, Jan 24, 2023 at 6:17 AM Kyotaro Horiguchi
<horikyota.ntt@gmail.com> wrote:
At Mon, 23 Jan 2023 17:36:13 +0530, Amit Kapila <amit.kapila16@gmail.com> wrote in
On Sun, Jan 22, 2023 at 6:12 PM Takamichi Osumi (Fujitsu)
<osumi.takamichi@fujitsu.com> wrote:Attached the updated patch v19.
Few comments: 2. + if (IsSet(supported_opts, SUBOPT_MIN_APPLY_DELAY) && + opts->min_apply_delay > 0 && opts->streaming == LOGICALREP_STREAM_PARALLEL) + ereport(ERROR, + errcode(ERRCODE_SYNTAX_ERROR), + errmsg("%s and %s are mutually exclusive options", + "min_apply_delay > 0", "streaming = parallel")); }I think here we should add a comment for the translator as we are
doing in some other nearby cases.IMHO "foo > bar" is not an "option". I think we say "foo and bar are
mutually exclusive options" but I think don't say "foo = x and bar = y
are.. options". I wrote a comment as "this should be more like
human-speaking" and Euler seems having the same feeling for another
error message.Concretely I would spell this as "min_apply_delay cannot be enabled
when parallel streaming mode is enabled" or something.
We can change it but the current message seems to be in line with some
nearby messages like "slot_name = NONE and enabled = true are mutually
exclusive options". So, isn't it better to keep this as one in sync
with existing messages?
--
With Regards,
Amit Kapila.
On Tue, Jan 24, 2023 at 8:35 AM Kyotaro Horiguchi
<horikyota.ntt@gmail.com> wrote:
Sorry, I forgot to write one comment.
At Tue, 24 Jan 2023 11:45:35 +0900 (JST), Kyotaro Horiguchi <horikyota.ntt@gmail.com> wrote in
+ /* Should we delay the current transaction? */ + if (finish_ts) + maybe_delay_apply(xid, finish_ts); + if (!am_parallel_apply_worker()) maybe_start_skipping_changes(lsn);It may not give actual advantages, but isn't it better that delay
happens after skipping?
If we go with the order you are suggesting then the LOGs will appear
as follows when we are skipping the transaction:
"logical replication starts skipping transaction at LSN ..."
"time-delayed replication for txid %u, min_apply_delay = %lld ms,
Remaining wait time: ..."
Personally, I would prefer the above LOGs to be in reverse order as it
doesn't make much sense to me to first say that we are skipping
changes and then say the transaction is delayed. What do you think?
--
With Regards,
Amit Kapila.
On Tue, Jan 24, 2023 at 8:15 AM Kyotaro Horiguchi
<horikyota.ntt@gmail.com> wrote:
Attached the updated patch v19.
+ maybe_delay_apply(TransactionId xid, TimestampTz finish_ts)
I look this spelling strange. How about maybe_apply_delay()?
+1.
send_feedback(): + * If the subscriber side apply is delayed (because of time-delayed + * replication) then do not tell the publisher that the received latest + * LSN is already applied and flushed, otherwise, it leads to the + * publisher side making a wrong assumption of logical replication + * progress. Instead, we just send a feedback message to avoid a publisher + * timeout during the delay. */ - if (!have_pending_txes) + if (!have_pending_txes && !in_delayed_apply) flushpos = writepos = recvpos;Honestly I don't like this wart. The reason for this is the function
assumes recvpos = applypos but we actually call it while holding
unapplied changes, that is, applypos < recvpos.Couldn't we maintain an additional static variable "last_applied"
along with last_received?
It won't be easy to maintain the meaning of last_applied because there
are cases where we don't apply the change directly. For example, in
case of streaming xacts, we will just keep writing it to the file,
now, say, due to some reason, we have to send the feedback, then it
will not allow you to update the latest write locations. This would
then become different then what we are doing without the patch.
Another point to think about is that we also need to keep the variable
updated for keep-alive ('k') messages even though we don't apply
anything in that case. Still, other cases to consider are where we
have mix of streaming and non-streaming transactions.
In this case the condition cited above
would be as follows and in_delayed_apply will become unnecessary.+ if (!have_pending_txes && last_received == last_applied)
The function is a static function and always called with a variable
last_received that has the same scope with the function, as the first
parameter. Thus we can remove the first parameter then let the
function directly look at the both two varaibles instead.
I think this is true without this patch, so why that has not been
followed in the first place? One comment, I see in this regard is as
below:
/* It's legal to not pass a recvpos */
if (recvpos < last_recvpos)
recvpos = last_recvpos;
--
With Regards,
Amit Kapila.
On Tue, Jan 24, 2023 at 5:58 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
On Tue, Jan 24, 2023 at 8:15 AM Kyotaro Horiguchi
<horikyota.ntt@gmail.com> wrote:Attached the updated patch v19.
+ maybe_delay_apply(TransactionId xid, TimestampTz finish_ts)
I look this spelling strange. How about maybe_apply_delay()?
+1.
It depends on how you read it. I read it like this:
maybe_delay_apply === means "maybe delay [the] apply"
(which is exactly what the function does)
versus
maybe_apply_delay === means "maybe [the] apply [needs a] delay"
(which is also correct, but it seemed a more awkward way to say it IMO)
~
Perhaps it's better to rename it more fully like
*maybe_delay_the_apply* to remove any ambiguous interpretations.
------
Kind Regards,
Peter Smith.
Fujitsu Australia
At Tue, 24 Jan 2023 11:28:58 +0530, Amit Kapila <amit.kapila16@gmail.com> wrote in
On Tue, Jan 24, 2023 at 6:17 AM Kyotaro Horiguchi
<horikyota.ntt@gmail.com> wrote:IMHO "foo > bar" is not an "option". I think we say "foo and bar are
mutually exclusive options" but I think don't say "foo = x and bar = y
are.. options". I wrote a comment as "this should be more like
human-speaking" and Euler seems having the same feeling for another
error message.Concretely I would spell this as "min_apply_delay cannot be enabled
when parallel streaming mode is enabled" or something.We can change it but the current message seems to be in line with some
nearby messages like "slot_name = NONE and enabled = true are mutually
exclusive options". So, isn't it better to keep this as one in sync
with existing messages?
Ooo. subscriptioncmds.c is full of such messages. Okay I agree that it
is better to leave it as is..
regards.
--
Kyotaro Horiguchi
NTT Open Source Software Center
On Tue, Jan 24, 2023 at 12:44 PM Peter Smith <smithpb2250@gmail.com> wrote:
On Tue, Jan 24, 2023 at 5:58 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
On Tue, Jan 24, 2023 at 8:15 AM Kyotaro Horiguchi
<horikyota.ntt@gmail.com> wrote:Attached the updated patch v19.
+ maybe_delay_apply(TransactionId xid, TimestampTz finish_ts)
I look this spelling strange. How about maybe_apply_delay()?
+1.
It depends on how you read it. I read it like this:
maybe_delay_apply === means "maybe delay [the] apply"
(which is exactly what the function does)versus
maybe_apply_delay === means "maybe [the] apply [needs a] delay"
(which is also correct, but it seemed a more awkward way to say it IMO)
This matches more with GUC and all other usages of variables in the
patch. So, I still prefer the second one.
--
With Regards,
Amit Kapila.
Dear Amit, Horiguchi-san,
send_feedback(): + * If the subscriber side apply is delayed (because of time-delayed + * replication) then do not tell the publisher that the received latest + * LSN is already applied and flushed, otherwise, it leads to the + * publisher side making a wrong assumption of logical replication + * progress. Instead, we just send a feedback message to avoid apublisher
+ * timeout during the delay. */ - if (!have_pending_txes) + if (!have_pending_txes && !in_delayed_apply) flushpos = writepos = recvpos;Honestly I don't like this wart. The reason for this is the function
assumes recvpos = applypos but we actually call it while holding
unapplied changes, that is, applypos < recvpos.Couldn't we maintain an additional static variable "last_applied"
along with last_received?It won't be easy to maintain the meaning of last_applied because there
are cases where we don't apply the change directly. For example, in
case of streaming xacts, we will just keep writing it to the file,
now, say, due to some reason, we have to send the feedback, then it
will not allow you to update the latest write locations. This would
then become different then what we are doing without the patch.
Another point to think about is that we also need to keep the variable
updated for keep-alive ('k') messages even though we don't apply
anything in that case. Still, other cases to consider are where we
have mix of streaming and non-streaming transactions.
I have tried to implement that, but it might be difficult because of a corner
case related with the initial data sync.
First of all, I have made last_applied to update when
* transactions are committed, prepared, or aborted
* apply worker receives keepalive message.
I thought during the initial data sync, we must not update the last applied
triggered by keepalive messages, so following lines were added just after
updating last_received.
```
+ if (last_applied < end_lsn && AllTablesyncsReady())
+ last_applied = end_lsn;
```
However, if data is synchronizing and workers receive the non-committable WAL,
this condition cannot be satisfied. 009_matviews.pl tests such a case, and I
got a failure there. In this test MATERIALIZED VIEW is created on publisher and then
the WAL is replicated to subscriber, but the transaction is not committed because
logical replication does not support the statement.
If we change the condition, we may the system may become inconsistent because the
worker replies that all remote WALs are applied even if tablesync workers are
synchronizing data.
Best Regards,
Hayato Kuroda
FUJITSU LIMITED
Hi,
On Tuesday, January 24, 2023 5:52 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
On Tue, Jan 24, 2023 at 12:44 PM Peter Smith <smithpb2250@gmail.com>
wrote:On Tue, Jan 24, 2023 at 5:58 PM Amit Kapila <amit.kapila16@gmail.com>
wrote:
On Tue, Jan 24, 2023 at 8:15 AM Kyotaro Horiguchi
<horikyota.ntt@gmail.com> wrote:Attached the updated patch v19.
+ maybe_delay_apply(TransactionId xid, TimestampTz finish_ts)
I look this spelling strange. How about maybe_apply_delay()?
+1.
It depends on how you read it. I read it like this:
maybe_delay_apply === means "maybe delay [the] apply"
(which is exactly what the function does)versus
maybe_apply_delay === means "maybe [the] apply [needs a] delay"
(which is also correct, but it seemed a more awkward way to say it
IMO)This matches more with GUC and all other usages of variables in the patch. So,
I still prefer the second one.
Okay. Fixed.
Attached the patch v20 that has incorporated all comments so far.
Kindly have a look at the attached patch.
Best Regards,
Takamichi Osumi
Attachments:
v20-0001-Time-delayed-logical-replication-subscriber.patchapplication/octet-stream; name=v20-0001-Time-delayed-logical-replication-subscriber.patchDownload
From 4386bd87ec3f78bc7efca8e8ffdf1a2b665e1a7b Mon Sep 17 00:00:00 2001
From: Takamichi Osumi <osumi.takamichi@fujitsu.com>
Date: Tue, 24 Jan 2023 11:53:16 +0000
Subject: [PATCH v20] Time-delayed logical replication subscriber
Similar to physical replication, a time-delayed copy of the data for
logical replication is useful for some scenarios (particularly to fix
errors that might cause data loss).
This patch implements a new subscription parameter called 'min_apply_delay'.
If the subscription sets min_apply_delay parameter, the logical
replication worker will delay the transaction commit for min_apply_delay
milliseconds.
The delay is calculated between the WAL time stamp and the current time
on the subscriber.
The delay occurs only on WAL records for transaction begins. The main
reason is to avoid keeping a transaction open for a long time. Regular
and prepared transactions are covered. Streamed transactions are also
covered.
The combination of parallel streaming mode and min_apply_delay is not
allowed. This is because we start applying the transaction stream as
soon as the first change arrives without knowing the transaction's
prepare/commit time. This means we cannot calculate the underlying
network/decoding lag between publisher and subscriber, and so always
waiting for the full 'min_apply_delay' period might include
unnecessary delay.
The other possibility is to apply the delay at the end of the parallel
apply transaction but that would cause issues related to resource
bloat and locks being held for a long time.
Note that this feature doesn't interact with skip transaction feature.
The skip transaction feature applies to one transaction with a specific LSN.
So, even if the skipped transaction and non-skipped transaction come
consecutively in a very short time, regardless of the order of which comes
first, the time-delayed feature gets balanced by delayed application
for other transactions before and after the skipped transaction.
Author: Euler Taveira, Takamichi Osumi, Kuroda Hayato
Reviewed-by: Amit Kapila, Peter Smith, Vignesh C, Shveta Malik,
Kyotaro Horiguchi, Dilip Kumar, Melih Mutlu
Discussion: https://postgr.es/m/CAB-JLwYOYwL=XTyAXKiH5CtM_Vm8KjKh7aaitCKvmCh4rzr5pQ@mail.gmail.com
---
doc/src/sgml/catalogs.sgml | 9 +
doc/src/sgml/config.sgml | 11 ++
doc/src/sgml/glossary.sgml | 14 ++
doc/src/sgml/logical-replication.sgml | 6 +
doc/src/sgml/ref/alter_subscription.sgml | 5 +-
doc/src/sgml/ref/create_subscription.sgml | 60 +++++-
src/backend/catalog/pg_subscription.c | 1 +
src/backend/catalog/system_views.sql | 7 +-
src/backend/commands/subscriptioncmds.c | 117 ++++++++++-
.../replication/logical/applyparallelworker.c | 3 +-
src/backend/replication/logical/worker.c | 165 ++++++++++++++--
src/bin/pg_dump/pg_dump.c | 15 +-
src/bin/pg_dump/pg_dump.h | 1 +
src/bin/psql/describe.c | 9 +-
src/bin/psql/tab-complete.c | 4 +-
src/include/catalog/pg_subscription.h | 3 +
src/include/replication/worker_internal.h | 2 +-
src/test/regress/expected/subscription.out | 181 ++++++++++-------
src/test/regress/sql/subscription.sql | 24 +++
src/test/subscription/meson.build | 1 +
src/test/subscription/t/032_apply_delay.pl | 187 ++++++++++++++++++
21 files changed, 721 insertions(+), 104 deletions(-)
create mode 100644 src/test/subscription/t/032_apply_delay.pl
diff --git a/doc/src/sgml/catalogs.sgml b/doc/src/sgml/catalogs.sgml
index c1e4048054..2b62beed59 100644
--- a/doc/src/sgml/catalogs.sgml
+++ b/doc/src/sgml/catalogs.sgml
@@ -7873,6 +7873,15 @@ SCRAM-SHA-256$<replaceable><iteration count></replaceable>:<replaceable>&l
</para></entry>
</row>
+ <row>
+ <entry role="catalog_table_entry"><para role="column_definition">
+ <structfield>subminapplydelay</structfield> <type>int8</type>
+ </para>
+ <para>
+ Total time spent delaying the application of changes, in milliseconds.
+ </para></entry>
+ </row>
+
<row>
<entry role="catalog_table_entry"><para role="column_definition">
<structfield>subname</structfield> <type>name</type>
diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml
index f985afc009..ee91a1fc02 100644
--- a/doc/src/sgml/config.sgml
+++ b/doc/src/sgml/config.sgml
@@ -4787,6 +4787,17 @@ ANY <replaceable class="parameter">num_sync</replaceable> ( <replaceable class="
the <filename>postgresql.conf</filename> file or on the server
command line.
</para>
+ <para>
+ For time-delayed logical replication, the apply worker sends a feedback
+ message to the publisher every
+ <varname>wal_receiver_status_interval</varname> milliseconds. Make sure
+ to set <varname>wal_receiver_status_interval</varname> less than the
+ <varname>wal_sender_timeout</varname> on the publisher, otherwise, the
+ <literal>walsender</literal> will repeatedly terminate due to timeout
+ error. If <varname>wal_receiver_status_interval</varname> is set to
+ zero, the apply worker doesn't send any feedback messages during the
+ <literal>min_apply_delay</literal> period.
+ </para>
</listitem>
</varlistentry>
diff --git a/doc/src/sgml/glossary.sgml b/doc/src/sgml/glossary.sgml
index 7c01a541fe..6ed6fa5853 100644
--- a/doc/src/sgml/glossary.sgml
+++ b/doc/src/sgml/glossary.sgml
@@ -1729,6 +1729,20 @@
</glossdef>
</glossentry>
+ <glossentry id="glossary-time-delayed-replication">
+ <glossterm>Time-delayed replication</glossterm>
+ <glossdef>
+ <para>
+ Replication setup that applies time-delayed copy of the data.
+ </para>
+ <para>
+ For more information, see
+ <xref linkend="guc-recovery-min-apply-delay"/> for physical replication
+ and <xref linkend="sql-createsubscription"/> for logical replication.
+ </para>
+ </glossdef>
+ </glossentry>
+
<glossentry id="glossary-toast">
<glossterm>TOAST</glossterm>
<glossdef>
diff --git a/doc/src/sgml/logical-replication.sgml b/doc/src/sgml/logical-replication.sgml
index 1bd5660c87..d8ae93f88d 100644
--- a/doc/src/sgml/logical-replication.sgml
+++ b/doc/src/sgml/logical-replication.sgml
@@ -247,6 +247,12 @@
target table.
</para>
+ <para>
+ A logical replication subscription can delay the application of changes by
+ specifying the <literal>min_apply_delay</literal> subscription parameter.
+ See <xref linkend="sql-createsubscription"/> for details.
+ </para>
+
<sect2 id="logical-replication-subscription-slot">
<title>Replication Slot Management</title>
diff --git a/doc/src/sgml/ref/alter_subscription.sgml b/doc/src/sgml/ref/alter_subscription.sgml
index ad93553a1d..1c6e9dd2d1 100644
--- a/doc/src/sgml/ref/alter_subscription.sgml
+++ b/doc/src/sgml/ref/alter_subscription.sgml
@@ -213,8 +213,9 @@ ALTER SUBSCRIPTION <replaceable class="parameter">name</replaceable> RENAME TO <
are <literal>slot_name</literal>,
<literal>synchronous_commit</literal>,
<literal>binary</literal>, <literal>streaming</literal>,
- <literal>disable_on_error</literal>, and
- <literal>origin</literal>.
+ <literal>disable_on_error</literal>,
+ <literal>origin</literal>, and
+ <literal>min_apply_delay</literal>.
</para>
</listitem>
</varlistentry>
diff --git a/doc/src/sgml/ref/create_subscription.sgml b/doc/src/sgml/ref/create_subscription.sgml
index eba72c6af6..c4a615ee5c 100644
--- a/doc/src/sgml/ref/create_subscription.sgml
+++ b/doc/src/sgml/ref/create_subscription.sgml
@@ -349,7 +349,48 @@ CREATE SUBSCRIPTION <replaceable class="parameter">subscription_name</replaceabl
</para>
</listitem>
</varlistentry>
- </variablelist></para>
+
+ <varlistentry>
+ <term><literal>min_apply_delay</literal> (<type>integer</type>)</term>
+ <listitem>
+ <para>
+ By default, the subscriber applies changes as soon as possible. This
+ parameter allows the user to delay the application of changes by a
+ given time period. If the value is specified without units, it is
+ taken as milliseconds. The default is zero (no delay). See
+ <xref linkend="config-setting-names-values"/> for details on the
+ available valid time unites.
+ </para>
+ <para>
+ Any delay occurs only on WAL records for transaction begins after all
+ initial table synchronization has finished. The delay is calculated
+ as the difference between the WAL timestamp as written on the
+ publisher and the current time on the subscriber. Any overhead of
+ time spent in logical decoding and in transferring the transaction
+ may reduce the actual wait time. It is also possible that the overhead
+ already exceeds the requested <literal>min_apply_delay</literal> value,
+ in which case no additional wait is necessary. If the system clocks
+ on publisher and subscriber are not synchronized, this may lead to
+ apply changes earlier than expected, but this is not a major issue
+ because this parameter is typically much larger than the time
+ deviations between servers. Note that if this parameter is set to a
+ long delay, the replication will stop if the replication slot falls
+ behind the current LSN by more than
+ <link linkend="guc-max-slot-wal-keep-size"><literal>max_slot_wal_keep_size</literal></link>.
+ </para>
+ <warning>
+ <para>
+ Delaying the replication can mean there is a much longer time
+ between making a change on the publisher, and that change being
+ committed on the subscriber. This can impact the performance of
+ synchronous replication. See <xref linkend="guc-synchronous-commit"/>
+ parameter.
+ </para>
+ </warning>
+ </listitem>
+ </varlistentry>
+ </variablelist>
+ </para>
</listitem>
</varlistentry>
@@ -413,6 +454,11 @@ CREATE SUBSCRIPTION <replaceable class="parameter">subscription_name</replaceabl
published with different column lists are not supported.
</para>
+ <para>
+ A non-zero <literal>min_apply_delay</literal> parameter is not allowed when
+ streaming in parallel mode.
+ </para>
+
<para>
We allow non-existent publications to be specified so that users can add
those later. This means
@@ -472,6 +518,18 @@ CREATE SUBSCRIPTION mysub
PUBLICATION insert_only
WITH (enabled = false);
</programlisting></para>
+
+ <para>
+ Create a subscription to a remote server that replicates tables in
+ the <literal>mypub</literal> publication and starts replicating immediately
+ on commit. Pre-existing data is not copied. The application of changes is
+ delayed by 4 hours.
+<programlisting>
+CREATE SUBSCRIPTION mysub
+ CONNECTION 'host=192.0.2.4 port=5432 user=foo dbname=foodb'
+ PUBLICATION mypub
+ WITH (copy_data = false, min_apply_delay = '4h');
+</programlisting></para>
</refsect1>
<refsect1>
diff --git a/src/backend/catalog/pg_subscription.c b/src/backend/catalog/pg_subscription.c
index a56ae311c3..c767cc1c3a 100644
--- a/src/backend/catalog/pg_subscription.c
+++ b/src/backend/catalog/pg_subscription.c
@@ -64,6 +64,7 @@ GetSubscription(Oid subid, bool missing_ok)
sub->oid = subid;
sub->dbid = subform->subdbid;
sub->skiplsn = subform->subskiplsn;
+ sub->minapplydelay = subform->subminapplydelay;
sub->name = pstrdup(NameStr(subform->subname));
sub->owner = subform->subowner;
sub->enabled = subform->subenabled;
diff --git a/src/backend/catalog/system_views.sql b/src/backend/catalog/system_views.sql
index 8608e3fa5b..317c2010cb 100644
--- a/src/backend/catalog/system_views.sql
+++ b/src/backend/catalog/system_views.sql
@@ -1299,9 +1299,10 @@ REVOKE ALL ON pg_replication_origin_status FROM public;
-- All columns of pg_subscription except subconninfo are publicly readable.
REVOKE ALL ON pg_subscription FROM public;
-GRANT SELECT (oid, subdbid, subskiplsn, subname, subowner, subenabled,
- subbinary, substream, subtwophasestate, subdisableonerr,
- subslotname, subsynccommit, subpublications, suborigin)
+GRANT SELECT (oid, subdbid, subskiplsn, subminapplydelay, subname, subowner,
+ subenabled, subbinary, substream, subtwophasestate,
+ subdisableonerr, subslotname, subsynccommit, subpublications,
+ suborigin)
ON pg_subscription TO public;
CREATE VIEW pg_stat_subscription_stats AS
diff --git a/src/backend/commands/subscriptioncmds.c b/src/backend/commands/subscriptioncmds.c
index 464db6d247..b4d075c931 100644
--- a/src/backend/commands/subscriptioncmds.c
+++ b/src/backend/commands/subscriptioncmds.c
@@ -66,6 +66,7 @@
#define SUBOPT_DISABLE_ON_ERR 0x00000400
#define SUBOPT_LSN 0x00000800
#define SUBOPT_ORIGIN 0x00001000
+#define SUBOPT_MIN_APPLY_DELAY 0x00002000
/* check if the 'val' has 'bits' set */
#define IsSet(val, bits) (((val) & (bits)) == (bits))
@@ -90,6 +91,7 @@ typedef struct SubOpts
bool disableonerr;
char *origin;
XLogRecPtr lsn;
+ int min_apply_delay;
} SubOpts;
static List *fetch_table_list(WalReceiverConn *wrconn, List *publications);
@@ -100,7 +102,7 @@ static void check_publications_origin(WalReceiverConn *wrconn,
static void check_duplicates_in_publist(List *publist, Datum *datums);
static List *merge_publications(List *oldpublist, List *newpublist, bool addpub, const char *subname);
static void ReportSlotConnectionError(List *rstates, Oid subid, char *slotname, char *err);
-
+static int defGetMinApplyDelay(DefElem *def);
/*
* Common option parsing function for CREATE and ALTER SUBSCRIPTION commands.
@@ -146,6 +148,8 @@ parse_subscription_options(ParseState *pstate, List *stmt_options,
opts->disableonerr = false;
if (IsSet(supported_opts, SUBOPT_ORIGIN))
opts->origin = pstrdup(LOGICALREP_ORIGIN_ANY);
+ if (IsSet(supported_opts, SUBOPT_MIN_APPLY_DELAY))
+ opts->min_apply_delay = 0;
/* Parse options */
foreach(lc, stmt_options)
@@ -324,6 +328,15 @@ parse_subscription_options(ParseState *pstate, List *stmt_options,
opts->specified_opts |= SUBOPT_LSN;
opts->lsn = lsn;
}
+ else if (IsSet(supported_opts, SUBOPT_MIN_APPLY_DELAY) &&
+ strcmp(defel->defname, "min_apply_delay") == 0)
+ {
+ if (IsSet(opts->specified_opts, SUBOPT_MIN_APPLY_DELAY))
+ errorConflictingDefElem(defel, pstate);
+
+ opts->specified_opts |= SUBOPT_MIN_APPLY_DELAY;
+ opts->min_apply_delay = defGetMinApplyDelay(defel);
+ }
else
ereport(ERROR,
(errcode(ERRCODE_SYNTAX_ERROR),
@@ -404,6 +417,31 @@ parse_subscription_options(ParseState *pstate, List *stmt_options,
"slot_name = NONE", "create_slot = false")));
}
}
+
+ /*
+ * The combination of parallel streaming mode and min_apply_delay is not
+ * allowed. This is because we start applying the transaction stream as
+ * soon as the first change arrives without knowing the transaction's
+ * prepare/commit time. This means we cannot calculate the underlying
+ * network/decoding lag between publisher and subscriber, and so always
+ * waiting for the full 'min_apply_delay' period might include unnecessary
+ * delay.
+ *
+ * The other possibility is to apply the delay at the end of the parallel
+ * apply transaction but that would cause issues related to resource bloat
+ * and locks being held for a long time.
+ */
+ if (IsSet(supported_opts, SUBOPT_MIN_APPLY_DELAY) &&
+ opts->min_apply_delay > 0 && opts->streaming == LOGICALREP_STREAM_PARALLEL)
+ ereport(ERROR,
+ errcode(ERRCODE_SYNTAX_ERROR),
+
+ /*
+ * - translator: the first %s is a string of the form "option > value"
+ * and the second one is "option = value".
+ */
+ errmsg("%s and %s are mutually exclusive options",
+ "min_apply_delay > 0", "streaming = parallel"));
}
/*
@@ -560,7 +598,8 @@ CreateSubscription(ParseState *pstate, CreateSubscriptionStmt *stmt,
SUBOPT_SLOT_NAME | SUBOPT_COPY_DATA |
SUBOPT_SYNCHRONOUS_COMMIT | SUBOPT_BINARY |
SUBOPT_STREAMING | SUBOPT_TWOPHASE_COMMIT |
- SUBOPT_DISABLE_ON_ERR | SUBOPT_ORIGIN);
+ SUBOPT_DISABLE_ON_ERR | SUBOPT_ORIGIN |
+ SUBOPT_MIN_APPLY_DELAY);
parse_subscription_options(pstate, stmt->options, supported_opts, &opts);
/*
@@ -625,6 +664,7 @@ CreateSubscription(ParseState *pstate, CreateSubscriptionStmt *stmt,
values[Anum_pg_subscription_oid - 1] = ObjectIdGetDatum(subid);
values[Anum_pg_subscription_subdbid - 1] = ObjectIdGetDatum(MyDatabaseId);
values[Anum_pg_subscription_subskiplsn - 1] = LSNGetDatum(InvalidXLogRecPtr);
+ values[Anum_pg_subscription_subminapplydelay - 1] = Int64GetDatum(opts.min_apply_delay);
values[Anum_pg_subscription_subname - 1] =
DirectFunctionCall1(namein, CStringGetDatum(stmt->subname));
values[Anum_pg_subscription_subowner - 1] = ObjectIdGetDatum(owner);
@@ -1054,7 +1094,7 @@ AlterSubscription(ParseState *pstate, AlterSubscriptionStmt *stmt,
supported_opts = (SUBOPT_SLOT_NAME |
SUBOPT_SYNCHRONOUS_COMMIT | SUBOPT_BINARY |
SUBOPT_STREAMING | SUBOPT_DISABLE_ON_ERR |
- SUBOPT_ORIGIN);
+ SUBOPT_ORIGIN | SUBOPT_MIN_APPLY_DELAY);
parse_subscription_options(pstate, stmt->options,
supported_opts, &opts);
@@ -1098,6 +1138,19 @@ AlterSubscription(ParseState *pstate, AlterSubscriptionStmt *stmt,
if (IsSet(opts.specified_opts, SUBOPT_STREAMING))
{
+ /*
+ * The combination of parallel streaming mode and
+ * min_apply_delay is not allowed. See
+ * parse_subscription_options for details of the reason.
+ */
+ if (opts.streaming == LOGICALREP_STREAM_PARALLEL)
+ if ((IsSet(opts.specified_opts, SUBOPT_MIN_APPLY_DELAY) && opts.min_apply_delay > 0) ||
+ (!IsSet(opts.specified_opts, SUBOPT_MIN_APPLY_DELAY) && sub->minapplydelay > 0))
+ ereport(ERROR,
+ errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
+ errmsg("cannot enable parallel streaming mode for subscription with %s",
+ "min_apply_delay"));
+
values[Anum_pg_subscription_substream - 1] =
CharGetDatum(opts.streaming);
replaces[Anum_pg_subscription_substream - 1] = true;
@@ -1111,6 +1164,25 @@ AlterSubscription(ParseState *pstate, AlterSubscriptionStmt *stmt,
= true;
}
+ if (IsSet(opts.specified_opts, SUBOPT_MIN_APPLY_DELAY))
+ {
+ /*
+ * The combination of parallel streaming mode and
+ * min_apply_delay is not allowed.
+ */
+ if (opts.min_apply_delay > 0)
+ if ((IsSet(opts.specified_opts, SUBOPT_STREAMING) && opts.streaming == LOGICALREP_STREAM_PARALLEL) ||
+ (!IsSet(opts.specified_opts, SUBOPT_STREAMING) && sub->stream == LOGICALREP_STREAM_PARALLEL))
+ ereport(ERROR,
+ errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
+ errmsg("cannot enable %s for subscription in %s mode",
+ "min_apply_delay", "streaming = parallel"));
+
+ values[Anum_pg_subscription_subminapplydelay - 1] =
+ Int64GetDatum(opts.min_apply_delay);
+ replaces[Anum_pg_subscription_subminapplydelay - 1] = true;
+ }
+
if (IsSet(opts.specified_opts, SUBOPT_ORIGIN))
{
values[Anum_pg_subscription_suborigin - 1] =
@@ -2195,3 +2267,42 @@ defGetStreamingMode(DefElem *def)
def->defname)));
return LOGICALREP_STREAM_OFF; /* keep compiler quiet */
}
+
+/*
+ * Extract the min_apply_delay value from a DefElem. This is very similar to
+ * parse_and_validate_value() for integer values, because min_apply_delay
+ * accepts the same parameter format as recovery_min_apply_delay.
+ */
+static int
+defGetMinApplyDelay(DefElem *def)
+{
+ char *input_string;
+ int result;
+ const char *hintmsg;
+
+ input_string = defGetString(def);
+
+ /*
+ * Parse given string as parameter which has millisecond unit
+ */
+ if (!parse_int(input_string, &result, GUC_UNIT_MS, &hintmsg))
+ ereport(ERROR,
+ (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+ errmsg("invalid value for parameter \"%s\": \"%s\"",
+ "min_apply_delay", input_string),
+ hintmsg ? errhint("%s", _(hintmsg)) : 0));
+
+ /*
+ * Check lower bound. parse_int() has already been confirmed that result
+ * is less than or equal to INT_MAX.
+ */
+ if (result < 0)
+ ereport(ERROR,
+ (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+ errmsg("%d ms is outside the valid range for parameter \"%s\" (%d .. %d)",
+ result,
+ "min_apply_delay",
+ 0, INT_MAX)));
+
+ return result;
+}
diff --git a/src/backend/replication/logical/applyparallelworker.c b/src/backend/replication/logical/applyparallelworker.c
index 3579e704fe..7302bce7a0 100644
--- a/src/backend/replication/logical/applyparallelworker.c
+++ b/src/backend/replication/logical/applyparallelworker.c
@@ -704,7 +704,8 @@ pa_process_spooled_messages_if_required(void)
{
apply_spooled_messages(&MyParallelShared->fileset,
MyParallelShared->xid,
- InvalidXLogRecPtr);
+ InvalidXLogRecPtr,
+ 0);
pa_set_fileset_state(MyParallelShared, FS_EMPTY);
}
diff --git a/src/backend/replication/logical/worker.c b/src/backend/replication/logical/worker.c
index cfb2ab6248..eb785c621d 100644
--- a/src/backend/replication/logical/worker.c
+++ b/src/backend/replication/logical/worker.c
@@ -319,6 +319,17 @@ static List *on_commit_wakeup_workers_subids = NIL;
bool in_remote_transaction = false;
static XLogRecPtr remote_final_lsn = InvalidXLogRecPtr;
+/*
+ * In order to avoid walsender timeout for time-delayed logical replication the
+ * apply worker keeps sending feedback messages during the delay period.
+ * Meanwhile, the feature delays the apply before the start of the
+ * transaction and thus we don't write WAL records for the suspended changes
+ * during the wait. When the apply worker sends a feedback message during the
+ * delay, we should not make positions of the flushed and apply LSN overwritten
+ * by the last received latest LSN. See send_feedback() for details.
+ */
+static XLogRecPtr last_received = InvalidXLogRecPtr;
+
/* fields valid only when processing streamed transaction */
static bool in_streamed_transaction = false;
@@ -389,7 +400,8 @@ static void stream_write_change(char action, StringInfo s);
static void stream_open_and_write_change(TransactionId xid, char action, StringInfo s);
static void stream_close_file(void);
-static void send_feedback(XLogRecPtr recvpos, bool force, bool requestReply);
+static void send_feedback(XLogRecPtr recvpos, bool force, bool requestReply,
+ bool in_delayed_apply);
static void DisableSubscriptionAndExit(void);
@@ -999,6 +1011,108 @@ slot_modify_data(TupleTableSlot *slot, TupleTableSlot *srcslot,
ExecStoreVirtualTuple(slot);
}
+/*
+ * When min_apply_delay parameter is set on the subscriber, we wait long enough
+ * to make sure a transaction is applied at least that period behind the
+ * publisher.
+ *
+ * While the physical replication applies the delay at commit time, this
+ * feature applies the delay for the next transaction but before starting the
+ * transaction. This is mainly because keeping a transaction that conducted
+ * write operations open for a long time results in some issues such as bloat
+ * and locks.
+ *
+ * The min_apply_delay parameter will take effect only after all tables are in
+ * READY state.
+ *
+ * xid is the transaction id where we apply the delay.
+ *
+ * finish_ts is the commit/prepare time of both regular (non-streamed) and
+ * streamed transactions. Unlike the regular (non-streamed) cases, the delay
+ * is applied in a STREAM COMMIT/STREAM PREPARE message for streamed
+ * transactions. The STREAM START message does not contain a commit/prepare
+ * time (it will be available when the in-progress transaction finishes).
+ * Hence, it's not appropriate to apply a delay at the STREAM START time.
+ */
+static void
+maybe_apply_delay(TransactionId xid, TimestampTz finish_ts)
+{
+ Assert(finish_ts > 0);
+
+ /* Nothing to do if no delay set */
+ if (!MySubscription->minapplydelay)
+ return;
+
+ /*
+ * The min_apply_delay parameter is ignored until all tablesync workers
+ * have reached READY state. This is because if we allowed the delay
+ * during the catchup phase, then once we reached the limit of tablesync
+ * workers it would impose a delay for each subsequent worker. That would
+ * cause initial table synchronization completion to take a long time.
+ */
+ if (!AllTablesyncsReady())
+ return;
+
+ /* Apply the delay by the latch mechanism */
+ while (true)
+ {
+ long diffms;
+
+ ResetLatch(MyLatch);
+
+ CHECK_FOR_INTERRUPTS();
+
+ /* This might change wal_receiver_status_interval */
+ if (ConfigReloadPending)
+ {
+ ConfigReloadPending = false;
+ ProcessConfigFile(PGC_SIGHUP);
+ }
+
+ /*
+ * Before calculating the time duration, reload the catalog if needed.
+ */
+ if (!in_remote_transaction && !in_streamed_transaction)
+ {
+ AcceptInvalidationMessages();
+ maybe_reread_subscription();
+ }
+
+ diffms = TimestampDifferenceMilliseconds(GetCurrentTimestamp(),
+ TimestampTzPlusMilliseconds(finish_ts, MySubscription->minapplydelay));
+
+ /*
+ * Exit without arming the latch if it's already past time to apply
+ * this transaction.
+ */
+ if (diffms <= 0)
+ break;
+
+ elog(DEBUG2, "time-delayed replication for txid %u, min_apply_delay = " INT64_FORMAT " ms, remaining wait time: %ld ms",
+ xid, MySubscription->minapplydelay, diffms);
+
+ /*
+ * Call send_feedback() to prevent the publisher from exiting by
+ * timeout during the delay, when wal_receiver_status_interval is
+ * available.
+ */
+ if (wal_receiver_status_interval > 0 &&
+ diffms > wal_receiver_status_interval * 1000L)
+ {
+ WaitLatch(MyLatch,
+ WL_LATCH_SET | WL_TIMEOUT | WL_EXIT_ON_PM_DEATH,
+ wal_receiver_status_interval * 1000L,
+ WAIT_EVENT_RECOVERY_APPLY_DELAY);
+ send_feedback(last_received, true, false, true);
+ }
+ else
+ WaitLatch(MyLatch,
+ WL_LATCH_SET | WL_TIMEOUT | WL_EXIT_ON_PM_DEATH,
+ diffms,
+ WAIT_EVENT_RECOVERY_APPLY_DELAY);
+ }
+}
+
/*
* Handle BEGIN message.
*/
@@ -1013,6 +1127,9 @@ apply_handle_begin(StringInfo s)
logicalrep_read_begin(s, &begin_data);
set_apply_error_context_xact(begin_data.xid, begin_data.final_lsn);
+ /* Should we delay the current transaction? */
+ maybe_apply_delay(begin_data.xid, begin_data.committime);
+
remote_final_lsn = begin_data.final_lsn;
maybe_start_skipping_changes(begin_data.final_lsn);
@@ -1070,6 +1187,9 @@ apply_handle_begin_prepare(StringInfo s)
logicalrep_read_begin_prepare(s, &begin_data);
set_apply_error_context_xact(begin_data.xid, begin_data.prepare_lsn);
+ /* Should we delay the current prepared transaction? */
+ maybe_apply_delay(begin_data.xid, begin_data.prepare_time);
+
remote_final_lsn = begin_data.prepare_lsn;
maybe_start_skipping_changes(begin_data.prepare_lsn);
@@ -1317,7 +1437,8 @@ apply_handle_stream_prepare(StringInfo s)
* spooled operations.
*/
apply_spooled_messages(MyLogicalRepWorker->stream_fileset,
- prepare_data.xid, prepare_data.prepare_lsn);
+ prepare_data.xid, prepare_data.prepare_lsn,
+ prepare_data.prepare_time);
/* Mark the transaction as prepared. */
apply_handle_prepare_internal(&prepare_data);
@@ -2011,10 +2132,13 @@ ensure_last_message(FileSet *stream_fileset, TransactionId xid, int fileno,
/*
* Common spoolfile processing.
+ *
+ * The commit/prepare time (finish_ts) for streamed transactions is required
+ * for time-delayed logical replication.
*/
void
apply_spooled_messages(FileSet *stream_fileset, TransactionId xid,
- XLogRecPtr lsn)
+ XLogRecPtr lsn, TimestampTz finish_ts)
{
StringInfoData s2;
int nchanges;
@@ -2025,6 +2149,10 @@ apply_spooled_messages(FileSet *stream_fileset, TransactionId xid,
int fileno;
off_t offset;
+ /* Should we delay the current transaction? */
+ if (finish_ts)
+ maybe_apply_delay(xid, finish_ts);
+
if (!am_parallel_apply_worker())
maybe_start_skipping_changes(lsn);
@@ -2174,7 +2302,7 @@ apply_handle_stream_commit(StringInfo s)
* spooled operations.
*/
apply_spooled_messages(MyLogicalRepWorker->stream_fileset, xid,
- commit_data.commit_lsn);
+ commit_data.commit_lsn, commit_data.committime);
apply_handle_commit_internal(&commit_data);
@@ -3447,7 +3575,7 @@ UpdateWorkerStats(XLogRecPtr last_lsn, TimestampTz send_time, bool reply)
* Apply main loop.
*/
static void
-LogicalRepApplyLoop(XLogRecPtr last_received)
+LogicalRepApplyLoop(void)
{
TimestampTz last_recv_timestamp = GetCurrentTimestamp();
bool ping_sent = false;
@@ -3568,7 +3696,7 @@ LogicalRepApplyLoop(XLogRecPtr last_received)
if (last_received < end_lsn)
last_received = end_lsn;
- send_feedback(last_received, reply_requested, false);
+ send_feedback(last_received, reply_requested, false, false);
UpdateWorkerStats(last_received, timestamp, true);
}
/* other message types are purposefully ignored */
@@ -3581,7 +3709,7 @@ LogicalRepApplyLoop(XLogRecPtr last_received)
}
/* confirm all writes so far */
- send_feedback(last_received, false, false);
+ send_feedback(last_received, false, false, false);
if (!in_remote_transaction && !in_streamed_transaction)
{
@@ -3678,7 +3806,7 @@ LogicalRepApplyLoop(XLogRecPtr last_received)
}
}
- send_feedback(last_received, requestReply, requestReply);
+ send_feedback(last_received, requestReply, requestReply, false);
/*
* Force reporting to ensure long idle periods don't lead to
@@ -3708,7 +3836,7 @@ LogicalRepApplyLoop(XLogRecPtr last_received)
* to send a response to avoid timeouts.
*/
static void
-send_feedback(XLogRecPtr recvpos, bool force, bool requestReply)
+send_feedback(XLogRecPtr recvpos, bool force, bool requestReply, bool in_delayed_apply)
{
static StringInfo reply_message = NULL;
static TimestampTz send_time = 0;
@@ -3738,8 +3866,15 @@ send_feedback(XLogRecPtr recvpos, bool force, bool requestReply)
/*
* No outstanding transactions to flush, we can report the latest received
* position. This is important for synchronous replication.
+ *
+ * If the logical replication subscription is delayed (min_apply_delay
+ * parameter) then do not inform the publisher that the received latest
+ * LSN is already applied and flushed, otherwise, the publisher will make
+ * a wrong assumption about the logical replication progress. Instead, it
+ * just sends a feedback message to avoid a replication timeout during the
+ * delay.
*/
- if (!have_pending_txes)
+ if (!have_pending_txes && !in_delayed_apply)
flushpos = writepos = recvpos;
if (writepos < last_writepos)
@@ -3776,8 +3911,9 @@ send_feedback(XLogRecPtr recvpos, bool force, bool requestReply)
pq_sendint64(reply_message, now); /* sendTime */
pq_sendbyte(reply_message, requestReply); /* replyRequested */
- elog(DEBUG2, "sending feedback (force %d) to recv %X/%X, write %X/%X, flush %X/%X",
+ elog(DEBUG2, "sending feedback (force %d, in_delayed_apply %s) to recv %X/%X, write %X/%X, flush %X/%X",
force,
+ in_delayed_apply ? "yes" : "no",
LSN_FORMAT_ARGS(recvpos),
LSN_FORMAT_ARGS(writepos),
LSN_FORMAT_ARGS(flushpos));
@@ -4367,11 +4503,11 @@ start_table_sync(XLogRecPtr *origin_startpos, char **myslotname)
* of system resource error and are not repeatable.
*/
static void
-start_apply(XLogRecPtr origin_startpos)
+start_apply(void)
{
PG_TRY();
{
- LogicalRepApplyLoop(origin_startpos);
+ LogicalRepApplyLoop();
}
PG_CATCH();
{
@@ -4661,7 +4797,8 @@ ApplyWorkerMain(Datum main_arg)
}
/* Run the main loop. */
- start_apply(origin_startpos);
+ last_received = origin_startpos;
+ start_apply();
proc_exit(0);
}
diff --git a/src/bin/pg_dump/pg_dump.c b/src/bin/pg_dump/pg_dump.c
index 527c7651ab..c0f69cb43b 100644
--- a/src/bin/pg_dump/pg_dump.c
+++ b/src/bin/pg_dump/pg_dump.c
@@ -4494,6 +4494,7 @@ getSubscriptions(Archive *fout)
int i_subsynccommit;
int i_subpublications;
int i_subbinary;
+ int i_subminapplydelay;
int i,
ntups;
@@ -4546,9 +4547,13 @@ getSubscriptions(Archive *fout)
LOGICALREP_TWOPHASE_STATE_DISABLED);
if (fout->remoteVersion >= 160000)
- appendPQExpBufferStr(query, " s.suborigin\n");
+ appendPQExpBufferStr(query,
+ " s.suborigin,\n"
+ " s.subminapplydelay\n");
else
- appendPQExpBuffer(query, " '%s' AS suborigin\n", LOGICALREP_ORIGIN_ANY);
+ appendPQExpBuffer(query, " '%s' AS suborigin,\n"
+ " 0 AS subminapplydelay\n",
+ LOGICALREP_ORIGIN_ANY);
appendPQExpBufferStr(query,
"FROM pg_subscription s\n"
@@ -4576,6 +4581,7 @@ getSubscriptions(Archive *fout)
i_subtwophasestate = PQfnumber(res, "subtwophasestate");
i_subdisableonerr = PQfnumber(res, "subdisableonerr");
i_suborigin = PQfnumber(res, "suborigin");
+ i_subminapplydelay = PQfnumber(res, "subminapplydelay");
subinfo = pg_malloc(ntups * sizeof(SubscriptionInfo));
@@ -4606,6 +4612,8 @@ getSubscriptions(Archive *fout)
subinfo[i].subdisableonerr =
pg_strdup(PQgetvalue(res, i, i_subdisableonerr));
subinfo[i].suborigin = pg_strdup(PQgetvalue(res, i, i_suborigin));
+ subinfo[i].subminapplydelay =
+ strtoi64(PQgetvalue(res, i, i_subminapplydelay), NULL, 10);
/* Decide whether we want to dump it */
selectDumpableObject(&(subinfo[i].dobj), fout);
@@ -4687,6 +4695,9 @@ dumpSubscription(Archive *fout, const SubscriptionInfo *subinfo)
if (strcmp(subinfo->subsynccommit, "off") != 0)
appendPQExpBuffer(query, ", synchronous_commit = %s", fmtId(subinfo->subsynccommit));
+ if (subinfo->subminapplydelay > 0)
+ appendPQExpBuffer(query, ", min_apply_delay = '" INT64_FORMAT " ms'", subinfo->subminapplydelay);
+
appendPQExpBufferStr(query, ");\n");
if (subinfo->dobj.dump & DUMP_COMPONENT_DEFINITION)
diff --git a/src/bin/pg_dump/pg_dump.h b/src/bin/pg_dump/pg_dump.h
index e7cbd8d7ed..e2525f70ab 100644
--- a/src/bin/pg_dump/pg_dump.h
+++ b/src/bin/pg_dump/pg_dump.h
@@ -661,6 +661,7 @@ typedef struct _SubscriptionInfo
char *subdisableonerr;
char *suborigin;
char *subsynccommit;
+ int64 subminapplydelay;
char *subpublications;
} SubscriptionInfo;
diff --git a/src/bin/psql/describe.c b/src/bin/psql/describe.c
index c8a0bb7b3a..81d4607a1c 100644
--- a/src/bin/psql/describe.c
+++ b/src/bin/psql/describe.c
@@ -6472,7 +6472,7 @@ describeSubscriptions(const char *pattern, bool verbose)
PGresult *res;
printQueryOpt myopt = pset.popt;
static const bool translate_columns[] = {false, false, false, false,
- false, false, false, false, false, false, false, false};
+ false, false, false, false, false, false, false, false, false};
if (pset.sversion < 100000)
{
@@ -6527,10 +6527,13 @@ describeSubscriptions(const char *pattern, bool verbose)
gettext_noop("Two-phase commit"),
gettext_noop("Disable on error"));
+ /* Origin and min_apply_delay are only supported in v16 and higher */
if (pset.sversion >= 160000)
appendPQExpBuffer(&buf,
- ", suborigin AS \"%s\"\n",
- gettext_noop("Origin"));
+ ", suborigin AS \"%s\"\n"
+ ", subminapplydelay AS \"%s\"\n",
+ gettext_noop("Origin"),
+ gettext_noop("Min apply delay"));
appendPQExpBuffer(&buf,
", subsynccommit AS \"%s\"\n"
diff --git a/src/bin/psql/tab-complete.c b/src/bin/psql/tab-complete.c
index 5e1882eaea..e8b9a43a47 100644
--- a/src/bin/psql/tab-complete.c
+++ b/src/bin/psql/tab-complete.c
@@ -1925,7 +1925,7 @@ psql_completion(const char *text, int start, int end)
COMPLETE_WITH("(", "PUBLICATION");
/* ALTER SUBSCRIPTION <name> SET ( */
else if (HeadMatches("ALTER", "SUBSCRIPTION", MatchAny) && TailMatches("SET", "("))
- COMPLETE_WITH("binary", "disable_on_error", "origin", "slot_name",
+ COMPLETE_WITH("binary", "disable_on_error", "min_apply_delay", "origin", "slot_name",
"streaming", "synchronous_commit");
/* ALTER SUBSCRIPTION <name> SKIP ( */
else if (HeadMatches("ALTER", "SUBSCRIPTION", MatchAny) && TailMatches("SKIP", "("))
@@ -3268,7 +3268,7 @@ psql_completion(const char *text, int start, int end)
/* Complete "CREATE SUBSCRIPTION <name> ... WITH ( <opt>" */
else if (HeadMatches("CREATE", "SUBSCRIPTION") && TailMatches("WITH", "("))
COMPLETE_WITH("binary", "connect", "copy_data", "create_slot",
- "disable_on_error", "enabled", "origin", "slot_name",
+ "disable_on_error", "enabled", "min_apply_delay", "origin", "slot_name",
"streaming", "synchronous_commit", "two_phase");
/* CREATE TRIGGER --- is allowed inside CREATE SCHEMA, so use TailMatches */
diff --git a/src/include/catalog/pg_subscription.h b/src/include/catalog/pg_subscription.h
index b0f2a1705d..e06f35c037 100644
--- a/src/include/catalog/pg_subscription.h
+++ b/src/include/catalog/pg_subscription.h
@@ -70,6 +70,8 @@ CATALOG(pg_subscription,6100,SubscriptionRelationId) BKI_SHARED_RELATION BKI_ROW
XLogRecPtr subskiplsn; /* All changes finished at this LSN are
* skipped */
+ int64 subminapplydelay; /* Replication apply delay (ms) */
+
NameData subname; /* Name of the subscription */
Oid subowner BKI_LOOKUP(pg_authid); /* Owner of the subscription */
@@ -120,6 +122,7 @@ typedef struct Subscription
* in */
XLogRecPtr skiplsn; /* All changes finished at this LSN are
* skipped */
+ int64 minapplydelay; /* Replication apply delay (ms) */
char *name; /* Name of the subscription */
Oid owner; /* Oid of the subscription owner */
bool enabled; /* Indicates if the subscription is enabled */
diff --git a/src/include/replication/worker_internal.h b/src/include/replication/worker_internal.h
index dc87a4edd1..3dc09d1a4c 100644
--- a/src/include/replication/worker_internal.h
+++ b/src/include/replication/worker_internal.h
@@ -255,7 +255,7 @@ extern void stream_stop_internal(TransactionId xid);
/* Common streaming function to apply all the spooled messages */
extern void apply_spooled_messages(FileSet *stream_fileset, TransactionId xid,
- XLogRecPtr lsn);
+ XLogRecPtr lsn, TimestampTz finish_ts);
extern void apply_dispatch(StringInfo s);
diff --git a/src/test/regress/expected/subscription.out b/src/test/regress/expected/subscription.out
index 4e5cb0d3a9..360e661f60 100644
--- a/src/test/regress/expected/subscription.out
+++ b/src/test/regress/expected/subscription.out
@@ -114,18 +114,18 @@ CREATE SUBSCRIPTION regress_testsub4 CONNECTION 'dbname=regress_doesnotexist' PU
WARNING: subscription was created, but is not connected
HINT: To initiate replication, you must manually create the replication slot, enable the subscription, and refresh the subscription.
\dRs+ regress_testsub4
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
-------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub4 | regress_subscription_user | f | {testpub} | f | off | d | f | none | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay | Synchronous commit | Conninfo | Skip LSN
+------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+-----------------+--------------------+-----------------------------+----------
+ regress_testsub4 | regress_subscription_user | f | {testpub} | f | off | d | f | none | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
ALTER SUBSCRIPTION regress_testsub4 SET (origin = any);
\dRs+ regress_testsub4
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
-------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub4 | regress_subscription_user | f | {testpub} | f | off | d | f | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay | Synchronous commit | Conninfo | Skip LSN
+------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+-----------------+--------------------+-----------------------------+----------
+ regress_testsub4 | regress_subscription_user | f | {testpub} | f | off | d | f | any | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
DROP SUBSCRIPTION regress_testsub3;
@@ -135,10 +135,10 @@ ALTER SUBSCRIPTION regress_testsub CONNECTION 'foobar';
ERROR: invalid connection string syntax: missing "=" after "foobar" in connection info string
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub} | f | off | d | f | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+-----------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | off | d | f | any | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
ALTER SUBSCRIPTION regress_testsub SET PUBLICATION testpub2, testpub3 WITH (refresh = false);
@@ -155,10 +155,10 @@ ERROR: unrecognized subscription parameter: "create_slot"
-- ok
ALTER SUBSCRIPTION regress_testsub SKIP (lsn = '0/12345');
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+---------------------+--------+-----------+------------------+------------------+--------+--------------------+------------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub2,testpub3} | f | off | d | f | any | off | dbname=regress_doesnotexist2 | 0/12345
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+---------------------+--------+-----------+------------------+------------------+--------+-----------------+--------------------+------------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub2,testpub3} | f | off | d | f | any | 0 | off | dbname=regress_doesnotexist2 | 0/12345
(1 row)
-- ok - with lsn = NONE
@@ -167,10 +167,10 @@ ALTER SUBSCRIPTION regress_testsub SKIP (lsn = NONE);
ALTER SUBSCRIPTION regress_testsub SKIP (lsn = '0/0');
ERROR: invalid WAL location (LSN): 0/0
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+---------------------+--------+-----------+------------------+------------------+--------+--------------------+------------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub2,testpub3} | f | off | d | f | any | off | dbname=regress_doesnotexist2 | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+---------------------+--------+-----------+------------------+------------------+--------+-----------------+--------------------+------------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub2,testpub3} | f | off | d | f | any | 0 | off | dbname=regress_doesnotexist2 | 0/0
(1 row)
BEGIN;
@@ -202,10 +202,10 @@ ALTER SUBSCRIPTION regress_testsub_foo SET (synchronous_commit = foobar);
ERROR: invalid value for parameter "synchronous_commit": "foobar"
HINT: Available values: local, remote_write, remote_apply, on, off.
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
----------------------+---------------------------+---------+---------------------+--------+-----------+------------------+------------------+--------+--------------------+------------------------------+----------
- regress_testsub_foo | regress_subscription_user | f | {testpub2,testpub3} | f | off | d | f | any | local | dbname=regress_doesnotexist2 | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay | Synchronous commit | Conninfo | Skip LSN
+---------------------+---------------------------+---------+---------------------+--------+-----------+------------------+------------------+--------+-----------------+--------------------+------------------------------+----------
+ regress_testsub_foo | regress_subscription_user | f | {testpub2,testpub3} | f | off | d | f | any | 0 | local | dbname=regress_doesnotexist2 | 0/0
(1 row)
-- rename back to keep the rest simple
@@ -239,19 +239,19 @@ CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUB
WARNING: subscription was created, but is not connected
HINT: To initiate replication, you must manually create the replication slot, enable the subscription, and refresh the subscription.
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub} | t | off | d | f | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+-----------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | t | off | d | f | any | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
ALTER SUBSCRIPTION regress_testsub SET (binary = false);
ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE);
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub} | f | off | d | f | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+-----------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | off | d | f | any | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
DROP SUBSCRIPTION regress_testsub;
@@ -263,27 +263,27 @@ CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUB
WARNING: subscription was created, but is not connected
HINT: To initiate replication, you must manually create the replication slot, enable the subscription, and refresh the subscription.
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub} | f | on | d | f | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+-----------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | on | d | f | any | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
ALTER SUBSCRIPTION regress_testsub SET (streaming = parallel);
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub} | f | parallel | d | f | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+-----------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | parallel | d | f | any | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
ALTER SUBSCRIPTION regress_testsub SET (streaming = false);
ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE);
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub} | f | off | d | f | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+-----------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | off | d | f | any | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
-- fail - publication already exists
@@ -298,10 +298,10 @@ ALTER SUBSCRIPTION regress_testsub ADD PUBLICATION testpub1, testpub2 WITH (refr
ALTER SUBSCRIPTION regress_testsub ADD PUBLICATION testpub1, testpub2 WITH (refresh = false);
ERROR: publication "testpub1" is already in subscription "regress_testsub"
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-----------------------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub,testpub1,testpub2} | f | off | d | f | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-----------------------------+--------+-----------+------------------+------------------+--------+-----------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub,testpub1,testpub2} | f | off | d | f | any | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
-- fail - publication used more than once
@@ -316,10 +316,10 @@ ERROR: publication "testpub3" is not in subscription "regress_testsub"
-- ok - delete publications
ALTER SUBSCRIPTION regress_testsub DROP PUBLICATION testpub1, testpub2 WITH (refresh = false);
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub} | f | off | d | f | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+-----------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | off | d | f | any | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
DROP SUBSCRIPTION regress_testsub;
@@ -355,10 +355,10 @@ CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUB
WARNING: subscription was created, but is not connected
HINT: To initiate replication, you must manually create the replication slot, enable the subscription, and refresh the subscription.
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub} | f | off | p | f | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+-----------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | off | p | f | any | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
--fail - alter of two_phase option not supported.
@@ -367,10 +367,10 @@ ERROR: unrecognized subscription parameter: "two_phase"
-- but can alter streaming when two_phase enabled
ALTER SUBSCRIPTION regress_testsub SET (streaming = true);
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub} | f | on | p | f | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+-----------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | on | p | f | any | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE);
@@ -380,10 +380,10 @@ CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUB
WARNING: subscription was created, but is not connected
HINT: To initiate replication, you must manually create the replication slot, enable the subscription, and refresh the subscription.
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub} | f | on | p | f | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+-----------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | on | p | f | any | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE);
@@ -396,20 +396,57 @@ CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUB
WARNING: subscription was created, but is not connected
HINT: To initiate replication, you must manually create the replication slot, enable the subscription, and refresh the subscription.
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub} | f | off | d | f | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+-----------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | off | d | f | any | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
ALTER SUBSCRIPTION regress_testsub SET (disable_on_error = true);
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub} | f | off | d | t | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+-----------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | off | d | t | any | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
+ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE);
+DROP SUBSCRIPTION regress_testsub;
+-- fail -- min_apply_delay must be a non-negative integer
+CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUBLICATION testpub WITH (connect = false, min_apply_delay = foo);
+ERROR: invalid value for parameter "min_apply_delay": "foo"
+CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUBLICATION testpub WITH (connect = false, min_apply_delay = -1);
+ERROR: -1 ms is outside the valid range for parameter "min_apply_delay" (0 .. 2147483647)
+-- fail - utilizing streaming = parallel with time-delayed replication is not supported
+CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUBLICATION testpub WITH (connect = false, streaming = parallel, min_apply_delay = 123);
+ERROR: min_apply_delay > 0 and streaming = parallel are mutually exclusive options
+-- success -- min_apply_delay value without unit is taken as milliseconds
+CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUBLICATION testpub WITH (connect = false, min_apply_delay = 123);
+WARNING: subscription was created, but is not connected
+HINT: To initiate replication, you must manually create the replication slot, enable the subscription, and refresh the subscription.
+\dRs+
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+-----------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | off | d | f | any | 123 | off | dbname=regress_doesnotexist | 0/0
+(1 row)
+
+-- success -- min_apply_delay value with unit is converted into ms and stored as an integer
+ALTER SUBSCRIPTION regress_testsub SET (min_apply_delay = '1 d');
+\dRs+
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+-----------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | off | d | f | any | 86400000 | off | dbname=regress_doesnotexist | 0/0
+(1 row)
+
+-- fail - alter subscription with streaming = parallel should fail when time-delayed replication is set
+ALTER SUBSCRIPTION regress_testsub SET (streaming = parallel);
+ERROR: cannot enable parallel streaming mode for subscription with min_apply_delay
+-- fail - alter subscription with min_apply_delay should fail when streaming = parallel is set
+ALTER SUBSCRIPTION regress_testsub SET (min_apply_delay = 0, streaming = parallel);
+ALTER SUBSCRIPTION regress_testsub SET (min_apply_delay = 123);
+ERROR: cannot enable min_apply_delay for subscription in streaming = parallel mode
ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE);
DROP SUBSCRIPTION regress_testsub;
RESET SESSION AUTHORIZATION;
diff --git a/src/test/regress/sql/subscription.sql b/src/test/regress/sql/subscription.sql
index 5f27b7d776..53fe2a4c6b 100644
--- a/src/test/regress/sql/subscription.sql
+++ b/src/test/regress/sql/subscription.sql
@@ -279,6 +279,30 @@ ALTER SUBSCRIPTION regress_testsub SET (disable_on_error = true);
ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE);
DROP SUBSCRIPTION regress_testsub;
+-- fail -- min_apply_delay must be a non-negative integer
+CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUBLICATION testpub WITH (connect = false, min_apply_delay = foo);
+CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUBLICATION testpub WITH (connect = false, min_apply_delay = -1);
+
+-- fail - utilizing streaming = parallel with time-delayed replication is not supported
+CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUBLICATION testpub WITH (connect = false, streaming = parallel, min_apply_delay = 123);
+
+-- success -- min_apply_delay value without unit is taken as milliseconds
+CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUBLICATION testpub WITH (connect = false, min_apply_delay = 123);
+\dRs+
+
+-- success -- min_apply_delay value with unit is converted into ms and stored as an integer
+ALTER SUBSCRIPTION regress_testsub SET (min_apply_delay = '1 d');
+\dRs+
+
+-- fail - alter subscription with streaming = parallel should fail when time-delayed replication is set
+ALTER SUBSCRIPTION regress_testsub SET (streaming = parallel);
+
+-- fail - alter subscription with min_apply_delay should fail when streaming = parallel is set
+ALTER SUBSCRIPTION regress_testsub SET (min_apply_delay = 0, streaming = parallel);
+ALTER SUBSCRIPTION regress_testsub SET (min_apply_delay = 123);
+ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE);
+DROP SUBSCRIPTION regress_testsub;
+
RESET SESSION AUTHORIZATION;
DROP ROLE regress_subscription_user;
DROP ROLE regress_subscription_user2;
diff --git a/src/test/subscription/meson.build b/src/test/subscription/meson.build
index 3db0fdfd96..a186876eb4 100644
--- a/src/test/subscription/meson.build
+++ b/src/test/subscription/meson.build
@@ -38,6 +38,7 @@ tests += {
't/029_on_error.pl',
't/030_origin.pl',
't/031_column_list.pl',
+ 't/032_apply_delay.pl',
't/100_bugs.pl',
],
},
diff --git a/src/test/subscription/t/032_apply_delay.pl b/src/test/subscription/t/032_apply_delay.pl
new file mode 100644
index 0000000000..7d499ab664
--- /dev/null
+++ b/src/test/subscription/t/032_apply_delay.pl
@@ -0,0 +1,187 @@
+
+# Copyright (c) 2023, PostgreSQL Global Development Group
+
+# Test replication apply delay
+use strict;
+use warnings;
+use PostgreSQL::Test::Cluster;
+use PostgreSQL::Test::Utils;
+use Test::More;
+
+# Confirm the time-delayed replication has been effective from the server log
+# message where the apply worker emits for applying delay. Moreover, verifies
+# that the current worker's remaining wait time is sufficiently bigger than the
+# expected value, in order to check any update of the min_apply_delay.
+sub check_apply_delay_log
+{
+ my ($node_subscriber, $offset, $expected) = @_;
+
+ my $log_location = $node_subscriber->wait_for_log(
+ qr/time-delayed replication for txid (\d+), min_apply_delay = (\d+) ms, remaining wait time: (\d+) ms/,
+ $offset);
+
+ cmp_ok($log_location, '>', $offset,
+ "logfile contains triggered logical replication apply delay");
+
+ # Get the remaining wait time from the server log
+ my $contents = slurp_file($node_subscriber->logfile, $offset);
+ $contents =~
+ qr/time-delayed replication for txid (\d+), min_apply_delay = (\d+) ms, remaining wait time: (\d+) ms/,
+ or die "could not get the apply worker wait time";
+ my $logged_delay = $3;
+
+ # Is it larger than expected?
+ cmp_ok($logged_delay, '>', $expected,
+ "The apply worker wait time has expected duration");
+}
+
+# Compare inserted time on the publisher with applied time on the subscriber to
+# confirm the latter is applied after expected time. The time is automatically
+# generated and stored in the table column 'c'.
+sub check_apply_delay_time
+{
+ my ($node_publisher, $node_subscriber, $primary_key, $expected_diffs) =
+ @_;
+
+ my $inserted_time_on_pub = $node_publisher->safe_psql(
+ 'postgres', qq[
+ SELECT extract(epoch from c) FROM test_tab WHERE a = $primary_key;
+ ]);
+
+ my $inserted_time_on_sub = $node_subscriber->safe_psql(
+ 'postgres', qq[
+ SELECT extract(epoch from c) FROM test_tab WHERE a = $primary_key;
+ ]);
+
+ cmp_ok(
+ $inserted_time_on_sub - $inserted_time_on_pub,
+ '>',
+ $expected_diffs,
+ "The tuple on the subscriber was modified later than the publisher");
+}
+
+# Create publisher node
+my $node_publisher = PostgreSQL::Test::Cluster->new('publisher');
+$node_publisher->init(allows_streaming => 'logical');
+$node_publisher->append_conf('postgresql.conf',
+ 'logical_decoding_work_mem = 64kB');
+$node_publisher->start;
+
+# Create subscriber node
+my $node_subscriber = PostgreSQL::Test::Cluster->new('subscriber');
+$node_subscriber->init;
+$node_subscriber->append_conf('postgresql.conf', "log_min_messages = debug2");
+$node_subscriber->start;
+
+# Setup structure on publisher
+$node_publisher->safe_psql('postgres',
+ "CREATE TABLE test_tab (a int primary key, b varchar, c timestamptz (6) DEFAULT now())"
+);
+
+# Setup structure on subscriber
+$node_subscriber->safe_psql('postgres',
+ "CREATE TABLE test_tab (a int primary key, b text, c timestamptz (6) DEFAULT now())"
+);
+
+# Setup logical replication
+my $publisher_connstr = $node_publisher->connstr . ' dbname=postgres';
+
+# The column 'c' must not be published because we want to compare the time
+# difference.
+$node_publisher->safe_psql('postgres',
+ "CREATE PUBLICATION tap_pub FOR TABLE test_tab (a, b)");
+
+my $appname = 'tap_sub';
+
+# Create a subscription that applies the transaction after 1 second delay
+$node_subscriber->safe_psql('postgres',
+ "CREATE SUBSCRIPTION tap_sub CONNECTION '$publisher_connstr application_name=$appname' PUBLICATION tap_pub WITH (copy_data = off, min_apply_delay = '1s', streaming = 'on')"
+);
+
+# Check log starting now for logical replication apply delay
+my $offset = -s $node_subscriber->logfile;
+
+# New row to trigger apply delay
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO test_tab VALUES (1, 'foo')");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO test_tab VALUES (2, 'bar')");
+
+$node_publisher->wait_for_catchup($appname);
+
+my $result =
+ $node_subscriber->safe_psql('postgres',
+ "SELECT count(*), min(a), max(a) FROM test_tab");
+is($result, qq(2|1|2), 'check if the new rows were applied to subscriber');
+
+# Make sure the apply worker knows to wait for more than 500ms
+check_apply_delay_log($node_subscriber, $offset, "0.5");
+
+# Verify that the subscriber lags the publisher by at least 1 second
+check_apply_delay_time($node_publisher, $node_subscriber, '2', '1');
+
+# Setup for streaming case
+$node_publisher->append_conf('postgresql.conf',
+ 'logical_decoding_mode = immediate');
+$node_publisher->reload;
+
+# Run a query to make sure that the reload has taken effect.
+$node_publisher->safe_psql('postgres', q{SELECT 1});
+
+# Check log starting now for logical replication apply delay
+$offset = -s $node_subscriber->logfile;
+
+# Test streamed transaction by insert
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO test_tab SELECT i, md5(i::text) FROM generate_series(3, 5) s(i);"
+);
+
+$node_publisher->wait_for_catchup($appname);
+
+$result =
+ $node_subscriber->safe_psql('postgres',
+ "SELECT count(*), min(a), max(a) FROM test_tab");
+is($result, qq(5|1|5), 'check if the new rows were applied to subscriber');
+
+# Make sure the apply worker knows to wait for more than 500ms
+check_apply_delay_log($node_subscriber, $offset, "0.5");
+
+# Verify that the subscriber lags the publisher by at least 1 second
+check_apply_delay_time($node_publisher, $node_subscriber, '5', '1');
+
+# Test whether ALTER SUBSCRIPTION changes the delayed time of the apply worker
+# (1 day 5 minutes). Note that the extra 5 minute is to account for any
+# decoding/network overhead.
+$node_subscriber->safe_psql('postgres',
+ "ALTER SUBSCRIPTION tap_sub SET (min_apply_delay = 86700000)");
+
+# Check log starting now for logical replication apply delay
+$offset = -s $node_subscriber->logfile;
+
+# New row to trigger apply delay
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO test_tab VALUES (0, 'foobar')");
+
+# Make sure the apply worker knows to wait for more than 1 day
+check_apply_delay_log($node_subscriber, $offset, "86400000");
+
+# Disable subscription and the worker should die immediately
+$node_subscriber->safe_psql('postgres',
+ "ALTER SUBSCRIPTION tap_sub DISABLE;");
+
+# Wait until worker dies
+my $sub_query =
+ "SELECT count(1) = 0 FROM pg_stat_subscription WHERE subname = 'tap_sub' AND pid IS NOT NULL;";
+$node_subscriber->poll_query_until('postgres', $sub_query)
+ or die "Timed out while waiting for subscriber to die";
+
+# Confirm disabling the subscription by ALTER SUBSCRIPTION DISABLE did not cause
+# the delayed transaction to be applied.
+$result = $node_subscriber->safe_psql('postgres',
+ "SELECT count(a) FROM test_tab WHERE a = 0;");
+is($result, qq(0), "check the delayed transaction was not applied");
+
+$node_subscriber->stop;
+$node_publisher->stop;
+
+done_testing();
--
2.30.0
On Tuesday, January 24, 2023 3:58 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
send_feedback(): + * If the subscriber side apply is delayed (because of time-delayed + * replication) then do not tell the publisher that the received latest + * LSN is already applied and flushed, otherwise, it leads to the + * publisher side making a wrong assumption of logical replication + * progress. Instead, we just send a feedback message to avoid apublisher
+ * timeout during the delay. */ - if (!have_pending_txes) + if (!have_pending_txes && !in_delayed_apply) flushpos = writepos = recvpos;Honestly I don't like this wart. The reason for this is the function
assumes recvpos = applypos but we actually call it while holding
unapplied changes, that is, applypos < recvpos.Couldn't we maintain an additional static variable "last_applied"
along with last_received?It won't be easy to maintain the meaning of last_applied because there are
cases where we don't apply the change directly. For example, in case of
streaming xacts, we will just keep writing it to the file, now, say, due to some
reason, we have to send the feedback, then it will not allow you to update the
latest write locations. This would then become different then what we are
doing without the patch.
Another point to think about is that we also need to keep the variable updated
for keep-alive ('k') messages even though we don't apply anything in that case.
Still, other cases to consider are where we have mix of streaming and
non-streaming transactions.
Agreed. This will change some existing behaviors. So, didn't conduct this change in the latest patch [1]/messages/by-id/TYCPR01MB8373DC1881F382B4703F26E0EDC99@TYCPR01MB8373.jpnprd01.prod.outlook.com.
[1]: /messages/by-id/TYCPR01MB8373DC1881F382B4703F26E0EDC99@TYCPR01MB8373.jpnprd01.prod.outlook.com
Best Regards,
Takamichi Osumi
On Monday, January 23, 2023 9:06 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
On Sun, Jan 22, 2023 at 6:12 PM Takamichi Osumi (Fujitsu)
<osumi.takamichi@fujitsu.com> wrote:Attached the updated patch v19.
Few comments:
=============
1.
}
+
+
+/*Only one empty line is sufficient between different functions.
Fixed.
2. + if (IsSet(supported_opts, SUBOPT_MIN_APPLY_DELAY) && + opts->min_apply_delay > 0 && opts->streaming == + opts->LOGICALREP_STREAM_PARALLEL) + ereport(ERROR, + errcode(ERRCODE_SYNTAX_ERROR), + errmsg("%s and %s are mutually exclusive options", + "min_apply_delay > 0", "streaming = parallel")); }I think here we should add a comment for the translator as we are doing in
some other nearby cases.
Fixed.
3. + /* + * The combination of parallel streaming mode and + * min_apply_delay is not allowed. + */ + if (opts.streaming == LOGICALREP_STREAM_PARALLEL) if + ((IsSet(opts.specified_opts, SUBOPT_MIN_APPLY_DELAY) && opts.min_apply_delay > 0) || + (!IsSet(opts.specified_opts, SUBOPT_MIN_APPLY_DELAY) && sub->minapplydelay > 0)) + ereport(ERROR, + errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE), + errmsg("cannot enable %s mode for subscription with %s", + "streaming = parallel", "min_apply_delay")); +A. When can second condition ((!IsSet(opts.specified_opts,
SUBOPT_MIN_APPLY_DELAY) && sub->minapplydelay > 0)) in above check
be true?
B. In comments, you can say "See parse_subscription_options."
(1) In the alter statement, streaming = parallel is set.
Also, (2) in the alter statement, min_apply_delay isn't set.
and (3) an existing subscription has non-zero min_apply_delay.
Added the comment.
4. +/* + * When min_apply_delay parameter is set on the subscriber, we wait +long enough + * to make sure a transaction is applied at least that interval behind +the + * publisher.Shouldn't this part of the comment needs to be updated after the patch has
stopped using interval?
Yes. I removed "interval" in descriptions so that we don't get
confused with types.
5. How does this feature interacts with the SKIP feature? Currently, it doesn't
care whether the changes of a particular xact are skipped or not. I think that
might be okay because anyway the purpose of this feature is to make
subscriber lag from publishers. What do you think?
I feel we can add some comments to indicate the same.
Added the comment in the commit message.
I didn't add this kind of comment as code comments,
since both features are independent. If there is a need to write it anywhere,
then please let me know. The latest patch is posted in [1]/messages/by-id/TYCPR01MB8373DC1881F382B4703F26E0EDC99@TYCPR01MB8373.jpnprd01.prod.outlook.com.
[1]: /messages/by-id/TYCPR01MB8373DC1881F382B4703F26E0EDC99@TYCPR01MB8373.jpnprd01.prod.outlook.com
Best Regards,
Takamichi Osumi
On Monday, January 23, 2023 7:45 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
On Mon, Jan 23, 2023 at 1:36 PM Peter Smith <smithpb2250@gmail.com>
wrote:Here are my review comments for v19-0001.
...
5. parse_subscription_options
+ /* + * The combination of parallel streaming mode and min_apply_delay is + not + * allowed. The subscriber in the parallel streaming mode applies + each + * stream on arrival without the time of commit/prepare. So, the + * subscriber needs to depend on the arrival time of the stream in + this + * case, if we apply the time-delayed feature for such transactions. + Then + * there is a possibility where some unnecessary delay will be added + on + * the subscriber by network communication break between nodes or + other + * heavy work load on the publisher. On the other hand, applying the + delay + * at the end of transaction with parallel apply also can cause + issues of + * used resource bloat and locks kept in open for a long time. Thus, + those + * features can't work together. + */IMO some re-wording might be warranted here. I am not sure quite how
to do it. Perhaps like below?SUGGESTION
The combination of parallel streaming mode and min_apply_delay is not
allowed.
Here are some reasons why these features are incompatible:
a. In the parallel streaming mode the subscriber applies each stream
on arrival without knowledge of the commit/prepare time. This means we
cannot calculate the underlying network/decoding lag between publisher
and subscriber, and so always waiting for the full 'min_apply_delay'
period might include unnecessary delay.
b. If we apply the delay at the end of the transaction of the parallel
apply then that would cause issues related to resource bloat and locks
being held for a long time.~~~
How about something like:
The combination of parallel streaming mode and min_apply_delay is not
allowed. This is because we start applying the transaction stream as soon as
the first change arrives without knowing the transaction's prepare/commit time.
This means we cannot calculate the underlying network/decoding lag between
publisher and subscriber, and so always waiting for the full 'min_apply_delay'
period might include unnecessary delay.The other possibility is to apply the delay at the end of the parallel apply
transaction but that would cause issues related to resource bloat and locks
being held for a long time.
Thank you for providing a good description ! Adopted.
The latest patch can be seen in [1]/messages/by-id/TYCPR01MB8373DC1881F382B4703F26E0EDC99@TYCPR01MB8373.jpnprd01.prod.outlook.com.
[1]: /messages/by-id/TYCPR01MB8373DC1881F382B4703F26E0EDC99@TYCPR01MB8373.jpnprd01.prod.outlook.com
Best Regards,
Takamichi Osumi
On Monday, January 23, 2023 5:07 PM Peter Smith <smithpb2250@gmail.com> wrote:
Here are my review comments for v19-0001.
Thanks for your review !
======
Commit message1.
The combination of parallel streaming mode and min_apply_delay is not
allowed. The subscriber in the parallel streaming mode applies each stream on
arrival without the time of commit/prepare. So, the subscriber needs to depend
on the arrival time of the stream in this case, if we apply the time-delayed
feature for such transactions. Then there is a possibility where some
unnecessary delay will be added on the subscriber by network communication
break between nodes or other heavy work load on the publisher. On the other
hand, applying the delay at the end of transaction with parallel apply also can
cause issues of used resource bloat and locks kept in open for a long time.
Thus, those features can't work together.
~I think the above is just cut/paste from a code comment within
subscriptioncmds.c. See review comments #5 below -- so if the code is
changed then this commit message should also change to match it.
Now, updated this. Kindly have a look at the latest patch in [1]/messages/by-id/TYCPR01MB8373DC1881F382B4703F26E0EDC99@TYCPR01MB8373.jpnprd01.prod.outlook.com.
======
doc/src/sgml/ref/create_subscription.sgml2. + <varlistentry> + <term><literal>min_apply_delay</literal> (<type>integer</type>)</term> + <listitem> + <para> + By default, the subscriber applies changes as soon as possible. This + parameter allows the user to delay the application of changes by a + given time interval. If the value is specified without units, it is + taken as milliseconds. The default is zero (no delay). + </para>2a.
The pgdocs says this is an integer default to “ms” unit. Also, the example on
this same page shows it is set to '4h'. But I did not see any mention of what
other units are available to the user. Maybe other time units should be
mentioned here, or maybe a link should be given to the section “20.1.1.
Parameter Names and Values".
Added.
~
2b.
Previously the word "interval" was deliberately used because this parameter
had interval support. But maybe now it should be changed so it is not
misleading."a given time interval" --> "a given time period" ??
Fixed.
======
src/backend/commands/subscriptioncmds.c3. Forward declare
+static int defGetMinApplyDelay(DefElem *def);
If the new function is implemented as static near the top of this source file then
this forward declare would not even be necessary, right?
This declaration has been kept as discussed.
~~~
4. parse_subscription_options
@@ -324,6 +328,12 @@ parse_subscription_options(ParseState *pstate, List *stmt_options, opts->specified_opts |= SUBOPT_LSN; opts->lsn = lsn; } + else if (IsSet(supported_opts, SUBOPT_MIN_APPLY_DELAY) && + strcmp(defel->defname, "min_apply_delay") == 0) { + opts->specified_opts |= SUBOPT_MIN_APPLY_DELAY; min_apply_delay = + opts->defGetMinApplyDelay(defel); + }Should this code fragment be calling errorConflictingDefElem so it will report
an error if the same min_apply_delay parameter is redundantly repeated?
(IIUC, this appears to be the code pattern for other parameters nearby).
Added.
~~~
5. parse_subscription_options
+ /* + * The combination of parallel streaming mode and min_apply_delay is + not + * allowed. The subscriber in the parallel streaming mode applies each + * stream on arrival without the time of commit/prepare. So, the + * subscriber needs to depend on the arrival time of the stream in this + * case, if we apply the time-delayed feature for such transactions. + Then + * there is a possibility where some unnecessary delay will be added on + * the subscriber by network communication break between nodes or other + * heavy work load on the publisher. On the other hand, applying the + delay + * at the end of transaction with parallel apply also can cause issues + of + * used resource bloat and locks kept in open for a long time. Thus, + those + * features can't work together. + */IMO some re-wording might be warranted here. I am not sure quite how to do it.
Perhaps like below?SUGGESTION
The combination of parallel streaming mode and min_apply_delay is not
allowed.Here are some reasons why these features are incompatible:
a. In the parallel streaming mode the subscriber applies each stream on arrival
without knowledge of the commit/prepare time. This means we cannot
calculate the underlying network/decoding lag between publisher and
subscriber, and so always waiting for the full 'min_apply_delay'
period might include unnecessary delay.
b. If we apply the delay at the end of the transaction of the parallel apply then
that would cause issues related to resource bloat and locks being held for a
long time.
Now, this has been changed to the one suggested by Amit-san.
Thanks for your help.
~~~
6. defGetMinApplyDelay
+ + +/* + * Extract the min_apply_delay mode value from a DefElem. This is very +similar + * to PGC_INT case of parse_and_validate_value(), because +min_apply_delay + * accepts the same string as recovery_min_apply_delay. + */ +int +defGetMinApplyDelay(DefElem *def)6a.
"same string" -> "same parameter format" ??
Fixed.
~
6b.
I thought this function should be implemented as static and located at the top
of the subscriptioncmds.c source file.
Made it static but didn't change the place, as Amit-san mentioned.
======
src/backend/replication/logical/worker.c7. maybe_delay_apply
+static void maybe_delay_apply(TransactionId xid, TimestampTz +finish_ts);Is there a reason why this is here? AFAIK the static implementation precedes
any usage so I doubt this forward declaration is required.
Removed.
~~~
8. send_feedback
@@ -3775,11 +3912,12 @@ send_feedback(XLogRecPtr recvpos, bool force,
bool requestReply)
pq_sendint64(reply_message, now); /* sendTime */
pq_sendbyte(reply_message, requestReply); /* replyRequested */- elog(DEBUG2, "sending feedback (force %d) to recv %X/%X, write %X/%X, flush %X/%X", + elog(DEBUG2, "sending feedback (force %d) to recv %X/%X, write %X/%X, flush %X/%X in-delayed: %d", force, LSN_FORMAT_ARGS(recvpos), LSN_FORMAT_ARGS(writepos), - LSN_FORMAT_ARGS(flushpos)); + LSN_FORMAT_ARGS(flushpos), + in_delayed_apply);Wondering if it is better to write this as:
"sending feedback (force %d, in_delayed_apply %d) to recv %X/%X,
write %X/%X, flush %X/%X"
Adopted and merged with the modification Euler-san provided.
~
10. Add new tests?
Should there be other tests just to verify different units (like 'd', 'h', 'min') are
working OK?
No need. The current subscription.sql does the check
of "invalid value for parameter..." error message, which ensures we call
the defGetMinApplyDelay(). Additionally, we have the test of one unit 'd'
for unit iteration loopin convert_to_base_unit().
So, the current test sets should suffice.
======
src/test/subscription/t/032_apply_delay.pl11. +# Confirm the time-delayed replication has been effective from the +server log # message where the apply worker emits for applying delay. +Moreover, verifies # that the current worker's delayed time is +sufficiently bigger than the # expected value, in order to check any update of the min_apply_delay. +sub check_apply_delay_log"the current worker's delayed time..." --> "the current worker's remaining wait
time..." ??
Fixed.
~~~
12. + # Get the delay time from the server log my $contents = + slurp_file($node_subscriber->logfile, $offset);"Get the delay time...." --> "Get the remaining wait time..."
Fixed.
~~~
13. +# Create a subscription that applies the trasaction after 50 +milliseconds delay $node_subscriber->safe_psql('postgres', + "CREATE SUBSCRIPTION tap_sub CONNECTION '$publisher_connstr application_name=$appname' PUBLICATION tap_pub WITH (copy_data = off, min_apply_delay = '50ms', streaming = 'on')" +);13a.
typo: "trasaction"
Fixed.
~
13b
50ms seems an extremely short time – How do you even know if this is testing
anything related to the time delay? You may just be detecting the normal lag
between publisher and subscriber without time delay having much to do with
anything.
The wait time has been updated to 1 second now.
Also, the TAP tests now search for the emitted logs by the apply worker.
The path to emit the log is in the maybe_apply_delay and
it does writes the log only if the "diffms" is bigger than zero,
which invokes the wait. So, this will ensure we use the feature
by this flow.
~
14.
+# Note that we cannot call check_apply_delay_log() here because there +is a # possibility that the delay is skipped. The event happens when +the WAL # replication between publisher and subscriber is delayed due +to a mechanical # problem. The log output will be checked later - substantial delay-time case. + +# Verify that the subscriber lags the publisher by at least 50 +milliseconds check_apply_delay_time($node_publisher, $node_subscriber, +'2', '0.05');14a.
"The event happens..." ??Did you mean "This might happen if the WAL..."
This part has been removed.
~
14b.
The log output will be checked later - substantial delay-time case.I think that needs re-wording to clarify.
e.g1. you have nothing called a "substantial delay-time" case.
e.g2. the word "later" confused me. Originally, I thought you meant it is not
tested yet but that you will check it "later", but now IIUC you are just referring
to the "1 day 5 minutes" test that comes below in this location TAP file (??)
Also, removed.
[1]: /messages/by-id/TYCPR01MB8373DC1881F382B4703F26E0EDC99@TYCPR01MB8373.jpnprd01.prod.outlook.com
Best Regards,
Takamichi Osumi
On Tuesday, January 24, 2023 8:32 AM Euler Taveira <euler@eulerto.com> wrote:
Good to know that you keep improving this patch. I have a few suggestions that
were easier to provide a patch on top of your latest patch than to provide an
inline suggestions.
Thanks for your review ! We basically adopted your suggestions.
There are a few documentation polishing. Let me comment some of them above.
- The length of time (ms) to delay the application of changes. + Total time spent delaying the application of changes, in millisecondsI don't remember if I suggested this description for catalog but IMO the
suggestion reads better for me.
Adopted the above change.
- For time-delayed logical replication (i.e. when the subscription is - created with parameter min_apply_delay > 0), the apply worker sends a - Standby Status Update message to the publisher with a period of - <literal>wal_receiver_status_interval</literal>. Make sure to set - <literal>wal_receiver_status_interval</literal> less than the - <literal>wal_sender_timeout</literal> on the publisher, otherwise, the - walsender will repeatedly terminate due to the timeout errors. If - <literal>wal_receiver_status_interval</literal> is set to zero, the apply - worker doesn't send any feedback messages during the subscriber's - <literal>min_apply_delay</literal> period. See - <xref linkend="sql-createsubscription"/> for details. + For time-delayed logical replication, the apply worker sends a feedback + message to the publisher every + <varname>wal_receiver_status_interval</varname> milliseconds. Make sure + to set <varname>wal_receiver_status_interval</varname> less than the + <varname>wal_sender_timeout</varname> on the publisher, otherwise, the + <literal>walsender</literal> will repeatedly terminate due to timeout + error. If <varname>wal_receiver_status_interval</varname> is set to + zero, the apply worker doesn't send any feedback messages during the + <literal>min_apply_delay</literal> interval.I removed the parenthesis explanation about time-delayed logical replication.
If you are reading the documentation and does not know what it means you should
(a) read the logical replication chapter or (b) check the glossary (maybe a new
entry should be added). I also removed the Standby status Update message but it
is a low level detail; let's refer to it as feedback message as the other
sentences do. I changed "literal" to "varname" that's the correct tag for
parameters. I replace "period" with "interval" that was the previous
terminology. IMO we should be uniform, use one or the other.
Adopted.
Also, I added the glossary for time-delayed replication (one for
applicable to both physical replication and logical replication).
Plus, I united the term "interval" to period, because it would clarify the type for this feature.
I think this is better.
- The subscriber replication can be instructed to lag behind the publisher - side changes by specifying the <literal>min_apply_delay</literal> - subscription parameter. See <xref linkend="sql-createsubscription"/> for - details. + A logical replication subscription can delay the application of changes by + specifying the <literal>min_apply_delay</literal> subscription parameter. + See <xref linkend="sql-createsubscription"/> for details.This feature refers to a specific subscription, hence, "logical replication
subscription" instead of "subscriber replication".
Adopted.
+ if (IsSet(opts->specified_opts, SUBOPT_MIN_APPLY_DELAY)) + errorConflictingDefElem(defel, pstate); +Peter S referred to this missing piece of code too.
Added.
-int +static int defGetMinApplyDelay(DefElem *def) {It seems you forgot static keyword.
Fixed.
- elog(DEBUG2, "time-delayed replication for txid %u, min_apply_delay = %lld ms, Remaining wait time: %ld ms", - xid, (long long) MySubscription->minapplydelay, diffms); + elog(DEBUG2, "time-delayed replication for txid %u, min_apply_delay = " INT64_FORMAT " ms, remaining wait time: %ld ms", + xid, MySubscription->minapplydelay, diffms); int64 should use format modifier INT64_FORMAT.
Fixed.
- (long) wal_receiver_status_interval * 1000, + wal_receiver_status_interval * 1000L,Cast is not required. I added a suffix to the constant.
Fixed.
- elog(DEBUG2, "sending feedback (force %d) to recv %X/%X, write %X/%X, flush %X/%X in-delayed: %d", + elog(DEBUG2, "sending feedback (force %d) to recv %X/%X, write %X/%X, flush %X/%X, apply delay: %s", force, LSN_FORMAT_ARGS(recvpos), LSN_FORMAT_ARGS(writepos), LSN_FORMAT_ARGS(flushpos), - in_delayed_apply); + in_delayed_apply? "yes" : "no");It is better to use a string to represent the yes/no option.
Fixed.
- gettext_noop("Min apply delay (ms)")); + gettext_noop("Min apply delay"));I don't know if it was discussed but we don't add units to headers. When I
think about this parameter representation (internal and external), I decided to
use the previous code because it provides a unit for external representation. I
understand that using the same representation as recovery_min_apply_delay is
good but the current code does not handle the external representation
accordingly. (recovery_min_apply_delay uses the GUC machinery to adds the unit
but for min_apply_delay, it doesn't).
Adopted.
# Setup for streaming case -$node_publisher->append_conf('postgres.conf', +$node_publisher->append_conf('postgresql.conf', 'logical_decoding_mode = immediate'); $node_publisher->reload;Fix configuration file name.
Fixed.
Maybe tests should do a better job. I think check_apply_delay_time is fragile
because it does not guarantee that time is not shifted. Time-delayed
replication is a subscriber feature and to check its correctness it should
check the logs.# Note that we cannot call check_apply_delay_log() here because there is a
# possibility that the delay is skipped. The event happens when the WAL
# replication between publisher and subscriber is delayed due to a mechanical
# problem. The log output will be checked later - substantial delay-time case.If you might not use the logs for it, it should adjust the min_apply_delay, no?
Yes. Adjusted.
It does not exercise the min_apply_delay vs parallel streaming mode.
+ /* + * The combination of parallel streaming mode and + * min_apply_delay is not allowed. + */ + if (opts.streaming == LOGICALREP_STREAM_PARALLEL) + if ((IsSet(opts.specified_opts, SUBOPT_MIN_APPLY_DELAY) && opts.min_apply_delay > 0) || + (!IsSet(opts.specified_opts, SUBOPT_MIN_APPLY_DELAY) && sub->minapplydelay > 0)) + ereport(ERROR, + errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE), + errmsg("cannot enable %s mode for subscription with %s", + "streaming = parallel", "min_apply_delay")); +Is this code correct? I also didn't like this message. "cannot enable streaming
= parallel mode for subscription with min_apply_delay" is far from a good error
message. How about refer parallelism to "parallel streaming mode".
Yes. opts is the input for alter command and sub object is the existing definition.
We need to check those combinations like when streaming is set to parallel
and min_apply_delay also gets set, then, min_apply_delay should not be bigger than 0, for example.
Besides, adopted your suggestion to improve the comments.
Attach the patch in [1]/messages/by-id/TYCPR01MB8373DC1881F382B4703F26E0EDC99@TYCPR01MB8373.jpnprd01.prod.outlook.com. Kindly have a look at it.
[1]: /messages/by-id/TYCPR01MB8373DC1881F382B4703F26E0EDC99@TYCPR01MB8373.jpnprd01.prod.outlook.com
Best Regards,
Takamichi Osumi
At Tue, 24 Jan 2023 11:45:36 +0530, Amit Kapila <amit.kapila16@gmail.com> wrote in
Personally, I would prefer the above LOGs to be in reverse order as it
doesn't make much sense to me to first say that we are skipping
changes and then say the transaction is delayed. What do you think?
In the first place, I misunderstood maybe_start_skipping_changes(),
which doesn't actually skip changes. So... sorry for the noise.
For the record, I agree that the current order is right.
regards.
--
Kyotaro Horiguchi
NTT Open Source Software Center
In short, I'd like to propose renaming the parameter in_delayed_apply
of send_feedback to "has_unprocessed_change".
At Tue, 24 Jan 2023 12:27:58 +0530, Amit Kapila <amit.kapila16@gmail.com> wrote in
send_feedback(): + * If the subscriber side apply is delayed (because of time-delayed + * replication) then do not tell the publisher that the received latest + * LSN is already applied and flushed, otherwise, it leads to the + * publisher side making a wrong assumption of logical replication + * progress. Instead, we just send a feedback message to avoid a publisher + * timeout during the delay. */ - if (!have_pending_txes) + if (!have_pending_txes && !in_delayed_apply) flushpos = writepos = recvpos;Honestly I don't like this wart. The reason for this is the function
assumes recvpos = applypos but we actually call it while holding
unapplied changes, that is, applypos < recvpos.Couldn't we maintain an additional static variable "last_applied"
along with last_received?It won't be easy to maintain the meaning of last_applied because there
are cases where we don't apply the change directly. For example, in
case of streaming xacts, we will just keep writing it to the file,
now, say, due to some reason, we have to send the feedback, then it
will not allow you to update the latest write locations. This would
then become different then what we are doing without the patch.
Another point to think about is that we also need to keep the variable
updated for keep-alive ('k') messages even though we don't apply
anything in that case. Still, other cases to consider are where we
have mix of streaming and non-streaming transactions.
Yeah. Even though I named it as "last_applied", its objective is to
have get_flush_position returning the correct have_pending_txes
without a hint from callers, that is, "let g_f_position know if
store_flush_position has been called with the last received data".
Anyway I tried that but didn't find a clean and simple way. However,
while on it, I realized what the code made me confused.
+static void send_feedback(XLogRecPtr recvpos, bool force, bool requestReply,
+ bool in_delayed_apply);
The name "in_delayed_apply" doesn't donsn't give me an idea of what
the function should do for it. If it is named "has_unprocessed_change",
I think it makes sense that send_feedback should think there may be an
outstanding transaction that is not known to the function.
So, my conclusion here is I'd like to propose changing the parameter
name to "has_unapplied_change".
In this case the condition cited above
would be as follows and in_delayed_apply will become unnecessary.+ if (!have_pending_txes && last_received == last_applied)
The function is a static function and always called with a variable
last_received that has the same scope with the function, as the first
Sorry for the noise, I misread it. Maybe I took the "function-scoped"
variable as file-scoped.. Thus the discussion is false.
parameter. Thus we can remove the first parameter then let the
function directly look at the both two varaibles instead.I think this is true without this patch, so why that has not been
followed in the first place? One comment, I see in this regard is as
below:/* It's legal to not pass a recvpos */
if (recvpos < last_recvpos)
recvpos = last_recvpos;
Sorry. I don't understand this. It is just a part of the ratchet
mechanism for the last received lsn to report.
regards.
--
Kyotaro Horiguchi
NTT Open Source Software Center
At Tue, 24 Jan 2023 14:22:19 +0530, Amit Kapila <amit.kapila16@gmail.com> wrote in
On Tue, Jan 24, 2023 at 12:44 PM Peter Smith <smithpb2250@gmail.com> wrote:
On Tue, Jan 24, 2023 at 5:58 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
On Tue, Jan 24, 2023 at 8:15 AM Kyotaro Horiguchi
<horikyota.ntt@gmail.com> wrote:Attached the updated patch v19.
+ maybe_delay_apply(TransactionId xid, TimestampTz finish_ts)
I look this spelling strange. How about maybe_apply_delay()?
+1.
It depends on how you read it. I read it like this:
maybe_delay_apply === means "maybe delay [the] apply"
(which is exactly what the function does)versus
maybe_apply_delay === means "maybe [the] apply [needs a] delay"
(which is also correct, but it seemed a more awkward way to say it IMO)This matches more with GUC and all other usages of variables in the
patch. So, I still prefer the second one.
I read it as "maybe apply [the] delay [to something suggested by the
context]". If we go the first way, I will name it as
"maybe_delay_apply_change" or something that has an extra word.
regards.
--
Kyotaro Horiguchi
NTT Open Source Software Center
Sorry for making you bothered by this.
At Tue, 24 Jan 2023 10:12:40 +0000, "Hayato Kuroda (Fujitsu)" <kuroda.hayato@fujitsu.com> wrote in
Couldn't we maintain an additional static variable "last_applied"
along with last_received?It won't be easy to maintain the meaning of last_applied because there
are cases where we don't apply the change directly. For example, in
case of streaming xacts, we will just keep writing it to the file,
now, say, due to some reason, we have to send the feedback, then it
will not allow you to update the latest write locations. This would
then become different then what we are doing without the patch.
Another point to think about is that we also need to keep the variable
updated for keep-alive ('k') messages even though we don't apply
anything in that case. Still, other cases to consider are where we
have mix of streaming and non-streaming transactions.I have tried to implement that, but it might be difficult because of a corner
case related with the initial data sync.First of all, I have made last_applied to update when
* transactions are committed, prepared, or aborted
* apply worker receives keepalive message.
Yeah, I vagurly thought that it is enough that the update happens just
befor existing send_feecback() calls. But it turned out to introduce
another unprincipledness..
I thought during the initial data sync, we must not update the last applied
triggered by keepalive messages, so following lines were added just after
updating last_received.``` + if (last_applied < end_lsn && AllTablesyncsReady()) + last_applied = end_lsn; ```
Maybe, the name "last_applied" made you confused. As I mentioned in
another message, the variable points to the remote LSN of last
"processed" 'w/k' messages.
regards.
--
Kyotaro Horiguchi
NTT Open Source Software Center
On Tue, Jan 24, 2023 at 5:49 PM Takamichi Osumi (Fujitsu)
<osumi.takamichi@fujitsu.com> wrote:
Attached the patch v20 that has incorporated all comments so far.
Kindly have a look at the attached patch.Best Regards,
Takamichi Osumi
Thank You for patch. My previous comments are addressed. Tested it and
it looks good. Logging is also fine now.
Just one comment, in summary, we see :
If the subscription sets min_apply_delay parameter, the logical
replication worker will delay the transaction commit for
min_apply_delay milliseconds.
Is it better to write "delay the transaction apply" instead of "delay
the transaction commit" just to be consistent as we do not actually
delay the commit for regular transactions.
thanks
Shveta
Hi,
On Wednesday, January 25, 2023 2:02 PM shveta malik <shveta.malik@gmail.com> wrote:
On Tue, Jan 24, 2023 at 5:49 PM Takamichi Osumi (Fujitsu)
<osumi.takamichi@fujitsu.com> wrote:Attached the patch v20 that has incorporated all comments so far.
Kindly have a look at the attached patch.Thank You for patch. My previous comments are addressed. Tested it and it
looks good. Logging is also fine now.Just one comment, in summary, we see :
If the subscription sets min_apply_delay parameter, the logical replication
worker will delay the transaction commit for min_apply_delay milliseconds.Is it better to write "delay the transaction apply" instead of "delay the
transaction commit" just to be consistent as we do not actually delay the
commit for regular transactions.
Thank you for your review !
Agreed. Your description looks better.
Attached the updated patch v21.
Best Regards,
Takamichi Osumi
Attachments:
v21-0001-Time-delayed-logical-replication-subscriber.patchapplication/octet-stream; name=v21-0001-Time-delayed-logical-replication-subscriber.patchDownload
From aeb37688ecb78212eec890b761b3d4fa1c56f6df Mon Sep 17 00:00:00 2001
From: Takamichi Osumi <osumi.takamichi@fujitsu.com>
Date: Wed, 25 Jan 2023 05:23:18 +0000
Subject: [PATCH v21] Time-delayed logical replication subscriber
Similar to physical replication, a time-delayed copy of the data for
logical replication is useful for some scenarios (particularly to fix
errors that might cause data loss).
This patch implements a new subscription parameter called 'min_apply_delay'.
If the subscription sets min_apply_delay parameter, the logical
replication worker will delay the transaction apply for min_apply_delay
milliseconds.
The delay is calculated between the WAL time stamp and the current time
on the subscriber.
The delay occurs only on WAL records for transaction begins. The main
reason is to avoid keeping a transaction open for a long time. Regular
and prepared transactions are covered. Streamed transactions are also
covered.
The combination of parallel streaming mode and min_apply_delay is not
allowed. This is because we start applying the transaction stream as
soon as the first change arrives without knowing the transaction's
prepare/commit time. This means we cannot calculate the underlying
network/decoding lag between publisher and subscriber, and so always
waiting for the full 'min_apply_delay' period might include
unnecessary delay.
The other possibility is to apply the delay at the end of the parallel
apply transaction but that would cause issues related to resource
bloat and locks being held for a long time.
Note that this feature doesn't interact with skip transaction feature.
The skip transaction feature applies to one transaction with a specific LSN.
So, even if the skipped transaction and non-skipped transaction come
consecutively in a very short time, regardless of the order of which comes
first, the time-delayed feature gets balanced by delayed application
for other transactions before and after the skipped transaction.
Author: Euler Taveira, Takamichi Osumi, Kuroda Hayato
Reviewed-by: Amit Kapila, Peter Smith, Vignesh C, Shveta Malik,
Kyotaro Horiguchi, Dilip Kumar, Melih Mutlu
Discussion: https://postgr.es/m/CAB-JLwYOYwL=XTyAXKiH5CtM_Vm8KjKh7aaitCKvmCh4rzr5pQ@mail.gmail.com
---
doc/src/sgml/catalogs.sgml | 9 +
doc/src/sgml/config.sgml | 11 ++
doc/src/sgml/glossary.sgml | 14 ++
doc/src/sgml/logical-replication.sgml | 6 +
doc/src/sgml/ref/alter_subscription.sgml | 5 +-
doc/src/sgml/ref/create_subscription.sgml | 60 +++++-
src/backend/catalog/pg_subscription.c | 1 +
src/backend/catalog/system_views.sql | 7 +-
src/backend/commands/subscriptioncmds.c | 117 ++++++++++-
.../replication/logical/applyparallelworker.c | 3 +-
src/backend/replication/logical/worker.c | 181 ++++++++++++++---
src/bin/pg_dump/pg_dump.c | 15 +-
src/bin/pg_dump/pg_dump.h | 1 +
src/bin/psql/describe.c | 9 +-
src/bin/psql/tab-complete.c | 4 +-
src/include/catalog/pg_subscription.h | 3 +
src/include/replication/worker_internal.h | 2 +-
src/test/regress/expected/subscription.out | 181 ++++++++++-------
src/test/regress/sql/subscription.sql | 24 +++
src/test/subscription/meson.build | 1 +
src/test/subscription/t/032_apply_delay.pl | 187 ++++++++++++++++++
21 files changed, 724 insertions(+), 117 deletions(-)
create mode 100644 src/test/subscription/t/032_apply_delay.pl
diff --git a/doc/src/sgml/catalogs.sgml b/doc/src/sgml/catalogs.sgml
index c1e4048054..2b62beed59 100644
--- a/doc/src/sgml/catalogs.sgml
+++ b/doc/src/sgml/catalogs.sgml
@@ -7873,6 +7873,15 @@ SCRAM-SHA-256$<replaceable><iteration count></replaceable>:<replaceable>&l
</para></entry>
</row>
+ <row>
+ <entry role="catalog_table_entry"><para role="column_definition">
+ <structfield>subminapplydelay</structfield> <type>int8</type>
+ </para>
+ <para>
+ Total time spent delaying the application of changes, in milliseconds.
+ </para></entry>
+ </row>
+
<row>
<entry role="catalog_table_entry"><para role="column_definition">
<structfield>subname</structfield> <type>name</type>
diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml
index f985afc009..ee91a1fc02 100644
--- a/doc/src/sgml/config.sgml
+++ b/doc/src/sgml/config.sgml
@@ -4787,6 +4787,17 @@ ANY <replaceable class="parameter">num_sync</replaceable> ( <replaceable class="
the <filename>postgresql.conf</filename> file or on the server
command line.
</para>
+ <para>
+ For time-delayed logical replication, the apply worker sends a feedback
+ message to the publisher every
+ <varname>wal_receiver_status_interval</varname> milliseconds. Make sure
+ to set <varname>wal_receiver_status_interval</varname> less than the
+ <varname>wal_sender_timeout</varname> on the publisher, otherwise, the
+ <literal>walsender</literal> will repeatedly terminate due to timeout
+ error. If <varname>wal_receiver_status_interval</varname> is set to
+ zero, the apply worker doesn't send any feedback messages during the
+ <literal>min_apply_delay</literal> period.
+ </para>
</listitem>
</varlistentry>
diff --git a/doc/src/sgml/glossary.sgml b/doc/src/sgml/glossary.sgml
index 7c01a541fe..6ed6fa5853 100644
--- a/doc/src/sgml/glossary.sgml
+++ b/doc/src/sgml/glossary.sgml
@@ -1729,6 +1729,20 @@
</glossdef>
</glossentry>
+ <glossentry id="glossary-time-delayed-replication">
+ <glossterm>Time-delayed replication</glossterm>
+ <glossdef>
+ <para>
+ Replication setup that applies time-delayed copy of the data.
+ </para>
+ <para>
+ For more information, see
+ <xref linkend="guc-recovery-min-apply-delay"/> for physical replication
+ and <xref linkend="sql-createsubscription"/> for logical replication.
+ </para>
+ </glossdef>
+ </glossentry>
+
<glossentry id="glossary-toast">
<glossterm>TOAST</glossterm>
<glossdef>
diff --git a/doc/src/sgml/logical-replication.sgml b/doc/src/sgml/logical-replication.sgml
index 1bd5660c87..d8ae93f88d 100644
--- a/doc/src/sgml/logical-replication.sgml
+++ b/doc/src/sgml/logical-replication.sgml
@@ -247,6 +247,12 @@
target table.
</para>
+ <para>
+ A logical replication subscription can delay the application of changes by
+ specifying the <literal>min_apply_delay</literal> subscription parameter.
+ See <xref linkend="sql-createsubscription"/> for details.
+ </para>
+
<sect2 id="logical-replication-subscription-slot">
<title>Replication Slot Management</title>
diff --git a/doc/src/sgml/ref/alter_subscription.sgml b/doc/src/sgml/ref/alter_subscription.sgml
index ad93553a1d..1c6e9dd2d1 100644
--- a/doc/src/sgml/ref/alter_subscription.sgml
+++ b/doc/src/sgml/ref/alter_subscription.sgml
@@ -213,8 +213,9 @@ ALTER SUBSCRIPTION <replaceable class="parameter">name</replaceable> RENAME TO <
are <literal>slot_name</literal>,
<literal>synchronous_commit</literal>,
<literal>binary</literal>, <literal>streaming</literal>,
- <literal>disable_on_error</literal>, and
- <literal>origin</literal>.
+ <literal>disable_on_error</literal>,
+ <literal>origin</literal>, and
+ <literal>min_apply_delay</literal>.
</para>
</listitem>
</varlistentry>
diff --git a/doc/src/sgml/ref/create_subscription.sgml b/doc/src/sgml/ref/create_subscription.sgml
index eba72c6af6..c4a615ee5c 100644
--- a/doc/src/sgml/ref/create_subscription.sgml
+++ b/doc/src/sgml/ref/create_subscription.sgml
@@ -349,7 +349,48 @@ CREATE SUBSCRIPTION <replaceable class="parameter">subscription_name</replaceabl
</para>
</listitem>
</varlistentry>
- </variablelist></para>
+
+ <varlistentry>
+ <term><literal>min_apply_delay</literal> (<type>integer</type>)</term>
+ <listitem>
+ <para>
+ By default, the subscriber applies changes as soon as possible. This
+ parameter allows the user to delay the application of changes by a
+ given time period. If the value is specified without units, it is
+ taken as milliseconds. The default is zero (no delay). See
+ <xref linkend="config-setting-names-values"/> for details on the
+ available valid time unites.
+ </para>
+ <para>
+ Any delay occurs only on WAL records for transaction begins after all
+ initial table synchronization has finished. The delay is calculated
+ as the difference between the WAL timestamp as written on the
+ publisher and the current time on the subscriber. Any overhead of
+ time spent in logical decoding and in transferring the transaction
+ may reduce the actual wait time. It is also possible that the overhead
+ already exceeds the requested <literal>min_apply_delay</literal> value,
+ in which case no additional wait is necessary. If the system clocks
+ on publisher and subscriber are not synchronized, this may lead to
+ apply changes earlier than expected, but this is not a major issue
+ because this parameter is typically much larger than the time
+ deviations between servers. Note that if this parameter is set to a
+ long delay, the replication will stop if the replication slot falls
+ behind the current LSN by more than
+ <link linkend="guc-max-slot-wal-keep-size"><literal>max_slot_wal_keep_size</literal></link>.
+ </para>
+ <warning>
+ <para>
+ Delaying the replication can mean there is a much longer time
+ between making a change on the publisher, and that change being
+ committed on the subscriber. This can impact the performance of
+ synchronous replication. See <xref linkend="guc-synchronous-commit"/>
+ parameter.
+ </para>
+ </warning>
+ </listitem>
+ </varlistentry>
+ </variablelist>
+ </para>
</listitem>
</varlistentry>
@@ -413,6 +454,11 @@ CREATE SUBSCRIPTION <replaceable class="parameter">subscription_name</replaceabl
published with different column lists are not supported.
</para>
+ <para>
+ A non-zero <literal>min_apply_delay</literal> parameter is not allowed when
+ streaming in parallel mode.
+ </para>
+
<para>
We allow non-existent publications to be specified so that users can add
those later. This means
@@ -472,6 +518,18 @@ CREATE SUBSCRIPTION mysub
PUBLICATION insert_only
WITH (enabled = false);
</programlisting></para>
+
+ <para>
+ Create a subscription to a remote server that replicates tables in
+ the <literal>mypub</literal> publication and starts replicating immediately
+ on commit. Pre-existing data is not copied. The application of changes is
+ delayed by 4 hours.
+<programlisting>
+CREATE SUBSCRIPTION mysub
+ CONNECTION 'host=192.0.2.4 port=5432 user=foo dbname=foodb'
+ PUBLICATION mypub
+ WITH (copy_data = false, min_apply_delay = '4h');
+</programlisting></para>
</refsect1>
<refsect1>
diff --git a/src/backend/catalog/pg_subscription.c b/src/backend/catalog/pg_subscription.c
index a56ae311c3..c767cc1c3a 100644
--- a/src/backend/catalog/pg_subscription.c
+++ b/src/backend/catalog/pg_subscription.c
@@ -64,6 +64,7 @@ GetSubscription(Oid subid, bool missing_ok)
sub->oid = subid;
sub->dbid = subform->subdbid;
sub->skiplsn = subform->subskiplsn;
+ sub->minapplydelay = subform->subminapplydelay;
sub->name = pstrdup(NameStr(subform->subname));
sub->owner = subform->subowner;
sub->enabled = subform->subenabled;
diff --git a/src/backend/catalog/system_views.sql b/src/backend/catalog/system_views.sql
index 8608e3fa5b..317c2010cb 100644
--- a/src/backend/catalog/system_views.sql
+++ b/src/backend/catalog/system_views.sql
@@ -1299,9 +1299,10 @@ REVOKE ALL ON pg_replication_origin_status FROM public;
-- All columns of pg_subscription except subconninfo are publicly readable.
REVOKE ALL ON pg_subscription FROM public;
-GRANT SELECT (oid, subdbid, subskiplsn, subname, subowner, subenabled,
- subbinary, substream, subtwophasestate, subdisableonerr,
- subslotname, subsynccommit, subpublications, suborigin)
+GRANT SELECT (oid, subdbid, subskiplsn, subminapplydelay, subname, subowner,
+ subenabled, subbinary, substream, subtwophasestate,
+ subdisableonerr, subslotname, subsynccommit, subpublications,
+ suborigin)
ON pg_subscription TO public;
CREATE VIEW pg_stat_subscription_stats AS
diff --git a/src/backend/commands/subscriptioncmds.c b/src/backend/commands/subscriptioncmds.c
index 464db6d247..b4d075c931 100644
--- a/src/backend/commands/subscriptioncmds.c
+++ b/src/backend/commands/subscriptioncmds.c
@@ -66,6 +66,7 @@
#define SUBOPT_DISABLE_ON_ERR 0x00000400
#define SUBOPT_LSN 0x00000800
#define SUBOPT_ORIGIN 0x00001000
+#define SUBOPT_MIN_APPLY_DELAY 0x00002000
/* check if the 'val' has 'bits' set */
#define IsSet(val, bits) (((val) & (bits)) == (bits))
@@ -90,6 +91,7 @@ typedef struct SubOpts
bool disableonerr;
char *origin;
XLogRecPtr lsn;
+ int min_apply_delay;
} SubOpts;
static List *fetch_table_list(WalReceiverConn *wrconn, List *publications);
@@ -100,7 +102,7 @@ static void check_publications_origin(WalReceiverConn *wrconn,
static void check_duplicates_in_publist(List *publist, Datum *datums);
static List *merge_publications(List *oldpublist, List *newpublist, bool addpub, const char *subname);
static void ReportSlotConnectionError(List *rstates, Oid subid, char *slotname, char *err);
-
+static int defGetMinApplyDelay(DefElem *def);
/*
* Common option parsing function for CREATE and ALTER SUBSCRIPTION commands.
@@ -146,6 +148,8 @@ parse_subscription_options(ParseState *pstate, List *stmt_options,
opts->disableonerr = false;
if (IsSet(supported_opts, SUBOPT_ORIGIN))
opts->origin = pstrdup(LOGICALREP_ORIGIN_ANY);
+ if (IsSet(supported_opts, SUBOPT_MIN_APPLY_DELAY))
+ opts->min_apply_delay = 0;
/* Parse options */
foreach(lc, stmt_options)
@@ -324,6 +328,15 @@ parse_subscription_options(ParseState *pstate, List *stmt_options,
opts->specified_opts |= SUBOPT_LSN;
opts->lsn = lsn;
}
+ else if (IsSet(supported_opts, SUBOPT_MIN_APPLY_DELAY) &&
+ strcmp(defel->defname, "min_apply_delay") == 0)
+ {
+ if (IsSet(opts->specified_opts, SUBOPT_MIN_APPLY_DELAY))
+ errorConflictingDefElem(defel, pstate);
+
+ opts->specified_opts |= SUBOPT_MIN_APPLY_DELAY;
+ opts->min_apply_delay = defGetMinApplyDelay(defel);
+ }
else
ereport(ERROR,
(errcode(ERRCODE_SYNTAX_ERROR),
@@ -404,6 +417,31 @@ parse_subscription_options(ParseState *pstate, List *stmt_options,
"slot_name = NONE", "create_slot = false")));
}
}
+
+ /*
+ * The combination of parallel streaming mode and min_apply_delay is not
+ * allowed. This is because we start applying the transaction stream as
+ * soon as the first change arrives without knowing the transaction's
+ * prepare/commit time. This means we cannot calculate the underlying
+ * network/decoding lag between publisher and subscriber, and so always
+ * waiting for the full 'min_apply_delay' period might include unnecessary
+ * delay.
+ *
+ * The other possibility is to apply the delay at the end of the parallel
+ * apply transaction but that would cause issues related to resource bloat
+ * and locks being held for a long time.
+ */
+ if (IsSet(supported_opts, SUBOPT_MIN_APPLY_DELAY) &&
+ opts->min_apply_delay > 0 && opts->streaming == LOGICALREP_STREAM_PARALLEL)
+ ereport(ERROR,
+ errcode(ERRCODE_SYNTAX_ERROR),
+
+ /*
+ * - translator: the first %s is a string of the form "option > value"
+ * and the second one is "option = value".
+ */
+ errmsg("%s and %s are mutually exclusive options",
+ "min_apply_delay > 0", "streaming = parallel"));
}
/*
@@ -560,7 +598,8 @@ CreateSubscription(ParseState *pstate, CreateSubscriptionStmt *stmt,
SUBOPT_SLOT_NAME | SUBOPT_COPY_DATA |
SUBOPT_SYNCHRONOUS_COMMIT | SUBOPT_BINARY |
SUBOPT_STREAMING | SUBOPT_TWOPHASE_COMMIT |
- SUBOPT_DISABLE_ON_ERR | SUBOPT_ORIGIN);
+ SUBOPT_DISABLE_ON_ERR | SUBOPT_ORIGIN |
+ SUBOPT_MIN_APPLY_DELAY);
parse_subscription_options(pstate, stmt->options, supported_opts, &opts);
/*
@@ -625,6 +664,7 @@ CreateSubscription(ParseState *pstate, CreateSubscriptionStmt *stmt,
values[Anum_pg_subscription_oid - 1] = ObjectIdGetDatum(subid);
values[Anum_pg_subscription_subdbid - 1] = ObjectIdGetDatum(MyDatabaseId);
values[Anum_pg_subscription_subskiplsn - 1] = LSNGetDatum(InvalidXLogRecPtr);
+ values[Anum_pg_subscription_subminapplydelay - 1] = Int64GetDatum(opts.min_apply_delay);
values[Anum_pg_subscription_subname - 1] =
DirectFunctionCall1(namein, CStringGetDatum(stmt->subname));
values[Anum_pg_subscription_subowner - 1] = ObjectIdGetDatum(owner);
@@ -1054,7 +1094,7 @@ AlterSubscription(ParseState *pstate, AlterSubscriptionStmt *stmt,
supported_opts = (SUBOPT_SLOT_NAME |
SUBOPT_SYNCHRONOUS_COMMIT | SUBOPT_BINARY |
SUBOPT_STREAMING | SUBOPT_DISABLE_ON_ERR |
- SUBOPT_ORIGIN);
+ SUBOPT_ORIGIN | SUBOPT_MIN_APPLY_DELAY);
parse_subscription_options(pstate, stmt->options,
supported_opts, &opts);
@@ -1098,6 +1138,19 @@ AlterSubscription(ParseState *pstate, AlterSubscriptionStmt *stmt,
if (IsSet(opts.specified_opts, SUBOPT_STREAMING))
{
+ /*
+ * The combination of parallel streaming mode and
+ * min_apply_delay is not allowed. See
+ * parse_subscription_options for details of the reason.
+ */
+ if (opts.streaming == LOGICALREP_STREAM_PARALLEL)
+ if ((IsSet(opts.specified_opts, SUBOPT_MIN_APPLY_DELAY) && opts.min_apply_delay > 0) ||
+ (!IsSet(opts.specified_opts, SUBOPT_MIN_APPLY_DELAY) && sub->minapplydelay > 0))
+ ereport(ERROR,
+ errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
+ errmsg("cannot enable parallel streaming mode for subscription with %s",
+ "min_apply_delay"));
+
values[Anum_pg_subscription_substream - 1] =
CharGetDatum(opts.streaming);
replaces[Anum_pg_subscription_substream - 1] = true;
@@ -1111,6 +1164,25 @@ AlterSubscription(ParseState *pstate, AlterSubscriptionStmt *stmt,
= true;
}
+ if (IsSet(opts.specified_opts, SUBOPT_MIN_APPLY_DELAY))
+ {
+ /*
+ * The combination of parallel streaming mode and
+ * min_apply_delay is not allowed.
+ */
+ if (opts.min_apply_delay > 0)
+ if ((IsSet(opts.specified_opts, SUBOPT_STREAMING) && opts.streaming == LOGICALREP_STREAM_PARALLEL) ||
+ (!IsSet(opts.specified_opts, SUBOPT_STREAMING) && sub->stream == LOGICALREP_STREAM_PARALLEL))
+ ereport(ERROR,
+ errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
+ errmsg("cannot enable %s for subscription in %s mode",
+ "min_apply_delay", "streaming = parallel"));
+
+ values[Anum_pg_subscription_subminapplydelay - 1] =
+ Int64GetDatum(opts.min_apply_delay);
+ replaces[Anum_pg_subscription_subminapplydelay - 1] = true;
+ }
+
if (IsSet(opts.specified_opts, SUBOPT_ORIGIN))
{
values[Anum_pg_subscription_suborigin - 1] =
@@ -2195,3 +2267,42 @@ defGetStreamingMode(DefElem *def)
def->defname)));
return LOGICALREP_STREAM_OFF; /* keep compiler quiet */
}
+
+/*
+ * Extract the min_apply_delay value from a DefElem. This is very similar to
+ * parse_and_validate_value() for integer values, because min_apply_delay
+ * accepts the same parameter format as recovery_min_apply_delay.
+ */
+static int
+defGetMinApplyDelay(DefElem *def)
+{
+ char *input_string;
+ int result;
+ const char *hintmsg;
+
+ input_string = defGetString(def);
+
+ /*
+ * Parse given string as parameter which has millisecond unit
+ */
+ if (!parse_int(input_string, &result, GUC_UNIT_MS, &hintmsg))
+ ereport(ERROR,
+ (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+ errmsg("invalid value for parameter \"%s\": \"%s\"",
+ "min_apply_delay", input_string),
+ hintmsg ? errhint("%s", _(hintmsg)) : 0));
+
+ /*
+ * Check lower bound. parse_int() has already been confirmed that result
+ * is less than or equal to INT_MAX.
+ */
+ if (result < 0)
+ ereport(ERROR,
+ (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+ errmsg("%d ms is outside the valid range for parameter \"%s\" (%d .. %d)",
+ result,
+ "min_apply_delay",
+ 0, INT_MAX)));
+
+ return result;
+}
diff --git a/src/backend/replication/logical/applyparallelworker.c b/src/backend/replication/logical/applyparallelworker.c
index 3579e704fe..7302bce7a0 100644
--- a/src/backend/replication/logical/applyparallelworker.c
+++ b/src/backend/replication/logical/applyparallelworker.c
@@ -704,7 +704,8 @@ pa_process_spooled_messages_if_required(void)
{
apply_spooled_messages(&MyParallelShared->fileset,
MyParallelShared->xid,
- InvalidXLogRecPtr);
+ InvalidXLogRecPtr,
+ 0);
pa_set_fileset_state(MyParallelShared, FS_EMPTY);
}
diff --git a/src/backend/replication/logical/worker.c b/src/backend/replication/logical/worker.c
index cfb2ab6248..d040efbaaf 100644
--- a/src/backend/replication/logical/worker.c
+++ b/src/backend/replication/logical/worker.c
@@ -319,6 +319,17 @@ static List *on_commit_wakeup_workers_subids = NIL;
bool in_remote_transaction = false;
static XLogRecPtr remote_final_lsn = InvalidXLogRecPtr;
+/*
+ * In order to avoid walsender timeout for time-delayed logical replication the
+ * apply worker keeps sending feedback messages during the delay period.
+ * Meanwhile, the feature delays the apply before the start of the
+ * transaction and thus we don't write WAL records for the suspended changes
+ * during the wait. When the apply worker sends a feedback message during the
+ * delay, we should not make positions of the flushed and apply LSN overwritten
+ * by the last received latest LSN. See send_feedback() for details.
+ */
+static XLogRecPtr last_received = InvalidXLogRecPtr;
+
/* fields valid only when processing streamed transaction */
static bool in_streamed_transaction = false;
@@ -389,7 +400,8 @@ static void stream_write_change(char action, StringInfo s);
static void stream_open_and_write_change(TransactionId xid, char action, StringInfo s);
static void stream_close_file(void);
-static void send_feedback(XLogRecPtr recvpos, bool force, bool requestReply);
+static void send_feedback(bool force, bool requestReply,
+ bool has_unprocessed_change);
static void DisableSubscriptionAndExit(void);
@@ -999,6 +1011,108 @@ slot_modify_data(TupleTableSlot *slot, TupleTableSlot *srcslot,
ExecStoreVirtualTuple(slot);
}
+/*
+ * When min_apply_delay parameter is set on the subscriber, we wait long enough
+ * to make sure a transaction is applied at least that period behind the
+ * publisher.
+ *
+ * While the physical replication applies the delay at commit time, this
+ * feature applies the delay for the next transaction but before starting the
+ * transaction. This is mainly because keeping a transaction that conducted
+ * write operations open for a long time results in some issues such as bloat
+ * and locks.
+ *
+ * The min_apply_delay parameter will take effect only after all tables are in
+ * READY state.
+ *
+ * xid is the transaction id where we apply the delay.
+ *
+ * finish_ts is the commit/prepare time of both regular (non-streamed) and
+ * streamed transactions. Unlike the regular (non-streamed) cases, the delay
+ * is applied in a STREAM COMMIT/STREAM PREPARE message for streamed
+ * transactions. The STREAM START message does not contain a commit/prepare
+ * time (it will be available when the in-progress transaction finishes).
+ * Hence, it's not appropriate to apply a delay at the STREAM START time.
+ */
+static void
+maybe_apply_delay(TransactionId xid, TimestampTz finish_ts)
+{
+ Assert(finish_ts > 0);
+
+ /* Nothing to do if no delay set */
+ if (!MySubscription->minapplydelay)
+ return;
+
+ /*
+ * The min_apply_delay parameter is ignored until all tablesync workers
+ * have reached READY state. This is because if we allowed the delay
+ * during the catchup phase, then once we reached the limit of tablesync
+ * workers it would impose a delay for each subsequent worker. That would
+ * cause initial table synchronization completion to take a long time.
+ */
+ if (!AllTablesyncsReady())
+ return;
+
+ /* Apply the delay by the latch mechanism */
+ while (true)
+ {
+ long diffms;
+
+ ResetLatch(MyLatch);
+
+ CHECK_FOR_INTERRUPTS();
+
+ /* This might change wal_receiver_status_interval */
+ if (ConfigReloadPending)
+ {
+ ConfigReloadPending = false;
+ ProcessConfigFile(PGC_SIGHUP);
+ }
+
+ /*
+ * Before calculating the time duration, reload the catalog if needed.
+ */
+ if (!in_remote_transaction && !in_streamed_transaction)
+ {
+ AcceptInvalidationMessages();
+ maybe_reread_subscription();
+ }
+
+ diffms = TimestampDifferenceMilliseconds(GetCurrentTimestamp(),
+ TimestampTzPlusMilliseconds(finish_ts, MySubscription->minapplydelay));
+
+ /*
+ * Exit without arming the latch if it's already past time to apply
+ * this transaction.
+ */
+ if (diffms <= 0)
+ break;
+
+ elog(DEBUG2, "time-delayed replication for txid %u, min_apply_delay = " INT64_FORMAT " ms, remaining wait time: %ld ms",
+ xid, MySubscription->minapplydelay, diffms);
+
+ /*
+ * Call send_feedback() to prevent the publisher from exiting by
+ * timeout during the delay, when wal_receiver_status_interval is
+ * available.
+ */
+ if (wal_receiver_status_interval > 0 &&
+ diffms > wal_receiver_status_interval * 1000L)
+ {
+ WaitLatch(MyLatch,
+ WL_LATCH_SET | WL_TIMEOUT | WL_EXIT_ON_PM_DEATH,
+ wal_receiver_status_interval * 1000L,
+ WAIT_EVENT_RECOVERY_APPLY_DELAY);
+ send_feedback(true, false, true);
+ }
+ else
+ WaitLatch(MyLatch,
+ WL_LATCH_SET | WL_TIMEOUT | WL_EXIT_ON_PM_DEATH,
+ diffms,
+ WAIT_EVENT_RECOVERY_APPLY_DELAY);
+ }
+}
+
/*
* Handle BEGIN message.
*/
@@ -1013,6 +1127,9 @@ apply_handle_begin(StringInfo s)
logicalrep_read_begin(s, &begin_data);
set_apply_error_context_xact(begin_data.xid, begin_data.final_lsn);
+ /* Should we delay the current transaction? */
+ maybe_apply_delay(begin_data.xid, begin_data.committime);
+
remote_final_lsn = begin_data.final_lsn;
maybe_start_skipping_changes(begin_data.final_lsn);
@@ -1070,6 +1187,9 @@ apply_handle_begin_prepare(StringInfo s)
logicalrep_read_begin_prepare(s, &begin_data);
set_apply_error_context_xact(begin_data.xid, begin_data.prepare_lsn);
+ /* Should we delay the current prepared transaction? */
+ maybe_apply_delay(begin_data.xid, begin_data.prepare_time);
+
remote_final_lsn = begin_data.prepare_lsn;
maybe_start_skipping_changes(begin_data.prepare_lsn);
@@ -1317,7 +1437,8 @@ apply_handle_stream_prepare(StringInfo s)
* spooled operations.
*/
apply_spooled_messages(MyLogicalRepWorker->stream_fileset,
- prepare_data.xid, prepare_data.prepare_lsn);
+ prepare_data.xid, prepare_data.prepare_lsn,
+ prepare_data.prepare_time);
/* Mark the transaction as prepared. */
apply_handle_prepare_internal(&prepare_data);
@@ -2011,10 +2132,13 @@ ensure_last_message(FileSet *stream_fileset, TransactionId xid, int fileno,
/*
* Common spoolfile processing.
+ *
+ * The commit/prepare time (finish_ts) for streamed transactions is required
+ * for time-delayed logical replication.
*/
void
apply_spooled_messages(FileSet *stream_fileset, TransactionId xid,
- XLogRecPtr lsn)
+ XLogRecPtr lsn, TimestampTz finish_ts)
{
StringInfoData s2;
int nchanges;
@@ -2025,6 +2149,10 @@ apply_spooled_messages(FileSet *stream_fileset, TransactionId xid,
int fileno;
off_t offset;
+ /* Should we delay the current transaction? */
+ if (finish_ts)
+ maybe_apply_delay(xid, finish_ts);
+
if (!am_parallel_apply_worker())
maybe_start_skipping_changes(lsn);
@@ -2174,7 +2302,7 @@ apply_handle_stream_commit(StringInfo s)
* spooled operations.
*/
apply_spooled_messages(MyLogicalRepWorker->stream_fileset, xid,
- commit_data.commit_lsn);
+ commit_data.commit_lsn, commit_data.committime);
apply_handle_commit_internal(&commit_data);
@@ -3447,7 +3575,7 @@ UpdateWorkerStats(XLogRecPtr last_lsn, TimestampTz send_time, bool reply)
* Apply main loop.
*/
static void
-LogicalRepApplyLoop(XLogRecPtr last_received)
+LogicalRepApplyLoop(void)
{
TimestampTz last_recv_timestamp = GetCurrentTimestamp();
bool ping_sent = false;
@@ -3568,7 +3696,7 @@ LogicalRepApplyLoop(XLogRecPtr last_received)
if (last_received < end_lsn)
last_received = end_lsn;
- send_feedback(last_received, reply_requested, false);
+ send_feedback(reply_requested, false, false);
UpdateWorkerStats(last_received, timestamp, true);
}
/* other message types are purposefully ignored */
@@ -3581,7 +3709,7 @@ LogicalRepApplyLoop(XLogRecPtr last_received)
}
/* confirm all writes so far */
- send_feedback(last_received, false, false);
+ send_feedback(false, false, false);
if (!in_remote_transaction && !in_streamed_transaction)
{
@@ -3678,7 +3806,7 @@ LogicalRepApplyLoop(XLogRecPtr last_received)
}
}
- send_feedback(last_received, requestReply, requestReply);
+ send_feedback(requestReply, requestReply, false);
/*
* Force reporting to ensure long idle periods don't lead to
@@ -3703,17 +3831,13 @@ LogicalRepApplyLoop(XLogRecPtr last_received)
/*
* Send a Standby Status Update message to server.
- *
- * 'recvpos' is the latest LSN we've received data to, force is set if we need
- * to send a response to avoid timeouts.
*/
static void
-send_feedback(XLogRecPtr recvpos, bool force, bool requestReply)
+send_feedback(bool force, bool requestReply, bool has_unprocessed_change)
{
static StringInfo reply_message = NULL;
static TimestampTz send_time = 0;
- static XLogRecPtr last_recvpos = InvalidXLogRecPtr;
static XLogRecPtr last_writepos = InvalidXLogRecPtr;
static XLogRecPtr last_flushpos = InvalidXLogRecPtr;
@@ -3729,18 +3853,21 @@ send_feedback(XLogRecPtr recvpos, bool force, bool requestReply)
if (!force && wal_receiver_status_interval <= 0)
return;
- /* It's legal to not pass a recvpos */
- if (recvpos < last_recvpos)
- recvpos = last_recvpos;
-
get_flush_position(&writepos, &flushpos, &have_pending_txes);
/*
* No outstanding transactions to flush, we can report the latest received
* position. This is important for synchronous replication.
+ *
+ * If the logical replication subscription has unprocessed changes then do
+ * not inform the publisher that the received latest LSN is already
+ * applied and flushed, otherwise, the publisher will make a wrong
+ * assumption about the logical replication progress. Instead, it just
+ * sends a feedback message to avoid a replication timeout during the
+ * delay.
*/
- if (!have_pending_txes)
- flushpos = writepos = recvpos;
+ if (!have_pending_txes && !has_unprocessed_change)
+ flushpos = writepos = last_received;
if (writepos < last_writepos)
writepos = last_writepos;
@@ -3770,23 +3897,22 @@ send_feedback(XLogRecPtr recvpos, bool force, bool requestReply)
resetStringInfo(reply_message);
pq_sendbyte(reply_message, 'r');
- pq_sendint64(reply_message, recvpos); /* write */
+ pq_sendint64(reply_message, last_received); /* write */
pq_sendint64(reply_message, flushpos); /* flush */
pq_sendint64(reply_message, writepos); /* apply */
pq_sendint64(reply_message, now); /* sendTime */
pq_sendbyte(reply_message, requestReply); /* replyRequested */
- elog(DEBUG2, "sending feedback (force %d) to recv %X/%X, write %X/%X, flush %X/%X",
+ elog(DEBUG2, "sending feedback (force %d, has_unprocessed_change %s) to recv %X/%X, write %X/%X, flush %X/%X",
force,
- LSN_FORMAT_ARGS(recvpos),
+ has_unprocessed_change ? "yes" : "no",
+ LSN_FORMAT_ARGS(last_received),
LSN_FORMAT_ARGS(writepos),
LSN_FORMAT_ARGS(flushpos));
walrcv_send(LogRepWorkerWalRcvConn,
reply_message->data, reply_message->len);
- if (recvpos > last_recvpos)
- last_recvpos = recvpos;
if (writepos > last_writepos)
last_writepos = writepos;
if (flushpos > last_flushpos)
@@ -4367,11 +4493,11 @@ start_table_sync(XLogRecPtr *origin_startpos, char **myslotname)
* of system resource error and are not repeatable.
*/
static void
-start_apply(XLogRecPtr origin_startpos)
+start_apply(void)
{
PG_TRY();
{
- LogicalRepApplyLoop(origin_startpos);
+ LogicalRepApplyLoop();
}
PG_CATCH();
{
@@ -4661,7 +4787,8 @@ ApplyWorkerMain(Datum main_arg)
}
/* Run the main loop. */
- start_apply(origin_startpos);
+ last_received = origin_startpos;
+ start_apply();
proc_exit(0);
}
diff --git a/src/bin/pg_dump/pg_dump.c b/src/bin/pg_dump/pg_dump.c
index 527c7651ab..c0f69cb43b 100644
--- a/src/bin/pg_dump/pg_dump.c
+++ b/src/bin/pg_dump/pg_dump.c
@@ -4494,6 +4494,7 @@ getSubscriptions(Archive *fout)
int i_subsynccommit;
int i_subpublications;
int i_subbinary;
+ int i_subminapplydelay;
int i,
ntups;
@@ -4546,9 +4547,13 @@ getSubscriptions(Archive *fout)
LOGICALREP_TWOPHASE_STATE_DISABLED);
if (fout->remoteVersion >= 160000)
- appendPQExpBufferStr(query, " s.suborigin\n");
+ appendPQExpBufferStr(query,
+ " s.suborigin,\n"
+ " s.subminapplydelay\n");
else
- appendPQExpBuffer(query, " '%s' AS suborigin\n", LOGICALREP_ORIGIN_ANY);
+ appendPQExpBuffer(query, " '%s' AS suborigin,\n"
+ " 0 AS subminapplydelay\n",
+ LOGICALREP_ORIGIN_ANY);
appendPQExpBufferStr(query,
"FROM pg_subscription s\n"
@@ -4576,6 +4581,7 @@ getSubscriptions(Archive *fout)
i_subtwophasestate = PQfnumber(res, "subtwophasestate");
i_subdisableonerr = PQfnumber(res, "subdisableonerr");
i_suborigin = PQfnumber(res, "suborigin");
+ i_subminapplydelay = PQfnumber(res, "subminapplydelay");
subinfo = pg_malloc(ntups * sizeof(SubscriptionInfo));
@@ -4606,6 +4612,8 @@ getSubscriptions(Archive *fout)
subinfo[i].subdisableonerr =
pg_strdup(PQgetvalue(res, i, i_subdisableonerr));
subinfo[i].suborigin = pg_strdup(PQgetvalue(res, i, i_suborigin));
+ subinfo[i].subminapplydelay =
+ strtoi64(PQgetvalue(res, i, i_subminapplydelay), NULL, 10);
/* Decide whether we want to dump it */
selectDumpableObject(&(subinfo[i].dobj), fout);
@@ -4687,6 +4695,9 @@ dumpSubscription(Archive *fout, const SubscriptionInfo *subinfo)
if (strcmp(subinfo->subsynccommit, "off") != 0)
appendPQExpBuffer(query, ", synchronous_commit = %s", fmtId(subinfo->subsynccommit));
+ if (subinfo->subminapplydelay > 0)
+ appendPQExpBuffer(query, ", min_apply_delay = '" INT64_FORMAT " ms'", subinfo->subminapplydelay);
+
appendPQExpBufferStr(query, ");\n");
if (subinfo->dobj.dump & DUMP_COMPONENT_DEFINITION)
diff --git a/src/bin/pg_dump/pg_dump.h b/src/bin/pg_dump/pg_dump.h
index e7cbd8d7ed..e2525f70ab 100644
--- a/src/bin/pg_dump/pg_dump.h
+++ b/src/bin/pg_dump/pg_dump.h
@@ -661,6 +661,7 @@ typedef struct _SubscriptionInfo
char *subdisableonerr;
char *suborigin;
char *subsynccommit;
+ int64 subminapplydelay;
char *subpublications;
} SubscriptionInfo;
diff --git a/src/bin/psql/describe.c b/src/bin/psql/describe.c
index c8a0bb7b3a..81d4607a1c 100644
--- a/src/bin/psql/describe.c
+++ b/src/bin/psql/describe.c
@@ -6472,7 +6472,7 @@ describeSubscriptions(const char *pattern, bool verbose)
PGresult *res;
printQueryOpt myopt = pset.popt;
static const bool translate_columns[] = {false, false, false, false,
- false, false, false, false, false, false, false, false};
+ false, false, false, false, false, false, false, false, false};
if (pset.sversion < 100000)
{
@@ -6527,10 +6527,13 @@ describeSubscriptions(const char *pattern, bool verbose)
gettext_noop("Two-phase commit"),
gettext_noop("Disable on error"));
+ /* Origin and min_apply_delay are only supported in v16 and higher */
if (pset.sversion >= 160000)
appendPQExpBuffer(&buf,
- ", suborigin AS \"%s\"\n",
- gettext_noop("Origin"));
+ ", suborigin AS \"%s\"\n"
+ ", subminapplydelay AS \"%s\"\n",
+ gettext_noop("Origin"),
+ gettext_noop("Min apply delay"));
appendPQExpBuffer(&buf,
", subsynccommit AS \"%s\"\n"
diff --git a/src/bin/psql/tab-complete.c b/src/bin/psql/tab-complete.c
index 5e1882eaea..e8b9a43a47 100644
--- a/src/bin/psql/tab-complete.c
+++ b/src/bin/psql/tab-complete.c
@@ -1925,7 +1925,7 @@ psql_completion(const char *text, int start, int end)
COMPLETE_WITH("(", "PUBLICATION");
/* ALTER SUBSCRIPTION <name> SET ( */
else if (HeadMatches("ALTER", "SUBSCRIPTION", MatchAny) && TailMatches("SET", "("))
- COMPLETE_WITH("binary", "disable_on_error", "origin", "slot_name",
+ COMPLETE_WITH("binary", "disable_on_error", "min_apply_delay", "origin", "slot_name",
"streaming", "synchronous_commit");
/* ALTER SUBSCRIPTION <name> SKIP ( */
else if (HeadMatches("ALTER", "SUBSCRIPTION", MatchAny) && TailMatches("SKIP", "("))
@@ -3268,7 +3268,7 @@ psql_completion(const char *text, int start, int end)
/* Complete "CREATE SUBSCRIPTION <name> ... WITH ( <opt>" */
else if (HeadMatches("CREATE", "SUBSCRIPTION") && TailMatches("WITH", "("))
COMPLETE_WITH("binary", "connect", "copy_data", "create_slot",
- "disable_on_error", "enabled", "origin", "slot_name",
+ "disable_on_error", "enabled", "min_apply_delay", "origin", "slot_name",
"streaming", "synchronous_commit", "two_phase");
/* CREATE TRIGGER --- is allowed inside CREATE SCHEMA, so use TailMatches */
diff --git a/src/include/catalog/pg_subscription.h b/src/include/catalog/pg_subscription.h
index b0f2a1705d..e06f35c037 100644
--- a/src/include/catalog/pg_subscription.h
+++ b/src/include/catalog/pg_subscription.h
@@ -70,6 +70,8 @@ CATALOG(pg_subscription,6100,SubscriptionRelationId) BKI_SHARED_RELATION BKI_ROW
XLogRecPtr subskiplsn; /* All changes finished at this LSN are
* skipped */
+ int64 subminapplydelay; /* Replication apply delay (ms) */
+
NameData subname; /* Name of the subscription */
Oid subowner BKI_LOOKUP(pg_authid); /* Owner of the subscription */
@@ -120,6 +122,7 @@ typedef struct Subscription
* in */
XLogRecPtr skiplsn; /* All changes finished at this LSN are
* skipped */
+ int64 minapplydelay; /* Replication apply delay (ms) */
char *name; /* Name of the subscription */
Oid owner; /* Oid of the subscription owner */
bool enabled; /* Indicates if the subscription is enabled */
diff --git a/src/include/replication/worker_internal.h b/src/include/replication/worker_internal.h
index dc87a4edd1..3dc09d1a4c 100644
--- a/src/include/replication/worker_internal.h
+++ b/src/include/replication/worker_internal.h
@@ -255,7 +255,7 @@ extern void stream_stop_internal(TransactionId xid);
/* Common streaming function to apply all the spooled messages */
extern void apply_spooled_messages(FileSet *stream_fileset, TransactionId xid,
- XLogRecPtr lsn);
+ XLogRecPtr lsn, TimestampTz finish_ts);
extern void apply_dispatch(StringInfo s);
diff --git a/src/test/regress/expected/subscription.out b/src/test/regress/expected/subscription.out
index 3f99b14394..5ccce39986 100644
--- a/src/test/regress/expected/subscription.out
+++ b/src/test/regress/expected/subscription.out
@@ -114,18 +114,18 @@ CREATE SUBSCRIPTION regress_testsub4 CONNECTION 'dbname=regress_doesnotexist' PU
WARNING: subscription was created, but is not connected
HINT: To initiate replication, you must manually create the replication slot, enable the subscription, and refresh the subscription.
\dRs+ regress_testsub4
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
-------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub4 | regress_subscription_user | f | {testpub} | f | off | d | f | none | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay | Synchronous commit | Conninfo | Skip LSN
+------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+-----------------+--------------------+-----------------------------+----------
+ regress_testsub4 | regress_subscription_user | f | {testpub} | f | off | d | f | none | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
ALTER SUBSCRIPTION regress_testsub4 SET (origin = any);
\dRs+ regress_testsub4
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
-------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub4 | regress_subscription_user | f | {testpub} | f | off | d | f | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay | Synchronous commit | Conninfo | Skip LSN
+------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+-----------------+--------------------+-----------------------------+----------
+ regress_testsub4 | regress_subscription_user | f | {testpub} | f | off | d | f | any | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
DROP SUBSCRIPTION regress_testsub3;
@@ -143,10 +143,10 @@ ALTER SUBSCRIPTION regress_testsub CONNECTION 'foobar';
ERROR: invalid connection string syntax: missing "=" after "foobar" in connection info string
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub} | f | off | d | f | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+-----------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | off | d | f | any | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
ALTER SUBSCRIPTION regress_testsub SET PUBLICATION testpub2, testpub3 WITH (refresh = false);
@@ -163,10 +163,10 @@ ERROR: unrecognized subscription parameter: "create_slot"
-- ok
ALTER SUBSCRIPTION regress_testsub SKIP (lsn = '0/12345');
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+---------------------+--------+-----------+------------------+------------------+--------+--------------------+------------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub2,testpub3} | f | off | d | f | any | off | dbname=regress_doesnotexist2 | 0/12345
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+---------------------+--------+-----------+------------------+------------------+--------+-----------------+--------------------+------------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub2,testpub3} | f | off | d | f | any | 0 | off | dbname=regress_doesnotexist2 | 0/12345
(1 row)
-- ok - with lsn = NONE
@@ -175,10 +175,10 @@ ALTER SUBSCRIPTION regress_testsub SKIP (lsn = NONE);
ALTER SUBSCRIPTION regress_testsub SKIP (lsn = '0/0');
ERROR: invalid WAL location (LSN): 0/0
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+---------------------+--------+-----------+------------------+------------------+--------+--------------------+------------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub2,testpub3} | f | off | d | f | any | off | dbname=regress_doesnotexist2 | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+---------------------+--------+-----------+------------------+------------------+--------+-----------------+--------------------+------------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub2,testpub3} | f | off | d | f | any | 0 | off | dbname=regress_doesnotexist2 | 0/0
(1 row)
BEGIN;
@@ -210,10 +210,10 @@ ALTER SUBSCRIPTION regress_testsub_foo SET (synchronous_commit = foobar);
ERROR: invalid value for parameter "synchronous_commit": "foobar"
HINT: Available values: local, remote_write, remote_apply, on, off.
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
----------------------+---------------------------+---------+---------------------+--------+-----------+------------------+------------------+--------+--------------------+------------------------------+----------
- regress_testsub_foo | regress_subscription_user | f | {testpub2,testpub3} | f | off | d | f | any | local | dbname=regress_doesnotexist2 | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay | Synchronous commit | Conninfo | Skip LSN
+---------------------+---------------------------+---------+---------------------+--------+-----------+------------------+------------------+--------+-----------------+--------------------+------------------------------+----------
+ regress_testsub_foo | regress_subscription_user | f | {testpub2,testpub3} | f | off | d | f | any | 0 | local | dbname=regress_doesnotexist2 | 0/0
(1 row)
-- rename back to keep the rest simple
@@ -247,19 +247,19 @@ CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUB
WARNING: subscription was created, but is not connected
HINT: To initiate replication, you must manually create the replication slot, enable the subscription, and refresh the subscription.
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub} | t | off | d | f | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+-----------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | t | off | d | f | any | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
ALTER SUBSCRIPTION regress_testsub SET (binary = false);
ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE);
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub} | f | off | d | f | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+-----------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | off | d | f | any | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
DROP SUBSCRIPTION regress_testsub;
@@ -271,27 +271,27 @@ CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUB
WARNING: subscription was created, but is not connected
HINT: To initiate replication, you must manually create the replication slot, enable the subscription, and refresh the subscription.
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub} | f | on | d | f | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+-----------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | on | d | f | any | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
ALTER SUBSCRIPTION regress_testsub SET (streaming = parallel);
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub} | f | parallel | d | f | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+-----------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | parallel | d | f | any | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
ALTER SUBSCRIPTION regress_testsub SET (streaming = false);
ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE);
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub} | f | off | d | f | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+-----------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | off | d | f | any | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
-- fail - publication already exists
@@ -306,10 +306,10 @@ ALTER SUBSCRIPTION regress_testsub ADD PUBLICATION testpub1, testpub2 WITH (refr
ALTER SUBSCRIPTION regress_testsub ADD PUBLICATION testpub1, testpub2 WITH (refresh = false);
ERROR: publication "testpub1" is already in subscription "regress_testsub"
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-----------------------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub,testpub1,testpub2} | f | off | d | f | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-----------------------------+--------+-----------+------------------+------------------+--------+-----------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub,testpub1,testpub2} | f | off | d | f | any | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
-- fail - publication used more than once
@@ -324,10 +324,10 @@ ERROR: publication "testpub3" is not in subscription "regress_testsub"
-- ok - delete publications
ALTER SUBSCRIPTION regress_testsub DROP PUBLICATION testpub1, testpub2 WITH (refresh = false);
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub} | f | off | d | f | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+-----------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | off | d | f | any | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
DROP SUBSCRIPTION regress_testsub;
@@ -363,10 +363,10 @@ CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUB
WARNING: subscription was created, but is not connected
HINT: To initiate replication, you must manually create the replication slot, enable the subscription, and refresh the subscription.
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub} | f | off | p | f | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+-----------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | off | p | f | any | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
--fail - alter of two_phase option not supported.
@@ -375,10 +375,10 @@ ERROR: unrecognized subscription parameter: "two_phase"
-- but can alter streaming when two_phase enabled
ALTER SUBSCRIPTION regress_testsub SET (streaming = true);
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub} | f | on | p | f | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+-----------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | on | p | f | any | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE);
@@ -388,10 +388,10 @@ CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUB
WARNING: subscription was created, but is not connected
HINT: To initiate replication, you must manually create the replication slot, enable the subscription, and refresh the subscription.
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub} | f | on | p | f | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+-----------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | on | p | f | any | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE);
@@ -404,20 +404,57 @@ CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUB
WARNING: subscription was created, but is not connected
HINT: To initiate replication, you must manually create the replication slot, enable the subscription, and refresh the subscription.
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub} | f | off | d | f | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+-----------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | off | d | f | any | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
ALTER SUBSCRIPTION regress_testsub SET (disable_on_error = true);
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub} | f | off | d | t | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+-----------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | off | d | t | any | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
+ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE);
+DROP SUBSCRIPTION regress_testsub;
+-- fail -- min_apply_delay must be a non-negative integer
+CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUBLICATION testpub WITH (connect = false, min_apply_delay = foo);
+ERROR: invalid value for parameter "min_apply_delay": "foo"
+CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUBLICATION testpub WITH (connect = false, min_apply_delay = -1);
+ERROR: -1 ms is outside the valid range for parameter "min_apply_delay" (0 .. 2147483647)
+-- fail - utilizing streaming = parallel with time-delayed replication is not supported
+CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUBLICATION testpub WITH (connect = false, streaming = parallel, min_apply_delay = 123);
+ERROR: min_apply_delay > 0 and streaming = parallel are mutually exclusive options
+-- success -- min_apply_delay value without unit is taken as milliseconds
+CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUBLICATION testpub WITH (connect = false, min_apply_delay = 123);
+WARNING: subscription was created, but is not connected
+HINT: To initiate replication, you must manually create the replication slot, enable the subscription, and refresh the subscription.
+\dRs+
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+-----------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | off | d | f | any | 123 | off | dbname=regress_doesnotexist | 0/0
+(1 row)
+
+-- success -- min_apply_delay value with unit is converted into ms and stored as an integer
+ALTER SUBSCRIPTION regress_testsub SET (min_apply_delay = '1 d');
+\dRs+
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+-----------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | off | d | f | any | 86400000 | off | dbname=regress_doesnotexist | 0/0
+(1 row)
+
+-- fail - alter subscription with streaming = parallel should fail when time-delayed replication is set
+ALTER SUBSCRIPTION regress_testsub SET (streaming = parallel);
+ERROR: cannot enable parallel streaming mode for subscription with min_apply_delay
+-- fail - alter subscription with min_apply_delay should fail when streaming = parallel is set
+ALTER SUBSCRIPTION regress_testsub SET (min_apply_delay = 0, streaming = parallel);
+ALTER SUBSCRIPTION regress_testsub SET (min_apply_delay = 123);
+ERROR: cannot enable min_apply_delay for subscription in streaming = parallel mode
ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE);
DROP SUBSCRIPTION regress_testsub;
RESET SESSION AUTHORIZATION;
diff --git a/src/test/regress/sql/subscription.sql b/src/test/regress/sql/subscription.sql
index 7281f5fee2..7317b140f5 100644
--- a/src/test/regress/sql/subscription.sql
+++ b/src/test/regress/sql/subscription.sql
@@ -286,6 +286,30 @@ ALTER SUBSCRIPTION regress_testsub SET (disable_on_error = true);
ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE);
DROP SUBSCRIPTION regress_testsub;
+-- fail -- min_apply_delay must be a non-negative integer
+CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUBLICATION testpub WITH (connect = false, min_apply_delay = foo);
+CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUBLICATION testpub WITH (connect = false, min_apply_delay = -1);
+
+-- fail - utilizing streaming = parallel with time-delayed replication is not supported
+CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUBLICATION testpub WITH (connect = false, streaming = parallel, min_apply_delay = 123);
+
+-- success -- min_apply_delay value without unit is taken as milliseconds
+CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUBLICATION testpub WITH (connect = false, min_apply_delay = 123);
+\dRs+
+
+-- success -- min_apply_delay value with unit is converted into ms and stored as an integer
+ALTER SUBSCRIPTION regress_testsub SET (min_apply_delay = '1 d');
+\dRs+
+
+-- fail - alter subscription with streaming = parallel should fail when time-delayed replication is set
+ALTER SUBSCRIPTION regress_testsub SET (streaming = parallel);
+
+-- fail - alter subscription with min_apply_delay should fail when streaming = parallel is set
+ALTER SUBSCRIPTION regress_testsub SET (min_apply_delay = 0, streaming = parallel);
+ALTER SUBSCRIPTION regress_testsub SET (min_apply_delay = 123);
+ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE);
+DROP SUBSCRIPTION regress_testsub;
+
RESET SESSION AUTHORIZATION;
DROP ROLE regress_subscription_user;
DROP ROLE regress_subscription_user2;
diff --git a/src/test/subscription/meson.build b/src/test/subscription/meson.build
index 3db0fdfd96..a186876eb4 100644
--- a/src/test/subscription/meson.build
+++ b/src/test/subscription/meson.build
@@ -38,6 +38,7 @@ tests += {
't/029_on_error.pl',
't/030_origin.pl',
't/031_column_list.pl',
+ 't/032_apply_delay.pl',
't/100_bugs.pl',
],
},
diff --git a/src/test/subscription/t/032_apply_delay.pl b/src/test/subscription/t/032_apply_delay.pl
new file mode 100644
index 0000000000..7d499ab664
--- /dev/null
+++ b/src/test/subscription/t/032_apply_delay.pl
@@ -0,0 +1,187 @@
+
+# Copyright (c) 2023, PostgreSQL Global Development Group
+
+# Test replication apply delay
+use strict;
+use warnings;
+use PostgreSQL::Test::Cluster;
+use PostgreSQL::Test::Utils;
+use Test::More;
+
+# Confirm the time-delayed replication has been effective from the server log
+# message where the apply worker emits for applying delay. Moreover, verifies
+# that the current worker's remaining wait time is sufficiently bigger than the
+# expected value, in order to check any update of the min_apply_delay.
+sub check_apply_delay_log
+{
+ my ($node_subscriber, $offset, $expected) = @_;
+
+ my $log_location = $node_subscriber->wait_for_log(
+ qr/time-delayed replication for txid (\d+), min_apply_delay = (\d+) ms, remaining wait time: (\d+) ms/,
+ $offset);
+
+ cmp_ok($log_location, '>', $offset,
+ "logfile contains triggered logical replication apply delay");
+
+ # Get the remaining wait time from the server log
+ my $contents = slurp_file($node_subscriber->logfile, $offset);
+ $contents =~
+ qr/time-delayed replication for txid (\d+), min_apply_delay = (\d+) ms, remaining wait time: (\d+) ms/,
+ or die "could not get the apply worker wait time";
+ my $logged_delay = $3;
+
+ # Is it larger than expected?
+ cmp_ok($logged_delay, '>', $expected,
+ "The apply worker wait time has expected duration");
+}
+
+# Compare inserted time on the publisher with applied time on the subscriber to
+# confirm the latter is applied after expected time. The time is automatically
+# generated and stored in the table column 'c'.
+sub check_apply_delay_time
+{
+ my ($node_publisher, $node_subscriber, $primary_key, $expected_diffs) =
+ @_;
+
+ my $inserted_time_on_pub = $node_publisher->safe_psql(
+ 'postgres', qq[
+ SELECT extract(epoch from c) FROM test_tab WHERE a = $primary_key;
+ ]);
+
+ my $inserted_time_on_sub = $node_subscriber->safe_psql(
+ 'postgres', qq[
+ SELECT extract(epoch from c) FROM test_tab WHERE a = $primary_key;
+ ]);
+
+ cmp_ok(
+ $inserted_time_on_sub - $inserted_time_on_pub,
+ '>',
+ $expected_diffs,
+ "The tuple on the subscriber was modified later than the publisher");
+}
+
+# Create publisher node
+my $node_publisher = PostgreSQL::Test::Cluster->new('publisher');
+$node_publisher->init(allows_streaming => 'logical');
+$node_publisher->append_conf('postgresql.conf',
+ 'logical_decoding_work_mem = 64kB');
+$node_publisher->start;
+
+# Create subscriber node
+my $node_subscriber = PostgreSQL::Test::Cluster->new('subscriber');
+$node_subscriber->init;
+$node_subscriber->append_conf('postgresql.conf', "log_min_messages = debug2");
+$node_subscriber->start;
+
+# Setup structure on publisher
+$node_publisher->safe_psql('postgres',
+ "CREATE TABLE test_tab (a int primary key, b varchar, c timestamptz (6) DEFAULT now())"
+);
+
+# Setup structure on subscriber
+$node_subscriber->safe_psql('postgres',
+ "CREATE TABLE test_tab (a int primary key, b text, c timestamptz (6) DEFAULT now())"
+);
+
+# Setup logical replication
+my $publisher_connstr = $node_publisher->connstr . ' dbname=postgres';
+
+# The column 'c' must not be published because we want to compare the time
+# difference.
+$node_publisher->safe_psql('postgres',
+ "CREATE PUBLICATION tap_pub FOR TABLE test_tab (a, b)");
+
+my $appname = 'tap_sub';
+
+# Create a subscription that applies the transaction after 1 second delay
+$node_subscriber->safe_psql('postgres',
+ "CREATE SUBSCRIPTION tap_sub CONNECTION '$publisher_connstr application_name=$appname' PUBLICATION tap_pub WITH (copy_data = off, min_apply_delay = '1s', streaming = 'on')"
+);
+
+# Check log starting now for logical replication apply delay
+my $offset = -s $node_subscriber->logfile;
+
+# New row to trigger apply delay
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO test_tab VALUES (1, 'foo')");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO test_tab VALUES (2, 'bar')");
+
+$node_publisher->wait_for_catchup($appname);
+
+my $result =
+ $node_subscriber->safe_psql('postgres',
+ "SELECT count(*), min(a), max(a) FROM test_tab");
+is($result, qq(2|1|2), 'check if the new rows were applied to subscriber');
+
+# Make sure the apply worker knows to wait for more than 500ms
+check_apply_delay_log($node_subscriber, $offset, "0.5");
+
+# Verify that the subscriber lags the publisher by at least 1 second
+check_apply_delay_time($node_publisher, $node_subscriber, '2', '1');
+
+# Setup for streaming case
+$node_publisher->append_conf('postgresql.conf',
+ 'logical_decoding_mode = immediate');
+$node_publisher->reload;
+
+# Run a query to make sure that the reload has taken effect.
+$node_publisher->safe_psql('postgres', q{SELECT 1});
+
+# Check log starting now for logical replication apply delay
+$offset = -s $node_subscriber->logfile;
+
+# Test streamed transaction by insert
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO test_tab SELECT i, md5(i::text) FROM generate_series(3, 5) s(i);"
+);
+
+$node_publisher->wait_for_catchup($appname);
+
+$result =
+ $node_subscriber->safe_psql('postgres',
+ "SELECT count(*), min(a), max(a) FROM test_tab");
+is($result, qq(5|1|5), 'check if the new rows were applied to subscriber');
+
+# Make sure the apply worker knows to wait for more than 500ms
+check_apply_delay_log($node_subscriber, $offset, "0.5");
+
+# Verify that the subscriber lags the publisher by at least 1 second
+check_apply_delay_time($node_publisher, $node_subscriber, '5', '1');
+
+# Test whether ALTER SUBSCRIPTION changes the delayed time of the apply worker
+# (1 day 5 minutes). Note that the extra 5 minute is to account for any
+# decoding/network overhead.
+$node_subscriber->safe_psql('postgres',
+ "ALTER SUBSCRIPTION tap_sub SET (min_apply_delay = 86700000)");
+
+# Check log starting now for logical replication apply delay
+$offset = -s $node_subscriber->logfile;
+
+# New row to trigger apply delay
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO test_tab VALUES (0, 'foobar')");
+
+# Make sure the apply worker knows to wait for more than 1 day
+check_apply_delay_log($node_subscriber, $offset, "86400000");
+
+# Disable subscription and the worker should die immediately
+$node_subscriber->safe_psql('postgres',
+ "ALTER SUBSCRIPTION tap_sub DISABLE;");
+
+# Wait until worker dies
+my $sub_query =
+ "SELECT count(1) = 0 FROM pg_stat_subscription WHERE subname = 'tap_sub' AND pid IS NOT NULL;";
+$node_subscriber->poll_query_until('postgres', $sub_query)
+ or die "Timed out while waiting for subscriber to die";
+
+# Confirm disabling the subscription by ALTER SUBSCRIPTION DISABLE did not cause
+# the delayed transaction to be applied.
+$result = $node_subscriber->safe_psql('postgres',
+ "SELECT count(a) FROM test_tab WHERE a = 0;");
+is($result, qq(0), "check the delayed transaction was not applied");
+
+$node_subscriber->stop;
+$node_publisher->stop;
+
+done_testing();
--
2.30.0
Hi, Horiguchi-san
Thank you for checking the patch !
On Wednesday, January 25, 2023 10:17 AM Kyotaro Horiguchi <horikyota.ntt@gmail.com> wrote:
In short, I'd like to propose renaming the parameter in_delayed_apply of
send_feedback to "has_unprocessed_change".At Tue, 24 Jan 2023 12:27:58 +0530, Amit Kapila <amit.kapila16@gmail.com>
wrote insend_feedback():
+ * If the subscriber side apply is delayed (because oftime-delayed
+ * replication) then do not tell the publisher that the received
latest
+ * LSN is already applied and flushed, otherwise, it leads to the + * publisher side making a wrong assumption of logicalreplication
+ * progress. Instead, we just send a feedback message to avoid a
publisher
+ * timeout during the delay. */ - if (!have_pending_txes) + if (!have_pending_txes && !in_delayed_apply) flushpos = writepos = recvpos;Honestly I don't like this wart. The reason for this is the function
assumes recvpos = applypos but we actually call it while holding
unapplied changes, that is, applypos < recvpos.Couldn't we maintain an additional static variable "last_applied"
along with last_received?It won't be easy to maintain the meaning of last_applied because there
are cases where we don't apply the change directly. For example, in
case of streaming xacts, we will just keep writing it to the file,
now, say, due to some reason, we have to send the feedback, then it
will not allow you to update the latest write locations. This would
then become different then what we are doing without the patch.
Another point to think about is that we also need to keep the variable
updated for keep-alive ('k') messages even though we don't apply
anything in that case. Still, other cases to consider are where we
have mix of streaming and non-streaming transactions.Yeah. Even though I named it as "last_applied", its objective is to have
get_flush_position returning the correct have_pending_txes without a hint
from callers, that is, "let g_f_position know if store_flush_position has been
called with the last received data".Anyway I tried that but didn't find a clean and simple way. However, while on it,
I realized what the code made me confused.+static void send_feedback(XLogRecPtr recvpos, bool force, bool requestReply, + bool in_delayed_apply);The name "in_delayed_apply" doesn't donsn't give me an idea of what the
function should do for it. If it is named "has_unprocessed_change", I think it
makes sense that send_feedback should think there may be an outstanding
transaction that is not known to the function.So, my conclusion here is I'd like to propose changing the parameter name to
"has_unapplied_change".
Renamed the variable name to "has_unprocessed_change".
Also, removed the first argument of the send_feedback() which isn't necessary now.
Kindly have a look at the patch shared in [1]/messages/by-id/TYCPR01MB8373193B4331B7EB6276F682EDCE9@TYCPR01MB8373.jpnprd01.prod.outlook.com.
[1]: /messages/by-id/TYCPR01MB8373193B4331B7EB6276F682EDCE9@TYCPR01MB8373.jpnprd01.prod.outlook.com
Best Regards,
Takamichi Osumi
At Tue, 24 Jan 2023 12:19:04 +0000, "Takamichi Osumi (Fujitsu)" <osumi.takamichi@fujitsu.com> wrote in
Attached the patch v20 that has incorporated all comments so far.
Thanks! I looked thourgh the documentation part.
+ <entry role="catalog_table_entry"><para role="column_definition">
+ <structfield>subminapplydelay</structfield> <type>int8</type>
+ </para>
+ <para>
+ Total time spent delaying the application of changes, in milliseconds.
+ </para></entry>
I was confused becase it reads as this column shows the summarized
actual waiting time caused by min_apply_delay. IIUC actually it shows
the min_apply_delay setting for the subscription. Thus shouldn't it be
something like this?
"The minimum amount of time to delay applying changes, in milliseconds"
And it might be better to mention the corresponding subscription paramter.
+ error. If <varname>wal_receiver_status_interval</varname> is set to
+ zero, the apply worker doesn't send any feedback messages during the
+ <literal>min_apply_delay</literal> period.
I took a bit longer time to understand what this sentence means. I'd
like to suggest something like the follwoing.
"Since no status-update messages are sent while delaying, note that
wal_receiver_status_interval is the only source of keepalive messages
during that period."
+ <para>
+ A logical replication subscription can delay the application of changes by
+ specifying the <literal>min_apply_delay</literal> subscription parameter.
+ See <xref linkend="sql-createsubscription"/> for details.
+ </para>
I'm not sure "logical replication subscription" is a common term.
Doesn't just "subscription" mean the same, especially in that context?
(Note that 31.2 starts with "A subscription is the downstream..").
+ Any delay occurs only on WAL records for transaction begins after all
+ initial table synchronization has finished. The delay is calculated
There is no "transaction begin" WAL records. Maybe it is "logical
replication transaction begin message". The timestamp is of "commit
time". (I took "transaction begins" as a noun, but that might be
wrong..)
+ may reduce the actual wait time. It is also possible that the overhead
+ already exceeds the requested <literal>min_apply_delay</literal> value,
+ in which case no additional wait is necessary. If the system clocks
I'm not sure it is right to say "necessary" here. IMHO it might be
better be "in which case no delay is applied".
+ in which case no additional wait is necessary. If the system clocks
+ on publisher and subscriber are not synchronized, this may lead to
+ apply changes earlier than expected, but this is not a major issue
+ because this parameter is typically much larger than the time
+ deviations between servers. Note that if this parameter is set to a
This doesn't seem to fit our documentation. It is not our business
whether a certain amount deviation is critical or not. How about
somethig like the following?
"Note that the delay is measured between the timestamp assigned by
publisher and the system clock on subscriber. You need to manage the
system clocks to be in sync so that the delay works properly."
+ Delaying the replication can mean there is a much longer time
+ between making a change on the publisher, and that change being
+ committed on the subscriber. This can impact the performance of
+ synchronous replication. See <xref linkend="guc-synchronous-commit"/>
+ parameter.
Do we need the "can" in "Delaying the replication can mean"? If we
want to say, it might be "Delaying the replication means there can be
a much longer..."?
+ <para>
+ Create a subscription to a remote server that replicates tables in
+ the <literal>mypub</literal> publication and starts replicating immediately
+ on commit. Pre-existing data is not copied. The application of changes is
+ delayed by 4 hours.
+<programlisting>
+CREATE SUBSCRIPTION mysub
+ CONNECTION 'host=192.0.2.4 port=5432 user=foo dbname=foodb'
+ PUBLICATION mypub
+ WITH (copy_data = false, min_apply_delay = '4h');
+</programlisting></para>
I'm not sure we need this additional example. We already have two
exmaples one of which differs from the above only by actual values for
PUBLICATION and WITH clauses.
regards.
--
Kyotaro Horiguchi
NTT Open Source Software Center
On Wed, Jan 25, 2023 at 11:23 AM Takamichi Osumi (Fujitsu)
<osumi.takamichi@fujitsu.com> wrote:
Thank you for checking the patch !
On Wednesday, January 25, 2023 10:17 AM Kyotaro Horiguchi <horikyota.ntt@gmail.com> wrote:In short, I'd like to propose renaming the parameter in_delayed_apply of
send_feedback to "has_unprocessed_change".At Tue, 24 Jan 2023 12:27:58 +0530, Amit Kapila <amit.kapila16@gmail.com>
wrote insend_feedback():
+ * If the subscriber side apply is delayed (because oftime-delayed
+ * replication) then do not tell the publisher that the received
latest
+ * LSN is already applied and flushed, otherwise, it leads to the + * publisher side making a wrong assumption of logicalreplication
+ * progress. Instead, we just send a feedback message to avoid a
publisher
+ * timeout during the delay. */ - if (!have_pending_txes) + if (!have_pending_txes && !in_delayed_apply) flushpos = writepos = recvpos;Honestly I don't like this wart. The reason for this is the function
assumes recvpos = applypos but we actually call it while holding
unapplied changes, that is, applypos < recvpos.Couldn't we maintain an additional static variable "last_applied"
along with last_received?It won't be easy to maintain the meaning of last_applied because there
are cases where we don't apply the change directly. For example, in
case of streaming xacts, we will just keep writing it to the file,
now, say, due to some reason, we have to send the feedback, then it
will not allow you to update the latest write locations. This would
then become different then what we are doing without the patch.
Another point to think about is that we also need to keep the variable
updated for keep-alive ('k') messages even though we don't apply
anything in that case. Still, other cases to consider are where we
have mix of streaming and non-streaming transactions.Yeah. Even though I named it as "last_applied", its objective is to have
get_flush_position returning the correct have_pending_txes without a hint
from callers, that is, "let g_f_position know if store_flush_position has been
called with the last received data".Anyway I tried that but didn't find a clean and simple way. However, while on it,
I realized what the code made me confused.+static void send_feedback(XLogRecPtr recvpos, bool force, bool requestReply, + bool in_delayed_apply);The name "in_delayed_apply" doesn't donsn't give me an idea of what the
function should do for it. If it is named "has_unprocessed_change", I think it
makes sense that send_feedback should think there may be an outstanding
transaction that is not known to the function.So, my conclusion here is I'd like to propose changing the parameter name to
"has_unapplied_change".Renamed the variable name to "has_unprocessed_change".
Also, removed the first argument of the send_feedback() which isn't necessary now.
Why did you remove the first argument of the send_feedback() when that
is not added by this patch? If you really think that is an
improvement, feel free to propose that as a separate patch.
Personally, I don't see a value in it.
--
With Regards,
Amit Kapila.
On Wed, Jan 25, 2023 at 11:57 AM Kyotaro Horiguchi
<horikyota.ntt@gmail.com> wrote:
At Tue, 24 Jan 2023 12:19:04 +0000, "Takamichi Osumi (Fujitsu)" <osumi.takamichi@fujitsu.com> wrote in
Attached the patch v20 that has incorporated all comments so far.
...
+ in which case no additional wait is necessary. If the system clocks + on publisher and subscriber are not synchronized, this may lead to + apply changes earlier than expected, but this is not a major issue + because this parameter is typically much larger than the time + deviations between servers. Note that if this parameter is set to aThis doesn't seem to fit our documentation. It is not our business
whether a certain amount deviation is critical or not. How about
somethig like the following?
But we have a similar description for 'recovery_min_apply_delay' [1]https://www.postgresql.org/docs/devel/runtime-config-replication.html.
See "...If the system clocks on primary and standby are not
synchronized, this may lead to recovery applying records earlier than
expected; but that is not a major issue because useful settings of
this parameter are much larger than typical time deviations between
servers."
[1]: https://www.postgresql.org/docs/devel/runtime-config-replication.html
--
With Regards,
Amit Kapila.
At Wed, 25 Jan 2023 12:30:19 +0530, Amit Kapila <amit.kapila16@gmail.com> wrote in
On Wed, Jan 25, 2023 at 11:57 AM Kyotaro Horiguchi
<horikyota.ntt@gmail.com> wrote:At Tue, 24 Jan 2023 12:19:04 +0000, "Takamichi Osumi (Fujitsu)" <osumi.takamichi@fujitsu.com> wrote in
Attached the patch v20 that has incorporated all comments so far.
...
+ in which case no additional wait is necessary. If the system clocks + on publisher and subscriber are not synchronized, this may lead to + apply changes earlier than expected, but this is not a major issue + because this parameter is typically much larger than the time + deviations between servers. Note that if this parameter is set to aThis doesn't seem to fit our documentation. It is not our business
whether a certain amount deviation is critical or not. How about
somethig like the following?But we have a similar description for 'recovery_min_apply_delay' [1].
See "...If the system clocks on primary and standby are not
synchronized, this may lead to recovery applying records earlier than
expected; but that is not a major issue because useful settings of
this parameter are much larger than typical time deviations between
servers."
Mmmm. I thought that we might be able to gather the description
(including other common descriptions, if any), but I didn't find an
appropreate place..
Okay. I agree to the current description. Thanks for the kind
explanation.
regards.
--
Kyotaro Horiguchi
NTT Open Source Software Center
On Wednesday, January 25, 2023 3:27 PM Kyotaro Horiguchi <horikyota.ntt@gmail.com> wrote:
At Tue, 24 Jan 2023 12:19:04 +0000, "Takamichi Osumi (Fujitsu)"
<osumi.takamichi@fujitsu.com> wrote inAttached the patch v20 that has incorporated all comments so far.
Thanks! I looked thourgh the documentation part.
Thank you for your review !
+ <entry role="catalog_table_entry"><para role="column_definition"> + <structfield>subminapplydelay</structfield> <type>int8</type> + </para> + <para> + Total time spent delaying the application of changes, in milliseconds. + </para></entry>I was confused becase it reads as this column shows the summarized actual
waiting time caused by min_apply_delay. IIUC actually it shows the
min_apply_delay setting for the subscription. Thus shouldn't it be something
like this?"The minimum amount of time to delay applying changes, in milliseconds"
And it might be better to mention the corresponding subscription paramter.
This description looks much better to me than the past description. Fixed.
OTOH, other parameters don't mention about its subscription parameters.
So, I didn't add the mention.
+ error. If <varname>wal_receiver_status_interval</varname> is set to + zero, the apply worker doesn't send any feedback messages during the + <literal>min_apply_delay</literal> period.I took a bit longer time to understand what this sentence means. I'd like to
suggest something like the follwoing."Since no status-update messages are sent while delaying, note that
wal_receiver_status_interval is the only source of keepalive messages during
that period."
The current patch's description is precise and I prefer that.
I would say "the only source" would be confusing to readers.
However, I slightly adjusted the description a bit. Could you please check ?
+ <para> + A logical replication subscription can delay the application of changes by + specifying the <literal>min_apply_delay</literal> subscription parameter. + See <xref linkend="sql-createsubscription"/> for details. + </para>I'm not sure "logical replication subscription" is a common term.
Doesn't just "subscription" mean the same, especially in that context?
(Note that 31.2 starts with "A subscription is the downstream..").
I think you are right. Fixed.
+ Any delay occurs only on WAL records for transaction begins after all + initial table synchronization has finished. The delay is + calculatedThere is no "transaction begin" WAL records. Maybe it is "logical replication
transaction begin message". The timestamp is of "commit time". (I took
"transaction begins" as a noun, but that might be
wrong..)
Yeah, we can improve here. But, we need to include not only
"commit" but also "prepare" as nuance in this part.
In short, I think we should change here to mention
(1) the delay happens after all initial table synchronization
(2) how delay is applied for non-streaming and streaming transactions in general.
By the way, WAL timestamp is a word used in the recovery_min_apply_delay.
So, I'd like to keep it to make the description more aligned with it,
until there is a better description.
Updated the doc. I adjusted the commit message according to this fix.
+ may reduce the actual wait time. It is also possible that the overhead + already exceeds the requested <literal>min_apply_delay</literal> value, + in which case no additional wait is necessary. If the system + clocksI'm not sure it is right to say "necessary" here. IMHO it might be better be "in
which case no delay is applied".
Agreed. Fixed.
+ in which case no additional wait is necessary. If the system clocks + on publisher and subscriber are not synchronized, this may lead to + apply changes earlier than expected, but this is not a major issue + because this parameter is typically much larger than the time + deviations between servers. Note that if this parameter is + set to aThis doesn't seem to fit our documentation. It is not our business whether a
certain amount deviation is critical or not. How about somethig like the
following?"Note that the delay is measured between the timestamp assigned by
publisher and the system clock on subscriber. You need to manage the
system clocks to be in sync so that the delay works properly."
As discussed, this is aligned with recovery_min_apply_delay. So, I keep it.
+ Delaying the replication can mean there is a much longer time + between making a change on the publisher, and that change being + committed on the subscriber. This can impact the performance of + synchronous replication. See <xref linkend="guc-synchronous-commit"/> + parameter.Do we need the "can" in "Delaying the replication can mean"? If we want to
say, it might be "Delaying the replication means there can be a much longer..."?
The "can" indicates the possibility as the nuance,
while adopting "means" in this case indicates "time delayed LR causes
the long time wait always".
I'm okay with either expression, but
I think you are right in practice and from
the perspective of the purpose of this feature. So, fixed.
+ <para> + Create a subscription to a remote server that replicates tables in + the <literal>mypub</literal> publication and starts replicating immediately + on commit. Pre-existing data is not copied. The application of changes is + delayed by 4 hours. +<programlisting> +CREATE SUBSCRIPTION mysub + CONNECTION 'host=192.0.2.4 port=5432 user=foo dbname=foodb' + PUBLICATION mypub + WITH (copy_data = false, min_apply_delay = '4h'); +</programlisting></para>I'm not sure we need this additional example. We already have two exmaples
one of which differs from the above only by actual values for PUBLICATION and
WITH clauses.
I thought there was no harm in having this example, but
what you say makes sense. Removed.
Attached the updated v22.
Best Regards,
Takamichi Osumi
Attachments:
v22-0001-Time-delayed-logical-replication-subscriber.patchapplication/octet-stream; name=v22-0001-Time-delayed-logical-replication-subscriber.patchDownload
From c57a0230f443b1d8962343a84b20286cc117ae67 Mon Sep 17 00:00:00 2001
From: Takamichi Osumi <osumi.takamichi@fujitsu.com>
Date: Wed, 25 Jan 2023 13:49:22 +0000
Subject: [PATCH v22] Time-delayed logical replication subscriber
Similar to physical replication, a time-delayed copy of the data for
logical replication is useful for some scenarios (particularly to fix
errors that might cause data loss).
This patch implements a new subscription parameter called 'min_apply_delay'.
If the subscription sets min_apply_delay parameter, the logical
replication worker will delay the transaction apply for min_apply_delay
milliseconds.
The delay is calculated between the WAL time stamp and the current time
on the subscriber.
The delay occurs before we start to apply the transaction on the
subscriber. The main reason is to avoid keeping a transaction open for
a long time. Regular and prepared transactions are covered. Streamed
transactions are also covered.
The combination of parallel streaming mode and min_apply_delay is not
allowed. This is because we start applying the transaction stream as
soon as the first change arrives without knowing the transaction's
prepare/commit time. This means we cannot calculate the underlying
network/decoding lag between publisher and subscriber, and so always
waiting for the full 'min_apply_delay' period might include
unnecessary delay.
The other possibility is to apply the delay at the end of the parallel
apply transaction but that would cause issues related to resource
bloat and locks being held for a long time.
Note that this feature doesn't interact with skip transaction feature.
The skip transaction feature applies to one transaction with a specific LSN.
So, even if the skipped transaction and non-skipped transaction come
consecutively in a very short time, regardless of the order of which comes
first, the time-delayed feature gets balanced by delayed application
for other transactions before and after the skipped transaction.
Author: Euler Taveira, Takamichi Osumi, Kuroda Hayato
Reviewed-by: Amit Kapila, Peter Smith, Vignesh C, Shveta Malik,
Kyotaro Horiguchi, Dilip Kumar, Melih Mutlu
Discussion: https://postgr.es/m/CAB-JLwYOYwL=XTyAXKiH5CtM_Vm8KjKh7aaitCKvmCh4rzr5pQ@mail.gmail.com
---
doc/src/sgml/catalogs.sgml | 9 +
doc/src/sgml/config.sgml | 11 ++
doc/src/sgml/glossary.sgml | 14 ++
doc/src/sgml/logical-replication.sgml | 6 +
doc/src/sgml/ref/alter_subscription.sgml | 5 +-
doc/src/sgml/ref/create_subscription.sgml | 48 ++++-
src/backend/catalog/pg_subscription.c | 1 +
src/backend/catalog/system_views.sql | 7 +-
src/backend/commands/subscriptioncmds.c | 117 ++++++++++-
.../replication/logical/applyparallelworker.c | 3 +-
src/backend/replication/logical/worker.c | 165 ++++++++++++++--
src/bin/pg_dump/pg_dump.c | 15 +-
src/bin/pg_dump/pg_dump.h | 1 +
src/bin/psql/describe.c | 9 +-
src/bin/psql/tab-complete.c | 4 +-
src/include/catalog/pg_subscription.h | 3 +
src/include/replication/worker_internal.h | 2 +-
src/test/regress/expected/subscription.out | 181 ++++++++++-------
src/test/regress/sql/subscription.sql | 24 +++
src/test/subscription/meson.build | 1 +
src/test/subscription/t/032_apply_delay.pl | 187 ++++++++++++++++++
21 files changed, 709 insertions(+), 104 deletions(-)
create mode 100644 src/test/subscription/t/032_apply_delay.pl
diff --git a/doc/src/sgml/catalogs.sgml b/doc/src/sgml/catalogs.sgml
index c1e4048054..a0cd21665e 100644
--- a/doc/src/sgml/catalogs.sgml
+++ b/doc/src/sgml/catalogs.sgml
@@ -7873,6 +7873,15 @@ SCRAM-SHA-256$<replaceable><iteration count></replaceable>:<replaceable>&l
</para></entry>
</row>
+ <row>
+ <entry role="catalog_table_entry"><para role="column_definition">
+ <structfield>subminapplydelay</structfield> <type>int8</type>
+ </para>
+ <para>
+ The minimum amount of time to delay applying changes, in milliseconds.
+ </para></entry>
+ </row>
+
<row>
<entry role="catalog_table_entry"><para role="column_definition">
<structfield>subname</structfield> <type>name</type>
diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml
index f985afc009..317ebeb38f 100644
--- a/doc/src/sgml/config.sgml
+++ b/doc/src/sgml/config.sgml
@@ -4787,6 +4787,17 @@ ANY <replaceable class="parameter">num_sync</replaceable> ( <replaceable class="
the <filename>postgresql.conf</filename> file or on the server
command line.
</para>
+ <para>
+ For time-delayed logical replication, the apply worker sends a feedback
+ message to the publisher every
+ <varname>wal_receiver_status_interval</varname> milliseconds. Make sure
+ to set <varname>wal_receiver_status_interval</varname> less than the
+ <varname>wal_sender_timeout</varname> on the publisher, otherwise, the
+ <literal>walsender</literal> will repeatedly terminate due to timeout
+ error. Note that if <varname>wal_receiver_status_interval</varname> is
+ set to zero, the apply worker sends no feedback messages during the
+ <literal>min_apply_delay</literal> period.
+ </para>
</listitem>
</varlistentry>
diff --git a/doc/src/sgml/glossary.sgml b/doc/src/sgml/glossary.sgml
index 7c01a541fe..6ed6fa5853 100644
--- a/doc/src/sgml/glossary.sgml
+++ b/doc/src/sgml/glossary.sgml
@@ -1729,6 +1729,20 @@
</glossdef>
</glossentry>
+ <glossentry id="glossary-time-delayed-replication">
+ <glossterm>Time-delayed replication</glossterm>
+ <glossdef>
+ <para>
+ Replication setup that applies time-delayed copy of the data.
+ </para>
+ <para>
+ For more information, see
+ <xref linkend="guc-recovery-min-apply-delay"/> for physical replication
+ and <xref linkend="sql-createsubscription"/> for logical replication.
+ </para>
+ </glossdef>
+ </glossentry>
+
<glossentry id="glossary-toast">
<glossterm>TOAST</glossterm>
<glossdef>
diff --git a/doc/src/sgml/logical-replication.sgml b/doc/src/sgml/logical-replication.sgml
index 1bd5660c87..6bd5f61e2b 100644
--- a/doc/src/sgml/logical-replication.sgml
+++ b/doc/src/sgml/logical-replication.sgml
@@ -247,6 +247,12 @@
target table.
</para>
+ <para>
+ A subscription can delay the application of changes by specifying the
+ <literal>min_apply_delay</literal> subscription parameter. See
+ <xref linkend="sql-createsubscription"/> for details.
+ </para>
+
<sect2 id="logical-replication-subscription-slot">
<title>Replication Slot Management</title>
diff --git a/doc/src/sgml/ref/alter_subscription.sgml b/doc/src/sgml/ref/alter_subscription.sgml
index ad93553a1d..1c6e9dd2d1 100644
--- a/doc/src/sgml/ref/alter_subscription.sgml
+++ b/doc/src/sgml/ref/alter_subscription.sgml
@@ -213,8 +213,9 @@ ALTER SUBSCRIPTION <replaceable class="parameter">name</replaceable> RENAME TO <
are <literal>slot_name</literal>,
<literal>synchronous_commit</literal>,
<literal>binary</literal>, <literal>streaming</literal>,
- <literal>disable_on_error</literal>, and
- <literal>origin</literal>.
+ <literal>disable_on_error</literal>,
+ <literal>origin</literal>, and
+ <literal>min_apply_delay</literal>.
</para>
</listitem>
</varlistentry>
diff --git a/doc/src/sgml/ref/create_subscription.sgml b/doc/src/sgml/ref/create_subscription.sgml
index eba72c6af6..22b4451d4d 100644
--- a/doc/src/sgml/ref/create_subscription.sgml
+++ b/doc/src/sgml/ref/create_subscription.sgml
@@ -349,7 +349,48 @@ CREATE SUBSCRIPTION <replaceable class="parameter">subscription_name</replaceabl
</para>
</listitem>
</varlistentry>
- </variablelist></para>
+
+ <varlistentry>
+ <term><literal>min_apply_delay</literal> (<type>integer</type>)</term>
+ <listitem>
+ <para>
+ By default, the subscriber applies changes as soon as possible. This
+ parameter allows the user to delay the application of changes by a
+ given time period. If the value is specified without units, it is
+ taken as milliseconds. The default is zero (no delay). See
+ <xref linkend="config-setting-names-values"/> for details on the
+ available valid time unites.
+ </para>
+ <para>
+ Any delay becomes effective after all initial table synchronization
+ has finished and occurs before each transaction starts to get applied
+ on the subscriber. The delay is calculated as the difference between
+ the WAL timestamp as written on the publisher and the current time on
+ the subscriber. Any overhead of time spent in logical decoding and in
+ transferring the transaction may reduce the actual wait time. It is
+ also possible that the overhead already exceeds the requested
+ <literal>min_apply_delay</literal> value, in which case no delay is
+ applied. If the system clocks on publisher and subscriber are not
+ synchronized, this may lead to apply changes earlier than expected,
+ but this is not a major issue because this parameter is typically
+ much larger than the time deviations between servers. Note that if
+ this parameter is set to a long delay, the replication will stop if
+ the replication slot falls behind the current LSN by more than
+ <link linkend="guc-max-slot-wal-keep-size"><literal>max_slot_wal_keep_size</literal></link>.
+ </para>
+ <warning>
+ <para>
+ Delaying the replication means there is a much longer time between
+ making a change on the publisher, and that change being committed
+ on the subscriber. This can impact the performance of synchronous
+ replication. See <xref linkend="guc-synchronous-commit"/>
+ parameter.
+ </para>
+ </warning>
+ </listitem>
+ </varlistentry>
+ </variablelist>
+ </para>
</listitem>
</varlistentry>
@@ -413,6 +454,11 @@ CREATE SUBSCRIPTION <replaceable class="parameter">subscription_name</replaceabl
published with different column lists are not supported.
</para>
+ <para>
+ A non-zero <literal>min_apply_delay</literal> parameter is not allowed when
+ streaming in parallel mode.
+ </para>
+
<para>
We allow non-existent publications to be specified so that users can add
those later. This means
diff --git a/src/backend/catalog/pg_subscription.c b/src/backend/catalog/pg_subscription.c
index a56ae311c3..c767cc1c3a 100644
--- a/src/backend/catalog/pg_subscription.c
+++ b/src/backend/catalog/pg_subscription.c
@@ -64,6 +64,7 @@ GetSubscription(Oid subid, bool missing_ok)
sub->oid = subid;
sub->dbid = subform->subdbid;
sub->skiplsn = subform->subskiplsn;
+ sub->minapplydelay = subform->subminapplydelay;
sub->name = pstrdup(NameStr(subform->subname));
sub->owner = subform->subowner;
sub->enabled = subform->subenabled;
diff --git a/src/backend/catalog/system_views.sql b/src/backend/catalog/system_views.sql
index 8608e3fa5b..317c2010cb 100644
--- a/src/backend/catalog/system_views.sql
+++ b/src/backend/catalog/system_views.sql
@@ -1299,9 +1299,10 @@ REVOKE ALL ON pg_replication_origin_status FROM public;
-- All columns of pg_subscription except subconninfo are publicly readable.
REVOKE ALL ON pg_subscription FROM public;
-GRANT SELECT (oid, subdbid, subskiplsn, subname, subowner, subenabled,
- subbinary, substream, subtwophasestate, subdisableonerr,
- subslotname, subsynccommit, subpublications, suborigin)
+GRANT SELECT (oid, subdbid, subskiplsn, subminapplydelay, subname, subowner,
+ subenabled, subbinary, substream, subtwophasestate,
+ subdisableonerr, subslotname, subsynccommit, subpublications,
+ suborigin)
ON pg_subscription TO public;
CREATE VIEW pg_stat_subscription_stats AS
diff --git a/src/backend/commands/subscriptioncmds.c b/src/backend/commands/subscriptioncmds.c
index 464db6d247..b4d075c931 100644
--- a/src/backend/commands/subscriptioncmds.c
+++ b/src/backend/commands/subscriptioncmds.c
@@ -66,6 +66,7 @@
#define SUBOPT_DISABLE_ON_ERR 0x00000400
#define SUBOPT_LSN 0x00000800
#define SUBOPT_ORIGIN 0x00001000
+#define SUBOPT_MIN_APPLY_DELAY 0x00002000
/* check if the 'val' has 'bits' set */
#define IsSet(val, bits) (((val) & (bits)) == (bits))
@@ -90,6 +91,7 @@ typedef struct SubOpts
bool disableonerr;
char *origin;
XLogRecPtr lsn;
+ int min_apply_delay;
} SubOpts;
static List *fetch_table_list(WalReceiverConn *wrconn, List *publications);
@@ -100,7 +102,7 @@ static void check_publications_origin(WalReceiverConn *wrconn,
static void check_duplicates_in_publist(List *publist, Datum *datums);
static List *merge_publications(List *oldpublist, List *newpublist, bool addpub, const char *subname);
static void ReportSlotConnectionError(List *rstates, Oid subid, char *slotname, char *err);
-
+static int defGetMinApplyDelay(DefElem *def);
/*
* Common option parsing function for CREATE and ALTER SUBSCRIPTION commands.
@@ -146,6 +148,8 @@ parse_subscription_options(ParseState *pstate, List *stmt_options,
opts->disableonerr = false;
if (IsSet(supported_opts, SUBOPT_ORIGIN))
opts->origin = pstrdup(LOGICALREP_ORIGIN_ANY);
+ if (IsSet(supported_opts, SUBOPT_MIN_APPLY_DELAY))
+ opts->min_apply_delay = 0;
/* Parse options */
foreach(lc, stmt_options)
@@ -324,6 +328,15 @@ parse_subscription_options(ParseState *pstate, List *stmt_options,
opts->specified_opts |= SUBOPT_LSN;
opts->lsn = lsn;
}
+ else if (IsSet(supported_opts, SUBOPT_MIN_APPLY_DELAY) &&
+ strcmp(defel->defname, "min_apply_delay") == 0)
+ {
+ if (IsSet(opts->specified_opts, SUBOPT_MIN_APPLY_DELAY))
+ errorConflictingDefElem(defel, pstate);
+
+ opts->specified_opts |= SUBOPT_MIN_APPLY_DELAY;
+ opts->min_apply_delay = defGetMinApplyDelay(defel);
+ }
else
ereport(ERROR,
(errcode(ERRCODE_SYNTAX_ERROR),
@@ -404,6 +417,31 @@ parse_subscription_options(ParseState *pstate, List *stmt_options,
"slot_name = NONE", "create_slot = false")));
}
}
+
+ /*
+ * The combination of parallel streaming mode and min_apply_delay is not
+ * allowed. This is because we start applying the transaction stream as
+ * soon as the first change arrives without knowing the transaction's
+ * prepare/commit time. This means we cannot calculate the underlying
+ * network/decoding lag between publisher and subscriber, and so always
+ * waiting for the full 'min_apply_delay' period might include unnecessary
+ * delay.
+ *
+ * The other possibility is to apply the delay at the end of the parallel
+ * apply transaction but that would cause issues related to resource bloat
+ * and locks being held for a long time.
+ */
+ if (IsSet(supported_opts, SUBOPT_MIN_APPLY_DELAY) &&
+ opts->min_apply_delay > 0 && opts->streaming == LOGICALREP_STREAM_PARALLEL)
+ ereport(ERROR,
+ errcode(ERRCODE_SYNTAX_ERROR),
+
+ /*
+ * - translator: the first %s is a string of the form "option > value"
+ * and the second one is "option = value".
+ */
+ errmsg("%s and %s are mutually exclusive options",
+ "min_apply_delay > 0", "streaming = parallel"));
}
/*
@@ -560,7 +598,8 @@ CreateSubscription(ParseState *pstate, CreateSubscriptionStmt *stmt,
SUBOPT_SLOT_NAME | SUBOPT_COPY_DATA |
SUBOPT_SYNCHRONOUS_COMMIT | SUBOPT_BINARY |
SUBOPT_STREAMING | SUBOPT_TWOPHASE_COMMIT |
- SUBOPT_DISABLE_ON_ERR | SUBOPT_ORIGIN);
+ SUBOPT_DISABLE_ON_ERR | SUBOPT_ORIGIN |
+ SUBOPT_MIN_APPLY_DELAY);
parse_subscription_options(pstate, stmt->options, supported_opts, &opts);
/*
@@ -625,6 +664,7 @@ CreateSubscription(ParseState *pstate, CreateSubscriptionStmt *stmt,
values[Anum_pg_subscription_oid - 1] = ObjectIdGetDatum(subid);
values[Anum_pg_subscription_subdbid - 1] = ObjectIdGetDatum(MyDatabaseId);
values[Anum_pg_subscription_subskiplsn - 1] = LSNGetDatum(InvalidXLogRecPtr);
+ values[Anum_pg_subscription_subminapplydelay - 1] = Int64GetDatum(opts.min_apply_delay);
values[Anum_pg_subscription_subname - 1] =
DirectFunctionCall1(namein, CStringGetDatum(stmt->subname));
values[Anum_pg_subscription_subowner - 1] = ObjectIdGetDatum(owner);
@@ -1054,7 +1094,7 @@ AlterSubscription(ParseState *pstate, AlterSubscriptionStmt *stmt,
supported_opts = (SUBOPT_SLOT_NAME |
SUBOPT_SYNCHRONOUS_COMMIT | SUBOPT_BINARY |
SUBOPT_STREAMING | SUBOPT_DISABLE_ON_ERR |
- SUBOPT_ORIGIN);
+ SUBOPT_ORIGIN | SUBOPT_MIN_APPLY_DELAY);
parse_subscription_options(pstate, stmt->options,
supported_opts, &opts);
@@ -1098,6 +1138,19 @@ AlterSubscription(ParseState *pstate, AlterSubscriptionStmt *stmt,
if (IsSet(opts.specified_opts, SUBOPT_STREAMING))
{
+ /*
+ * The combination of parallel streaming mode and
+ * min_apply_delay is not allowed. See
+ * parse_subscription_options for details of the reason.
+ */
+ if (opts.streaming == LOGICALREP_STREAM_PARALLEL)
+ if ((IsSet(opts.specified_opts, SUBOPT_MIN_APPLY_DELAY) && opts.min_apply_delay > 0) ||
+ (!IsSet(opts.specified_opts, SUBOPT_MIN_APPLY_DELAY) && sub->minapplydelay > 0))
+ ereport(ERROR,
+ errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
+ errmsg("cannot enable parallel streaming mode for subscription with %s",
+ "min_apply_delay"));
+
values[Anum_pg_subscription_substream - 1] =
CharGetDatum(opts.streaming);
replaces[Anum_pg_subscription_substream - 1] = true;
@@ -1111,6 +1164,25 @@ AlterSubscription(ParseState *pstate, AlterSubscriptionStmt *stmt,
= true;
}
+ if (IsSet(opts.specified_opts, SUBOPT_MIN_APPLY_DELAY))
+ {
+ /*
+ * The combination of parallel streaming mode and
+ * min_apply_delay is not allowed.
+ */
+ if (opts.min_apply_delay > 0)
+ if ((IsSet(opts.specified_opts, SUBOPT_STREAMING) && opts.streaming == LOGICALREP_STREAM_PARALLEL) ||
+ (!IsSet(opts.specified_opts, SUBOPT_STREAMING) && sub->stream == LOGICALREP_STREAM_PARALLEL))
+ ereport(ERROR,
+ errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
+ errmsg("cannot enable %s for subscription in %s mode",
+ "min_apply_delay", "streaming = parallel"));
+
+ values[Anum_pg_subscription_subminapplydelay - 1] =
+ Int64GetDatum(opts.min_apply_delay);
+ replaces[Anum_pg_subscription_subminapplydelay - 1] = true;
+ }
+
if (IsSet(opts.specified_opts, SUBOPT_ORIGIN))
{
values[Anum_pg_subscription_suborigin - 1] =
@@ -2195,3 +2267,42 @@ defGetStreamingMode(DefElem *def)
def->defname)));
return LOGICALREP_STREAM_OFF; /* keep compiler quiet */
}
+
+/*
+ * Extract the min_apply_delay value from a DefElem. This is very similar to
+ * parse_and_validate_value() for integer values, because min_apply_delay
+ * accepts the same parameter format as recovery_min_apply_delay.
+ */
+static int
+defGetMinApplyDelay(DefElem *def)
+{
+ char *input_string;
+ int result;
+ const char *hintmsg;
+
+ input_string = defGetString(def);
+
+ /*
+ * Parse given string as parameter which has millisecond unit
+ */
+ if (!parse_int(input_string, &result, GUC_UNIT_MS, &hintmsg))
+ ereport(ERROR,
+ (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+ errmsg("invalid value for parameter \"%s\": \"%s\"",
+ "min_apply_delay", input_string),
+ hintmsg ? errhint("%s", _(hintmsg)) : 0));
+
+ /*
+ * Check lower bound. parse_int() has already been confirmed that result
+ * is less than or equal to INT_MAX.
+ */
+ if (result < 0)
+ ereport(ERROR,
+ (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+ errmsg("%d ms is outside the valid range for parameter \"%s\" (%d .. %d)",
+ result,
+ "min_apply_delay",
+ 0, INT_MAX)));
+
+ return result;
+}
diff --git a/src/backend/replication/logical/applyparallelworker.c b/src/backend/replication/logical/applyparallelworker.c
index 3579e704fe..7302bce7a0 100644
--- a/src/backend/replication/logical/applyparallelworker.c
+++ b/src/backend/replication/logical/applyparallelworker.c
@@ -704,7 +704,8 @@ pa_process_spooled_messages_if_required(void)
{
apply_spooled_messages(&MyParallelShared->fileset,
MyParallelShared->xid,
- InvalidXLogRecPtr);
+ InvalidXLogRecPtr,
+ 0);
pa_set_fileset_state(MyParallelShared, FS_EMPTY);
}
diff --git a/src/backend/replication/logical/worker.c b/src/backend/replication/logical/worker.c
index cfb2ab6248..dd6a9f7fcf 100644
--- a/src/backend/replication/logical/worker.c
+++ b/src/backend/replication/logical/worker.c
@@ -319,6 +319,17 @@ static List *on_commit_wakeup_workers_subids = NIL;
bool in_remote_transaction = false;
static XLogRecPtr remote_final_lsn = InvalidXLogRecPtr;
+/*
+ * In order to avoid walsender timeout for time-delayed logical replication the
+ * apply worker keeps sending feedback messages during the delay period.
+ * Meanwhile, the feature delays the apply before the start of the
+ * transaction and thus we don't write WAL records for the suspended changes
+ * during the wait. When the apply worker sends a feedback message during the
+ * delay, we should not make positions of the flushed and apply LSN overwritten
+ * by the last received latest LSN. See send_feedback() for details.
+ */
+static XLogRecPtr last_received = InvalidXLogRecPtr;
+
/* fields valid only when processing streamed transaction */
static bool in_streamed_transaction = false;
@@ -389,7 +400,8 @@ static void stream_write_change(char action, StringInfo s);
static void stream_open_and_write_change(TransactionId xid, char action, StringInfo s);
static void stream_close_file(void);
-static void send_feedback(XLogRecPtr recvpos, bool force, bool requestReply);
+static void send_feedback(XLogRecPtr recvpos, bool force, bool requestReply,
+ bool has_unprocessed_change);
static void DisableSubscriptionAndExit(void);
@@ -999,6 +1011,108 @@ slot_modify_data(TupleTableSlot *slot, TupleTableSlot *srcslot,
ExecStoreVirtualTuple(slot);
}
+/*
+ * When min_apply_delay parameter is set on the subscriber, we wait long enough
+ * to make sure a transaction is applied at least that period behind the
+ * publisher.
+ *
+ * While the physical replication applies the delay at commit time, this
+ * feature applies the delay for the next transaction but before starting the
+ * transaction. This is mainly because keeping a transaction that conducted
+ * write operations open for a long time results in some issues such as bloat
+ * and locks.
+ *
+ * The min_apply_delay parameter will take effect only after all tables are in
+ * READY state.
+ *
+ * xid is the transaction id where we apply the delay.
+ *
+ * finish_ts is the commit/prepare time of both regular (non-streamed) and
+ * streamed transactions. Unlike the regular (non-streamed) cases, the delay
+ * is applied in a STREAM COMMIT/STREAM PREPARE message for streamed
+ * transactions. The STREAM START message does not contain a commit/prepare
+ * time (it will be available when the in-progress transaction finishes).
+ * Hence, it's not appropriate to apply a delay at the STREAM START time.
+ */
+static void
+maybe_apply_delay(TransactionId xid, TimestampTz finish_ts)
+{
+ Assert(finish_ts > 0);
+
+ /* Nothing to do if no delay set */
+ if (!MySubscription->minapplydelay)
+ return;
+
+ /*
+ * The min_apply_delay parameter is ignored until all tablesync workers
+ * have reached READY state. This is because if we allowed the delay
+ * during the catchup phase, then once we reached the limit of tablesync
+ * workers it would impose a delay for each subsequent worker. That would
+ * cause initial table synchronization completion to take a long time.
+ */
+ if (!AllTablesyncsReady())
+ return;
+
+ /* Apply the delay by the latch mechanism */
+ while (true)
+ {
+ long diffms;
+
+ ResetLatch(MyLatch);
+
+ CHECK_FOR_INTERRUPTS();
+
+ /* This might change wal_receiver_status_interval */
+ if (ConfigReloadPending)
+ {
+ ConfigReloadPending = false;
+ ProcessConfigFile(PGC_SIGHUP);
+ }
+
+ /*
+ * Before calculating the time duration, reload the catalog if needed.
+ */
+ if (!in_remote_transaction && !in_streamed_transaction)
+ {
+ AcceptInvalidationMessages();
+ maybe_reread_subscription();
+ }
+
+ diffms = TimestampDifferenceMilliseconds(GetCurrentTimestamp(),
+ TimestampTzPlusMilliseconds(finish_ts, MySubscription->minapplydelay));
+
+ /*
+ * Exit without arming the latch if it's already past time to apply
+ * this transaction.
+ */
+ if (diffms <= 0)
+ break;
+
+ elog(DEBUG2, "time-delayed replication for txid %u, min_apply_delay = " INT64_FORMAT " ms, remaining wait time: %ld ms",
+ xid, MySubscription->minapplydelay, diffms);
+
+ /*
+ * Call send_feedback() to prevent the publisher from exiting by
+ * timeout during the delay, when wal_receiver_status_interval is
+ * available.
+ */
+ if (wal_receiver_status_interval > 0 &&
+ diffms > wal_receiver_status_interval * 1000L)
+ {
+ WaitLatch(MyLatch,
+ WL_LATCH_SET | WL_TIMEOUT | WL_EXIT_ON_PM_DEATH,
+ wal_receiver_status_interval * 1000L,
+ WAIT_EVENT_RECOVERY_APPLY_DELAY);
+ send_feedback(last_received, true, false, true);
+ }
+ else
+ WaitLatch(MyLatch,
+ WL_LATCH_SET | WL_TIMEOUT | WL_EXIT_ON_PM_DEATH,
+ diffms,
+ WAIT_EVENT_RECOVERY_APPLY_DELAY);
+ }
+}
+
/*
* Handle BEGIN message.
*/
@@ -1013,6 +1127,9 @@ apply_handle_begin(StringInfo s)
logicalrep_read_begin(s, &begin_data);
set_apply_error_context_xact(begin_data.xid, begin_data.final_lsn);
+ /* Should we delay the current transaction? */
+ maybe_apply_delay(begin_data.xid, begin_data.committime);
+
remote_final_lsn = begin_data.final_lsn;
maybe_start_skipping_changes(begin_data.final_lsn);
@@ -1070,6 +1187,9 @@ apply_handle_begin_prepare(StringInfo s)
logicalrep_read_begin_prepare(s, &begin_data);
set_apply_error_context_xact(begin_data.xid, begin_data.prepare_lsn);
+ /* Should we delay the current prepared transaction? */
+ maybe_apply_delay(begin_data.xid, begin_data.prepare_time);
+
remote_final_lsn = begin_data.prepare_lsn;
maybe_start_skipping_changes(begin_data.prepare_lsn);
@@ -1317,7 +1437,8 @@ apply_handle_stream_prepare(StringInfo s)
* spooled operations.
*/
apply_spooled_messages(MyLogicalRepWorker->stream_fileset,
- prepare_data.xid, prepare_data.prepare_lsn);
+ prepare_data.xid, prepare_data.prepare_lsn,
+ prepare_data.prepare_time);
/* Mark the transaction as prepared. */
apply_handle_prepare_internal(&prepare_data);
@@ -2011,10 +2132,13 @@ ensure_last_message(FileSet *stream_fileset, TransactionId xid, int fileno,
/*
* Common spoolfile processing.
+ *
+ * The commit/prepare time (finish_ts) for streamed transactions is required
+ * for time-delayed logical replication.
*/
void
apply_spooled_messages(FileSet *stream_fileset, TransactionId xid,
- XLogRecPtr lsn)
+ XLogRecPtr lsn, TimestampTz finish_ts)
{
StringInfoData s2;
int nchanges;
@@ -2025,6 +2149,10 @@ apply_spooled_messages(FileSet *stream_fileset, TransactionId xid,
int fileno;
off_t offset;
+ /* Should we delay the current transaction? */
+ if (finish_ts)
+ maybe_apply_delay(xid, finish_ts);
+
if (!am_parallel_apply_worker())
maybe_start_skipping_changes(lsn);
@@ -2174,7 +2302,7 @@ apply_handle_stream_commit(StringInfo s)
* spooled operations.
*/
apply_spooled_messages(MyLogicalRepWorker->stream_fileset, xid,
- commit_data.commit_lsn);
+ commit_data.commit_lsn, commit_data.committime);
apply_handle_commit_internal(&commit_data);
@@ -3447,7 +3575,7 @@ UpdateWorkerStats(XLogRecPtr last_lsn, TimestampTz send_time, bool reply)
* Apply main loop.
*/
static void
-LogicalRepApplyLoop(XLogRecPtr last_received)
+LogicalRepApplyLoop(void)
{
TimestampTz last_recv_timestamp = GetCurrentTimestamp();
bool ping_sent = false;
@@ -3568,7 +3696,7 @@ LogicalRepApplyLoop(XLogRecPtr last_received)
if (last_received < end_lsn)
last_received = end_lsn;
- send_feedback(last_received, reply_requested, false);
+ send_feedback(last_received, reply_requested, false, false);
UpdateWorkerStats(last_received, timestamp, true);
}
/* other message types are purposefully ignored */
@@ -3581,7 +3709,7 @@ LogicalRepApplyLoop(XLogRecPtr last_received)
}
/* confirm all writes so far */
- send_feedback(last_received, false, false);
+ send_feedback(last_received, false, false, false);
if (!in_remote_transaction && !in_streamed_transaction)
{
@@ -3678,7 +3806,7 @@ LogicalRepApplyLoop(XLogRecPtr last_received)
}
}
- send_feedback(last_received, requestReply, requestReply);
+ send_feedback(last_received, requestReply, requestReply, false);
/*
* Force reporting to ensure long idle periods don't lead to
@@ -3708,7 +3836,7 @@ LogicalRepApplyLoop(XLogRecPtr last_received)
* to send a response to avoid timeouts.
*/
static void
-send_feedback(XLogRecPtr recvpos, bool force, bool requestReply)
+send_feedback(XLogRecPtr recvpos, bool force, bool requestReply, bool has_unprocessed_change)
{
static StringInfo reply_message = NULL;
static TimestampTz send_time = 0;
@@ -3738,8 +3866,15 @@ send_feedback(XLogRecPtr recvpos, bool force, bool requestReply)
/*
* No outstanding transactions to flush, we can report the latest received
* position. This is important for synchronous replication.
+ *
+ * If the logical replication subscription has unprocessed changes then do
+ * not inform the publisher that the received latest LSN is already
+ * applied and flushed, otherwise, the publisher will make a wrong
+ * assumption about the logical replication progress. Instead, it just
+ * sends a feedback message to avoid a replication timeout during the
+ * delay.
*/
- if (!have_pending_txes)
+ if (!have_pending_txes && !has_unprocessed_change)
flushpos = writepos = recvpos;
if (writepos < last_writepos)
@@ -3776,8 +3911,9 @@ send_feedback(XLogRecPtr recvpos, bool force, bool requestReply)
pq_sendint64(reply_message, now); /* sendTime */
pq_sendbyte(reply_message, requestReply); /* replyRequested */
- elog(DEBUG2, "sending feedback (force %d) to recv %X/%X, write %X/%X, flush %X/%X",
+ elog(DEBUG2, "sending feedback (force %d, has_unprocessed_change %s) to recv %X/%X, write %X/%X, flush %X/%X",
force,
+ has_unprocessed_change ? "yes" : "no",
LSN_FORMAT_ARGS(recvpos),
LSN_FORMAT_ARGS(writepos),
LSN_FORMAT_ARGS(flushpos));
@@ -4367,11 +4503,11 @@ start_table_sync(XLogRecPtr *origin_startpos, char **myslotname)
* of system resource error and are not repeatable.
*/
static void
-start_apply(XLogRecPtr origin_startpos)
+start_apply(void)
{
PG_TRY();
{
- LogicalRepApplyLoop(origin_startpos);
+ LogicalRepApplyLoop();
}
PG_CATCH();
{
@@ -4661,7 +4797,8 @@ ApplyWorkerMain(Datum main_arg)
}
/* Run the main loop. */
- start_apply(origin_startpos);
+ last_received = origin_startpos;
+ start_apply();
proc_exit(0);
}
diff --git a/src/bin/pg_dump/pg_dump.c b/src/bin/pg_dump/pg_dump.c
index 527c7651ab..c0f69cb43b 100644
--- a/src/bin/pg_dump/pg_dump.c
+++ b/src/bin/pg_dump/pg_dump.c
@@ -4494,6 +4494,7 @@ getSubscriptions(Archive *fout)
int i_subsynccommit;
int i_subpublications;
int i_subbinary;
+ int i_subminapplydelay;
int i,
ntups;
@@ -4546,9 +4547,13 @@ getSubscriptions(Archive *fout)
LOGICALREP_TWOPHASE_STATE_DISABLED);
if (fout->remoteVersion >= 160000)
- appendPQExpBufferStr(query, " s.suborigin\n");
+ appendPQExpBufferStr(query,
+ " s.suborigin,\n"
+ " s.subminapplydelay\n");
else
- appendPQExpBuffer(query, " '%s' AS suborigin\n", LOGICALREP_ORIGIN_ANY);
+ appendPQExpBuffer(query, " '%s' AS suborigin,\n"
+ " 0 AS subminapplydelay\n",
+ LOGICALREP_ORIGIN_ANY);
appendPQExpBufferStr(query,
"FROM pg_subscription s\n"
@@ -4576,6 +4581,7 @@ getSubscriptions(Archive *fout)
i_subtwophasestate = PQfnumber(res, "subtwophasestate");
i_subdisableonerr = PQfnumber(res, "subdisableonerr");
i_suborigin = PQfnumber(res, "suborigin");
+ i_subminapplydelay = PQfnumber(res, "subminapplydelay");
subinfo = pg_malloc(ntups * sizeof(SubscriptionInfo));
@@ -4606,6 +4612,8 @@ getSubscriptions(Archive *fout)
subinfo[i].subdisableonerr =
pg_strdup(PQgetvalue(res, i, i_subdisableonerr));
subinfo[i].suborigin = pg_strdup(PQgetvalue(res, i, i_suborigin));
+ subinfo[i].subminapplydelay =
+ strtoi64(PQgetvalue(res, i, i_subminapplydelay), NULL, 10);
/* Decide whether we want to dump it */
selectDumpableObject(&(subinfo[i].dobj), fout);
@@ -4687,6 +4695,9 @@ dumpSubscription(Archive *fout, const SubscriptionInfo *subinfo)
if (strcmp(subinfo->subsynccommit, "off") != 0)
appendPQExpBuffer(query, ", synchronous_commit = %s", fmtId(subinfo->subsynccommit));
+ if (subinfo->subminapplydelay > 0)
+ appendPQExpBuffer(query, ", min_apply_delay = '" INT64_FORMAT " ms'", subinfo->subminapplydelay);
+
appendPQExpBufferStr(query, ");\n");
if (subinfo->dobj.dump & DUMP_COMPONENT_DEFINITION)
diff --git a/src/bin/pg_dump/pg_dump.h b/src/bin/pg_dump/pg_dump.h
index e7cbd8d7ed..e2525f70ab 100644
--- a/src/bin/pg_dump/pg_dump.h
+++ b/src/bin/pg_dump/pg_dump.h
@@ -661,6 +661,7 @@ typedef struct _SubscriptionInfo
char *subdisableonerr;
char *suborigin;
char *subsynccommit;
+ int64 subminapplydelay;
char *subpublications;
} SubscriptionInfo;
diff --git a/src/bin/psql/describe.c b/src/bin/psql/describe.c
index c8a0bb7b3a..81d4607a1c 100644
--- a/src/bin/psql/describe.c
+++ b/src/bin/psql/describe.c
@@ -6472,7 +6472,7 @@ describeSubscriptions(const char *pattern, bool verbose)
PGresult *res;
printQueryOpt myopt = pset.popt;
static const bool translate_columns[] = {false, false, false, false,
- false, false, false, false, false, false, false, false};
+ false, false, false, false, false, false, false, false, false};
if (pset.sversion < 100000)
{
@@ -6527,10 +6527,13 @@ describeSubscriptions(const char *pattern, bool verbose)
gettext_noop("Two-phase commit"),
gettext_noop("Disable on error"));
+ /* Origin and min_apply_delay are only supported in v16 and higher */
if (pset.sversion >= 160000)
appendPQExpBuffer(&buf,
- ", suborigin AS \"%s\"\n",
- gettext_noop("Origin"));
+ ", suborigin AS \"%s\"\n"
+ ", subminapplydelay AS \"%s\"\n",
+ gettext_noop("Origin"),
+ gettext_noop("Min apply delay"));
appendPQExpBuffer(&buf,
", subsynccommit AS \"%s\"\n"
diff --git a/src/bin/psql/tab-complete.c b/src/bin/psql/tab-complete.c
index 5e1882eaea..e8b9a43a47 100644
--- a/src/bin/psql/tab-complete.c
+++ b/src/bin/psql/tab-complete.c
@@ -1925,7 +1925,7 @@ psql_completion(const char *text, int start, int end)
COMPLETE_WITH("(", "PUBLICATION");
/* ALTER SUBSCRIPTION <name> SET ( */
else if (HeadMatches("ALTER", "SUBSCRIPTION", MatchAny) && TailMatches("SET", "("))
- COMPLETE_WITH("binary", "disable_on_error", "origin", "slot_name",
+ COMPLETE_WITH("binary", "disable_on_error", "min_apply_delay", "origin", "slot_name",
"streaming", "synchronous_commit");
/* ALTER SUBSCRIPTION <name> SKIP ( */
else if (HeadMatches("ALTER", "SUBSCRIPTION", MatchAny) && TailMatches("SKIP", "("))
@@ -3268,7 +3268,7 @@ psql_completion(const char *text, int start, int end)
/* Complete "CREATE SUBSCRIPTION <name> ... WITH ( <opt>" */
else if (HeadMatches("CREATE", "SUBSCRIPTION") && TailMatches("WITH", "("))
COMPLETE_WITH("binary", "connect", "copy_data", "create_slot",
- "disable_on_error", "enabled", "origin", "slot_name",
+ "disable_on_error", "enabled", "min_apply_delay", "origin", "slot_name",
"streaming", "synchronous_commit", "two_phase");
/* CREATE TRIGGER --- is allowed inside CREATE SCHEMA, so use TailMatches */
diff --git a/src/include/catalog/pg_subscription.h b/src/include/catalog/pg_subscription.h
index b0f2a1705d..e06f35c037 100644
--- a/src/include/catalog/pg_subscription.h
+++ b/src/include/catalog/pg_subscription.h
@@ -70,6 +70,8 @@ CATALOG(pg_subscription,6100,SubscriptionRelationId) BKI_SHARED_RELATION BKI_ROW
XLogRecPtr subskiplsn; /* All changes finished at this LSN are
* skipped */
+ int64 subminapplydelay; /* Replication apply delay (ms) */
+
NameData subname; /* Name of the subscription */
Oid subowner BKI_LOOKUP(pg_authid); /* Owner of the subscription */
@@ -120,6 +122,7 @@ typedef struct Subscription
* in */
XLogRecPtr skiplsn; /* All changes finished at this LSN are
* skipped */
+ int64 minapplydelay; /* Replication apply delay (ms) */
char *name; /* Name of the subscription */
Oid owner; /* Oid of the subscription owner */
bool enabled; /* Indicates if the subscription is enabled */
diff --git a/src/include/replication/worker_internal.h b/src/include/replication/worker_internal.h
index dc87a4edd1..3dc09d1a4c 100644
--- a/src/include/replication/worker_internal.h
+++ b/src/include/replication/worker_internal.h
@@ -255,7 +255,7 @@ extern void stream_stop_internal(TransactionId xid);
/* Common streaming function to apply all the spooled messages */
extern void apply_spooled_messages(FileSet *stream_fileset, TransactionId xid,
- XLogRecPtr lsn);
+ XLogRecPtr lsn, TimestampTz finish_ts);
extern void apply_dispatch(StringInfo s);
diff --git a/src/test/regress/expected/subscription.out b/src/test/regress/expected/subscription.out
index 3f99b14394..5ccce39986 100644
--- a/src/test/regress/expected/subscription.out
+++ b/src/test/regress/expected/subscription.out
@@ -114,18 +114,18 @@ CREATE SUBSCRIPTION regress_testsub4 CONNECTION 'dbname=regress_doesnotexist' PU
WARNING: subscription was created, but is not connected
HINT: To initiate replication, you must manually create the replication slot, enable the subscription, and refresh the subscription.
\dRs+ regress_testsub4
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
-------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub4 | regress_subscription_user | f | {testpub} | f | off | d | f | none | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay | Synchronous commit | Conninfo | Skip LSN
+------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+-----------------+--------------------+-----------------------------+----------
+ regress_testsub4 | regress_subscription_user | f | {testpub} | f | off | d | f | none | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
ALTER SUBSCRIPTION regress_testsub4 SET (origin = any);
\dRs+ regress_testsub4
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
-------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub4 | regress_subscription_user | f | {testpub} | f | off | d | f | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay | Synchronous commit | Conninfo | Skip LSN
+------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+-----------------+--------------------+-----------------------------+----------
+ regress_testsub4 | regress_subscription_user | f | {testpub} | f | off | d | f | any | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
DROP SUBSCRIPTION regress_testsub3;
@@ -143,10 +143,10 @@ ALTER SUBSCRIPTION regress_testsub CONNECTION 'foobar';
ERROR: invalid connection string syntax: missing "=" after "foobar" in connection info string
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub} | f | off | d | f | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+-----------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | off | d | f | any | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
ALTER SUBSCRIPTION regress_testsub SET PUBLICATION testpub2, testpub3 WITH (refresh = false);
@@ -163,10 +163,10 @@ ERROR: unrecognized subscription parameter: "create_slot"
-- ok
ALTER SUBSCRIPTION regress_testsub SKIP (lsn = '0/12345');
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+---------------------+--------+-----------+------------------+------------------+--------+--------------------+------------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub2,testpub3} | f | off | d | f | any | off | dbname=regress_doesnotexist2 | 0/12345
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+---------------------+--------+-----------+------------------+------------------+--------+-----------------+--------------------+------------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub2,testpub3} | f | off | d | f | any | 0 | off | dbname=regress_doesnotexist2 | 0/12345
(1 row)
-- ok - with lsn = NONE
@@ -175,10 +175,10 @@ ALTER SUBSCRIPTION regress_testsub SKIP (lsn = NONE);
ALTER SUBSCRIPTION regress_testsub SKIP (lsn = '0/0');
ERROR: invalid WAL location (LSN): 0/0
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+---------------------+--------+-----------+------------------+------------------+--------+--------------------+------------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub2,testpub3} | f | off | d | f | any | off | dbname=regress_doesnotexist2 | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+---------------------+--------+-----------+------------------+------------------+--------+-----------------+--------------------+------------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub2,testpub3} | f | off | d | f | any | 0 | off | dbname=regress_doesnotexist2 | 0/0
(1 row)
BEGIN;
@@ -210,10 +210,10 @@ ALTER SUBSCRIPTION regress_testsub_foo SET (synchronous_commit = foobar);
ERROR: invalid value for parameter "synchronous_commit": "foobar"
HINT: Available values: local, remote_write, remote_apply, on, off.
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
----------------------+---------------------------+---------+---------------------+--------+-----------+------------------+------------------+--------+--------------------+------------------------------+----------
- regress_testsub_foo | regress_subscription_user | f | {testpub2,testpub3} | f | off | d | f | any | local | dbname=regress_doesnotexist2 | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay | Synchronous commit | Conninfo | Skip LSN
+---------------------+---------------------------+---------+---------------------+--------+-----------+------------------+------------------+--------+-----------------+--------------------+------------------------------+----------
+ regress_testsub_foo | regress_subscription_user | f | {testpub2,testpub3} | f | off | d | f | any | 0 | local | dbname=regress_doesnotexist2 | 0/0
(1 row)
-- rename back to keep the rest simple
@@ -247,19 +247,19 @@ CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUB
WARNING: subscription was created, but is not connected
HINT: To initiate replication, you must manually create the replication slot, enable the subscription, and refresh the subscription.
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub} | t | off | d | f | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+-----------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | t | off | d | f | any | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
ALTER SUBSCRIPTION regress_testsub SET (binary = false);
ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE);
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub} | f | off | d | f | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+-----------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | off | d | f | any | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
DROP SUBSCRIPTION regress_testsub;
@@ -271,27 +271,27 @@ CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUB
WARNING: subscription was created, but is not connected
HINT: To initiate replication, you must manually create the replication slot, enable the subscription, and refresh the subscription.
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub} | f | on | d | f | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+-----------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | on | d | f | any | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
ALTER SUBSCRIPTION regress_testsub SET (streaming = parallel);
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub} | f | parallel | d | f | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+-----------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | parallel | d | f | any | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
ALTER SUBSCRIPTION regress_testsub SET (streaming = false);
ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE);
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub} | f | off | d | f | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+-----------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | off | d | f | any | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
-- fail - publication already exists
@@ -306,10 +306,10 @@ ALTER SUBSCRIPTION regress_testsub ADD PUBLICATION testpub1, testpub2 WITH (refr
ALTER SUBSCRIPTION regress_testsub ADD PUBLICATION testpub1, testpub2 WITH (refresh = false);
ERROR: publication "testpub1" is already in subscription "regress_testsub"
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-----------------------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub,testpub1,testpub2} | f | off | d | f | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-----------------------------+--------+-----------+------------------+------------------+--------+-----------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub,testpub1,testpub2} | f | off | d | f | any | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
-- fail - publication used more than once
@@ -324,10 +324,10 @@ ERROR: publication "testpub3" is not in subscription "regress_testsub"
-- ok - delete publications
ALTER SUBSCRIPTION regress_testsub DROP PUBLICATION testpub1, testpub2 WITH (refresh = false);
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub} | f | off | d | f | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+-----------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | off | d | f | any | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
DROP SUBSCRIPTION regress_testsub;
@@ -363,10 +363,10 @@ CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUB
WARNING: subscription was created, but is not connected
HINT: To initiate replication, you must manually create the replication slot, enable the subscription, and refresh the subscription.
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub} | f | off | p | f | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+-----------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | off | p | f | any | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
--fail - alter of two_phase option not supported.
@@ -375,10 +375,10 @@ ERROR: unrecognized subscription parameter: "two_phase"
-- but can alter streaming when two_phase enabled
ALTER SUBSCRIPTION regress_testsub SET (streaming = true);
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub} | f | on | p | f | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+-----------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | on | p | f | any | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE);
@@ -388,10 +388,10 @@ CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUB
WARNING: subscription was created, but is not connected
HINT: To initiate replication, you must manually create the replication slot, enable the subscription, and refresh the subscription.
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub} | f | on | p | f | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+-----------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | on | p | f | any | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE);
@@ -404,20 +404,57 @@ CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUB
WARNING: subscription was created, but is not connected
HINT: To initiate replication, you must manually create the replication slot, enable the subscription, and refresh the subscription.
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub} | f | off | d | f | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+-----------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | off | d | f | any | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
ALTER SUBSCRIPTION regress_testsub SET (disable_on_error = true);
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub} | f | off | d | t | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+-----------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | off | d | t | any | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
+ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE);
+DROP SUBSCRIPTION regress_testsub;
+-- fail -- min_apply_delay must be a non-negative integer
+CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUBLICATION testpub WITH (connect = false, min_apply_delay = foo);
+ERROR: invalid value for parameter "min_apply_delay": "foo"
+CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUBLICATION testpub WITH (connect = false, min_apply_delay = -1);
+ERROR: -1 ms is outside the valid range for parameter "min_apply_delay" (0 .. 2147483647)
+-- fail - utilizing streaming = parallel with time-delayed replication is not supported
+CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUBLICATION testpub WITH (connect = false, streaming = parallel, min_apply_delay = 123);
+ERROR: min_apply_delay > 0 and streaming = parallel are mutually exclusive options
+-- success -- min_apply_delay value without unit is taken as milliseconds
+CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUBLICATION testpub WITH (connect = false, min_apply_delay = 123);
+WARNING: subscription was created, but is not connected
+HINT: To initiate replication, you must manually create the replication slot, enable the subscription, and refresh the subscription.
+\dRs+
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+-----------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | off | d | f | any | 123 | off | dbname=regress_doesnotexist | 0/0
+(1 row)
+
+-- success -- min_apply_delay value with unit is converted into ms and stored as an integer
+ALTER SUBSCRIPTION regress_testsub SET (min_apply_delay = '1 d');
+\dRs+
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+-----------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | off | d | f | any | 86400000 | off | dbname=regress_doesnotexist | 0/0
+(1 row)
+
+-- fail - alter subscription with streaming = parallel should fail when time-delayed replication is set
+ALTER SUBSCRIPTION regress_testsub SET (streaming = parallel);
+ERROR: cannot enable parallel streaming mode for subscription with min_apply_delay
+-- fail - alter subscription with min_apply_delay should fail when streaming = parallel is set
+ALTER SUBSCRIPTION regress_testsub SET (min_apply_delay = 0, streaming = parallel);
+ALTER SUBSCRIPTION regress_testsub SET (min_apply_delay = 123);
+ERROR: cannot enable min_apply_delay for subscription in streaming = parallel mode
ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE);
DROP SUBSCRIPTION regress_testsub;
RESET SESSION AUTHORIZATION;
diff --git a/src/test/regress/sql/subscription.sql b/src/test/regress/sql/subscription.sql
index 7281f5fee2..7317b140f5 100644
--- a/src/test/regress/sql/subscription.sql
+++ b/src/test/regress/sql/subscription.sql
@@ -286,6 +286,30 @@ ALTER SUBSCRIPTION regress_testsub SET (disable_on_error = true);
ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE);
DROP SUBSCRIPTION regress_testsub;
+-- fail -- min_apply_delay must be a non-negative integer
+CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUBLICATION testpub WITH (connect = false, min_apply_delay = foo);
+CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUBLICATION testpub WITH (connect = false, min_apply_delay = -1);
+
+-- fail - utilizing streaming = parallel with time-delayed replication is not supported
+CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUBLICATION testpub WITH (connect = false, streaming = parallel, min_apply_delay = 123);
+
+-- success -- min_apply_delay value without unit is taken as milliseconds
+CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUBLICATION testpub WITH (connect = false, min_apply_delay = 123);
+\dRs+
+
+-- success -- min_apply_delay value with unit is converted into ms and stored as an integer
+ALTER SUBSCRIPTION regress_testsub SET (min_apply_delay = '1 d');
+\dRs+
+
+-- fail - alter subscription with streaming = parallel should fail when time-delayed replication is set
+ALTER SUBSCRIPTION regress_testsub SET (streaming = parallel);
+
+-- fail - alter subscription with min_apply_delay should fail when streaming = parallel is set
+ALTER SUBSCRIPTION regress_testsub SET (min_apply_delay = 0, streaming = parallel);
+ALTER SUBSCRIPTION regress_testsub SET (min_apply_delay = 123);
+ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE);
+DROP SUBSCRIPTION regress_testsub;
+
RESET SESSION AUTHORIZATION;
DROP ROLE regress_subscription_user;
DROP ROLE regress_subscription_user2;
diff --git a/src/test/subscription/meson.build b/src/test/subscription/meson.build
index 3db0fdfd96..a186876eb4 100644
--- a/src/test/subscription/meson.build
+++ b/src/test/subscription/meson.build
@@ -38,6 +38,7 @@ tests += {
't/029_on_error.pl',
't/030_origin.pl',
't/031_column_list.pl',
+ 't/032_apply_delay.pl',
't/100_bugs.pl',
],
},
diff --git a/src/test/subscription/t/032_apply_delay.pl b/src/test/subscription/t/032_apply_delay.pl
new file mode 100644
index 0000000000..7d499ab664
--- /dev/null
+++ b/src/test/subscription/t/032_apply_delay.pl
@@ -0,0 +1,187 @@
+
+# Copyright (c) 2023, PostgreSQL Global Development Group
+
+# Test replication apply delay
+use strict;
+use warnings;
+use PostgreSQL::Test::Cluster;
+use PostgreSQL::Test::Utils;
+use Test::More;
+
+# Confirm the time-delayed replication has been effective from the server log
+# message where the apply worker emits for applying delay. Moreover, verifies
+# that the current worker's remaining wait time is sufficiently bigger than the
+# expected value, in order to check any update of the min_apply_delay.
+sub check_apply_delay_log
+{
+ my ($node_subscriber, $offset, $expected) = @_;
+
+ my $log_location = $node_subscriber->wait_for_log(
+ qr/time-delayed replication for txid (\d+), min_apply_delay = (\d+) ms, remaining wait time: (\d+) ms/,
+ $offset);
+
+ cmp_ok($log_location, '>', $offset,
+ "logfile contains triggered logical replication apply delay");
+
+ # Get the remaining wait time from the server log
+ my $contents = slurp_file($node_subscriber->logfile, $offset);
+ $contents =~
+ qr/time-delayed replication for txid (\d+), min_apply_delay = (\d+) ms, remaining wait time: (\d+) ms/,
+ or die "could not get the apply worker wait time";
+ my $logged_delay = $3;
+
+ # Is it larger than expected?
+ cmp_ok($logged_delay, '>', $expected,
+ "The apply worker wait time has expected duration");
+}
+
+# Compare inserted time on the publisher with applied time on the subscriber to
+# confirm the latter is applied after expected time. The time is automatically
+# generated and stored in the table column 'c'.
+sub check_apply_delay_time
+{
+ my ($node_publisher, $node_subscriber, $primary_key, $expected_diffs) =
+ @_;
+
+ my $inserted_time_on_pub = $node_publisher->safe_psql(
+ 'postgres', qq[
+ SELECT extract(epoch from c) FROM test_tab WHERE a = $primary_key;
+ ]);
+
+ my $inserted_time_on_sub = $node_subscriber->safe_psql(
+ 'postgres', qq[
+ SELECT extract(epoch from c) FROM test_tab WHERE a = $primary_key;
+ ]);
+
+ cmp_ok(
+ $inserted_time_on_sub - $inserted_time_on_pub,
+ '>',
+ $expected_diffs,
+ "The tuple on the subscriber was modified later than the publisher");
+}
+
+# Create publisher node
+my $node_publisher = PostgreSQL::Test::Cluster->new('publisher');
+$node_publisher->init(allows_streaming => 'logical');
+$node_publisher->append_conf('postgresql.conf',
+ 'logical_decoding_work_mem = 64kB');
+$node_publisher->start;
+
+# Create subscriber node
+my $node_subscriber = PostgreSQL::Test::Cluster->new('subscriber');
+$node_subscriber->init;
+$node_subscriber->append_conf('postgresql.conf', "log_min_messages = debug2");
+$node_subscriber->start;
+
+# Setup structure on publisher
+$node_publisher->safe_psql('postgres',
+ "CREATE TABLE test_tab (a int primary key, b varchar, c timestamptz (6) DEFAULT now())"
+);
+
+# Setup structure on subscriber
+$node_subscriber->safe_psql('postgres',
+ "CREATE TABLE test_tab (a int primary key, b text, c timestamptz (6) DEFAULT now())"
+);
+
+# Setup logical replication
+my $publisher_connstr = $node_publisher->connstr . ' dbname=postgres';
+
+# The column 'c' must not be published because we want to compare the time
+# difference.
+$node_publisher->safe_psql('postgres',
+ "CREATE PUBLICATION tap_pub FOR TABLE test_tab (a, b)");
+
+my $appname = 'tap_sub';
+
+# Create a subscription that applies the transaction after 1 second delay
+$node_subscriber->safe_psql('postgres',
+ "CREATE SUBSCRIPTION tap_sub CONNECTION '$publisher_connstr application_name=$appname' PUBLICATION tap_pub WITH (copy_data = off, min_apply_delay = '1s', streaming = 'on')"
+);
+
+# Check log starting now for logical replication apply delay
+my $offset = -s $node_subscriber->logfile;
+
+# New row to trigger apply delay
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO test_tab VALUES (1, 'foo')");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO test_tab VALUES (2, 'bar')");
+
+$node_publisher->wait_for_catchup($appname);
+
+my $result =
+ $node_subscriber->safe_psql('postgres',
+ "SELECT count(*), min(a), max(a) FROM test_tab");
+is($result, qq(2|1|2), 'check if the new rows were applied to subscriber');
+
+# Make sure the apply worker knows to wait for more than 500ms
+check_apply_delay_log($node_subscriber, $offset, "0.5");
+
+# Verify that the subscriber lags the publisher by at least 1 second
+check_apply_delay_time($node_publisher, $node_subscriber, '2', '1');
+
+# Setup for streaming case
+$node_publisher->append_conf('postgresql.conf',
+ 'logical_decoding_mode = immediate');
+$node_publisher->reload;
+
+# Run a query to make sure that the reload has taken effect.
+$node_publisher->safe_psql('postgres', q{SELECT 1});
+
+# Check log starting now for logical replication apply delay
+$offset = -s $node_subscriber->logfile;
+
+# Test streamed transaction by insert
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO test_tab SELECT i, md5(i::text) FROM generate_series(3, 5) s(i);"
+);
+
+$node_publisher->wait_for_catchup($appname);
+
+$result =
+ $node_subscriber->safe_psql('postgres',
+ "SELECT count(*), min(a), max(a) FROM test_tab");
+is($result, qq(5|1|5), 'check if the new rows were applied to subscriber');
+
+# Make sure the apply worker knows to wait for more than 500ms
+check_apply_delay_log($node_subscriber, $offset, "0.5");
+
+# Verify that the subscriber lags the publisher by at least 1 second
+check_apply_delay_time($node_publisher, $node_subscriber, '5', '1');
+
+# Test whether ALTER SUBSCRIPTION changes the delayed time of the apply worker
+# (1 day 5 minutes). Note that the extra 5 minute is to account for any
+# decoding/network overhead.
+$node_subscriber->safe_psql('postgres',
+ "ALTER SUBSCRIPTION tap_sub SET (min_apply_delay = 86700000)");
+
+# Check log starting now for logical replication apply delay
+$offset = -s $node_subscriber->logfile;
+
+# New row to trigger apply delay
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO test_tab VALUES (0, 'foobar')");
+
+# Make sure the apply worker knows to wait for more than 1 day
+check_apply_delay_log($node_subscriber, $offset, "86400000");
+
+# Disable subscription and the worker should die immediately
+$node_subscriber->safe_psql('postgres',
+ "ALTER SUBSCRIPTION tap_sub DISABLE;");
+
+# Wait until worker dies
+my $sub_query =
+ "SELECT count(1) = 0 FROM pg_stat_subscription WHERE subname = 'tap_sub' AND pid IS NOT NULL;";
+$node_subscriber->poll_query_until('postgres', $sub_query)
+ or die "Timed out while waiting for subscriber to die";
+
+# Confirm disabling the subscription by ALTER SUBSCRIPTION DISABLE did not cause
+# the delayed transaction to be applied.
+$result = $node_subscriber->safe_psql('postgres',
+ "SELECT count(a) FROM test_tab WHERE a = 0;");
+is($result, qq(0), "check the delayed transaction was not applied");
+
+$node_subscriber->stop;
+$node_publisher->stop;
+
+done_testing();
--
2.30.0
On Wednesday, January 25, 2023 3:55 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
On Wed, Jan 25, 2023 at 11:23 AM Takamichi Osumi (Fujitsu)
<osumi.takamichi@fujitsu.com> wrote:Thank you for checking the patch !
On Wednesday, January 25, 2023 10:17 AM Kyotaro Horiguchi<horikyota.ntt@gmail.com> wrote:
In short, I'd like to propose renaming the parameter
in_delayed_apply of send_feedback to "has_unprocessed_change".At Tue, 24 Jan 2023 12:27:58 +0530, Amit Kapila
<amit.kapila16@gmail.com> wrote insend_feedback():
+ * If the subscriber side apply is delayed (because oftime-delayed
+ * replication) then do not tell the publisher that the + receivedlatest
+ * LSN is already applied and flushed, otherwise, it leads to
the
+ * publisher side making a wrong assumption of logical
replication
+ * progress. Instead, we just send a feedback message to + avoid apublisher
+ * timeout during the delay. */ - if (!have_pending_txes) + if (!have_pending_txes && !in_delayed_apply) flushpos = writepos = recvpos;Honestly I don't like this wart. The reason for this is the
function assumes recvpos = applypos but we actually call it
while holding unapplied changes, that is, applypos < recvpos.Couldn't we maintain an additional static variable "last_applied"
along with last_received?It won't be easy to maintain the meaning of last_applied because
there are cases where we don't apply the change directly. For
example, in case of streaming xacts, we will just keep writing it
to the file, now, say, due to some reason, we have to send the
feedback, then it will not allow you to update the latest write
locations. This would then become different then what we are doingwithout the patch.
Another point to think about is that we also need to keep the
variable updated for keep-alive ('k') messages even though we
don't apply anything in that case. Still, other cases to consider
are where we have mix of streaming and non-streaming transactions.Yeah. Even though I named it as "last_applied", its objective is to
have get_flush_position returning the correct have_pending_txes
without a hint from callers, that is, "let g_f_position know if
store_flush_position has been called with the last received data".Anyway I tried that but didn't find a clean and simple way. However,
while on it, I realized what the code made me confused.+static void send_feedback(XLogRecPtr recvpos, bool force, bool requestReply, + bool + in_delayed_apply);The name "in_delayed_apply" doesn't donsn't give me an idea of what
the function should do for it. If it is named
"has_unprocessed_change", I think it makes sense that send_feedback
should think there may be an outstanding transaction that is not known tothe function.
So, my conclusion here is I'd like to propose changing the parameter
name to "has_unapplied_change".Renamed the variable name to "has_unprocessed_change".
Also, removed the first argument of the send_feedback() which isn'tnecessary now.
Why did you remove the first argument of the send_feedback() when that is not
added by this patch? If you really think that is an improvement, feel free to
propose that as a separate patch.
Personally, I don't see a value in it.
Oh, sorry for that. I have made the change back.
Kindly have a look at the v22 shared in [1]/messages/by-id/TYCPR01MB837305BD31FA317256BC7B1FEDCE9@TYCPR01MB8373.jpnprd01.prod.outlook.com.
[1]: /messages/by-id/TYCPR01MB837305BD31FA317256BC7B1FEDCE9@TYCPR01MB8373.jpnprd01.prod.outlook.com
Best Regards,
Takamichi Osumi
On Wednesday, January 25, 2023 11:24 PM I wrote:
Attached the updated v22.
Hi,
During self-review, I noticed some changes are
required for some variable types related to 'min_apply_delay' value,
so have conducted the adjustment changes for the same.
Additionally, I made some comments for translator and TAP test better.
Note that I executed pgindent and pgperltidy for the patch.
Now the updated patch should be more refined.
Best Regards,
Takamichi Osumi
Attachments:
v23-0001-Time-delayed-logical-replication-subscriber.patchapplication/octet-stream; name=v23-0001-Time-delayed-logical-replication-subscriber.patchDownload
From 6b5a51fccbfcc40136ed41949d71799bdb4276c1 Mon Sep 17 00:00:00 2001
From: Takamichi Osumi <osumi.takamichi@fujitsu.com>
Date: Fri, 27 Jan 2023 07:31:41 +0000
Subject: [PATCH v23] Time-delayed logical replication subscriber
Similar to physical replication, a time-delayed copy of the data for
logical replication is useful for some scenarios (particularly to fix
errors that might cause data loss).
This patch implements a new subscription parameter called 'min_apply_delay'.
If the subscription sets min_apply_delay parameter, the logical
replication worker will delay the transaction apply for min_apply_delay
milliseconds.
The delay is calculated between the WAL time stamp and the current time
on the subscriber.
The delay occurs before we start to apply the transaction on the
subscriber. The main reason is to avoid keeping a transaction open for
a long time. Regular and prepared transactions are covered. Streamed
transactions are also covered.
The combination of parallel streaming mode and min_apply_delay is not
allowed. This is because we start applying the transaction stream as
soon as the first change arrives without knowing the transaction's
prepare/commit time. This means we cannot calculate the underlying
network/decoding lag between publisher and subscriber, and so always
waiting for the full 'min_apply_delay' period might include
unnecessary delay.
The other possibility is to apply the delay at the end of the parallel
apply transaction but that would cause issues related to resource
bloat and locks being held for a long time.
Note that this feature doesn't interact with skip transaction feature.
The skip transaction feature applies to one transaction with a specific LSN.
So, even if the skipped transaction and non-skipped transaction come
consecutively in a very short time, regardless of the order of which comes
first, the time-delayed feature gets balanced by delayed application
for other transactions before and after the skipped transaction.
Author: Euler Taveira, Takamichi Osumi, Kuroda Hayato
Reviewed-by: Amit Kapila, Peter Smith, Vignesh C, Shveta Malik,
Kyotaro Horiguchi, Dilip Kumar, Melih Mutlu
Discussion: https://postgr.es/m/CAB-JLwYOYwL=XTyAXKiH5CtM_Vm8KjKh7aaitCKvmCh4rzr5pQ@mail.gmail.com
---
doc/src/sgml/catalogs.sgml | 9 +
doc/src/sgml/config.sgml | 11 ++
doc/src/sgml/glossary.sgml | 14 ++
doc/src/sgml/logical-replication.sgml | 6 +
doc/src/sgml/ref/alter_subscription.sgml | 5 +-
doc/src/sgml/ref/create_subscription.sgml | 48 ++++-
src/backend/catalog/pg_subscription.c | 1 +
src/backend/catalog/system_views.sql | 7 +-
src/backend/commands/subscriptioncmds.c | 117 ++++++++++-
.../replication/logical/applyparallelworker.c | 3 +-
src/backend/replication/logical/worker.c | 165 ++++++++++++++--
src/bin/pg_dump/pg_dump.c | 15 +-
src/bin/pg_dump/pg_dump.h | 1 +
src/bin/psql/describe.c | 9 +-
src/bin/psql/tab-complete.c | 4 +-
src/include/catalog/pg_subscription.h | 3 +
src/include/replication/worker_internal.h | 2 +-
src/test/regress/expected/subscription.out | 181 ++++++++++-------
src/test/regress/sql/subscription.sql | 24 +++
src/test/subscription/meson.build | 1 +
src/test/subscription/t/032_apply_delay.pl | 187 ++++++++++++++++++
21 files changed, 709 insertions(+), 104 deletions(-)
create mode 100644 src/test/subscription/t/032_apply_delay.pl
diff --git a/doc/src/sgml/catalogs.sgml b/doc/src/sgml/catalogs.sgml
index c1e4048054..925a6ebb12 100644
--- a/doc/src/sgml/catalogs.sgml
+++ b/doc/src/sgml/catalogs.sgml
@@ -7873,6 +7873,15 @@ SCRAM-SHA-256$<replaceable><iteration count></replaceable>:<replaceable>&l
</para></entry>
</row>
+ <row>
+ <entry role="catalog_table_entry"><para role="column_definition">
+ <structfield>subminapplydelay</structfield> <type>int4</type>
+ </para>
+ <para>
+ The minimum amount of time to delay applying changes, in milliseconds.
+ </para></entry>
+ </row>
+
<row>
<entry role="catalog_table_entry"><para role="column_definition">
<structfield>subname</structfield> <type>name</type>
diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml
index f985afc009..317ebeb38f 100644
--- a/doc/src/sgml/config.sgml
+++ b/doc/src/sgml/config.sgml
@@ -4787,6 +4787,17 @@ ANY <replaceable class="parameter">num_sync</replaceable> ( <replaceable class="
the <filename>postgresql.conf</filename> file or on the server
command line.
</para>
+ <para>
+ For time-delayed logical replication, the apply worker sends a feedback
+ message to the publisher every
+ <varname>wal_receiver_status_interval</varname> milliseconds. Make sure
+ to set <varname>wal_receiver_status_interval</varname> less than the
+ <varname>wal_sender_timeout</varname> on the publisher, otherwise, the
+ <literal>walsender</literal> will repeatedly terminate due to timeout
+ error. Note that if <varname>wal_receiver_status_interval</varname> is
+ set to zero, the apply worker sends no feedback messages during the
+ <literal>min_apply_delay</literal> period.
+ </para>
</listitem>
</varlistentry>
diff --git a/doc/src/sgml/glossary.sgml b/doc/src/sgml/glossary.sgml
index 7c01a541fe..6ed6fa5853 100644
--- a/doc/src/sgml/glossary.sgml
+++ b/doc/src/sgml/glossary.sgml
@@ -1729,6 +1729,20 @@
</glossdef>
</glossentry>
+ <glossentry id="glossary-time-delayed-replication">
+ <glossterm>Time-delayed replication</glossterm>
+ <glossdef>
+ <para>
+ Replication setup that applies time-delayed copy of the data.
+ </para>
+ <para>
+ For more information, see
+ <xref linkend="guc-recovery-min-apply-delay"/> for physical replication
+ and <xref linkend="sql-createsubscription"/> for logical replication.
+ </para>
+ </glossdef>
+ </glossentry>
+
<glossentry id="glossary-toast">
<glossterm>TOAST</glossterm>
<glossdef>
diff --git a/doc/src/sgml/logical-replication.sgml b/doc/src/sgml/logical-replication.sgml
index 1bd5660c87..6bd5f61e2b 100644
--- a/doc/src/sgml/logical-replication.sgml
+++ b/doc/src/sgml/logical-replication.sgml
@@ -247,6 +247,12 @@
target table.
</para>
+ <para>
+ A subscription can delay the application of changes by specifying the
+ <literal>min_apply_delay</literal> subscription parameter. See
+ <xref linkend="sql-createsubscription"/> for details.
+ </para>
+
<sect2 id="logical-replication-subscription-slot">
<title>Replication Slot Management</title>
diff --git a/doc/src/sgml/ref/alter_subscription.sgml b/doc/src/sgml/ref/alter_subscription.sgml
index ad93553a1d..1c6e9dd2d1 100644
--- a/doc/src/sgml/ref/alter_subscription.sgml
+++ b/doc/src/sgml/ref/alter_subscription.sgml
@@ -213,8 +213,9 @@ ALTER SUBSCRIPTION <replaceable class="parameter">name</replaceable> RENAME TO <
are <literal>slot_name</literal>,
<literal>synchronous_commit</literal>,
<literal>binary</literal>, <literal>streaming</literal>,
- <literal>disable_on_error</literal>, and
- <literal>origin</literal>.
+ <literal>disable_on_error</literal>,
+ <literal>origin</literal>, and
+ <literal>min_apply_delay</literal>.
</para>
</listitem>
</varlistentry>
diff --git a/doc/src/sgml/ref/create_subscription.sgml b/doc/src/sgml/ref/create_subscription.sgml
index eba72c6af6..22b4451d4d 100644
--- a/doc/src/sgml/ref/create_subscription.sgml
+++ b/doc/src/sgml/ref/create_subscription.sgml
@@ -349,7 +349,48 @@ CREATE SUBSCRIPTION <replaceable class="parameter">subscription_name</replaceabl
</para>
</listitem>
</varlistentry>
- </variablelist></para>
+
+ <varlistentry>
+ <term><literal>min_apply_delay</literal> (<type>integer</type>)</term>
+ <listitem>
+ <para>
+ By default, the subscriber applies changes as soon as possible. This
+ parameter allows the user to delay the application of changes by a
+ given time period. If the value is specified without units, it is
+ taken as milliseconds. The default is zero (no delay). See
+ <xref linkend="config-setting-names-values"/> for details on the
+ available valid time unites.
+ </para>
+ <para>
+ Any delay becomes effective after all initial table synchronization
+ has finished and occurs before each transaction starts to get applied
+ on the subscriber. The delay is calculated as the difference between
+ the WAL timestamp as written on the publisher and the current time on
+ the subscriber. Any overhead of time spent in logical decoding and in
+ transferring the transaction may reduce the actual wait time. It is
+ also possible that the overhead already exceeds the requested
+ <literal>min_apply_delay</literal> value, in which case no delay is
+ applied. If the system clocks on publisher and subscriber are not
+ synchronized, this may lead to apply changes earlier than expected,
+ but this is not a major issue because this parameter is typically
+ much larger than the time deviations between servers. Note that if
+ this parameter is set to a long delay, the replication will stop if
+ the replication slot falls behind the current LSN by more than
+ <link linkend="guc-max-slot-wal-keep-size"><literal>max_slot_wal_keep_size</literal></link>.
+ </para>
+ <warning>
+ <para>
+ Delaying the replication means there is a much longer time between
+ making a change on the publisher, and that change being committed
+ on the subscriber. This can impact the performance of synchronous
+ replication. See <xref linkend="guc-synchronous-commit"/>
+ parameter.
+ </para>
+ </warning>
+ </listitem>
+ </varlistentry>
+ </variablelist>
+ </para>
</listitem>
</varlistentry>
@@ -413,6 +454,11 @@ CREATE SUBSCRIPTION <replaceable class="parameter">subscription_name</replaceabl
published with different column lists are not supported.
</para>
+ <para>
+ A non-zero <literal>min_apply_delay</literal> parameter is not allowed when
+ streaming in parallel mode.
+ </para>
+
<para>
We allow non-existent publications to be specified so that users can add
those later. This means
diff --git a/src/backend/catalog/pg_subscription.c b/src/backend/catalog/pg_subscription.c
index a56ae311c3..c767cc1c3a 100644
--- a/src/backend/catalog/pg_subscription.c
+++ b/src/backend/catalog/pg_subscription.c
@@ -64,6 +64,7 @@ GetSubscription(Oid subid, bool missing_ok)
sub->oid = subid;
sub->dbid = subform->subdbid;
sub->skiplsn = subform->subskiplsn;
+ sub->minapplydelay = subform->subminapplydelay;
sub->name = pstrdup(NameStr(subform->subname));
sub->owner = subform->subowner;
sub->enabled = subform->subenabled;
diff --git a/src/backend/catalog/system_views.sql b/src/backend/catalog/system_views.sql
index 8608e3fa5b..317c2010cb 100644
--- a/src/backend/catalog/system_views.sql
+++ b/src/backend/catalog/system_views.sql
@@ -1299,9 +1299,10 @@ REVOKE ALL ON pg_replication_origin_status FROM public;
-- All columns of pg_subscription except subconninfo are publicly readable.
REVOKE ALL ON pg_subscription FROM public;
-GRANT SELECT (oid, subdbid, subskiplsn, subname, subowner, subenabled,
- subbinary, substream, subtwophasestate, subdisableonerr,
- subslotname, subsynccommit, subpublications, suborigin)
+GRANT SELECT (oid, subdbid, subskiplsn, subminapplydelay, subname, subowner,
+ subenabled, subbinary, substream, subtwophasestate,
+ subdisableonerr, subslotname, subsynccommit, subpublications,
+ suborigin)
ON pg_subscription TO public;
CREATE VIEW pg_stat_subscription_stats AS
diff --git a/src/backend/commands/subscriptioncmds.c b/src/backend/commands/subscriptioncmds.c
index 464db6d247..489eae85ee 100644
--- a/src/backend/commands/subscriptioncmds.c
+++ b/src/backend/commands/subscriptioncmds.c
@@ -66,6 +66,7 @@
#define SUBOPT_DISABLE_ON_ERR 0x00000400
#define SUBOPT_LSN 0x00000800
#define SUBOPT_ORIGIN 0x00001000
+#define SUBOPT_MIN_APPLY_DELAY 0x00002000
/* check if the 'val' has 'bits' set */
#define IsSet(val, bits) (((val) & (bits)) == (bits))
@@ -90,6 +91,7 @@ typedef struct SubOpts
bool disableonerr;
char *origin;
XLogRecPtr lsn;
+ int32 min_apply_delay;
} SubOpts;
static List *fetch_table_list(WalReceiverConn *wrconn, List *publications);
@@ -100,7 +102,7 @@ static void check_publications_origin(WalReceiverConn *wrconn,
static void check_duplicates_in_publist(List *publist, Datum *datums);
static List *merge_publications(List *oldpublist, List *newpublist, bool addpub, const char *subname);
static void ReportSlotConnectionError(List *rstates, Oid subid, char *slotname, char *err);
-
+static int defGetMinApplyDelay(DefElem *def);
/*
* Common option parsing function for CREATE and ALTER SUBSCRIPTION commands.
@@ -146,6 +148,8 @@ parse_subscription_options(ParseState *pstate, List *stmt_options,
opts->disableonerr = false;
if (IsSet(supported_opts, SUBOPT_ORIGIN))
opts->origin = pstrdup(LOGICALREP_ORIGIN_ANY);
+ if (IsSet(supported_opts, SUBOPT_MIN_APPLY_DELAY))
+ opts->min_apply_delay = 0;
/* Parse options */
foreach(lc, stmt_options)
@@ -324,6 +328,15 @@ parse_subscription_options(ParseState *pstate, List *stmt_options,
opts->specified_opts |= SUBOPT_LSN;
opts->lsn = lsn;
}
+ else if (IsSet(supported_opts, SUBOPT_MIN_APPLY_DELAY) &&
+ strcmp(defel->defname, "min_apply_delay") == 0)
+ {
+ if (IsSet(opts->specified_opts, SUBOPT_MIN_APPLY_DELAY))
+ errorConflictingDefElem(defel, pstate);
+
+ opts->specified_opts |= SUBOPT_MIN_APPLY_DELAY;
+ opts->min_apply_delay = defGetMinApplyDelay(defel);
+ }
else
ereport(ERROR,
(errcode(ERRCODE_SYNTAX_ERROR),
@@ -404,6 +417,31 @@ parse_subscription_options(ParseState *pstate, List *stmt_options,
"slot_name = NONE", "create_slot = false")));
}
}
+
+ /*
+ * The combination of parallel streaming mode and min_apply_delay is not
+ * allowed. This is because we start applying the transaction stream as
+ * soon as the first change arrives without knowing the transaction's
+ * prepare/commit time. This means we cannot calculate the underlying
+ * network/decoding lag between publisher and subscriber, and so always
+ * waiting for the full 'min_apply_delay' period might include unnecessary
+ * delay.
+ *
+ * The other possibility is to apply the delay at the end of the parallel
+ * apply transaction but that would cause issues related to resource bloat
+ * and locks being held for a long time.
+ */
+ if (IsSet(supported_opts, SUBOPT_MIN_APPLY_DELAY) &&
+ opts->min_apply_delay > 0 && opts->streaming == LOGICALREP_STREAM_PARALLEL)
+ ereport(ERROR,
+ errcode(ERRCODE_SYNTAX_ERROR),
+
+ /*
+ * - translator: the first %s is a string of the form "parameter > 0"
+ * and the second one is "option = value".
+ */
+ errmsg("%s and %s are mutually exclusive options",
+ "min_apply_delay > 0", "streaming = parallel"));
}
/*
@@ -560,7 +598,8 @@ CreateSubscription(ParseState *pstate, CreateSubscriptionStmt *stmt,
SUBOPT_SLOT_NAME | SUBOPT_COPY_DATA |
SUBOPT_SYNCHRONOUS_COMMIT | SUBOPT_BINARY |
SUBOPT_STREAMING | SUBOPT_TWOPHASE_COMMIT |
- SUBOPT_DISABLE_ON_ERR | SUBOPT_ORIGIN);
+ SUBOPT_DISABLE_ON_ERR | SUBOPT_ORIGIN |
+ SUBOPT_MIN_APPLY_DELAY);
parse_subscription_options(pstate, stmt->options, supported_opts, &opts);
/*
@@ -625,6 +664,7 @@ CreateSubscription(ParseState *pstate, CreateSubscriptionStmt *stmt,
values[Anum_pg_subscription_oid - 1] = ObjectIdGetDatum(subid);
values[Anum_pg_subscription_subdbid - 1] = ObjectIdGetDatum(MyDatabaseId);
values[Anum_pg_subscription_subskiplsn - 1] = LSNGetDatum(InvalidXLogRecPtr);
+ values[Anum_pg_subscription_subminapplydelay - 1] = Int32GetDatum(opts.min_apply_delay);
values[Anum_pg_subscription_subname - 1] =
DirectFunctionCall1(namein, CStringGetDatum(stmt->subname));
values[Anum_pg_subscription_subowner - 1] = ObjectIdGetDatum(owner);
@@ -1054,7 +1094,7 @@ AlterSubscription(ParseState *pstate, AlterSubscriptionStmt *stmt,
supported_opts = (SUBOPT_SLOT_NAME |
SUBOPT_SYNCHRONOUS_COMMIT | SUBOPT_BINARY |
SUBOPT_STREAMING | SUBOPT_DISABLE_ON_ERR |
- SUBOPT_ORIGIN);
+ SUBOPT_ORIGIN | SUBOPT_MIN_APPLY_DELAY);
parse_subscription_options(pstate, stmt->options,
supported_opts, &opts);
@@ -1098,6 +1138,19 @@ AlterSubscription(ParseState *pstate, AlterSubscriptionStmt *stmt,
if (IsSet(opts.specified_opts, SUBOPT_STREAMING))
{
+ /*
+ * The combination of parallel streaming mode and
+ * min_apply_delay is not allowed. See
+ * parse_subscription_options for details of the reason.
+ */
+ if (opts.streaming == LOGICALREP_STREAM_PARALLEL)
+ if ((IsSet(opts.specified_opts, SUBOPT_MIN_APPLY_DELAY) && opts.min_apply_delay > 0) ||
+ (!IsSet(opts.specified_opts, SUBOPT_MIN_APPLY_DELAY) && sub->minapplydelay > 0))
+ ereport(ERROR,
+ errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
+ errmsg("cannot enable parallel streaming mode for subscription with %s",
+ "min_apply_delay"));
+
values[Anum_pg_subscription_substream - 1] =
CharGetDatum(opts.streaming);
replaces[Anum_pg_subscription_substream - 1] = true;
@@ -1111,6 +1164,25 @@ AlterSubscription(ParseState *pstate, AlterSubscriptionStmt *stmt,
= true;
}
+ if (IsSet(opts.specified_opts, SUBOPT_MIN_APPLY_DELAY))
+ {
+ /*
+ * The combination of parallel streaming mode and
+ * min_apply_delay is not allowed.
+ */
+ if (opts.min_apply_delay > 0)
+ if ((IsSet(opts.specified_opts, SUBOPT_STREAMING) && opts.streaming == LOGICALREP_STREAM_PARALLEL) ||
+ (!IsSet(opts.specified_opts, SUBOPT_STREAMING) && sub->stream == LOGICALREP_STREAM_PARALLEL))
+ ereport(ERROR,
+ errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
+ errmsg("cannot enable %s for subscription in %s mode",
+ "min_apply_delay", "streaming = parallel"));
+
+ values[Anum_pg_subscription_subminapplydelay - 1] =
+ Int32GetDatum(opts.min_apply_delay);
+ replaces[Anum_pg_subscription_subminapplydelay - 1] = true;
+ }
+
if (IsSet(opts.specified_opts, SUBOPT_ORIGIN))
{
values[Anum_pg_subscription_suborigin - 1] =
@@ -2195,3 +2267,42 @@ defGetStreamingMode(DefElem *def)
def->defname)));
return LOGICALREP_STREAM_OFF; /* keep compiler quiet */
}
+
+/*
+ * Extract the min_apply_delay value from a DefElem. This is very similar to
+ * parse_and_validate_value() for integer values, because min_apply_delay
+ * accepts the same parameter format as recovery_min_apply_delay.
+ */
+static int
+defGetMinApplyDelay(DefElem *def)
+{
+ char *input_string;
+ int result;
+ const char *hintmsg;
+
+ input_string = defGetString(def);
+
+ /*
+ * Parse given string as parameter which has millisecond unit
+ */
+ if (!parse_int(input_string, &result, GUC_UNIT_MS, &hintmsg))
+ ereport(ERROR,
+ (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+ errmsg("invalid value for parameter \"%s\": \"%s\"",
+ "min_apply_delay", input_string),
+ hintmsg ? errhint("%s", _(hintmsg)) : 0));
+
+ /*
+ * Check lower bound. parse_int() has already been confirmed that result
+ * is less than or equal to INT_MAX.
+ */
+ if (result < 0)
+ ereport(ERROR,
+ (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+ errmsg("%d ms is outside the valid range for parameter \"%s\" (%d .. %d)",
+ result,
+ "min_apply_delay",
+ 0, INT_MAX)));
+
+ return result;
+}
diff --git a/src/backend/replication/logical/applyparallelworker.c b/src/backend/replication/logical/applyparallelworker.c
index 3579e704fe..7302bce7a0 100644
--- a/src/backend/replication/logical/applyparallelworker.c
+++ b/src/backend/replication/logical/applyparallelworker.c
@@ -704,7 +704,8 @@ pa_process_spooled_messages_if_required(void)
{
apply_spooled_messages(&MyParallelShared->fileset,
MyParallelShared->xid,
- InvalidXLogRecPtr);
+ InvalidXLogRecPtr,
+ 0);
pa_set_fileset_state(MyParallelShared, FS_EMPTY);
}
diff --git a/src/backend/replication/logical/worker.c b/src/backend/replication/logical/worker.c
index cfb2ab6248..add7aca078 100644
--- a/src/backend/replication/logical/worker.c
+++ b/src/backend/replication/logical/worker.c
@@ -319,6 +319,17 @@ static List *on_commit_wakeup_workers_subids = NIL;
bool in_remote_transaction = false;
static XLogRecPtr remote_final_lsn = InvalidXLogRecPtr;
+/*
+ * In order to avoid walsender timeout for time-delayed logical replication the
+ * apply worker keeps sending feedback messages during the delay period.
+ * Meanwhile, the feature delays the apply before the start of the
+ * transaction and thus we don't write WAL records for the suspended changes
+ * during the wait. When the apply worker sends a feedback message during the
+ * delay, we should not make positions of the flushed and apply LSN overwritten
+ * by the last received latest LSN. See send_feedback() for details.
+ */
+static XLogRecPtr last_received = InvalidXLogRecPtr;
+
/* fields valid only when processing streamed transaction */
static bool in_streamed_transaction = false;
@@ -389,7 +400,8 @@ static void stream_write_change(char action, StringInfo s);
static void stream_open_and_write_change(TransactionId xid, char action, StringInfo s);
static void stream_close_file(void);
-static void send_feedback(XLogRecPtr recvpos, bool force, bool requestReply);
+static void send_feedback(XLogRecPtr recvpos, bool force, bool requestReply,
+ bool has_unprocessed_change);
static void DisableSubscriptionAndExit(void);
@@ -999,6 +1011,108 @@ slot_modify_data(TupleTableSlot *slot, TupleTableSlot *srcslot,
ExecStoreVirtualTuple(slot);
}
+/*
+ * When min_apply_delay parameter is set on the subscriber, we wait long enough
+ * to make sure a transaction is applied at least that period behind the
+ * publisher.
+ *
+ * While the physical replication applies the delay at commit time, this
+ * feature applies the delay for the next transaction but before starting the
+ * transaction. This is mainly because keeping a transaction that conducted
+ * write operations open for a long time results in some issues such as bloat
+ * and locks.
+ *
+ * The min_apply_delay parameter will take effect only after all tables are in
+ * READY state.
+ *
+ * xid is the transaction id where we apply the delay.
+ *
+ * finish_ts is the commit/prepare time of both regular (non-streamed) and
+ * streamed transactions. Unlike the regular (non-streamed) cases, the delay
+ * is applied in a STREAM COMMIT/STREAM PREPARE message for streamed
+ * transactions. The STREAM START message does not contain a commit/prepare
+ * time (it will be available when the in-progress transaction finishes).
+ * Hence, it's not appropriate to apply a delay at the STREAM START time.
+ */
+static void
+maybe_apply_delay(TransactionId xid, TimestampTz finish_ts)
+{
+ Assert(finish_ts > 0);
+
+ /* Nothing to do if no delay set */
+ if (!MySubscription->minapplydelay)
+ return;
+
+ /*
+ * The min_apply_delay parameter is ignored until all tablesync workers
+ * have reached READY state. This is because if we allowed the delay
+ * during the catchup phase, then once we reached the limit of tablesync
+ * workers it would impose a delay for each subsequent worker. That would
+ * cause initial table synchronization completion to take a long time.
+ */
+ if (!AllTablesyncsReady())
+ return;
+
+ /* Apply the delay by the latch mechanism */
+ while (true)
+ {
+ long diffms;
+
+ ResetLatch(MyLatch);
+
+ CHECK_FOR_INTERRUPTS();
+
+ /* This might change wal_receiver_status_interval */
+ if (ConfigReloadPending)
+ {
+ ConfigReloadPending = false;
+ ProcessConfigFile(PGC_SIGHUP);
+ }
+
+ /*
+ * Before calculating the time duration, reload the catalog if needed.
+ */
+ if (!in_remote_transaction && !in_streamed_transaction)
+ {
+ AcceptInvalidationMessages();
+ maybe_reread_subscription();
+ }
+
+ diffms = TimestampDifferenceMilliseconds(GetCurrentTimestamp(),
+ TimestampTzPlusMilliseconds(finish_ts, MySubscription->minapplydelay));
+
+ /*
+ * Exit without arming the latch if it's already past time to apply
+ * this transaction.
+ */
+ if (diffms <= 0)
+ break;
+
+ elog(DEBUG2, "time-delayed replication for txid %u, min_apply_delay = %d ms, remaining wait time: %ld ms",
+ xid, MySubscription->minapplydelay, diffms);
+
+ /*
+ * Call send_feedback() to prevent the publisher from exiting by
+ * timeout during the delay, when wal_receiver_status_interval is
+ * available.
+ */
+ if (wal_receiver_status_interval > 0 &&
+ diffms > wal_receiver_status_interval * 1000L)
+ {
+ WaitLatch(MyLatch,
+ WL_LATCH_SET | WL_TIMEOUT | WL_EXIT_ON_PM_DEATH,
+ wal_receiver_status_interval * 1000L,
+ WAIT_EVENT_RECOVERY_APPLY_DELAY);
+ send_feedback(last_received, true, false, true);
+ }
+ else
+ WaitLatch(MyLatch,
+ WL_LATCH_SET | WL_TIMEOUT | WL_EXIT_ON_PM_DEATH,
+ diffms,
+ WAIT_EVENT_RECOVERY_APPLY_DELAY);
+ }
+}
+
/*
* Handle BEGIN message.
*/
@@ -1013,6 +1127,9 @@ apply_handle_begin(StringInfo s)
logicalrep_read_begin(s, &begin_data);
set_apply_error_context_xact(begin_data.xid, begin_data.final_lsn);
+ /* Should we delay the current transaction? */
+ maybe_apply_delay(begin_data.xid, begin_data.committime);
+
remote_final_lsn = begin_data.final_lsn;
maybe_start_skipping_changes(begin_data.final_lsn);
@@ -1070,6 +1187,9 @@ apply_handle_begin_prepare(StringInfo s)
logicalrep_read_begin_prepare(s, &begin_data);
set_apply_error_context_xact(begin_data.xid, begin_data.prepare_lsn);
+ /* Should we delay the current prepared transaction? */
+ maybe_apply_delay(begin_data.xid, begin_data.prepare_time);
+
remote_final_lsn = begin_data.prepare_lsn;
maybe_start_skipping_changes(begin_data.prepare_lsn);
@@ -1317,7 +1437,8 @@ apply_handle_stream_prepare(StringInfo s)
* spooled operations.
*/
apply_spooled_messages(MyLogicalRepWorker->stream_fileset,
- prepare_data.xid, prepare_data.prepare_lsn);
+ prepare_data.xid, prepare_data.prepare_lsn,
+ prepare_data.prepare_time);
/* Mark the transaction as prepared. */
apply_handle_prepare_internal(&prepare_data);
@@ -2011,10 +2132,13 @@ ensure_last_message(FileSet *stream_fileset, TransactionId xid, int fileno,
/*
* Common spoolfile processing.
+ *
+ * The commit/prepare time (finish_ts) for streamed transactions is required
+ * for time-delayed logical replication.
*/
void
apply_spooled_messages(FileSet *stream_fileset, TransactionId xid,
- XLogRecPtr lsn)
+ XLogRecPtr lsn, TimestampTz finish_ts)
{
StringInfoData s2;
int nchanges;
@@ -2025,6 +2149,10 @@ apply_spooled_messages(FileSet *stream_fileset, TransactionId xid,
int fileno;
off_t offset;
+ /* Should we delay the current transaction? */
+ if (finish_ts)
+ maybe_apply_delay(xid, finish_ts);
+
if (!am_parallel_apply_worker())
maybe_start_skipping_changes(lsn);
@@ -2174,7 +2302,7 @@ apply_handle_stream_commit(StringInfo s)
* spooled operations.
*/
apply_spooled_messages(MyLogicalRepWorker->stream_fileset, xid,
- commit_data.commit_lsn);
+ commit_data.commit_lsn, commit_data.committime);
apply_handle_commit_internal(&commit_data);
@@ -3447,7 +3575,7 @@ UpdateWorkerStats(XLogRecPtr last_lsn, TimestampTz send_time, bool reply)
* Apply main loop.
*/
static void
-LogicalRepApplyLoop(XLogRecPtr last_received)
+LogicalRepApplyLoop(void)
{
TimestampTz last_recv_timestamp = GetCurrentTimestamp();
bool ping_sent = false;
@@ -3568,7 +3696,7 @@ LogicalRepApplyLoop(XLogRecPtr last_received)
if (last_received < end_lsn)
last_received = end_lsn;
- send_feedback(last_received, reply_requested, false);
+ send_feedback(last_received, reply_requested, false, false);
UpdateWorkerStats(last_received, timestamp, true);
}
/* other message types are purposefully ignored */
@@ -3581,7 +3709,7 @@ LogicalRepApplyLoop(XLogRecPtr last_received)
}
/* confirm all writes so far */
- send_feedback(last_received, false, false);
+ send_feedback(last_received, false, false, false);
if (!in_remote_transaction && !in_streamed_transaction)
{
@@ -3678,7 +3806,7 @@ LogicalRepApplyLoop(XLogRecPtr last_received)
}
}
- send_feedback(last_received, requestReply, requestReply);
+ send_feedback(last_received, requestReply, requestReply, false);
/*
* Force reporting to ensure long idle periods don't lead to
@@ -3708,7 +3836,7 @@ LogicalRepApplyLoop(XLogRecPtr last_received)
* to send a response to avoid timeouts.
*/
static void
-send_feedback(XLogRecPtr recvpos, bool force, bool requestReply)
+send_feedback(XLogRecPtr recvpos, bool force, bool requestReply, bool has_unprocessed_change)
{
static StringInfo reply_message = NULL;
static TimestampTz send_time = 0;
@@ -3738,8 +3866,15 @@ send_feedback(XLogRecPtr recvpos, bool force, bool requestReply)
/*
* No outstanding transactions to flush, we can report the latest received
* position. This is important for synchronous replication.
+ *
+ * If the logical replication subscription has unprocessed changes then do
+ * not inform the publisher that the received latest LSN is already
+ * applied and flushed, otherwise, the publisher will make a wrong
+ * assumption about the logical replication progress. Instead, it just
+ * sends a feedback message to avoid a replication timeout during the
+ * delay.
*/
- if (!have_pending_txes)
+ if (!have_pending_txes && !has_unprocessed_change)
flushpos = writepos = recvpos;
if (writepos < last_writepos)
@@ -3776,8 +3911,9 @@ send_feedback(XLogRecPtr recvpos, bool force, bool requestReply)
pq_sendint64(reply_message, now); /* sendTime */
pq_sendbyte(reply_message, requestReply); /* replyRequested */
- elog(DEBUG2, "sending feedback (force %d) to recv %X/%X, write %X/%X, flush %X/%X",
+ elog(DEBUG2, "sending feedback (force %d, has_unprocessed_change %s) to recv %X/%X, write %X/%X, flush %X/%X",
force,
+ has_unprocessed_change ? "yes" : "no",
LSN_FORMAT_ARGS(recvpos),
LSN_FORMAT_ARGS(writepos),
LSN_FORMAT_ARGS(flushpos));
@@ -4367,11 +4503,11 @@ start_table_sync(XLogRecPtr *origin_startpos, char **myslotname)
* of system resource error and are not repeatable.
*/
static void
-start_apply(XLogRecPtr origin_startpos)
+start_apply(void)
{
PG_TRY();
{
- LogicalRepApplyLoop(origin_startpos);
+ LogicalRepApplyLoop();
}
PG_CATCH();
{
@@ -4661,7 +4797,8 @@ ApplyWorkerMain(Datum main_arg)
}
/* Run the main loop. */
- start_apply(origin_startpos);
+ last_received = origin_startpos;
+ start_apply();
proc_exit(0);
}
diff --git a/src/bin/pg_dump/pg_dump.c b/src/bin/pg_dump/pg_dump.c
index 527c7651ab..1e87f0124e 100644
--- a/src/bin/pg_dump/pg_dump.c
+++ b/src/bin/pg_dump/pg_dump.c
@@ -4494,6 +4494,7 @@ getSubscriptions(Archive *fout)
int i_subsynccommit;
int i_subpublications;
int i_subbinary;
+ int i_subminapplydelay;
int i,
ntups;
@@ -4546,9 +4547,13 @@ getSubscriptions(Archive *fout)
LOGICALREP_TWOPHASE_STATE_DISABLED);
if (fout->remoteVersion >= 160000)
- appendPQExpBufferStr(query, " s.suborigin\n");
+ appendPQExpBufferStr(query,
+ " s.suborigin,\n"
+ " s.subminapplydelay\n");
else
- appendPQExpBuffer(query, " '%s' AS suborigin\n", LOGICALREP_ORIGIN_ANY);
+ appendPQExpBuffer(query, " '%s' AS suborigin,\n"
+ " 0 AS subminapplydelay\n",
+ LOGICALREP_ORIGIN_ANY);
appendPQExpBufferStr(query,
"FROM pg_subscription s\n"
@@ -4576,6 +4581,7 @@ getSubscriptions(Archive *fout)
i_subtwophasestate = PQfnumber(res, "subtwophasestate");
i_subdisableonerr = PQfnumber(res, "subdisableonerr");
i_suborigin = PQfnumber(res, "suborigin");
+ i_subminapplydelay = PQfnumber(res, "subminapplydelay");
subinfo = pg_malloc(ntups * sizeof(SubscriptionInfo));
@@ -4606,6 +4612,8 @@ getSubscriptions(Archive *fout)
subinfo[i].subdisableonerr =
pg_strdup(PQgetvalue(res, i, i_subdisableonerr));
subinfo[i].suborigin = pg_strdup(PQgetvalue(res, i, i_suborigin));
+ subinfo[i].subminapplydelay =
+ atoi(PQgetvalue(res, i, i_subminapplydelay));
/* Decide whether we want to dump it */
selectDumpableObject(&(subinfo[i].dobj), fout);
@@ -4687,6 +4695,9 @@ dumpSubscription(Archive *fout, const SubscriptionInfo *subinfo)
if (strcmp(subinfo->subsynccommit, "off") != 0)
appendPQExpBuffer(query, ", synchronous_commit = %s", fmtId(subinfo->subsynccommit));
+ if (subinfo->subminapplydelay > 0)
+ appendPQExpBuffer(query, ", min_apply_delay = '%d ms'", subinfo->subminapplydelay);
+
appendPQExpBufferStr(query, ");\n");
if (subinfo->dobj.dump & DUMP_COMPONENT_DEFINITION)
diff --git a/src/bin/pg_dump/pg_dump.h b/src/bin/pg_dump/pg_dump.h
index e7cbd8d7ed..b8831c3ed3 100644
--- a/src/bin/pg_dump/pg_dump.h
+++ b/src/bin/pg_dump/pg_dump.h
@@ -661,6 +661,7 @@ typedef struct _SubscriptionInfo
char *subdisableonerr;
char *suborigin;
char *subsynccommit;
+ int subminapplydelay;
char *subpublications;
} SubscriptionInfo;
diff --git a/src/bin/psql/describe.c b/src/bin/psql/describe.c
index c8a0bb7b3a..81d4607a1c 100644
--- a/src/bin/psql/describe.c
+++ b/src/bin/psql/describe.c
@@ -6472,7 +6472,7 @@ describeSubscriptions(const char *pattern, bool verbose)
PGresult *res;
printQueryOpt myopt = pset.popt;
static const bool translate_columns[] = {false, false, false, false,
- false, false, false, false, false, false, false, false};
+ false, false, false, false, false, false, false, false, false};
if (pset.sversion < 100000)
{
@@ -6527,10 +6527,13 @@ describeSubscriptions(const char *pattern, bool verbose)
gettext_noop("Two-phase commit"),
gettext_noop("Disable on error"));
+ /* Origin and min_apply_delay are only supported in v16 and higher */
if (pset.sversion >= 160000)
appendPQExpBuffer(&buf,
- ", suborigin AS \"%s\"\n",
- gettext_noop("Origin"));
+ ", suborigin AS \"%s\"\n"
+ ", subminapplydelay AS \"%s\"\n",
+ gettext_noop("Origin"),
+ gettext_noop("Min apply delay"));
appendPQExpBuffer(&buf,
", subsynccommit AS \"%s\"\n"
diff --git a/src/bin/psql/tab-complete.c b/src/bin/psql/tab-complete.c
index 5e1882eaea..e8b9a43a47 100644
--- a/src/bin/psql/tab-complete.c
+++ b/src/bin/psql/tab-complete.c
@@ -1925,7 +1925,7 @@ psql_completion(const char *text, int start, int end)
COMPLETE_WITH("(", "PUBLICATION");
/* ALTER SUBSCRIPTION <name> SET ( */
else if (HeadMatches("ALTER", "SUBSCRIPTION", MatchAny) && TailMatches("SET", "("))
- COMPLETE_WITH("binary", "disable_on_error", "origin", "slot_name",
+ COMPLETE_WITH("binary", "disable_on_error", "min_apply_delay", "origin", "slot_name",
"streaming", "synchronous_commit");
/* ALTER SUBSCRIPTION <name> SKIP ( */
else if (HeadMatches("ALTER", "SUBSCRIPTION", MatchAny) && TailMatches("SKIP", "("))
@@ -3268,7 +3268,7 @@ psql_completion(const char *text, int start, int end)
/* Complete "CREATE SUBSCRIPTION <name> ... WITH ( <opt>" */
else if (HeadMatches("CREATE", "SUBSCRIPTION") && TailMatches("WITH", "("))
COMPLETE_WITH("binary", "connect", "copy_data", "create_slot",
- "disable_on_error", "enabled", "origin", "slot_name",
+ "disable_on_error", "enabled", "min_apply_delay", "origin", "slot_name",
"streaming", "synchronous_commit", "two_phase");
/* CREATE TRIGGER --- is allowed inside CREATE SCHEMA, so use TailMatches */
diff --git a/src/include/catalog/pg_subscription.h b/src/include/catalog/pg_subscription.h
index b0f2a1705d..9787bb75ea 100644
--- a/src/include/catalog/pg_subscription.h
+++ b/src/include/catalog/pg_subscription.h
@@ -70,6 +70,8 @@ CATALOG(pg_subscription,6100,SubscriptionRelationId) BKI_SHARED_RELATION BKI_ROW
XLogRecPtr subskiplsn; /* All changes finished at this LSN are
* skipped */
+ int32 subminapplydelay; /* Replication apply delay (ms) */
+
NameData subname; /* Name of the subscription */
Oid subowner BKI_LOOKUP(pg_authid); /* Owner of the subscription */
@@ -120,6 +122,7 @@ typedef struct Subscription
* in */
XLogRecPtr skiplsn; /* All changes finished at this LSN are
* skipped */
+ int32 minapplydelay; /* Replication apply delay (ms) */
char *name; /* Name of the subscription */
Oid owner; /* Oid of the subscription owner */
bool enabled; /* Indicates if the subscription is enabled */
diff --git a/src/include/replication/worker_internal.h b/src/include/replication/worker_internal.h
index dc87a4edd1..3dc09d1a4c 100644
--- a/src/include/replication/worker_internal.h
+++ b/src/include/replication/worker_internal.h
@@ -255,7 +255,7 @@ extern void stream_stop_internal(TransactionId xid);
/* Common streaming function to apply all the spooled messages */
extern void apply_spooled_messages(FileSet *stream_fileset, TransactionId xid,
- XLogRecPtr lsn);
+ XLogRecPtr lsn, TimestampTz finish_ts);
extern void apply_dispatch(StringInfo s);
diff --git a/src/test/regress/expected/subscription.out b/src/test/regress/expected/subscription.out
index 3f99b14394..5ccce39986 100644
--- a/src/test/regress/expected/subscription.out
+++ b/src/test/regress/expected/subscription.out
@@ -114,18 +114,18 @@ CREATE SUBSCRIPTION regress_testsub4 CONNECTION 'dbname=regress_doesnotexist' PU
WARNING: subscription was created, but is not connected
HINT: To initiate replication, you must manually create the replication slot, enable the subscription, and refresh the subscription.
\dRs+ regress_testsub4
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
-------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub4 | regress_subscription_user | f | {testpub} | f | off | d | f | none | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay | Synchronous commit | Conninfo | Skip LSN
+------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+-----------------+--------------------+-----------------------------+----------
+ regress_testsub4 | regress_subscription_user | f | {testpub} | f | off | d | f | none | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
ALTER SUBSCRIPTION regress_testsub4 SET (origin = any);
\dRs+ regress_testsub4
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
-------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub4 | regress_subscription_user | f | {testpub} | f | off | d | f | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay | Synchronous commit | Conninfo | Skip LSN
+------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+-----------------+--------------------+-----------------------------+----------
+ regress_testsub4 | regress_subscription_user | f | {testpub} | f | off | d | f | any | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
DROP SUBSCRIPTION regress_testsub3;
@@ -143,10 +143,10 @@ ALTER SUBSCRIPTION regress_testsub CONNECTION 'foobar';
ERROR: invalid connection string syntax: missing "=" after "foobar" in connection info string
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub} | f | off | d | f | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+-----------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | off | d | f | any | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
ALTER SUBSCRIPTION regress_testsub SET PUBLICATION testpub2, testpub3 WITH (refresh = false);
@@ -163,10 +163,10 @@ ERROR: unrecognized subscription parameter: "create_slot"
-- ok
ALTER SUBSCRIPTION regress_testsub SKIP (lsn = '0/12345');
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+---------------------+--------+-----------+------------------+------------------+--------+--------------------+------------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub2,testpub3} | f | off | d | f | any | off | dbname=regress_doesnotexist2 | 0/12345
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+---------------------+--------+-----------+------------------+------------------+--------+-----------------+--------------------+------------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub2,testpub3} | f | off | d | f | any | 0 | off | dbname=regress_doesnotexist2 | 0/12345
(1 row)
-- ok - with lsn = NONE
@@ -175,10 +175,10 @@ ALTER SUBSCRIPTION regress_testsub SKIP (lsn = NONE);
ALTER SUBSCRIPTION regress_testsub SKIP (lsn = '0/0');
ERROR: invalid WAL location (LSN): 0/0
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+---------------------+--------+-----------+------------------+------------------+--------+--------------------+------------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub2,testpub3} | f | off | d | f | any | off | dbname=regress_doesnotexist2 | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+---------------------+--------+-----------+------------------+------------------+--------+-----------------+--------------------+------------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub2,testpub3} | f | off | d | f | any | 0 | off | dbname=regress_doesnotexist2 | 0/0
(1 row)
BEGIN;
@@ -210,10 +210,10 @@ ALTER SUBSCRIPTION regress_testsub_foo SET (synchronous_commit = foobar);
ERROR: invalid value for parameter "synchronous_commit": "foobar"
HINT: Available values: local, remote_write, remote_apply, on, off.
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
----------------------+---------------------------+---------+---------------------+--------+-----------+------------------+------------------+--------+--------------------+------------------------------+----------
- regress_testsub_foo | regress_subscription_user | f | {testpub2,testpub3} | f | off | d | f | any | local | dbname=regress_doesnotexist2 | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay | Synchronous commit | Conninfo | Skip LSN
+---------------------+---------------------------+---------+---------------------+--------+-----------+------------------+------------------+--------+-----------------+--------------------+------------------------------+----------
+ regress_testsub_foo | regress_subscription_user | f | {testpub2,testpub3} | f | off | d | f | any | 0 | local | dbname=regress_doesnotexist2 | 0/0
(1 row)
-- rename back to keep the rest simple
@@ -247,19 +247,19 @@ CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUB
WARNING: subscription was created, but is not connected
HINT: To initiate replication, you must manually create the replication slot, enable the subscription, and refresh the subscription.
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub} | t | off | d | f | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+-----------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | t | off | d | f | any | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
ALTER SUBSCRIPTION regress_testsub SET (binary = false);
ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE);
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub} | f | off | d | f | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+-----------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | off | d | f | any | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
DROP SUBSCRIPTION regress_testsub;
@@ -271,27 +271,27 @@ CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUB
WARNING: subscription was created, but is not connected
HINT: To initiate replication, you must manually create the replication slot, enable the subscription, and refresh the subscription.
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub} | f | on | d | f | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+-----------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | on | d | f | any | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
ALTER SUBSCRIPTION regress_testsub SET (streaming = parallel);
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub} | f | parallel | d | f | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+-----------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | parallel | d | f | any | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
ALTER SUBSCRIPTION regress_testsub SET (streaming = false);
ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE);
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub} | f | off | d | f | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+-----------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | off | d | f | any | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
-- fail - publication already exists
@@ -306,10 +306,10 @@ ALTER SUBSCRIPTION regress_testsub ADD PUBLICATION testpub1, testpub2 WITH (refr
ALTER SUBSCRIPTION regress_testsub ADD PUBLICATION testpub1, testpub2 WITH (refresh = false);
ERROR: publication "testpub1" is already in subscription "regress_testsub"
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-----------------------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub,testpub1,testpub2} | f | off | d | f | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-----------------------------+--------+-----------+------------------+------------------+--------+-----------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub,testpub1,testpub2} | f | off | d | f | any | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
-- fail - publication used more than once
@@ -324,10 +324,10 @@ ERROR: publication "testpub3" is not in subscription "regress_testsub"
-- ok - delete publications
ALTER SUBSCRIPTION regress_testsub DROP PUBLICATION testpub1, testpub2 WITH (refresh = false);
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub} | f | off | d | f | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+-----------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | off | d | f | any | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
DROP SUBSCRIPTION regress_testsub;
@@ -363,10 +363,10 @@ CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUB
WARNING: subscription was created, but is not connected
HINT: To initiate replication, you must manually create the replication slot, enable the subscription, and refresh the subscription.
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub} | f | off | p | f | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+-----------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | off | p | f | any | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
--fail - alter of two_phase option not supported.
@@ -375,10 +375,10 @@ ERROR: unrecognized subscription parameter: "two_phase"
-- but can alter streaming when two_phase enabled
ALTER SUBSCRIPTION regress_testsub SET (streaming = true);
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub} | f | on | p | f | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+-----------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | on | p | f | any | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE);
@@ -388,10 +388,10 @@ CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUB
WARNING: subscription was created, but is not connected
HINT: To initiate replication, you must manually create the replication slot, enable the subscription, and refresh the subscription.
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub} | f | on | p | f | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+-----------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | on | p | f | any | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE);
@@ -404,20 +404,57 @@ CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUB
WARNING: subscription was created, but is not connected
HINT: To initiate replication, you must manually create the replication slot, enable the subscription, and refresh the subscription.
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub} | f | off | d | f | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+-----------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | off | d | f | any | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
ALTER SUBSCRIPTION regress_testsub SET (disable_on_error = true);
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub} | f | off | d | t | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+-----------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | off | d | t | any | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
+ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE);
+DROP SUBSCRIPTION regress_testsub;
+-- fail -- min_apply_delay must be a non-negative integer
+CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUBLICATION testpub WITH (connect = false, min_apply_delay = foo);
+ERROR: invalid value for parameter "min_apply_delay": "foo"
+CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUBLICATION testpub WITH (connect = false, min_apply_delay = -1);
+ERROR: -1 ms is outside the valid range for parameter "min_apply_delay" (0 .. 2147483647)
+-- fail - utilizing streaming = parallel with time-delayed replication is not supported
+CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUBLICATION testpub WITH (connect = false, streaming = parallel, min_apply_delay = 123);
+ERROR: min_apply_delay > 0 and streaming = parallel are mutually exclusive options
+-- success -- min_apply_delay value without unit is taken as milliseconds
+CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUBLICATION testpub WITH (connect = false, min_apply_delay = 123);
+WARNING: subscription was created, but is not connected
+HINT: To initiate replication, you must manually create the replication slot, enable the subscription, and refresh the subscription.
+\dRs+
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+-----------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | off | d | f | any | 123 | off | dbname=regress_doesnotexist | 0/0
+(1 row)
+
+-- success -- min_apply_delay value with unit is converted into ms and stored as an integer
+ALTER SUBSCRIPTION regress_testsub SET (min_apply_delay = '1 d');
+\dRs+
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+-----------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | off | d | f | any | 86400000 | off | dbname=regress_doesnotexist | 0/0
+(1 row)
+
+-- fail - alter subscription with streaming = parallel should fail when time-delayed replication is set
+ALTER SUBSCRIPTION regress_testsub SET (streaming = parallel);
+ERROR: cannot enable parallel streaming mode for subscription with min_apply_delay
+-- fail - alter subscription with min_apply_delay should fail when streaming = parallel is set
+ALTER SUBSCRIPTION regress_testsub SET (min_apply_delay = 0, streaming = parallel);
+ALTER SUBSCRIPTION regress_testsub SET (min_apply_delay = 123);
+ERROR: cannot enable min_apply_delay for subscription in streaming = parallel mode
ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE);
DROP SUBSCRIPTION regress_testsub;
RESET SESSION AUTHORIZATION;
diff --git a/src/test/regress/sql/subscription.sql b/src/test/regress/sql/subscription.sql
index 7281f5fee2..7317b140f5 100644
--- a/src/test/regress/sql/subscription.sql
+++ b/src/test/regress/sql/subscription.sql
@@ -286,6 +286,30 @@ ALTER SUBSCRIPTION regress_testsub SET (disable_on_error = true);
ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE);
DROP SUBSCRIPTION regress_testsub;
+-- fail -- min_apply_delay must be a non-negative integer
+CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUBLICATION testpub WITH (connect = false, min_apply_delay = foo);
+CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUBLICATION testpub WITH (connect = false, min_apply_delay = -1);
+
+-- fail - utilizing streaming = parallel with time-delayed replication is not supported
+CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUBLICATION testpub WITH (connect = false, streaming = parallel, min_apply_delay = 123);
+
+-- success -- min_apply_delay value without unit is taken as milliseconds
+CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUBLICATION testpub WITH (connect = false, min_apply_delay = 123);
+\dRs+
+
+-- success -- min_apply_delay value with unit is converted into ms and stored as an integer
+ALTER SUBSCRIPTION regress_testsub SET (min_apply_delay = '1 d');
+\dRs+
+
+-- fail - alter subscription with streaming = parallel should fail when time-delayed replication is set
+ALTER SUBSCRIPTION regress_testsub SET (streaming = parallel);
+
+-- fail - alter subscription with min_apply_delay should fail when streaming = parallel is set
+ALTER SUBSCRIPTION regress_testsub SET (min_apply_delay = 0, streaming = parallel);
+ALTER SUBSCRIPTION regress_testsub SET (min_apply_delay = 123);
+ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE);
+DROP SUBSCRIPTION regress_testsub;
+
RESET SESSION AUTHORIZATION;
DROP ROLE regress_subscription_user;
DROP ROLE regress_subscription_user2;
diff --git a/src/test/subscription/meson.build b/src/test/subscription/meson.build
index 3db0fdfd96..a186876eb4 100644
--- a/src/test/subscription/meson.build
+++ b/src/test/subscription/meson.build
@@ -38,6 +38,7 @@ tests += {
't/029_on_error.pl',
't/030_origin.pl',
't/031_column_list.pl',
+ 't/032_apply_delay.pl',
't/100_bugs.pl',
],
},
diff --git a/src/test/subscription/t/032_apply_delay.pl b/src/test/subscription/t/032_apply_delay.pl
new file mode 100644
index 0000000000..2d63919525
--- /dev/null
+++ b/src/test/subscription/t/032_apply_delay.pl
@@ -0,0 +1,187 @@
+
+# Copyright (c) 2023, PostgreSQL Global Development Group
+
+# Test replication apply delay
+use strict;
+use warnings;
+use PostgreSQL::Test::Cluster;
+use PostgreSQL::Test::Utils;
+use Test::More;
+
+# Confirm the time-delayed replication has been effective from the server log
+# message where the apply worker emits for applying delay. Moreover, verify
+# that the current worker's remaining wait time is sufficiently bigger than the
+# expected value, in order to check any update of the min_apply_delay.
+sub check_apply_delay_log
+{
+ my ($node_subscriber, $offset, $expected) = @_;
+
+ my $log_location = $node_subscriber->wait_for_log(
+ qr/time-delayed replication for txid (\d+), min_apply_delay = (\d+) ms, remaining wait time: (\d+) ms/,
+ $offset);
+
+ cmp_ok($log_location, '>', $offset,
+ "logfile contains triggered logical replication apply delay");
+
+ # Get the remaining wait time from the server log
+ my $contents = slurp_file($node_subscriber->logfile, $offset);
+ $contents =~
+ qr/time-delayed replication for txid (\d+), min_apply_delay = (\d+) ms, remaining wait time: (\d+) ms/,
+ or die "could not get the apply worker wait time";
+ my $logged_delay = $3;
+
+ # Is it larger than expected?
+ cmp_ok($logged_delay, '>', $expected,
+ "The apply worker wait time has expected duration");
+}
+
+# Compare inserted time on the publisher with applied time on the subscriber to
+# confirm the latter is applied after expected time. The time is automatically
+# generated and stored in the table column 'c'.
+sub check_apply_delay_time
+{
+ my ($node_publisher, $node_subscriber, $primary_key, $expected_diffs) =
+ @_;
+
+ my $inserted_time_on_pub = $node_publisher->safe_psql(
+ 'postgres', qq[
+ SELECT extract(epoch from c) FROM test_tab WHERE a = $primary_key;
+ ]);
+
+ my $inserted_time_on_sub = $node_subscriber->safe_psql(
+ 'postgres', qq[
+ SELECT extract(epoch from c) FROM test_tab WHERE a = $primary_key;
+ ]);
+
+ cmp_ok(
+ $inserted_time_on_sub - $inserted_time_on_pub,
+ '>',
+ $expected_diffs,
+ "The tuple on the subscriber was modified later than the publisher");
+}
+
+# Create publisher node
+my $node_publisher = PostgreSQL::Test::Cluster->new('publisher');
+$node_publisher->init(allows_streaming => 'logical');
+$node_publisher->append_conf('postgresql.conf',
+ 'logical_decoding_work_mem = 64kB');
+$node_publisher->start;
+
+# Create subscriber node
+my $node_subscriber = PostgreSQL::Test::Cluster->new('subscriber');
+$node_subscriber->init;
+$node_subscriber->append_conf('postgresql.conf', "log_min_messages = debug2");
+$node_subscriber->start;
+
+# Setup structure on publisher
+$node_publisher->safe_psql('postgres',
+ "CREATE TABLE test_tab (a int primary key, b varchar, c timestamptz (6) DEFAULT now())"
+);
+
+# Setup structure on subscriber
+$node_subscriber->safe_psql('postgres',
+ "CREATE TABLE test_tab (a int primary key, b text, c timestamptz (6) DEFAULT now())"
+);
+
+# Setup logical replication
+my $publisher_connstr = $node_publisher->connstr . ' dbname=postgres';
+
+# The column 'c' must not be published because we want to compare the time
+# difference.
+$node_publisher->safe_psql('postgres',
+ "CREATE PUBLICATION tap_pub FOR TABLE test_tab (a, b)");
+
+my $appname = 'tap_sub';
+
+# Create a subscription that applies the transaction after 1 second delay
+$node_subscriber->safe_psql('postgres',
+ "CREATE SUBSCRIPTION tap_sub CONNECTION '$publisher_connstr application_name=$appname' PUBLICATION tap_pub WITH (copy_data = off, min_apply_delay = '1s', streaming = 'on')"
+);
+
+# Check log starting now for logical replication apply delay
+my $offset = -s $node_subscriber->logfile;
+
+# New row to trigger apply delay
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO test_tab VALUES (1, 'foo')");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO test_tab VALUES (2, 'bar')");
+
+$node_publisher->wait_for_catchup($appname);
+
+my $result =
+ $node_subscriber->safe_psql('postgres',
+ "SELECT count(*), min(a), max(a) FROM test_tab");
+is($result, qq(2|1|2), 'check if the new rows were applied to subscriber');
+
+# Make sure the apply worker knows to wait for more than 500ms
+check_apply_delay_log($node_subscriber, $offset, "0.5");
+
+# Verify that the subscriber lags the publisher by at least 1 second
+check_apply_delay_time($node_publisher, $node_subscriber, '2', '1');
+
+# Setup for streaming case
+$node_publisher->append_conf('postgresql.conf',
+ 'logical_decoding_mode = immediate');
+$node_publisher->reload;
+
+# Run a query to make sure that the reload has taken effect.
+$node_publisher->safe_psql('postgres', q{SELECT 1});
+
+# Check log starting now for logical replication apply delay
+$offset = -s $node_subscriber->logfile;
+
+# Test streamed transaction by insert
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO test_tab SELECT i, md5(i::text) FROM generate_series(3, 5) s(i);"
+);
+
+$node_publisher->wait_for_catchup($appname);
+
+$result =
+ $node_subscriber->safe_psql('postgres',
+ "SELECT count(*), min(a), max(a) FROM test_tab");
+is($result, qq(5|1|5), 'check if the new rows were applied to subscriber');
+
+# Make sure the apply worker knows to wait for more than 500ms
+check_apply_delay_log($node_subscriber, $offset, "0.5");
+
+# Verify that the subscriber lags the publisher by at least 1 second
+check_apply_delay_time($node_publisher, $node_subscriber, '5', '1');
+
+# Test whether ALTER SUBSCRIPTION changes the delayed time of the apply worker
+# (1 day 5 minutes). Note that the extra 5 minute is to account for any
+# decoding/network overhead.
+$node_subscriber->safe_psql('postgres',
+ "ALTER SUBSCRIPTION tap_sub SET (min_apply_delay = 86700000)");
+
+# Check log starting now for logical replication apply delay
+$offset = -s $node_subscriber->logfile;
+
+# New row to trigger apply delay
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO test_tab VALUES (0, 'foobar')");
+
+# Make sure the apply worker knows to wait for more than 1 day
+check_apply_delay_log($node_subscriber, $offset, "86400000");
+
+# Disable subscription and the worker should die immediately
+$node_subscriber->safe_psql('postgres',
+ "ALTER SUBSCRIPTION tap_sub DISABLE;");
+
+# Wait until worker dies
+my $sub_query =
+ "SELECT count(1) = 0 FROM pg_stat_subscription WHERE subname = 'tap_sub' AND pid IS NOT NULL;";
+$node_subscriber->poll_query_until('postgres', $sub_query)
+ or die "Timed out while waiting for subscriber to die";
+
+# Confirm disabling the subscription by ALTER SUBSCRIPTION DISABLE did not cause
+# the delayed transaction to be applied.
+$result = $node_subscriber->safe_psql('postgres',
+ "SELECT count(a) FROM test_tab WHERE a = 0;");
+is($result, qq(0), "check the delayed transaction was not applied");
+
+$node_subscriber->stop;
+$node_publisher->stop;
+
+done_testing();
--
2.30.0
On Fri, Jan 27, 2023 at 1:39 PM Takamichi Osumi (Fujitsu)
<osumi.takamichi@fujitsu.com> wrote:
On Wednesday, January 25, 2023 11:24 PM I wrote:
Attached the updated v22.
Hi,
During self-review, I noticed some changes are
required for some variable types related to 'min_apply_delay' value,
so have conducted the adjustment changes for the same.
So, you have changed min_apply_delay from int64 to int32, but you
haven't mentioned the reason for the same? We use 'int' for the
similar parameter recovery_min_apply_delay, so, ideally, it makes
sense but still better to tell your reason explicitly.
Few comments
=============
1.
@@ -70,6 +70,8 @@ CATALOG(pg_subscription,6100,SubscriptionRelationId)
BKI_SHARED_RELATION BKI_ROW
XLogRecPtr subskiplsn; /* All changes finished at this LSN are
* skipped */
+ int32 subminapplydelay; /* Replication apply delay (ms) */
+
NameData subname; /* Name of the subscription */
Oid subowner BKI_LOOKUP(pg_authid); /* Owner of the subscription */
Why are you placing this after subskiplsn? Earlier it was okay because
we want the 64 bit value to be aligned but now, isn't it better to
keep it after subowner?
2.
+
+ diffms = TimestampDifferenceMilliseconds(GetCurrentTimestamp(),
+ TimestampTzPlusMilliseconds(finish_ts, MySubscription->minapplydelay));
The above code appears a bit unreadable. Can we store the result of
TimestampTzPlusMilliseconds() in a separate variable say "TimestampTz
delayUntil;"?
--
With Regards,
Amit Kapila.
Hi,
On Friday, January 27, 2023 8:00 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
On Fri, Jan 27, 2023 at 1:39 PM Takamichi Osumi (Fujitsu)
<osumi.takamichi@fujitsu.com> wrote:On Wednesday, January 25, 2023 11:24 PM I wrote:
Attached the updated v22.
Hi,
During self-review, I noticed some changes are required for some
variable types related to 'min_apply_delay' value, so have conducted
the adjustment changes for the same.So, you have changed min_apply_delay from int64 to int32, but you haven't
mentioned the reason for the same? We use 'int' for the similar parameter
recovery_min_apply_delay, so, ideally, it makes sense but still better to tell your
reason explicitly.
Yes. It's because I thought I need to make this feature consistent with the recovery_min_apply_delay.
This feature handles the range same as the recovery_min_apply delay from 0 to INT_MAX now
so should be adjusted to match it.
Few comments
=============
1.
@@ -70,6 +70,8 @@ CATALOG(pg_subscription,6100,SubscriptionRelationId)
BKI_SHARED_RELATION BKI_ROW
XLogRecPtr subskiplsn; /* All changes finished at this LSN are
* skipped */+ int32 subminapplydelay; /* Replication apply delay (ms) */ + NameData subname; /* Name of the subscription */Oid subowner BKI_LOOKUP(pg_authid); /* Owner of the subscription */
Why are you placing this after subskiplsn? Earlier it was okay because we want
the 64 bit value to be aligned but now, isn't it better to keep it after subowner?
Moved it after subowner.
2. + + diffms = TimestampDifferenceMilliseconds(GetCurrentTimestamp(), + TimestampTzPlusMilliseconds(finish_ts, + MySubscription->minapplydelay));The above code appears a bit unreadable. Can we store the result of
TimestampTzPlusMilliseconds() in a separate variable say "TimestampTz
delayUntil;"?
Agreed. Fixed.
Attached the updated patch v24.
Best Regards,
Takamichi Osumi
Attachments:
v24-0001-Time-delayed-logical-replication-subscriber.patchapplication/octet-stream; name=v24-0001-Time-delayed-logical-replication-subscriber.patchDownload
From 850ed11254a60c259919ab2eb01eaf1a7f65733d Mon Sep 17 00:00:00 2001
From: Takamichi Osumi <osumi.takamichi@fujitsu.com>
Date: Sat, 28 Jan 2023 03:51:06 +0000
Subject: [PATCH v24] Time-delayed logical replication subscriber
Similar to physical replication, a time-delayed copy of the data for
logical replication is useful for some scenarios (particularly to fix
errors that might cause data loss).
This patch implements a new subscription parameter called 'min_apply_delay'.
If the subscription sets min_apply_delay parameter, the logical
replication worker will delay the transaction apply for min_apply_delay
milliseconds.
The delay is calculated between the WAL time stamp and the current time
on the subscriber.
The delay occurs before we start to apply the transaction on the
subscriber. The main reason is to avoid keeping a transaction open for
a long time. Regular and prepared transactions are covered. Streamed
transactions are also covered.
The combination of parallel streaming mode and min_apply_delay is not
allowed. This is because we start applying the transaction stream as
soon as the first change arrives without knowing the transaction's
prepare/commit time. This means we cannot calculate the underlying
network/decoding lag between publisher and subscriber, and so always
waiting for the full 'min_apply_delay' period might include
unnecessary delay.
The other possibility is to apply the delay at the end of the parallel
apply transaction but that would cause issues related to resource
bloat and locks being held for a long time.
Note that this feature doesn't interact with skip transaction feature.
The skip transaction feature applies to one transaction with a specific LSN.
So, even if the skipped transaction and non-skipped transaction come
consecutively in a very short time, regardless of the order of which comes
first, the time-delayed feature gets balanced by delayed application
for other transactions before and after the skipped transaction.
Author: Euler Taveira, Takamichi Osumi, Kuroda Hayato
Reviewed-by: Amit Kapila, Peter Smith, Vignesh C, Shveta Malik,
Kyotaro Horiguchi, Dilip Kumar, Melih Mutlu
Discussion: https://postgr.es/m/CAB-JLwYOYwL=XTyAXKiH5CtM_Vm8KjKh7aaitCKvmCh4rzr5pQ@mail.gmail.com
---
doc/src/sgml/catalogs.sgml | 9 +
doc/src/sgml/config.sgml | 11 ++
doc/src/sgml/glossary.sgml | 14 ++
doc/src/sgml/logical-replication.sgml | 6 +
doc/src/sgml/ref/alter_subscription.sgml | 5 +-
doc/src/sgml/ref/create_subscription.sgml | 48 ++++-
src/backend/catalog/pg_subscription.c | 1 +
src/backend/catalog/system_views.sql | 7 +-
src/backend/commands/subscriptioncmds.c | 117 ++++++++++-
.../replication/logical/applyparallelworker.c | 3 +-
src/backend/replication/logical/worker.c | 166 ++++++++++++++--
src/bin/pg_dump/pg_dump.c | 15 +-
src/bin/pg_dump/pg_dump.h | 1 +
src/bin/psql/describe.c | 9 +-
src/bin/psql/tab-complete.c | 4 +-
src/include/catalog/pg_subscription.h | 3 +
src/include/replication/worker_internal.h | 2 +-
src/test/regress/expected/subscription.out | 181 ++++++++++-------
src/test/regress/sql/subscription.sql | 24 +++
src/test/subscription/meson.build | 1 +
src/test/subscription/t/032_apply_delay.pl | 187 ++++++++++++++++++
21 files changed, 710 insertions(+), 104 deletions(-)
create mode 100644 src/test/subscription/t/032_apply_delay.pl
diff --git a/doc/src/sgml/catalogs.sgml b/doc/src/sgml/catalogs.sgml
index c1e4048054..925a6ebb12 100644
--- a/doc/src/sgml/catalogs.sgml
+++ b/doc/src/sgml/catalogs.sgml
@@ -7873,6 +7873,15 @@ SCRAM-SHA-256$<replaceable><iteration count></replaceable>:<replaceable>&l
</para></entry>
</row>
+ <row>
+ <entry role="catalog_table_entry"><para role="column_definition">
+ <structfield>subminapplydelay</structfield> <type>int4</type>
+ </para>
+ <para>
+ The minimum amount of time to delay applying changes, in milliseconds.
+ </para></entry>
+ </row>
+
<row>
<entry role="catalog_table_entry"><para role="column_definition">
<structfield>subname</structfield> <type>name</type>
diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml
index f985afc009..317ebeb38f 100644
--- a/doc/src/sgml/config.sgml
+++ b/doc/src/sgml/config.sgml
@@ -4787,6 +4787,17 @@ ANY <replaceable class="parameter">num_sync</replaceable> ( <replaceable class="
the <filename>postgresql.conf</filename> file or on the server
command line.
</para>
+ <para>
+ For time-delayed logical replication, the apply worker sends a feedback
+ message to the publisher every
+ <varname>wal_receiver_status_interval</varname> milliseconds. Make sure
+ to set <varname>wal_receiver_status_interval</varname> less than the
+ <varname>wal_sender_timeout</varname> on the publisher, otherwise, the
+ <literal>walsender</literal> will repeatedly terminate due to timeout
+ error. Note that if <varname>wal_receiver_status_interval</varname> is
+ set to zero, the apply worker sends no feedback messages during the
+ <literal>min_apply_delay</literal> period.
+ </para>
</listitem>
</varlistentry>
diff --git a/doc/src/sgml/glossary.sgml b/doc/src/sgml/glossary.sgml
index 7c01a541fe..6ed6fa5853 100644
--- a/doc/src/sgml/glossary.sgml
+++ b/doc/src/sgml/glossary.sgml
@@ -1729,6 +1729,20 @@
</glossdef>
</glossentry>
+ <glossentry id="glossary-time-delayed-replication">
+ <glossterm>Time-delayed replication</glossterm>
+ <glossdef>
+ <para>
+ Replication setup that applies time-delayed copy of the data.
+ </para>
+ <para>
+ For more information, see
+ <xref linkend="guc-recovery-min-apply-delay"/> for physical replication
+ and <xref linkend="sql-createsubscription"/> for logical replication.
+ </para>
+ </glossdef>
+ </glossentry>
+
<glossentry id="glossary-toast">
<glossterm>TOAST</glossterm>
<glossdef>
diff --git a/doc/src/sgml/logical-replication.sgml b/doc/src/sgml/logical-replication.sgml
index 1bd5660c87..6bd5f61e2b 100644
--- a/doc/src/sgml/logical-replication.sgml
+++ b/doc/src/sgml/logical-replication.sgml
@@ -247,6 +247,12 @@
target table.
</para>
+ <para>
+ A subscription can delay the application of changes by specifying the
+ <literal>min_apply_delay</literal> subscription parameter. See
+ <xref linkend="sql-createsubscription"/> for details.
+ </para>
+
<sect2 id="logical-replication-subscription-slot">
<title>Replication Slot Management</title>
diff --git a/doc/src/sgml/ref/alter_subscription.sgml b/doc/src/sgml/ref/alter_subscription.sgml
index ad93553a1d..1c6e9dd2d1 100644
--- a/doc/src/sgml/ref/alter_subscription.sgml
+++ b/doc/src/sgml/ref/alter_subscription.sgml
@@ -213,8 +213,9 @@ ALTER SUBSCRIPTION <replaceable class="parameter">name</replaceable> RENAME TO <
are <literal>slot_name</literal>,
<literal>synchronous_commit</literal>,
<literal>binary</literal>, <literal>streaming</literal>,
- <literal>disable_on_error</literal>, and
- <literal>origin</literal>.
+ <literal>disable_on_error</literal>,
+ <literal>origin</literal>, and
+ <literal>min_apply_delay</literal>.
</para>
</listitem>
</varlistentry>
diff --git a/doc/src/sgml/ref/create_subscription.sgml b/doc/src/sgml/ref/create_subscription.sgml
index eba72c6af6..22b4451d4d 100644
--- a/doc/src/sgml/ref/create_subscription.sgml
+++ b/doc/src/sgml/ref/create_subscription.sgml
@@ -349,7 +349,48 @@ CREATE SUBSCRIPTION <replaceable class="parameter">subscription_name</replaceabl
</para>
</listitem>
</varlistentry>
- </variablelist></para>
+
+ <varlistentry>
+ <term><literal>min_apply_delay</literal> (<type>integer</type>)</term>
+ <listitem>
+ <para>
+ By default, the subscriber applies changes as soon as possible. This
+ parameter allows the user to delay the application of changes by a
+ given time period. If the value is specified without units, it is
+ taken as milliseconds. The default is zero (no delay). See
+ <xref linkend="config-setting-names-values"/> for details on the
+ available valid time unites.
+ </para>
+ <para>
+ Any delay becomes effective after all initial table synchronization
+ has finished and occurs before each transaction starts to get applied
+ on the subscriber. The delay is calculated as the difference between
+ the WAL timestamp as written on the publisher and the current time on
+ the subscriber. Any overhead of time spent in logical decoding and in
+ transferring the transaction may reduce the actual wait time. It is
+ also possible that the overhead already exceeds the requested
+ <literal>min_apply_delay</literal> value, in which case no delay is
+ applied. If the system clocks on publisher and subscriber are not
+ synchronized, this may lead to apply changes earlier than expected,
+ but this is not a major issue because this parameter is typically
+ much larger than the time deviations between servers. Note that if
+ this parameter is set to a long delay, the replication will stop if
+ the replication slot falls behind the current LSN by more than
+ <link linkend="guc-max-slot-wal-keep-size"><literal>max_slot_wal_keep_size</literal></link>.
+ </para>
+ <warning>
+ <para>
+ Delaying the replication means there is a much longer time between
+ making a change on the publisher, and that change being committed
+ on the subscriber. This can impact the performance of synchronous
+ replication. See <xref linkend="guc-synchronous-commit"/>
+ parameter.
+ </para>
+ </warning>
+ </listitem>
+ </varlistentry>
+ </variablelist>
+ </para>
</listitem>
</varlistentry>
@@ -413,6 +454,11 @@ CREATE SUBSCRIPTION <replaceable class="parameter">subscription_name</replaceabl
published with different column lists are not supported.
</para>
+ <para>
+ A non-zero <literal>min_apply_delay</literal> parameter is not allowed when
+ streaming in parallel mode.
+ </para>
+
<para>
We allow non-existent publications to be specified so that users can add
those later. This means
diff --git a/src/backend/catalog/pg_subscription.c b/src/backend/catalog/pg_subscription.c
index a56ae311c3..c767cc1c3a 100644
--- a/src/backend/catalog/pg_subscription.c
+++ b/src/backend/catalog/pg_subscription.c
@@ -64,6 +64,7 @@ GetSubscription(Oid subid, bool missing_ok)
sub->oid = subid;
sub->dbid = subform->subdbid;
sub->skiplsn = subform->subskiplsn;
+ sub->minapplydelay = subform->subminapplydelay;
sub->name = pstrdup(NameStr(subform->subname));
sub->owner = subform->subowner;
sub->enabled = subform->subenabled;
diff --git a/src/backend/catalog/system_views.sql b/src/backend/catalog/system_views.sql
index 8608e3fa5b..317c2010cb 100644
--- a/src/backend/catalog/system_views.sql
+++ b/src/backend/catalog/system_views.sql
@@ -1299,9 +1299,10 @@ REVOKE ALL ON pg_replication_origin_status FROM public;
-- All columns of pg_subscription except subconninfo are publicly readable.
REVOKE ALL ON pg_subscription FROM public;
-GRANT SELECT (oid, subdbid, subskiplsn, subname, subowner, subenabled,
- subbinary, substream, subtwophasestate, subdisableonerr,
- subslotname, subsynccommit, subpublications, suborigin)
+GRANT SELECT (oid, subdbid, subskiplsn, subminapplydelay, subname, subowner,
+ subenabled, subbinary, substream, subtwophasestate,
+ subdisableonerr, subslotname, subsynccommit, subpublications,
+ suborigin)
ON pg_subscription TO public;
CREATE VIEW pg_stat_subscription_stats AS
diff --git a/src/backend/commands/subscriptioncmds.c b/src/backend/commands/subscriptioncmds.c
index 464db6d247..489eae85ee 100644
--- a/src/backend/commands/subscriptioncmds.c
+++ b/src/backend/commands/subscriptioncmds.c
@@ -66,6 +66,7 @@
#define SUBOPT_DISABLE_ON_ERR 0x00000400
#define SUBOPT_LSN 0x00000800
#define SUBOPT_ORIGIN 0x00001000
+#define SUBOPT_MIN_APPLY_DELAY 0x00002000
/* check if the 'val' has 'bits' set */
#define IsSet(val, bits) (((val) & (bits)) == (bits))
@@ -90,6 +91,7 @@ typedef struct SubOpts
bool disableonerr;
char *origin;
XLogRecPtr lsn;
+ int32 min_apply_delay;
} SubOpts;
static List *fetch_table_list(WalReceiverConn *wrconn, List *publications);
@@ -100,7 +102,7 @@ static void check_publications_origin(WalReceiverConn *wrconn,
static void check_duplicates_in_publist(List *publist, Datum *datums);
static List *merge_publications(List *oldpublist, List *newpublist, bool addpub, const char *subname);
static void ReportSlotConnectionError(List *rstates, Oid subid, char *slotname, char *err);
-
+static int defGetMinApplyDelay(DefElem *def);
/*
* Common option parsing function for CREATE and ALTER SUBSCRIPTION commands.
@@ -146,6 +148,8 @@ parse_subscription_options(ParseState *pstate, List *stmt_options,
opts->disableonerr = false;
if (IsSet(supported_opts, SUBOPT_ORIGIN))
opts->origin = pstrdup(LOGICALREP_ORIGIN_ANY);
+ if (IsSet(supported_opts, SUBOPT_MIN_APPLY_DELAY))
+ opts->min_apply_delay = 0;
/* Parse options */
foreach(lc, stmt_options)
@@ -324,6 +328,15 @@ parse_subscription_options(ParseState *pstate, List *stmt_options,
opts->specified_opts |= SUBOPT_LSN;
opts->lsn = lsn;
}
+ else if (IsSet(supported_opts, SUBOPT_MIN_APPLY_DELAY) &&
+ strcmp(defel->defname, "min_apply_delay") == 0)
+ {
+ if (IsSet(opts->specified_opts, SUBOPT_MIN_APPLY_DELAY))
+ errorConflictingDefElem(defel, pstate);
+
+ opts->specified_opts |= SUBOPT_MIN_APPLY_DELAY;
+ opts->min_apply_delay = defGetMinApplyDelay(defel);
+ }
else
ereport(ERROR,
(errcode(ERRCODE_SYNTAX_ERROR),
@@ -404,6 +417,31 @@ parse_subscription_options(ParseState *pstate, List *stmt_options,
"slot_name = NONE", "create_slot = false")));
}
}
+
+ /*
+ * The combination of parallel streaming mode and min_apply_delay is not
+ * allowed. This is because we start applying the transaction stream as
+ * soon as the first change arrives without knowing the transaction's
+ * prepare/commit time. This means we cannot calculate the underlying
+ * network/decoding lag between publisher and subscriber, and so always
+ * waiting for the full 'min_apply_delay' period might include unnecessary
+ * delay.
+ *
+ * The other possibility is to apply the delay at the end of the parallel
+ * apply transaction but that would cause issues related to resource bloat
+ * and locks being held for a long time.
+ */
+ if (IsSet(supported_opts, SUBOPT_MIN_APPLY_DELAY) &&
+ opts->min_apply_delay > 0 && opts->streaming == LOGICALREP_STREAM_PARALLEL)
+ ereport(ERROR,
+ errcode(ERRCODE_SYNTAX_ERROR),
+
+ /*
+ * - translator: the first %s is a string of the form "parameter > 0"
+ * and the second one is "option = value".
+ */
+ errmsg("%s and %s are mutually exclusive options",
+ "min_apply_delay > 0", "streaming = parallel"));
}
/*
@@ -560,7 +598,8 @@ CreateSubscription(ParseState *pstate, CreateSubscriptionStmt *stmt,
SUBOPT_SLOT_NAME | SUBOPT_COPY_DATA |
SUBOPT_SYNCHRONOUS_COMMIT | SUBOPT_BINARY |
SUBOPT_STREAMING | SUBOPT_TWOPHASE_COMMIT |
- SUBOPT_DISABLE_ON_ERR | SUBOPT_ORIGIN);
+ SUBOPT_DISABLE_ON_ERR | SUBOPT_ORIGIN |
+ SUBOPT_MIN_APPLY_DELAY);
parse_subscription_options(pstate, stmt->options, supported_opts, &opts);
/*
@@ -625,6 +664,7 @@ CreateSubscription(ParseState *pstate, CreateSubscriptionStmt *stmt,
values[Anum_pg_subscription_oid - 1] = ObjectIdGetDatum(subid);
values[Anum_pg_subscription_subdbid - 1] = ObjectIdGetDatum(MyDatabaseId);
values[Anum_pg_subscription_subskiplsn - 1] = LSNGetDatum(InvalidXLogRecPtr);
+ values[Anum_pg_subscription_subminapplydelay - 1] = Int32GetDatum(opts.min_apply_delay);
values[Anum_pg_subscription_subname - 1] =
DirectFunctionCall1(namein, CStringGetDatum(stmt->subname));
values[Anum_pg_subscription_subowner - 1] = ObjectIdGetDatum(owner);
@@ -1054,7 +1094,7 @@ AlterSubscription(ParseState *pstate, AlterSubscriptionStmt *stmt,
supported_opts = (SUBOPT_SLOT_NAME |
SUBOPT_SYNCHRONOUS_COMMIT | SUBOPT_BINARY |
SUBOPT_STREAMING | SUBOPT_DISABLE_ON_ERR |
- SUBOPT_ORIGIN);
+ SUBOPT_ORIGIN | SUBOPT_MIN_APPLY_DELAY);
parse_subscription_options(pstate, stmt->options,
supported_opts, &opts);
@@ -1098,6 +1138,19 @@ AlterSubscription(ParseState *pstate, AlterSubscriptionStmt *stmt,
if (IsSet(opts.specified_opts, SUBOPT_STREAMING))
{
+ /*
+ * The combination of parallel streaming mode and
+ * min_apply_delay is not allowed. See
+ * parse_subscription_options for details of the reason.
+ */
+ if (opts.streaming == LOGICALREP_STREAM_PARALLEL)
+ if ((IsSet(opts.specified_opts, SUBOPT_MIN_APPLY_DELAY) && opts.min_apply_delay > 0) ||
+ (!IsSet(opts.specified_opts, SUBOPT_MIN_APPLY_DELAY) && sub->minapplydelay > 0))
+ ereport(ERROR,
+ errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
+ errmsg("cannot enable parallel streaming mode for subscription with %s",
+ "min_apply_delay"));
+
values[Anum_pg_subscription_substream - 1] =
CharGetDatum(opts.streaming);
replaces[Anum_pg_subscription_substream - 1] = true;
@@ -1111,6 +1164,25 @@ AlterSubscription(ParseState *pstate, AlterSubscriptionStmt *stmt,
= true;
}
+ if (IsSet(opts.specified_opts, SUBOPT_MIN_APPLY_DELAY))
+ {
+ /*
+ * The combination of parallel streaming mode and
+ * min_apply_delay is not allowed.
+ */
+ if (opts.min_apply_delay > 0)
+ if ((IsSet(opts.specified_opts, SUBOPT_STREAMING) && opts.streaming == LOGICALREP_STREAM_PARALLEL) ||
+ (!IsSet(opts.specified_opts, SUBOPT_STREAMING) && sub->stream == LOGICALREP_STREAM_PARALLEL))
+ ereport(ERROR,
+ errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
+ errmsg("cannot enable %s for subscription in %s mode",
+ "min_apply_delay", "streaming = parallel"));
+
+ values[Anum_pg_subscription_subminapplydelay - 1] =
+ Int32GetDatum(opts.min_apply_delay);
+ replaces[Anum_pg_subscription_subminapplydelay - 1] = true;
+ }
+
if (IsSet(opts.specified_opts, SUBOPT_ORIGIN))
{
values[Anum_pg_subscription_suborigin - 1] =
@@ -2195,3 +2267,42 @@ defGetStreamingMode(DefElem *def)
def->defname)));
return LOGICALREP_STREAM_OFF; /* keep compiler quiet */
}
+
+/*
+ * Extract the min_apply_delay value from a DefElem. This is very similar to
+ * parse_and_validate_value() for integer values, because min_apply_delay
+ * accepts the same parameter format as recovery_min_apply_delay.
+ */
+static int
+defGetMinApplyDelay(DefElem *def)
+{
+ char *input_string;
+ int result;
+ const char *hintmsg;
+
+ input_string = defGetString(def);
+
+ /*
+ * Parse given string as parameter which has millisecond unit
+ */
+ if (!parse_int(input_string, &result, GUC_UNIT_MS, &hintmsg))
+ ereport(ERROR,
+ (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+ errmsg("invalid value for parameter \"%s\": \"%s\"",
+ "min_apply_delay", input_string),
+ hintmsg ? errhint("%s", _(hintmsg)) : 0));
+
+ /*
+ * Check lower bound. parse_int() has already been confirmed that result
+ * is less than or equal to INT_MAX.
+ */
+ if (result < 0)
+ ereport(ERROR,
+ (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+ errmsg("%d ms is outside the valid range for parameter \"%s\" (%d .. %d)",
+ result,
+ "min_apply_delay",
+ 0, INT_MAX)));
+
+ return result;
+}
diff --git a/src/backend/replication/logical/applyparallelworker.c b/src/backend/replication/logical/applyparallelworker.c
index 3579e704fe..7302bce7a0 100644
--- a/src/backend/replication/logical/applyparallelworker.c
+++ b/src/backend/replication/logical/applyparallelworker.c
@@ -704,7 +704,8 @@ pa_process_spooled_messages_if_required(void)
{
apply_spooled_messages(&MyParallelShared->fileset,
MyParallelShared->xid,
- InvalidXLogRecPtr);
+ InvalidXLogRecPtr,
+ 0);
pa_set_fileset_state(MyParallelShared, FS_EMPTY);
}
diff --git a/src/backend/replication/logical/worker.c b/src/backend/replication/logical/worker.c
index cfb2ab6248..cd380fb350 100644
--- a/src/backend/replication/logical/worker.c
+++ b/src/backend/replication/logical/worker.c
@@ -319,6 +319,17 @@ static List *on_commit_wakeup_workers_subids = NIL;
bool in_remote_transaction = false;
static XLogRecPtr remote_final_lsn = InvalidXLogRecPtr;
+/*
+ * In order to avoid walsender timeout for time-delayed logical replication the
+ * apply worker keeps sending feedback messages during the delay period.
+ * Meanwhile, the feature delays the apply before the start of the
+ * transaction and thus we don't write WAL records for the suspended changes
+ * during the wait. When the apply worker sends a feedback message during the
+ * delay, we should not make positions of the flushed and apply LSN overwritten
+ * by the last received latest LSN. See send_feedback() for details.
+ */
+static XLogRecPtr last_received = InvalidXLogRecPtr;
+
/* fields valid only when processing streamed transaction */
static bool in_streamed_transaction = false;
@@ -389,7 +400,8 @@ static void stream_write_change(char action, StringInfo s);
static void stream_open_and_write_change(TransactionId xid, char action, StringInfo s);
static void stream_close_file(void);
-static void send_feedback(XLogRecPtr recvpos, bool force, bool requestReply);
+static void send_feedback(XLogRecPtr recvpos, bool force, bool requestReply,
+ bool has_unprocessed_change);
static void DisableSubscriptionAndExit(void);
@@ -999,6 +1011,109 @@ slot_modify_data(TupleTableSlot *slot, TupleTableSlot *srcslot,
ExecStoreVirtualTuple(slot);
}
+/*
+ * When min_apply_delay parameter is set on the subscriber, we wait long enough
+ * to make sure a transaction is applied at least that period behind the
+ * publisher.
+ *
+ * While the physical replication applies the delay at commit time, this
+ * feature applies the delay for the next transaction but before starting the
+ * transaction. This is mainly because keeping a transaction that conducted
+ * write operations open for a long time results in some issues such as bloat
+ * and locks.
+ *
+ * The min_apply_delay parameter will take effect only after all tables are in
+ * READY state.
+ *
+ * xid is the transaction id where we apply the delay.
+ *
+ * finish_ts is the commit/prepare time of both regular (non-streamed) and
+ * streamed transactions. Unlike the regular (non-streamed) cases, the delay
+ * is applied in a STREAM COMMIT/STREAM PREPARE message for streamed
+ * transactions. The STREAM START message does not contain a commit/prepare
+ * time (it will be available when the in-progress transaction finishes).
+ * Hence, it's not appropriate to apply a delay at the STREAM START time.
+ */
+static void
+maybe_apply_delay(TransactionId xid, TimestampTz finish_ts)
+{
+ Assert(finish_ts > 0);
+
+ /* Nothing to do if no delay set */
+ if (!MySubscription->minapplydelay)
+ return;
+
+ /*
+ * The min_apply_delay parameter is ignored until all tablesync workers
+ * have reached READY state. This is because if we allowed the delay
+ * during the catchup phase, then once we reached the limit of tablesync
+ * workers it would impose a delay for each subsequent worker. That would
+ * cause initial table synchronization completion to take a long time.
+ */
+ if (!AllTablesyncsReady())
+ return;
+
+ /* Apply the delay by the latch mechanism */
+ while (true)
+ {
+ TimestampTz delayUntil;
+ long diffms;
+
+ ResetLatch(MyLatch);
+
+ CHECK_FOR_INTERRUPTS();
+
+ /* This might change wal_receiver_status_interval */
+ if (ConfigReloadPending)
+ {
+ ConfigReloadPending = false;
+ ProcessConfigFile(PGC_SIGHUP);
+ }
+
+ /*
+ * Before calculating the time duration, reload the catalog if needed.
+ */
+ if (!in_remote_transaction && !in_streamed_transaction)
+ {
+ AcceptInvalidationMessages();
+ maybe_reread_subscription();
+ }
+
+ delayUntil = TimestampTzPlusMilliseconds(finish_ts, MySubscription->minapplydelay);
+ diffms = TimestampDifferenceMilliseconds(GetCurrentTimestamp(), delayUntil);
+
+ /*
+ * Exit without arming the latch if it's already past time to apply
+ * this transaction.
+ */
+ if (diffms <= 0)
+ break;
+
+ elog(DEBUG2, "time-delayed replication for txid %u, min_apply_delay = %d ms, remaining wait time: %ld ms",
+ xid, MySubscription->minapplydelay, diffms);
+
+ /*
+ * Call send_feedback() to prevent the publisher from exiting by
+ * timeout during the delay, when wal_receiver_status_interval is
+ * available.
+ */
+ if (wal_receiver_status_interval > 0 &&
+ diffms > wal_receiver_status_interval * 1000L)
+ {
+ WaitLatch(MyLatch,
+ WL_LATCH_SET | WL_TIMEOUT | WL_EXIT_ON_PM_DEATH,
+ wal_receiver_status_interval * 1000L,
+ WAIT_EVENT_RECOVERY_APPLY_DELAY);
+ send_feedback(last_received, true, false, true);
+ }
+ else
+ WaitLatch(MyLatch,
+ WL_LATCH_SET | WL_TIMEOUT | WL_EXIT_ON_PM_DEATH,
+ diffms,
+ WAIT_EVENT_RECOVERY_APPLY_DELAY);
+ }
+}
+
/*
* Handle BEGIN message.
*/
@@ -1013,6 +1128,9 @@ apply_handle_begin(StringInfo s)
logicalrep_read_begin(s, &begin_data);
set_apply_error_context_xact(begin_data.xid, begin_data.final_lsn);
+ /* Should we delay the current transaction? */
+ maybe_apply_delay(begin_data.xid, begin_data.committime);
+
remote_final_lsn = begin_data.final_lsn;
maybe_start_skipping_changes(begin_data.final_lsn);
@@ -1070,6 +1188,9 @@ apply_handle_begin_prepare(StringInfo s)
logicalrep_read_begin_prepare(s, &begin_data);
set_apply_error_context_xact(begin_data.xid, begin_data.prepare_lsn);
+ /* Should we delay the current prepared transaction? */
+ maybe_apply_delay(begin_data.xid, begin_data.prepare_time);
+
remote_final_lsn = begin_data.prepare_lsn;
maybe_start_skipping_changes(begin_data.prepare_lsn);
@@ -1317,7 +1438,8 @@ apply_handle_stream_prepare(StringInfo s)
* spooled operations.
*/
apply_spooled_messages(MyLogicalRepWorker->stream_fileset,
- prepare_data.xid, prepare_data.prepare_lsn);
+ prepare_data.xid, prepare_data.prepare_lsn,
+ prepare_data.prepare_time);
/* Mark the transaction as prepared. */
apply_handle_prepare_internal(&prepare_data);
@@ -2011,10 +2133,13 @@ ensure_last_message(FileSet *stream_fileset, TransactionId xid, int fileno,
/*
* Common spoolfile processing.
+ *
+ * The commit/prepare time (finish_ts) for streamed transactions is required
+ * for time-delayed logical replication.
*/
void
apply_spooled_messages(FileSet *stream_fileset, TransactionId xid,
- XLogRecPtr lsn)
+ XLogRecPtr lsn, TimestampTz finish_ts)
{
StringInfoData s2;
int nchanges;
@@ -2025,6 +2150,10 @@ apply_spooled_messages(FileSet *stream_fileset, TransactionId xid,
int fileno;
off_t offset;
+ /* Should we delay the current transaction? */
+ if (finish_ts)
+ maybe_apply_delay(xid, finish_ts);
+
if (!am_parallel_apply_worker())
maybe_start_skipping_changes(lsn);
@@ -2174,7 +2303,7 @@ apply_handle_stream_commit(StringInfo s)
* spooled operations.
*/
apply_spooled_messages(MyLogicalRepWorker->stream_fileset, xid,
- commit_data.commit_lsn);
+ commit_data.commit_lsn, commit_data.committime);
apply_handle_commit_internal(&commit_data);
@@ -3447,7 +3576,7 @@ UpdateWorkerStats(XLogRecPtr last_lsn, TimestampTz send_time, bool reply)
* Apply main loop.
*/
static void
-LogicalRepApplyLoop(XLogRecPtr last_received)
+LogicalRepApplyLoop(void)
{
TimestampTz last_recv_timestamp = GetCurrentTimestamp();
bool ping_sent = false;
@@ -3568,7 +3697,7 @@ LogicalRepApplyLoop(XLogRecPtr last_received)
if (last_received < end_lsn)
last_received = end_lsn;
- send_feedback(last_received, reply_requested, false);
+ send_feedback(last_received, reply_requested, false, false);
UpdateWorkerStats(last_received, timestamp, true);
}
/* other message types are purposefully ignored */
@@ -3581,7 +3710,7 @@ LogicalRepApplyLoop(XLogRecPtr last_received)
}
/* confirm all writes so far */
- send_feedback(last_received, false, false);
+ send_feedback(last_received, false, false, false);
if (!in_remote_transaction && !in_streamed_transaction)
{
@@ -3678,7 +3807,7 @@ LogicalRepApplyLoop(XLogRecPtr last_received)
}
}
- send_feedback(last_received, requestReply, requestReply);
+ send_feedback(last_received, requestReply, requestReply, false);
/*
* Force reporting to ensure long idle periods don't lead to
@@ -3708,7 +3837,7 @@ LogicalRepApplyLoop(XLogRecPtr last_received)
* to send a response to avoid timeouts.
*/
static void
-send_feedback(XLogRecPtr recvpos, bool force, bool requestReply)
+send_feedback(XLogRecPtr recvpos, bool force, bool requestReply, bool has_unprocessed_change)
{
static StringInfo reply_message = NULL;
static TimestampTz send_time = 0;
@@ -3738,8 +3867,15 @@ send_feedback(XLogRecPtr recvpos, bool force, bool requestReply)
/*
* No outstanding transactions to flush, we can report the latest received
* position. This is important for synchronous replication.
+ *
+ * If the logical replication subscription has unprocessed changes then do
+ * not inform the publisher that the received latest LSN is already
+ * applied and flushed, otherwise, the publisher will make a wrong
+ * assumption about the logical replication progress. Instead, it just
+ * sends a feedback message to avoid a replication timeout during the
+ * delay.
*/
- if (!have_pending_txes)
+ if (!have_pending_txes && !has_unprocessed_change)
flushpos = writepos = recvpos;
if (writepos < last_writepos)
@@ -3776,8 +3912,9 @@ send_feedback(XLogRecPtr recvpos, bool force, bool requestReply)
pq_sendint64(reply_message, now); /* sendTime */
pq_sendbyte(reply_message, requestReply); /* replyRequested */
- elog(DEBUG2, "sending feedback (force %d) to recv %X/%X, write %X/%X, flush %X/%X",
+ elog(DEBUG2, "sending feedback (force %d, has_unprocessed_change %s) to recv %X/%X, write %X/%X, flush %X/%X",
force,
+ has_unprocessed_change ? "yes" : "no",
LSN_FORMAT_ARGS(recvpos),
LSN_FORMAT_ARGS(writepos),
LSN_FORMAT_ARGS(flushpos));
@@ -4367,11 +4504,11 @@ start_table_sync(XLogRecPtr *origin_startpos, char **myslotname)
* of system resource error and are not repeatable.
*/
static void
-start_apply(XLogRecPtr origin_startpos)
+start_apply(void)
{
PG_TRY();
{
- LogicalRepApplyLoop(origin_startpos);
+ LogicalRepApplyLoop();
}
PG_CATCH();
{
@@ -4661,7 +4798,8 @@ ApplyWorkerMain(Datum main_arg)
}
/* Run the main loop. */
- start_apply(origin_startpos);
+ last_received = origin_startpos;
+ start_apply();
proc_exit(0);
}
diff --git a/src/bin/pg_dump/pg_dump.c b/src/bin/pg_dump/pg_dump.c
index 527c7651ab..1e87f0124e 100644
--- a/src/bin/pg_dump/pg_dump.c
+++ b/src/bin/pg_dump/pg_dump.c
@@ -4494,6 +4494,7 @@ getSubscriptions(Archive *fout)
int i_subsynccommit;
int i_subpublications;
int i_subbinary;
+ int i_subminapplydelay;
int i,
ntups;
@@ -4546,9 +4547,13 @@ getSubscriptions(Archive *fout)
LOGICALREP_TWOPHASE_STATE_DISABLED);
if (fout->remoteVersion >= 160000)
- appendPQExpBufferStr(query, " s.suborigin\n");
+ appendPQExpBufferStr(query,
+ " s.suborigin,\n"
+ " s.subminapplydelay\n");
else
- appendPQExpBuffer(query, " '%s' AS suborigin\n", LOGICALREP_ORIGIN_ANY);
+ appendPQExpBuffer(query, " '%s' AS suborigin,\n"
+ " 0 AS subminapplydelay\n",
+ LOGICALREP_ORIGIN_ANY);
appendPQExpBufferStr(query,
"FROM pg_subscription s\n"
@@ -4576,6 +4581,7 @@ getSubscriptions(Archive *fout)
i_subtwophasestate = PQfnumber(res, "subtwophasestate");
i_subdisableonerr = PQfnumber(res, "subdisableonerr");
i_suborigin = PQfnumber(res, "suborigin");
+ i_subminapplydelay = PQfnumber(res, "subminapplydelay");
subinfo = pg_malloc(ntups * sizeof(SubscriptionInfo));
@@ -4606,6 +4612,8 @@ getSubscriptions(Archive *fout)
subinfo[i].subdisableonerr =
pg_strdup(PQgetvalue(res, i, i_subdisableonerr));
subinfo[i].suborigin = pg_strdup(PQgetvalue(res, i, i_suborigin));
+ subinfo[i].subminapplydelay =
+ atoi(PQgetvalue(res, i, i_subminapplydelay));
/* Decide whether we want to dump it */
selectDumpableObject(&(subinfo[i].dobj), fout);
@@ -4687,6 +4695,9 @@ dumpSubscription(Archive *fout, const SubscriptionInfo *subinfo)
if (strcmp(subinfo->subsynccommit, "off") != 0)
appendPQExpBuffer(query, ", synchronous_commit = %s", fmtId(subinfo->subsynccommit));
+ if (subinfo->subminapplydelay > 0)
+ appendPQExpBuffer(query, ", min_apply_delay = '%d ms'", subinfo->subminapplydelay);
+
appendPQExpBufferStr(query, ");\n");
if (subinfo->dobj.dump & DUMP_COMPONENT_DEFINITION)
diff --git a/src/bin/pg_dump/pg_dump.h b/src/bin/pg_dump/pg_dump.h
index e7cbd8d7ed..b8831c3ed3 100644
--- a/src/bin/pg_dump/pg_dump.h
+++ b/src/bin/pg_dump/pg_dump.h
@@ -661,6 +661,7 @@ typedef struct _SubscriptionInfo
char *subdisableonerr;
char *suborigin;
char *subsynccommit;
+ int subminapplydelay;
char *subpublications;
} SubscriptionInfo;
diff --git a/src/bin/psql/describe.c b/src/bin/psql/describe.c
index c8a0bb7b3a..81d4607a1c 100644
--- a/src/bin/psql/describe.c
+++ b/src/bin/psql/describe.c
@@ -6472,7 +6472,7 @@ describeSubscriptions(const char *pattern, bool verbose)
PGresult *res;
printQueryOpt myopt = pset.popt;
static const bool translate_columns[] = {false, false, false, false,
- false, false, false, false, false, false, false, false};
+ false, false, false, false, false, false, false, false, false};
if (pset.sversion < 100000)
{
@@ -6527,10 +6527,13 @@ describeSubscriptions(const char *pattern, bool verbose)
gettext_noop("Two-phase commit"),
gettext_noop("Disable on error"));
+ /* Origin and min_apply_delay are only supported in v16 and higher */
if (pset.sversion >= 160000)
appendPQExpBuffer(&buf,
- ", suborigin AS \"%s\"\n",
- gettext_noop("Origin"));
+ ", suborigin AS \"%s\"\n"
+ ", subminapplydelay AS \"%s\"\n",
+ gettext_noop("Origin"),
+ gettext_noop("Min apply delay"));
appendPQExpBuffer(&buf,
", subsynccommit AS \"%s\"\n"
diff --git a/src/bin/psql/tab-complete.c b/src/bin/psql/tab-complete.c
index 5e1882eaea..e8b9a43a47 100644
--- a/src/bin/psql/tab-complete.c
+++ b/src/bin/psql/tab-complete.c
@@ -1925,7 +1925,7 @@ psql_completion(const char *text, int start, int end)
COMPLETE_WITH("(", "PUBLICATION");
/* ALTER SUBSCRIPTION <name> SET ( */
else if (HeadMatches("ALTER", "SUBSCRIPTION", MatchAny) && TailMatches("SET", "("))
- COMPLETE_WITH("binary", "disable_on_error", "origin", "slot_name",
+ COMPLETE_WITH("binary", "disable_on_error", "min_apply_delay", "origin", "slot_name",
"streaming", "synchronous_commit");
/* ALTER SUBSCRIPTION <name> SKIP ( */
else if (HeadMatches("ALTER", "SUBSCRIPTION", MatchAny) && TailMatches("SKIP", "("))
@@ -3268,7 +3268,7 @@ psql_completion(const char *text, int start, int end)
/* Complete "CREATE SUBSCRIPTION <name> ... WITH ( <opt>" */
else if (HeadMatches("CREATE", "SUBSCRIPTION") && TailMatches("WITH", "("))
COMPLETE_WITH("binary", "connect", "copy_data", "create_slot",
- "disable_on_error", "enabled", "origin", "slot_name",
+ "disable_on_error", "enabled", "min_apply_delay", "origin", "slot_name",
"streaming", "synchronous_commit", "two_phase");
/* CREATE TRIGGER --- is allowed inside CREATE SCHEMA, so use TailMatches */
diff --git a/src/include/catalog/pg_subscription.h b/src/include/catalog/pg_subscription.h
index b0f2a1705d..b8fe47ef6e 100644
--- a/src/include/catalog/pg_subscription.h
+++ b/src/include/catalog/pg_subscription.h
@@ -74,6 +74,8 @@ CATALOG(pg_subscription,6100,SubscriptionRelationId) BKI_SHARED_RELATION BKI_ROW
Oid subowner BKI_LOOKUP(pg_authid); /* Owner of the subscription */
+ int32 subminapplydelay; /* Replication apply delay (ms) */
+
bool subenabled; /* True if the subscription is enabled (the
* worker should be running) */
@@ -120,6 +122,7 @@ typedef struct Subscription
* in */
XLogRecPtr skiplsn; /* All changes finished at this LSN are
* skipped */
+ int32 minapplydelay; /* Replication apply delay (ms) */
char *name; /* Name of the subscription */
Oid owner; /* Oid of the subscription owner */
bool enabled; /* Indicates if the subscription is enabled */
diff --git a/src/include/replication/worker_internal.h b/src/include/replication/worker_internal.h
index dc87a4edd1..3dc09d1a4c 100644
--- a/src/include/replication/worker_internal.h
+++ b/src/include/replication/worker_internal.h
@@ -255,7 +255,7 @@ extern void stream_stop_internal(TransactionId xid);
/* Common streaming function to apply all the spooled messages */
extern void apply_spooled_messages(FileSet *stream_fileset, TransactionId xid,
- XLogRecPtr lsn);
+ XLogRecPtr lsn, TimestampTz finish_ts);
extern void apply_dispatch(StringInfo s);
diff --git a/src/test/regress/expected/subscription.out b/src/test/regress/expected/subscription.out
index 3f99b14394..5ccce39986 100644
--- a/src/test/regress/expected/subscription.out
+++ b/src/test/regress/expected/subscription.out
@@ -114,18 +114,18 @@ CREATE SUBSCRIPTION regress_testsub4 CONNECTION 'dbname=regress_doesnotexist' PU
WARNING: subscription was created, but is not connected
HINT: To initiate replication, you must manually create the replication slot, enable the subscription, and refresh the subscription.
\dRs+ regress_testsub4
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
-------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub4 | regress_subscription_user | f | {testpub} | f | off | d | f | none | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay | Synchronous commit | Conninfo | Skip LSN
+------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+-----------------+--------------------+-----------------------------+----------
+ regress_testsub4 | regress_subscription_user | f | {testpub} | f | off | d | f | none | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
ALTER SUBSCRIPTION regress_testsub4 SET (origin = any);
\dRs+ regress_testsub4
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
-------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub4 | regress_subscription_user | f | {testpub} | f | off | d | f | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay | Synchronous commit | Conninfo | Skip LSN
+------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+-----------------+--------------------+-----------------------------+----------
+ regress_testsub4 | regress_subscription_user | f | {testpub} | f | off | d | f | any | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
DROP SUBSCRIPTION regress_testsub3;
@@ -143,10 +143,10 @@ ALTER SUBSCRIPTION regress_testsub CONNECTION 'foobar';
ERROR: invalid connection string syntax: missing "=" after "foobar" in connection info string
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub} | f | off | d | f | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+-----------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | off | d | f | any | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
ALTER SUBSCRIPTION regress_testsub SET PUBLICATION testpub2, testpub3 WITH (refresh = false);
@@ -163,10 +163,10 @@ ERROR: unrecognized subscription parameter: "create_slot"
-- ok
ALTER SUBSCRIPTION regress_testsub SKIP (lsn = '0/12345');
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+---------------------+--------+-----------+------------------+------------------+--------+--------------------+------------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub2,testpub3} | f | off | d | f | any | off | dbname=regress_doesnotexist2 | 0/12345
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+---------------------+--------+-----------+------------------+------------------+--------+-----------------+--------------------+------------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub2,testpub3} | f | off | d | f | any | 0 | off | dbname=regress_doesnotexist2 | 0/12345
(1 row)
-- ok - with lsn = NONE
@@ -175,10 +175,10 @@ ALTER SUBSCRIPTION regress_testsub SKIP (lsn = NONE);
ALTER SUBSCRIPTION regress_testsub SKIP (lsn = '0/0');
ERROR: invalid WAL location (LSN): 0/0
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+---------------------+--------+-----------+------------------+------------------+--------+--------------------+------------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub2,testpub3} | f | off | d | f | any | off | dbname=regress_doesnotexist2 | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+---------------------+--------+-----------+------------------+------------------+--------+-----------------+--------------------+------------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub2,testpub3} | f | off | d | f | any | 0 | off | dbname=regress_doesnotexist2 | 0/0
(1 row)
BEGIN;
@@ -210,10 +210,10 @@ ALTER SUBSCRIPTION regress_testsub_foo SET (synchronous_commit = foobar);
ERROR: invalid value for parameter "synchronous_commit": "foobar"
HINT: Available values: local, remote_write, remote_apply, on, off.
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
----------------------+---------------------------+---------+---------------------+--------+-----------+------------------+------------------+--------+--------------------+------------------------------+----------
- regress_testsub_foo | regress_subscription_user | f | {testpub2,testpub3} | f | off | d | f | any | local | dbname=regress_doesnotexist2 | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay | Synchronous commit | Conninfo | Skip LSN
+---------------------+---------------------------+---------+---------------------+--------+-----------+------------------+------------------+--------+-----------------+--------------------+------------------------------+----------
+ regress_testsub_foo | regress_subscription_user | f | {testpub2,testpub3} | f | off | d | f | any | 0 | local | dbname=regress_doesnotexist2 | 0/0
(1 row)
-- rename back to keep the rest simple
@@ -247,19 +247,19 @@ CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUB
WARNING: subscription was created, but is not connected
HINT: To initiate replication, you must manually create the replication slot, enable the subscription, and refresh the subscription.
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub} | t | off | d | f | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+-----------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | t | off | d | f | any | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
ALTER SUBSCRIPTION regress_testsub SET (binary = false);
ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE);
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub} | f | off | d | f | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+-----------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | off | d | f | any | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
DROP SUBSCRIPTION regress_testsub;
@@ -271,27 +271,27 @@ CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUB
WARNING: subscription was created, but is not connected
HINT: To initiate replication, you must manually create the replication slot, enable the subscription, and refresh the subscription.
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub} | f | on | d | f | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+-----------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | on | d | f | any | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
ALTER SUBSCRIPTION regress_testsub SET (streaming = parallel);
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub} | f | parallel | d | f | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+-----------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | parallel | d | f | any | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
ALTER SUBSCRIPTION regress_testsub SET (streaming = false);
ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE);
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub} | f | off | d | f | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+-----------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | off | d | f | any | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
-- fail - publication already exists
@@ -306,10 +306,10 @@ ALTER SUBSCRIPTION regress_testsub ADD PUBLICATION testpub1, testpub2 WITH (refr
ALTER SUBSCRIPTION regress_testsub ADD PUBLICATION testpub1, testpub2 WITH (refresh = false);
ERROR: publication "testpub1" is already in subscription "regress_testsub"
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-----------------------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub,testpub1,testpub2} | f | off | d | f | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-----------------------------+--------+-----------+------------------+------------------+--------+-----------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub,testpub1,testpub2} | f | off | d | f | any | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
-- fail - publication used more than once
@@ -324,10 +324,10 @@ ERROR: publication "testpub3" is not in subscription "regress_testsub"
-- ok - delete publications
ALTER SUBSCRIPTION regress_testsub DROP PUBLICATION testpub1, testpub2 WITH (refresh = false);
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub} | f | off | d | f | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+-----------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | off | d | f | any | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
DROP SUBSCRIPTION regress_testsub;
@@ -363,10 +363,10 @@ CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUB
WARNING: subscription was created, but is not connected
HINT: To initiate replication, you must manually create the replication slot, enable the subscription, and refresh the subscription.
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub} | f | off | p | f | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+-----------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | off | p | f | any | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
--fail - alter of two_phase option not supported.
@@ -375,10 +375,10 @@ ERROR: unrecognized subscription parameter: "two_phase"
-- but can alter streaming when two_phase enabled
ALTER SUBSCRIPTION regress_testsub SET (streaming = true);
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub} | f | on | p | f | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+-----------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | on | p | f | any | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE);
@@ -388,10 +388,10 @@ CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUB
WARNING: subscription was created, but is not connected
HINT: To initiate replication, you must manually create the replication slot, enable the subscription, and refresh the subscription.
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub} | f | on | p | f | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+-----------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | on | p | f | any | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE);
@@ -404,20 +404,57 @@ CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUB
WARNING: subscription was created, but is not connected
HINT: To initiate replication, you must manually create the replication slot, enable the subscription, and refresh the subscription.
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub} | f | off | d | f | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+-----------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | off | d | f | any | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
ALTER SUBSCRIPTION regress_testsub SET (disable_on_error = true);
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub} | f | off | d | t | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+-----------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | off | d | t | any | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
+ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE);
+DROP SUBSCRIPTION regress_testsub;
+-- fail -- min_apply_delay must be a non-negative integer
+CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUBLICATION testpub WITH (connect = false, min_apply_delay = foo);
+ERROR: invalid value for parameter "min_apply_delay": "foo"
+CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUBLICATION testpub WITH (connect = false, min_apply_delay = -1);
+ERROR: -1 ms is outside the valid range for parameter "min_apply_delay" (0 .. 2147483647)
+-- fail - utilizing streaming = parallel with time-delayed replication is not supported
+CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUBLICATION testpub WITH (connect = false, streaming = parallel, min_apply_delay = 123);
+ERROR: min_apply_delay > 0 and streaming = parallel are mutually exclusive options
+-- success -- min_apply_delay value without unit is taken as milliseconds
+CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUBLICATION testpub WITH (connect = false, min_apply_delay = 123);
+WARNING: subscription was created, but is not connected
+HINT: To initiate replication, you must manually create the replication slot, enable the subscription, and refresh the subscription.
+\dRs+
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+-----------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | off | d | f | any | 123 | off | dbname=regress_doesnotexist | 0/0
+(1 row)
+
+-- success -- min_apply_delay value with unit is converted into ms and stored as an integer
+ALTER SUBSCRIPTION regress_testsub SET (min_apply_delay = '1 d');
+\dRs+
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+-----------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | off | d | f | any | 86400000 | off | dbname=regress_doesnotexist | 0/0
+(1 row)
+
+-- fail - alter subscription with streaming = parallel should fail when time-delayed replication is set
+ALTER SUBSCRIPTION regress_testsub SET (streaming = parallel);
+ERROR: cannot enable parallel streaming mode for subscription with min_apply_delay
+-- fail - alter subscription with min_apply_delay should fail when streaming = parallel is set
+ALTER SUBSCRIPTION regress_testsub SET (min_apply_delay = 0, streaming = parallel);
+ALTER SUBSCRIPTION regress_testsub SET (min_apply_delay = 123);
+ERROR: cannot enable min_apply_delay for subscription in streaming = parallel mode
ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE);
DROP SUBSCRIPTION regress_testsub;
RESET SESSION AUTHORIZATION;
diff --git a/src/test/regress/sql/subscription.sql b/src/test/regress/sql/subscription.sql
index 7281f5fee2..7317b140f5 100644
--- a/src/test/regress/sql/subscription.sql
+++ b/src/test/regress/sql/subscription.sql
@@ -286,6 +286,30 @@ ALTER SUBSCRIPTION regress_testsub SET (disable_on_error = true);
ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE);
DROP SUBSCRIPTION regress_testsub;
+-- fail -- min_apply_delay must be a non-negative integer
+CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUBLICATION testpub WITH (connect = false, min_apply_delay = foo);
+CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUBLICATION testpub WITH (connect = false, min_apply_delay = -1);
+
+-- fail - utilizing streaming = parallel with time-delayed replication is not supported
+CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUBLICATION testpub WITH (connect = false, streaming = parallel, min_apply_delay = 123);
+
+-- success -- min_apply_delay value without unit is taken as milliseconds
+CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUBLICATION testpub WITH (connect = false, min_apply_delay = 123);
+\dRs+
+
+-- success -- min_apply_delay value with unit is converted into ms and stored as an integer
+ALTER SUBSCRIPTION regress_testsub SET (min_apply_delay = '1 d');
+\dRs+
+
+-- fail - alter subscription with streaming = parallel should fail when time-delayed replication is set
+ALTER SUBSCRIPTION regress_testsub SET (streaming = parallel);
+
+-- fail - alter subscription with min_apply_delay should fail when streaming = parallel is set
+ALTER SUBSCRIPTION regress_testsub SET (min_apply_delay = 0, streaming = parallel);
+ALTER SUBSCRIPTION regress_testsub SET (min_apply_delay = 123);
+ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE);
+DROP SUBSCRIPTION regress_testsub;
+
RESET SESSION AUTHORIZATION;
DROP ROLE regress_subscription_user;
DROP ROLE regress_subscription_user2;
diff --git a/src/test/subscription/meson.build b/src/test/subscription/meson.build
index 3db0fdfd96..a186876eb4 100644
--- a/src/test/subscription/meson.build
+++ b/src/test/subscription/meson.build
@@ -38,6 +38,7 @@ tests += {
't/029_on_error.pl',
't/030_origin.pl',
't/031_column_list.pl',
+ 't/032_apply_delay.pl',
't/100_bugs.pl',
],
},
diff --git a/src/test/subscription/t/032_apply_delay.pl b/src/test/subscription/t/032_apply_delay.pl
new file mode 100644
index 0000000000..2d63919525
--- /dev/null
+++ b/src/test/subscription/t/032_apply_delay.pl
@@ -0,0 +1,187 @@
+
+# Copyright (c) 2023, PostgreSQL Global Development Group
+
+# Test replication apply delay
+use strict;
+use warnings;
+use PostgreSQL::Test::Cluster;
+use PostgreSQL::Test::Utils;
+use Test::More;
+
+# Confirm the time-delayed replication has been effective from the server log
+# message where the apply worker emits for applying delay. Moreover, verify
+# that the current worker's remaining wait time is sufficiently bigger than the
+# expected value, in order to check any update of the min_apply_delay.
+sub check_apply_delay_log
+{
+ my ($node_subscriber, $offset, $expected) = @_;
+
+ my $log_location = $node_subscriber->wait_for_log(
+ qr/time-delayed replication for txid (\d+), min_apply_delay = (\d+) ms, remaining wait time: (\d+) ms/,
+ $offset);
+
+ cmp_ok($log_location, '>', $offset,
+ "logfile contains triggered logical replication apply delay");
+
+ # Get the remaining wait time from the server log
+ my $contents = slurp_file($node_subscriber->logfile, $offset);
+ $contents =~
+ qr/time-delayed replication for txid (\d+), min_apply_delay = (\d+) ms, remaining wait time: (\d+) ms/,
+ or die "could not get the apply worker wait time";
+ my $logged_delay = $3;
+
+ # Is it larger than expected?
+ cmp_ok($logged_delay, '>', $expected,
+ "The apply worker wait time has expected duration");
+}
+
+# Compare inserted time on the publisher with applied time on the subscriber to
+# confirm the latter is applied after expected time. The time is automatically
+# generated and stored in the table column 'c'.
+sub check_apply_delay_time
+{
+ my ($node_publisher, $node_subscriber, $primary_key, $expected_diffs) =
+ @_;
+
+ my $inserted_time_on_pub = $node_publisher->safe_psql(
+ 'postgres', qq[
+ SELECT extract(epoch from c) FROM test_tab WHERE a = $primary_key;
+ ]);
+
+ my $inserted_time_on_sub = $node_subscriber->safe_psql(
+ 'postgres', qq[
+ SELECT extract(epoch from c) FROM test_tab WHERE a = $primary_key;
+ ]);
+
+ cmp_ok(
+ $inserted_time_on_sub - $inserted_time_on_pub,
+ '>',
+ $expected_diffs,
+ "The tuple on the subscriber was modified later than the publisher");
+}
+
+# Create publisher node
+my $node_publisher = PostgreSQL::Test::Cluster->new('publisher');
+$node_publisher->init(allows_streaming => 'logical');
+$node_publisher->append_conf('postgresql.conf',
+ 'logical_decoding_work_mem = 64kB');
+$node_publisher->start;
+
+# Create subscriber node
+my $node_subscriber = PostgreSQL::Test::Cluster->new('subscriber');
+$node_subscriber->init;
+$node_subscriber->append_conf('postgresql.conf', "log_min_messages = debug2");
+$node_subscriber->start;
+
+# Setup structure on publisher
+$node_publisher->safe_psql('postgres',
+ "CREATE TABLE test_tab (a int primary key, b varchar, c timestamptz (6) DEFAULT now())"
+);
+
+# Setup structure on subscriber
+$node_subscriber->safe_psql('postgres',
+ "CREATE TABLE test_tab (a int primary key, b text, c timestamptz (6) DEFAULT now())"
+);
+
+# Setup logical replication
+my $publisher_connstr = $node_publisher->connstr . ' dbname=postgres';
+
+# The column 'c' must not be published because we want to compare the time
+# difference.
+$node_publisher->safe_psql('postgres',
+ "CREATE PUBLICATION tap_pub FOR TABLE test_tab (a, b)");
+
+my $appname = 'tap_sub';
+
+# Create a subscription that applies the transaction after 1 second delay
+$node_subscriber->safe_psql('postgres',
+ "CREATE SUBSCRIPTION tap_sub CONNECTION '$publisher_connstr application_name=$appname' PUBLICATION tap_pub WITH (copy_data = off, min_apply_delay = '1s', streaming = 'on')"
+);
+
+# Check log starting now for logical replication apply delay
+my $offset = -s $node_subscriber->logfile;
+
+# New row to trigger apply delay
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO test_tab VALUES (1, 'foo')");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO test_tab VALUES (2, 'bar')");
+
+$node_publisher->wait_for_catchup($appname);
+
+my $result =
+ $node_subscriber->safe_psql('postgres',
+ "SELECT count(*), min(a), max(a) FROM test_tab");
+is($result, qq(2|1|2), 'check if the new rows were applied to subscriber');
+
+# Make sure the apply worker knows to wait for more than 500ms
+check_apply_delay_log($node_subscriber, $offset, "0.5");
+
+# Verify that the subscriber lags the publisher by at least 1 second
+check_apply_delay_time($node_publisher, $node_subscriber, '2', '1');
+
+# Setup for streaming case
+$node_publisher->append_conf('postgresql.conf',
+ 'logical_decoding_mode = immediate');
+$node_publisher->reload;
+
+# Run a query to make sure that the reload has taken effect.
+$node_publisher->safe_psql('postgres', q{SELECT 1});
+
+# Check log starting now for logical replication apply delay
+$offset = -s $node_subscriber->logfile;
+
+# Test streamed transaction by insert
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO test_tab SELECT i, md5(i::text) FROM generate_series(3, 5) s(i);"
+);
+
+$node_publisher->wait_for_catchup($appname);
+
+$result =
+ $node_subscriber->safe_psql('postgres',
+ "SELECT count(*), min(a), max(a) FROM test_tab");
+is($result, qq(5|1|5), 'check if the new rows were applied to subscriber');
+
+# Make sure the apply worker knows to wait for more than 500ms
+check_apply_delay_log($node_subscriber, $offset, "0.5");
+
+# Verify that the subscriber lags the publisher by at least 1 second
+check_apply_delay_time($node_publisher, $node_subscriber, '5', '1');
+
+# Test whether ALTER SUBSCRIPTION changes the delayed time of the apply worker
+# (1 day 5 minutes). Note that the extra 5 minute is to account for any
+# decoding/network overhead.
+$node_subscriber->safe_psql('postgres',
+ "ALTER SUBSCRIPTION tap_sub SET (min_apply_delay = 86700000)");
+
+# Check log starting now for logical replication apply delay
+$offset = -s $node_subscriber->logfile;
+
+# New row to trigger apply delay
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO test_tab VALUES (0, 'foobar')");
+
+# Make sure the apply worker knows to wait for more than 1 day
+check_apply_delay_log($node_subscriber, $offset, "86400000");
+
+# Disable subscription and the worker should die immediately
+$node_subscriber->safe_psql('postgres',
+ "ALTER SUBSCRIPTION tap_sub DISABLE;");
+
+# Wait until worker dies
+my $sub_query =
+ "SELECT count(1) = 0 FROM pg_stat_subscription WHERE subname = 'tap_sub' AND pid IS NOT NULL;";
+$node_subscriber->poll_query_until('postgres', $sub_query)
+ or die "Timed out while waiting for subscriber to die";
+
+# Confirm disabling the subscription by ALTER SUBSCRIPTION DISABLE did not cause
+# the delayed transaction to be applied.
+$result = $node_subscriber->safe_psql('postgres',
+ "SELECT count(a) FROM test_tab WHERE a = 0;");
+is($result, qq(0), "check the delayed transaction was not applied");
+
+$node_subscriber->stop;
+$node_publisher->stop;
+
+done_testing();
--
2.30.0
At Sat, 28 Jan 2023 04:28:29 +0000, "Takamichi Osumi (Fujitsu)" <osumi.takamichi@fujitsu.com> wrote in
On Friday, January 27, 2023 8:00 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
So, you have changed min_apply_delay from int64 to int32, but you haven't
mentioned the reason for the same? We use 'int' for the similar parameter
recovery_min_apply_delay, so, ideally, it makes sense but still better to tell your
reason explicitly.Yes. It's because I thought I need to make this feature consistent with the recovery_min_apply_delay.
This feature handles the range same as the recovery_min_apply delay from 0 to INT_MAX now
so should be adjusted to match it.
INT_MAX can stick out of int32 on some platforms. (I'm not sure where
that actually happens, though.) We can use PG_INT32_MAX instead.
IMHO, I think we don't use int as a catalog column and I agree that
int32 is sufficient since I don't think more than 49 days delay is
practical. On the other hand, maybe I wouldn't want to use int32 for
intermediate calculations.
regards.
--
Kyotaro Horiguchi
NTT Open Source Software Center
On Mon, Jan 30, 2023 at 8:32 AM Kyotaro Horiguchi
<horikyota.ntt@gmail.com> wrote:
At Sat, 28 Jan 2023 04:28:29 +0000, "Takamichi Osumi (Fujitsu)" <osumi.takamichi@fujitsu.com> wrote in
On Friday, January 27, 2023 8:00 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
So, you have changed min_apply_delay from int64 to int32, but you haven't
mentioned the reason for the same? We use 'int' for the similar parameter
recovery_min_apply_delay, so, ideally, it makes sense but still better to tell your
reason explicitly.Yes. It's because I thought I need to make this feature consistent with the recovery_min_apply_delay.
This feature handles the range same as the recovery_min_apply delay from 0 to INT_MAX now
so should be adjusted to match it.INT_MAX can stick out of int32 on some platforms. (I'm not sure where
that actually happens, though.) We can use PG_INT32_MAX instead.
But in other integer GUCs including recovery_min_apply_delay, we use
INT_MAX, so not sure if it is a good idea to do something different
here.
--
With Regards,
Amit Kapila.
At Mon, 30 Jan 2023 08:51:05 +0530, Amit Kapila <amit.kapila16@gmail.com> wrote in
On Mon, Jan 30, 2023 at 8:32 AM Kyotaro Horiguchi
<horikyota.ntt@gmail.com> wrote:At Sat, 28 Jan 2023 04:28:29 +0000, "Takamichi Osumi (Fujitsu)" <osumi.takamichi@fujitsu.com> wrote in
On Friday, January 27, 2023 8:00 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
So, you have changed min_apply_delay from int64 to int32, but you haven't
mentioned the reason for the same? We use 'int' for the similar parameter
recovery_min_apply_delay, so, ideally, it makes sense but still better to tell your
reason explicitly.Yes. It's because I thought I need to make this feature consistent with the recovery_min_apply_delay.
This feature handles the range same as the recovery_min_apply delay from 0 to INT_MAX now
so should be adjusted to match it.INT_MAX can stick out of int32 on some platforms. (I'm not sure where
that actually happens, though.) We can use PG_INT32_MAX instead.But in other integer GUCs including recovery_min_apply_delay, we use
INT_MAX, so not sure if it is a good idea to do something different
here.
The GUC is not stored in a catalog, but.. oh... it is multiplied by
1000. So if it is larger than (INT_MAX / 1000), it overflows... If we
officially accept that (I don't think great) behavior (even only for
impractical values), I don't object further.
regards.
--
Kyotaro Horiguchi
NTT Open Source Software Center
On Mon, Jan 30, 2023 at 9:43 AM Kyotaro Horiguchi
<horikyota.ntt@gmail.com> wrote:
At Mon, 30 Jan 2023 08:51:05 +0530, Amit Kapila <amit.kapila16@gmail.com> wrote in
On Mon, Jan 30, 2023 at 8:32 AM Kyotaro Horiguchi
<horikyota.ntt@gmail.com> wrote:At Sat, 28 Jan 2023 04:28:29 +0000, "Takamichi Osumi (Fujitsu)" <osumi.takamichi@fujitsu.com> wrote in
On Friday, January 27, 2023 8:00 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
So, you have changed min_apply_delay from int64 to int32, but you haven't
mentioned the reason for the same? We use 'int' for the similar parameter
recovery_min_apply_delay, so, ideally, it makes sense but still better to tell your
reason explicitly.Yes. It's because I thought I need to make this feature consistent with the recovery_min_apply_delay.
This feature handles the range same as the recovery_min_apply delay from 0 to INT_MAX now
so should be adjusted to match it.INT_MAX can stick out of int32 on some platforms. (I'm not sure where
that actually happens, though.) We can use PG_INT32_MAX instead.But in other integer GUCs including recovery_min_apply_delay, we use
INT_MAX, so not sure if it is a good idea to do something different
here.The GUC is not stored in a catalog, but.. oh... it is multiplied by
1000.
Which part of the patch you are referring to here? Isn't the check in
the function defGetMinApplyDelay() sufficient to ensure that the
'delay' value stored in the catalog will always be lesser than
INT_MAX?
--
With Regards,
Amit Kapila.
At Mon, 30 Jan 2023 11:56:33 +0530, Amit Kapila <amit.kapila16@gmail.com> wrote in
On Mon, Jan 30, 2023 at 9:43 AM Kyotaro Horiguchi
<horikyota.ntt@gmail.com> wrote:At Mon, 30 Jan 2023 08:51:05 +0530, Amit Kapila <amit.kapila16@gmail.com> wrote in
On Mon, Jan 30, 2023 at 8:32 AM Kyotaro Horiguchi
<horikyota.ntt@gmail.com> wrote:At Sat, 28 Jan 2023 04:28:29 +0000, "Takamichi Osumi (Fujitsu)" <osumi.takamichi@fujitsu.com> wrote in
On Friday, January 27, 2023 8:00 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
So, you have changed min_apply_delay from int64 to int32, but you haven't
mentioned the reason for the same? We use 'int' for the similar parameter
recovery_min_apply_delay, so, ideally, it makes sense but still better to tell your
reason explicitly.Yes. It's because I thought I need to make this feature consistent with the recovery_min_apply_delay.
This feature handles the range same as the recovery_min_apply delay from 0 to INT_MAX now
so should be adjusted to match it.INT_MAX can stick out of int32 on some platforms. (I'm not sure where
that actually happens, though.) We can use PG_INT32_MAX instead.But in other integer GUCs including recovery_min_apply_delay, we use
INT_MAX, so not sure if it is a good idea to do something different
here.The GUC is not stored in a catalog, but.. oh... it is multiplied by
1000.Which part of the patch you are referring to here? Isn't the check in
Where recovery_min_apply_delay is used. It is allowed to be set up to
INT_MAX but it is used as:
delayUntil = TimestampTzPlusMilliseconds(xtime, recovery_min_apply_delay);
Where the macro is defined as:
#define TimestampTzPlusMilliseconds(tz,ms) ((tz) + ((ms) * (int64) 1000))
Which can lead to overflow, which is practically harmless.
the function defGetMinApplyDelay() sufficient to ensure that the
'delay' value stored in the catalog will always be lesser than
INT_MAX?
I'm concerned about cases where INT_MAX is wider than int32. If we
don't assume such cases, I'm fine with INT_MAX there.
regards.
--
Kyotaro Horiguchi
NTT Open Source Software Center
On Mon, Jan 30, 2023 at 12:38 PM Kyotaro Horiguchi
<horikyota.ntt@gmail.com> wrote:
At Mon, 30 Jan 2023 11:56:33 +0530, Amit Kapila <amit.kapila16@gmail.com> wrote in
The GUC is not stored in a catalog, but.. oh... it is multiplied by
1000.Which part of the patch you are referring to here? Isn't the check in
Where recovery_min_apply_delay is used. It is allowed to be set up to
INT_MAX but it is used as:delayUntil = TimestampTzPlusMilliseconds(xtime, recovery_min_apply_delay);
Where the macro is defined as:
#define TimestampTzPlusMilliseconds(tz,ms) ((tz) + ((ms) * (int64) 1000))
Which can lead to overflow, which is practically harmless.
But here tz is always TimestampTz (which is int64), so do, we need to worry?
the function defGetMinApplyDelay() sufficient to ensure that the
'delay' value stored in the catalog will always be lesser than
INT_MAX?I'm concerned about cases where INT_MAX is wider than int32. If we
don't assume such cases, I'm fine with INT_MAX there.
I am not aware of such cases. Anyway, if any such case is discovered
then we need to change the checks in defGetMinApplyDelay(), right? If
so, then I think it is better to keep it as it is unless we know that
this could be an issue on some platform.
--
With Regards,
Amit Kapila.
On Saturday, January 28, 2023 1:28 PM I wrote:
Attached the updated patch v24.
Hi,
I've conducted the rebase affected by the commit(1e8b61735c)
by renaming the GUC to logical_replication_mode accordingly,
because it's utilized in the TAP test of this time-delayed LR feature.
There is no other change for this version.
Kindly have a look at the attached v25.
Best Regards,
Takamichi Osumi
Attachments:
v25-0001-Time-delayed-logical-replication-subscriber.patchapplication/octet-stream; name=v25-0001-Time-delayed-logical-replication-subscriber.patchDownload
From 3d0ce918adacaa86de9d4476544a19f83f9cabc3 Mon Sep 17 00:00:00 2001
From: Takamichi Osumi <osumi.takamichi@fujitsu.com>
Date: Mon, 30 Jan 2023 09:29:41 +0000
Subject: [PATCH v25] Time-delayed logical replication subscriber
Similar to physical replication, a time-delayed copy of the data for
logical replication is useful for some scenarios (particularly to fix
errors that might cause data loss).
This patch implements a new subscription parameter called 'min_apply_delay'.
If the subscription sets min_apply_delay parameter, the logical
replication worker will delay the transaction apply for min_apply_delay
milliseconds.
The delay is calculated between the WAL time stamp and the current time
on the subscriber.
The delay occurs before we start to apply the transaction on the
subscriber. The main reason is to avoid keeping a transaction open for
a long time. Regular and prepared transactions are covered. Streamed
transactions are also covered.
The combination of parallel streaming mode and min_apply_delay is not
allowed. This is because we start applying the transaction stream as
soon as the first change arrives without knowing the transaction's
prepare/commit time. This means we cannot calculate the underlying
network/decoding lag between publisher and subscriber, and so always
waiting for the full 'min_apply_delay' period might include
unnecessary delay.
The other possibility is to apply the delay at the end of the parallel
apply transaction but that would cause issues related to resource
bloat and locks being held for a long time.
Note that this feature doesn't interact with skip transaction feature.
The skip transaction feature applies to one transaction with a specific LSN.
So, even if the skipped transaction and non-skipped transaction come
consecutively in a very short time, regardless of the order of which comes
first, the time-delayed feature gets balanced by delayed application
for other transactions before and after the skipped transaction.
Author: Euler Taveira, Takamichi Osumi, Kuroda Hayato
Reviewed-by: Amit Kapila, Peter Smith, Vignesh C, Shveta Malik,
Kyotaro Horiguchi, Dilip Kumar, Melih Mutlu
Discussion: https://postgr.es/m/CAB-JLwYOYwL=XTyAXKiH5CtM_Vm8KjKh7aaitCKvmCh4rzr5pQ@mail.gmail.com
---
doc/src/sgml/catalogs.sgml | 9 +
doc/src/sgml/config.sgml | 11 ++
doc/src/sgml/glossary.sgml | 14 ++
doc/src/sgml/logical-replication.sgml | 6 +
doc/src/sgml/ref/alter_subscription.sgml | 5 +-
doc/src/sgml/ref/create_subscription.sgml | 48 ++++-
src/backend/catalog/pg_subscription.c | 1 +
src/backend/catalog/system_views.sql | 7 +-
src/backend/commands/subscriptioncmds.c | 117 ++++++++++-
.../replication/logical/applyparallelworker.c | 3 +-
src/backend/replication/logical/worker.c | 166 ++++++++++++++--
src/bin/pg_dump/pg_dump.c | 15 +-
src/bin/pg_dump/pg_dump.h | 1 +
src/bin/psql/describe.c | 9 +-
src/bin/psql/tab-complete.c | 4 +-
src/include/catalog/pg_subscription.h | 3 +
src/include/replication/worker_internal.h | 2 +-
src/test/regress/expected/subscription.out | 181 ++++++++++-------
src/test/regress/sql/subscription.sql | 24 +++
src/test/subscription/meson.build | 1 +
src/test/subscription/t/032_apply_delay.pl | 187 ++++++++++++++++++
21 files changed, 710 insertions(+), 104 deletions(-)
create mode 100644 src/test/subscription/t/032_apply_delay.pl
diff --git a/doc/src/sgml/catalogs.sgml b/doc/src/sgml/catalogs.sgml
index c1e4048054..925a6ebb12 100644
--- a/doc/src/sgml/catalogs.sgml
+++ b/doc/src/sgml/catalogs.sgml
@@ -7873,6 +7873,15 @@ SCRAM-SHA-256$<replaceable><iteration count></replaceable>:<replaceable>&l
</para></entry>
</row>
+ <row>
+ <entry role="catalog_table_entry"><para role="column_definition">
+ <structfield>subminapplydelay</structfield> <type>int4</type>
+ </para>
+ <para>
+ The minimum amount of time to delay applying changes, in milliseconds.
+ </para></entry>
+ </row>
+
<row>
<entry role="catalog_table_entry"><para role="column_definition">
<structfield>subname</structfield> <type>name</type>
diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml
index 1cf53c74ea..7bb0be09b2 100644
--- a/doc/src/sgml/config.sgml
+++ b/doc/src/sgml/config.sgml
@@ -4787,6 +4787,17 @@ ANY <replaceable class="parameter">num_sync</replaceable> ( <replaceable class="
the <filename>postgresql.conf</filename> file or on the server
command line.
</para>
+ <para>
+ For time-delayed logical replication, the apply worker sends a feedback
+ message to the publisher every
+ <varname>wal_receiver_status_interval</varname> milliseconds. Make sure
+ to set <varname>wal_receiver_status_interval</varname> less than the
+ <varname>wal_sender_timeout</varname> on the publisher, otherwise, the
+ <literal>walsender</literal> will repeatedly terminate due to timeout
+ error. Note that if <varname>wal_receiver_status_interval</varname> is
+ set to zero, the apply worker sends no feedback messages during the
+ <literal>min_apply_delay</literal> period.
+ </para>
</listitem>
</varlistentry>
diff --git a/doc/src/sgml/glossary.sgml b/doc/src/sgml/glossary.sgml
index 7c01a541fe..6ed6fa5853 100644
--- a/doc/src/sgml/glossary.sgml
+++ b/doc/src/sgml/glossary.sgml
@@ -1729,6 +1729,20 @@
</glossdef>
</glossentry>
+ <glossentry id="glossary-time-delayed-replication">
+ <glossterm>Time-delayed replication</glossterm>
+ <glossdef>
+ <para>
+ Replication setup that applies time-delayed copy of the data.
+ </para>
+ <para>
+ For more information, see
+ <xref linkend="guc-recovery-min-apply-delay"/> for physical replication
+ and <xref linkend="sql-createsubscription"/> for logical replication.
+ </para>
+ </glossdef>
+ </glossentry>
+
<glossentry id="glossary-toast">
<glossterm>TOAST</glossterm>
<glossdef>
diff --git a/doc/src/sgml/logical-replication.sgml b/doc/src/sgml/logical-replication.sgml
index 1bd5660c87..6bd5f61e2b 100644
--- a/doc/src/sgml/logical-replication.sgml
+++ b/doc/src/sgml/logical-replication.sgml
@@ -247,6 +247,12 @@
target table.
</para>
+ <para>
+ A subscription can delay the application of changes by specifying the
+ <literal>min_apply_delay</literal> subscription parameter. See
+ <xref linkend="sql-createsubscription"/> for details.
+ </para>
+
<sect2 id="logical-replication-subscription-slot">
<title>Replication Slot Management</title>
diff --git a/doc/src/sgml/ref/alter_subscription.sgml b/doc/src/sgml/ref/alter_subscription.sgml
index ad93553a1d..1c6e9dd2d1 100644
--- a/doc/src/sgml/ref/alter_subscription.sgml
+++ b/doc/src/sgml/ref/alter_subscription.sgml
@@ -213,8 +213,9 @@ ALTER SUBSCRIPTION <replaceable class="parameter">name</replaceable> RENAME TO <
are <literal>slot_name</literal>,
<literal>synchronous_commit</literal>,
<literal>binary</literal>, <literal>streaming</literal>,
- <literal>disable_on_error</literal>, and
- <literal>origin</literal>.
+ <literal>disable_on_error</literal>,
+ <literal>origin</literal>, and
+ <literal>min_apply_delay</literal>.
</para>
</listitem>
</varlistentry>
diff --git a/doc/src/sgml/ref/create_subscription.sgml b/doc/src/sgml/ref/create_subscription.sgml
index eba72c6af6..22b4451d4d 100644
--- a/doc/src/sgml/ref/create_subscription.sgml
+++ b/doc/src/sgml/ref/create_subscription.sgml
@@ -349,7 +349,48 @@ CREATE SUBSCRIPTION <replaceable class="parameter">subscription_name</replaceabl
</para>
</listitem>
</varlistentry>
- </variablelist></para>
+
+ <varlistentry>
+ <term><literal>min_apply_delay</literal> (<type>integer</type>)</term>
+ <listitem>
+ <para>
+ By default, the subscriber applies changes as soon as possible. This
+ parameter allows the user to delay the application of changes by a
+ given time period. If the value is specified without units, it is
+ taken as milliseconds. The default is zero (no delay). See
+ <xref linkend="config-setting-names-values"/> for details on the
+ available valid time unites.
+ </para>
+ <para>
+ Any delay becomes effective after all initial table synchronization
+ has finished and occurs before each transaction starts to get applied
+ on the subscriber. The delay is calculated as the difference between
+ the WAL timestamp as written on the publisher and the current time on
+ the subscriber. Any overhead of time spent in logical decoding and in
+ transferring the transaction may reduce the actual wait time. It is
+ also possible that the overhead already exceeds the requested
+ <literal>min_apply_delay</literal> value, in which case no delay is
+ applied. If the system clocks on publisher and subscriber are not
+ synchronized, this may lead to apply changes earlier than expected,
+ but this is not a major issue because this parameter is typically
+ much larger than the time deviations between servers. Note that if
+ this parameter is set to a long delay, the replication will stop if
+ the replication slot falls behind the current LSN by more than
+ <link linkend="guc-max-slot-wal-keep-size"><literal>max_slot_wal_keep_size</literal></link>.
+ </para>
+ <warning>
+ <para>
+ Delaying the replication means there is a much longer time between
+ making a change on the publisher, and that change being committed
+ on the subscriber. This can impact the performance of synchronous
+ replication. See <xref linkend="guc-synchronous-commit"/>
+ parameter.
+ </para>
+ </warning>
+ </listitem>
+ </varlistentry>
+ </variablelist>
+ </para>
</listitem>
</varlistentry>
@@ -413,6 +454,11 @@ CREATE SUBSCRIPTION <replaceable class="parameter">subscription_name</replaceabl
published with different column lists are not supported.
</para>
+ <para>
+ A non-zero <literal>min_apply_delay</literal> parameter is not allowed when
+ streaming in parallel mode.
+ </para>
+
<para>
We allow non-existent publications to be specified so that users can add
those later. This means
diff --git a/src/backend/catalog/pg_subscription.c b/src/backend/catalog/pg_subscription.c
index a56ae311c3..c767cc1c3a 100644
--- a/src/backend/catalog/pg_subscription.c
+++ b/src/backend/catalog/pg_subscription.c
@@ -64,6 +64,7 @@ GetSubscription(Oid subid, bool missing_ok)
sub->oid = subid;
sub->dbid = subform->subdbid;
sub->skiplsn = subform->subskiplsn;
+ sub->minapplydelay = subform->subminapplydelay;
sub->name = pstrdup(NameStr(subform->subname));
sub->owner = subform->subowner;
sub->enabled = subform->subenabled;
diff --git a/src/backend/catalog/system_views.sql b/src/backend/catalog/system_views.sql
index 8608e3fa5b..317c2010cb 100644
--- a/src/backend/catalog/system_views.sql
+++ b/src/backend/catalog/system_views.sql
@@ -1299,9 +1299,10 @@ REVOKE ALL ON pg_replication_origin_status FROM public;
-- All columns of pg_subscription except subconninfo are publicly readable.
REVOKE ALL ON pg_subscription FROM public;
-GRANT SELECT (oid, subdbid, subskiplsn, subname, subowner, subenabled,
- subbinary, substream, subtwophasestate, subdisableonerr,
- subslotname, subsynccommit, subpublications, suborigin)
+GRANT SELECT (oid, subdbid, subskiplsn, subminapplydelay, subname, subowner,
+ subenabled, subbinary, substream, subtwophasestate,
+ subdisableonerr, subslotname, subsynccommit, subpublications,
+ suborigin)
ON pg_subscription TO public;
CREATE VIEW pg_stat_subscription_stats AS
diff --git a/src/backend/commands/subscriptioncmds.c b/src/backend/commands/subscriptioncmds.c
index 464db6d247..489eae85ee 100644
--- a/src/backend/commands/subscriptioncmds.c
+++ b/src/backend/commands/subscriptioncmds.c
@@ -66,6 +66,7 @@
#define SUBOPT_DISABLE_ON_ERR 0x00000400
#define SUBOPT_LSN 0x00000800
#define SUBOPT_ORIGIN 0x00001000
+#define SUBOPT_MIN_APPLY_DELAY 0x00002000
/* check if the 'val' has 'bits' set */
#define IsSet(val, bits) (((val) & (bits)) == (bits))
@@ -90,6 +91,7 @@ typedef struct SubOpts
bool disableonerr;
char *origin;
XLogRecPtr lsn;
+ int32 min_apply_delay;
} SubOpts;
static List *fetch_table_list(WalReceiverConn *wrconn, List *publications);
@@ -100,7 +102,7 @@ static void check_publications_origin(WalReceiverConn *wrconn,
static void check_duplicates_in_publist(List *publist, Datum *datums);
static List *merge_publications(List *oldpublist, List *newpublist, bool addpub, const char *subname);
static void ReportSlotConnectionError(List *rstates, Oid subid, char *slotname, char *err);
-
+static int defGetMinApplyDelay(DefElem *def);
/*
* Common option parsing function for CREATE and ALTER SUBSCRIPTION commands.
@@ -146,6 +148,8 @@ parse_subscription_options(ParseState *pstate, List *stmt_options,
opts->disableonerr = false;
if (IsSet(supported_opts, SUBOPT_ORIGIN))
opts->origin = pstrdup(LOGICALREP_ORIGIN_ANY);
+ if (IsSet(supported_opts, SUBOPT_MIN_APPLY_DELAY))
+ opts->min_apply_delay = 0;
/* Parse options */
foreach(lc, stmt_options)
@@ -324,6 +328,15 @@ parse_subscription_options(ParseState *pstate, List *stmt_options,
opts->specified_opts |= SUBOPT_LSN;
opts->lsn = lsn;
}
+ else if (IsSet(supported_opts, SUBOPT_MIN_APPLY_DELAY) &&
+ strcmp(defel->defname, "min_apply_delay") == 0)
+ {
+ if (IsSet(opts->specified_opts, SUBOPT_MIN_APPLY_DELAY))
+ errorConflictingDefElem(defel, pstate);
+
+ opts->specified_opts |= SUBOPT_MIN_APPLY_DELAY;
+ opts->min_apply_delay = defGetMinApplyDelay(defel);
+ }
else
ereport(ERROR,
(errcode(ERRCODE_SYNTAX_ERROR),
@@ -404,6 +417,31 @@ parse_subscription_options(ParseState *pstate, List *stmt_options,
"slot_name = NONE", "create_slot = false")));
}
}
+
+ /*
+ * The combination of parallel streaming mode and min_apply_delay is not
+ * allowed. This is because we start applying the transaction stream as
+ * soon as the first change arrives without knowing the transaction's
+ * prepare/commit time. This means we cannot calculate the underlying
+ * network/decoding lag between publisher and subscriber, and so always
+ * waiting for the full 'min_apply_delay' period might include unnecessary
+ * delay.
+ *
+ * The other possibility is to apply the delay at the end of the parallel
+ * apply transaction but that would cause issues related to resource bloat
+ * and locks being held for a long time.
+ */
+ if (IsSet(supported_opts, SUBOPT_MIN_APPLY_DELAY) &&
+ opts->min_apply_delay > 0 && opts->streaming == LOGICALREP_STREAM_PARALLEL)
+ ereport(ERROR,
+ errcode(ERRCODE_SYNTAX_ERROR),
+
+ /*
+ * - translator: the first %s is a string of the form "parameter > 0"
+ * and the second one is "option = value".
+ */
+ errmsg("%s and %s are mutually exclusive options",
+ "min_apply_delay > 0", "streaming = parallel"));
}
/*
@@ -560,7 +598,8 @@ CreateSubscription(ParseState *pstate, CreateSubscriptionStmt *stmt,
SUBOPT_SLOT_NAME | SUBOPT_COPY_DATA |
SUBOPT_SYNCHRONOUS_COMMIT | SUBOPT_BINARY |
SUBOPT_STREAMING | SUBOPT_TWOPHASE_COMMIT |
- SUBOPT_DISABLE_ON_ERR | SUBOPT_ORIGIN);
+ SUBOPT_DISABLE_ON_ERR | SUBOPT_ORIGIN |
+ SUBOPT_MIN_APPLY_DELAY);
parse_subscription_options(pstate, stmt->options, supported_opts, &opts);
/*
@@ -625,6 +664,7 @@ CreateSubscription(ParseState *pstate, CreateSubscriptionStmt *stmt,
values[Anum_pg_subscription_oid - 1] = ObjectIdGetDatum(subid);
values[Anum_pg_subscription_subdbid - 1] = ObjectIdGetDatum(MyDatabaseId);
values[Anum_pg_subscription_subskiplsn - 1] = LSNGetDatum(InvalidXLogRecPtr);
+ values[Anum_pg_subscription_subminapplydelay - 1] = Int32GetDatum(opts.min_apply_delay);
values[Anum_pg_subscription_subname - 1] =
DirectFunctionCall1(namein, CStringGetDatum(stmt->subname));
values[Anum_pg_subscription_subowner - 1] = ObjectIdGetDatum(owner);
@@ -1054,7 +1094,7 @@ AlterSubscription(ParseState *pstate, AlterSubscriptionStmt *stmt,
supported_opts = (SUBOPT_SLOT_NAME |
SUBOPT_SYNCHRONOUS_COMMIT | SUBOPT_BINARY |
SUBOPT_STREAMING | SUBOPT_DISABLE_ON_ERR |
- SUBOPT_ORIGIN);
+ SUBOPT_ORIGIN | SUBOPT_MIN_APPLY_DELAY);
parse_subscription_options(pstate, stmt->options,
supported_opts, &opts);
@@ -1098,6 +1138,19 @@ AlterSubscription(ParseState *pstate, AlterSubscriptionStmt *stmt,
if (IsSet(opts.specified_opts, SUBOPT_STREAMING))
{
+ /*
+ * The combination of parallel streaming mode and
+ * min_apply_delay is not allowed. See
+ * parse_subscription_options for details of the reason.
+ */
+ if (opts.streaming == LOGICALREP_STREAM_PARALLEL)
+ if ((IsSet(opts.specified_opts, SUBOPT_MIN_APPLY_DELAY) && opts.min_apply_delay > 0) ||
+ (!IsSet(opts.specified_opts, SUBOPT_MIN_APPLY_DELAY) && sub->minapplydelay > 0))
+ ereport(ERROR,
+ errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
+ errmsg("cannot enable parallel streaming mode for subscription with %s",
+ "min_apply_delay"));
+
values[Anum_pg_subscription_substream - 1] =
CharGetDatum(opts.streaming);
replaces[Anum_pg_subscription_substream - 1] = true;
@@ -1111,6 +1164,25 @@ AlterSubscription(ParseState *pstate, AlterSubscriptionStmt *stmt,
= true;
}
+ if (IsSet(opts.specified_opts, SUBOPT_MIN_APPLY_DELAY))
+ {
+ /*
+ * The combination of parallel streaming mode and
+ * min_apply_delay is not allowed.
+ */
+ if (opts.min_apply_delay > 0)
+ if ((IsSet(opts.specified_opts, SUBOPT_STREAMING) && opts.streaming == LOGICALREP_STREAM_PARALLEL) ||
+ (!IsSet(opts.specified_opts, SUBOPT_STREAMING) && sub->stream == LOGICALREP_STREAM_PARALLEL))
+ ereport(ERROR,
+ errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
+ errmsg("cannot enable %s for subscription in %s mode",
+ "min_apply_delay", "streaming = parallel"));
+
+ values[Anum_pg_subscription_subminapplydelay - 1] =
+ Int32GetDatum(opts.min_apply_delay);
+ replaces[Anum_pg_subscription_subminapplydelay - 1] = true;
+ }
+
if (IsSet(opts.specified_opts, SUBOPT_ORIGIN))
{
values[Anum_pg_subscription_suborigin - 1] =
@@ -2195,3 +2267,42 @@ defGetStreamingMode(DefElem *def)
def->defname)));
return LOGICALREP_STREAM_OFF; /* keep compiler quiet */
}
+
+/*
+ * Extract the min_apply_delay value from a DefElem. This is very similar to
+ * parse_and_validate_value() for integer values, because min_apply_delay
+ * accepts the same parameter format as recovery_min_apply_delay.
+ */
+static int
+defGetMinApplyDelay(DefElem *def)
+{
+ char *input_string;
+ int result;
+ const char *hintmsg;
+
+ input_string = defGetString(def);
+
+ /*
+ * Parse given string as parameter which has millisecond unit
+ */
+ if (!parse_int(input_string, &result, GUC_UNIT_MS, &hintmsg))
+ ereport(ERROR,
+ (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+ errmsg("invalid value for parameter \"%s\": \"%s\"",
+ "min_apply_delay", input_string),
+ hintmsg ? errhint("%s", _(hintmsg)) : 0));
+
+ /*
+ * Check lower bound. parse_int() has already been confirmed that result
+ * is less than or equal to INT_MAX.
+ */
+ if (result < 0)
+ ereport(ERROR,
+ (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+ errmsg("%d ms is outside the valid range for parameter \"%s\" (%d .. %d)",
+ result,
+ "min_apply_delay",
+ 0, INT_MAX)));
+
+ return result;
+}
diff --git a/src/backend/replication/logical/applyparallelworker.c b/src/backend/replication/logical/applyparallelworker.c
index 3579e704fe..7302bce7a0 100644
--- a/src/backend/replication/logical/applyparallelworker.c
+++ b/src/backend/replication/logical/applyparallelworker.c
@@ -704,7 +704,8 @@ pa_process_spooled_messages_if_required(void)
{
apply_spooled_messages(&MyParallelShared->fileset,
MyParallelShared->xid,
- InvalidXLogRecPtr);
+ InvalidXLogRecPtr,
+ 0);
pa_set_fileset_state(MyParallelShared, FS_EMPTY);
}
diff --git a/src/backend/replication/logical/worker.c b/src/backend/replication/logical/worker.c
index cfb2ab6248..cd380fb350 100644
--- a/src/backend/replication/logical/worker.c
+++ b/src/backend/replication/logical/worker.c
@@ -319,6 +319,17 @@ static List *on_commit_wakeup_workers_subids = NIL;
bool in_remote_transaction = false;
static XLogRecPtr remote_final_lsn = InvalidXLogRecPtr;
+/*
+ * In order to avoid walsender timeout for time-delayed logical replication the
+ * apply worker keeps sending feedback messages during the delay period.
+ * Meanwhile, the feature delays the apply before the start of the
+ * transaction and thus we don't write WAL records for the suspended changes
+ * during the wait. When the apply worker sends a feedback message during the
+ * delay, we should not make positions of the flushed and apply LSN overwritten
+ * by the last received latest LSN. See send_feedback() for details.
+ */
+static XLogRecPtr last_received = InvalidXLogRecPtr;
+
/* fields valid only when processing streamed transaction */
static bool in_streamed_transaction = false;
@@ -389,7 +400,8 @@ static void stream_write_change(char action, StringInfo s);
static void stream_open_and_write_change(TransactionId xid, char action, StringInfo s);
static void stream_close_file(void);
-static void send_feedback(XLogRecPtr recvpos, bool force, bool requestReply);
+static void send_feedback(XLogRecPtr recvpos, bool force, bool requestReply,
+ bool has_unprocessed_change);
static void DisableSubscriptionAndExit(void);
@@ -999,6 +1011,109 @@ slot_modify_data(TupleTableSlot *slot, TupleTableSlot *srcslot,
ExecStoreVirtualTuple(slot);
}
+/*
+ * When min_apply_delay parameter is set on the subscriber, we wait long enough
+ * to make sure a transaction is applied at least that period behind the
+ * publisher.
+ *
+ * While the physical replication applies the delay at commit time, this
+ * feature applies the delay for the next transaction but before starting the
+ * transaction. This is mainly because keeping a transaction that conducted
+ * write operations open for a long time results in some issues such as bloat
+ * and locks.
+ *
+ * The min_apply_delay parameter will take effect only after all tables are in
+ * READY state.
+ *
+ * xid is the transaction id where we apply the delay.
+ *
+ * finish_ts is the commit/prepare time of both regular (non-streamed) and
+ * streamed transactions. Unlike the regular (non-streamed) cases, the delay
+ * is applied in a STREAM COMMIT/STREAM PREPARE message for streamed
+ * transactions. The STREAM START message does not contain a commit/prepare
+ * time (it will be available when the in-progress transaction finishes).
+ * Hence, it's not appropriate to apply a delay at the STREAM START time.
+ */
+static void
+maybe_apply_delay(TransactionId xid, TimestampTz finish_ts)
+{
+ Assert(finish_ts > 0);
+
+ /* Nothing to do if no delay set */
+ if (!MySubscription->minapplydelay)
+ return;
+
+ /*
+ * The min_apply_delay parameter is ignored until all tablesync workers
+ * have reached READY state. This is because if we allowed the delay
+ * during the catchup phase, then once we reached the limit of tablesync
+ * workers it would impose a delay for each subsequent worker. That would
+ * cause initial table synchronization completion to take a long time.
+ */
+ if (!AllTablesyncsReady())
+ return;
+
+ /* Apply the delay by the latch mechanism */
+ while (true)
+ {
+ TimestampTz delayUntil;
+ long diffms;
+
+ ResetLatch(MyLatch);
+
+ CHECK_FOR_INTERRUPTS();
+
+ /* This might change wal_receiver_status_interval */
+ if (ConfigReloadPending)
+ {
+ ConfigReloadPending = false;
+ ProcessConfigFile(PGC_SIGHUP);
+ }
+
+ /*
+ * Before calculating the time duration, reload the catalog if needed.
+ */
+ if (!in_remote_transaction && !in_streamed_transaction)
+ {
+ AcceptInvalidationMessages();
+ maybe_reread_subscription();
+ }
+
+ delayUntil = TimestampTzPlusMilliseconds(finish_ts, MySubscription->minapplydelay);
+ diffms = TimestampDifferenceMilliseconds(GetCurrentTimestamp(), delayUntil);
+
+ /*
+ * Exit without arming the latch if it's already past time to apply
+ * this transaction.
+ */
+ if (diffms <= 0)
+ break;
+
+ elog(DEBUG2, "time-delayed replication for txid %u, min_apply_delay = %d ms, remaining wait time: %ld ms",
+ xid, MySubscription->minapplydelay, diffms);
+
+ /*
+ * Call send_feedback() to prevent the publisher from exiting by
+ * timeout during the delay, when wal_receiver_status_interval is
+ * available.
+ */
+ if (wal_receiver_status_interval > 0 &&
+ diffms > wal_receiver_status_interval * 1000L)
+ {
+ WaitLatch(MyLatch,
+ WL_LATCH_SET | WL_TIMEOUT | WL_EXIT_ON_PM_DEATH,
+ wal_receiver_status_interval * 1000L,
+ WAIT_EVENT_RECOVERY_APPLY_DELAY);
+ send_feedback(last_received, true, false, true);
+ }
+ else
+ WaitLatch(MyLatch,
+ WL_LATCH_SET | WL_TIMEOUT | WL_EXIT_ON_PM_DEATH,
+ diffms,
+ WAIT_EVENT_RECOVERY_APPLY_DELAY);
+ }
+}
+
/*
* Handle BEGIN message.
*/
@@ -1013,6 +1128,9 @@ apply_handle_begin(StringInfo s)
logicalrep_read_begin(s, &begin_data);
set_apply_error_context_xact(begin_data.xid, begin_data.final_lsn);
+ /* Should we delay the current transaction? */
+ maybe_apply_delay(begin_data.xid, begin_data.committime);
+
remote_final_lsn = begin_data.final_lsn;
maybe_start_skipping_changes(begin_data.final_lsn);
@@ -1070,6 +1188,9 @@ apply_handle_begin_prepare(StringInfo s)
logicalrep_read_begin_prepare(s, &begin_data);
set_apply_error_context_xact(begin_data.xid, begin_data.prepare_lsn);
+ /* Should we delay the current prepared transaction? */
+ maybe_apply_delay(begin_data.xid, begin_data.prepare_time);
+
remote_final_lsn = begin_data.prepare_lsn;
maybe_start_skipping_changes(begin_data.prepare_lsn);
@@ -1317,7 +1438,8 @@ apply_handle_stream_prepare(StringInfo s)
* spooled operations.
*/
apply_spooled_messages(MyLogicalRepWorker->stream_fileset,
- prepare_data.xid, prepare_data.prepare_lsn);
+ prepare_data.xid, prepare_data.prepare_lsn,
+ prepare_data.prepare_time);
/* Mark the transaction as prepared. */
apply_handle_prepare_internal(&prepare_data);
@@ -2011,10 +2133,13 @@ ensure_last_message(FileSet *stream_fileset, TransactionId xid, int fileno,
/*
* Common spoolfile processing.
+ *
+ * The commit/prepare time (finish_ts) for streamed transactions is required
+ * for time-delayed logical replication.
*/
void
apply_spooled_messages(FileSet *stream_fileset, TransactionId xid,
- XLogRecPtr lsn)
+ XLogRecPtr lsn, TimestampTz finish_ts)
{
StringInfoData s2;
int nchanges;
@@ -2025,6 +2150,10 @@ apply_spooled_messages(FileSet *stream_fileset, TransactionId xid,
int fileno;
off_t offset;
+ /* Should we delay the current transaction? */
+ if (finish_ts)
+ maybe_apply_delay(xid, finish_ts);
+
if (!am_parallel_apply_worker())
maybe_start_skipping_changes(lsn);
@@ -2174,7 +2303,7 @@ apply_handle_stream_commit(StringInfo s)
* spooled operations.
*/
apply_spooled_messages(MyLogicalRepWorker->stream_fileset, xid,
- commit_data.commit_lsn);
+ commit_data.commit_lsn, commit_data.committime);
apply_handle_commit_internal(&commit_data);
@@ -3447,7 +3576,7 @@ UpdateWorkerStats(XLogRecPtr last_lsn, TimestampTz send_time, bool reply)
* Apply main loop.
*/
static void
-LogicalRepApplyLoop(XLogRecPtr last_received)
+LogicalRepApplyLoop(void)
{
TimestampTz last_recv_timestamp = GetCurrentTimestamp();
bool ping_sent = false;
@@ -3568,7 +3697,7 @@ LogicalRepApplyLoop(XLogRecPtr last_received)
if (last_received < end_lsn)
last_received = end_lsn;
- send_feedback(last_received, reply_requested, false);
+ send_feedback(last_received, reply_requested, false, false);
UpdateWorkerStats(last_received, timestamp, true);
}
/* other message types are purposefully ignored */
@@ -3581,7 +3710,7 @@ LogicalRepApplyLoop(XLogRecPtr last_received)
}
/* confirm all writes so far */
- send_feedback(last_received, false, false);
+ send_feedback(last_received, false, false, false);
if (!in_remote_transaction && !in_streamed_transaction)
{
@@ -3678,7 +3807,7 @@ LogicalRepApplyLoop(XLogRecPtr last_received)
}
}
- send_feedback(last_received, requestReply, requestReply);
+ send_feedback(last_received, requestReply, requestReply, false);
/*
* Force reporting to ensure long idle periods don't lead to
@@ -3708,7 +3837,7 @@ LogicalRepApplyLoop(XLogRecPtr last_received)
* to send a response to avoid timeouts.
*/
static void
-send_feedback(XLogRecPtr recvpos, bool force, bool requestReply)
+send_feedback(XLogRecPtr recvpos, bool force, bool requestReply, bool has_unprocessed_change)
{
static StringInfo reply_message = NULL;
static TimestampTz send_time = 0;
@@ -3738,8 +3867,15 @@ send_feedback(XLogRecPtr recvpos, bool force, bool requestReply)
/*
* No outstanding transactions to flush, we can report the latest received
* position. This is important for synchronous replication.
+ *
+ * If the logical replication subscription has unprocessed changes then do
+ * not inform the publisher that the received latest LSN is already
+ * applied and flushed, otherwise, the publisher will make a wrong
+ * assumption about the logical replication progress. Instead, it just
+ * sends a feedback message to avoid a replication timeout during the
+ * delay.
*/
- if (!have_pending_txes)
+ if (!have_pending_txes && !has_unprocessed_change)
flushpos = writepos = recvpos;
if (writepos < last_writepos)
@@ -3776,8 +3912,9 @@ send_feedback(XLogRecPtr recvpos, bool force, bool requestReply)
pq_sendint64(reply_message, now); /* sendTime */
pq_sendbyte(reply_message, requestReply); /* replyRequested */
- elog(DEBUG2, "sending feedback (force %d) to recv %X/%X, write %X/%X, flush %X/%X",
+ elog(DEBUG2, "sending feedback (force %d, has_unprocessed_change %s) to recv %X/%X, write %X/%X, flush %X/%X",
force,
+ has_unprocessed_change ? "yes" : "no",
LSN_FORMAT_ARGS(recvpos),
LSN_FORMAT_ARGS(writepos),
LSN_FORMAT_ARGS(flushpos));
@@ -4367,11 +4504,11 @@ start_table_sync(XLogRecPtr *origin_startpos, char **myslotname)
* of system resource error and are not repeatable.
*/
static void
-start_apply(XLogRecPtr origin_startpos)
+start_apply(void)
{
PG_TRY();
{
- LogicalRepApplyLoop(origin_startpos);
+ LogicalRepApplyLoop();
}
PG_CATCH();
{
@@ -4661,7 +4798,8 @@ ApplyWorkerMain(Datum main_arg)
}
/* Run the main loop. */
- start_apply(origin_startpos);
+ last_received = origin_startpos;
+ start_apply();
proc_exit(0);
}
diff --git a/src/bin/pg_dump/pg_dump.c b/src/bin/pg_dump/pg_dump.c
index 527c7651ab..1e87f0124e 100644
--- a/src/bin/pg_dump/pg_dump.c
+++ b/src/bin/pg_dump/pg_dump.c
@@ -4494,6 +4494,7 @@ getSubscriptions(Archive *fout)
int i_subsynccommit;
int i_subpublications;
int i_subbinary;
+ int i_subminapplydelay;
int i,
ntups;
@@ -4546,9 +4547,13 @@ getSubscriptions(Archive *fout)
LOGICALREP_TWOPHASE_STATE_DISABLED);
if (fout->remoteVersion >= 160000)
- appendPQExpBufferStr(query, " s.suborigin\n");
+ appendPQExpBufferStr(query,
+ " s.suborigin,\n"
+ " s.subminapplydelay\n");
else
- appendPQExpBuffer(query, " '%s' AS suborigin\n", LOGICALREP_ORIGIN_ANY);
+ appendPQExpBuffer(query, " '%s' AS suborigin,\n"
+ " 0 AS subminapplydelay\n",
+ LOGICALREP_ORIGIN_ANY);
appendPQExpBufferStr(query,
"FROM pg_subscription s\n"
@@ -4576,6 +4581,7 @@ getSubscriptions(Archive *fout)
i_subtwophasestate = PQfnumber(res, "subtwophasestate");
i_subdisableonerr = PQfnumber(res, "subdisableonerr");
i_suborigin = PQfnumber(res, "suborigin");
+ i_subminapplydelay = PQfnumber(res, "subminapplydelay");
subinfo = pg_malloc(ntups * sizeof(SubscriptionInfo));
@@ -4606,6 +4612,8 @@ getSubscriptions(Archive *fout)
subinfo[i].subdisableonerr =
pg_strdup(PQgetvalue(res, i, i_subdisableonerr));
subinfo[i].suborigin = pg_strdup(PQgetvalue(res, i, i_suborigin));
+ subinfo[i].subminapplydelay =
+ atoi(PQgetvalue(res, i, i_subminapplydelay));
/* Decide whether we want to dump it */
selectDumpableObject(&(subinfo[i].dobj), fout);
@@ -4687,6 +4695,9 @@ dumpSubscription(Archive *fout, const SubscriptionInfo *subinfo)
if (strcmp(subinfo->subsynccommit, "off") != 0)
appendPQExpBuffer(query, ", synchronous_commit = %s", fmtId(subinfo->subsynccommit));
+ if (subinfo->subminapplydelay > 0)
+ appendPQExpBuffer(query, ", min_apply_delay = '%d ms'", subinfo->subminapplydelay);
+
appendPQExpBufferStr(query, ");\n");
if (subinfo->dobj.dump & DUMP_COMPONENT_DEFINITION)
diff --git a/src/bin/pg_dump/pg_dump.h b/src/bin/pg_dump/pg_dump.h
index e7cbd8d7ed..b8831c3ed3 100644
--- a/src/bin/pg_dump/pg_dump.h
+++ b/src/bin/pg_dump/pg_dump.h
@@ -661,6 +661,7 @@ typedef struct _SubscriptionInfo
char *subdisableonerr;
char *suborigin;
char *subsynccommit;
+ int subminapplydelay;
char *subpublications;
} SubscriptionInfo;
diff --git a/src/bin/psql/describe.c b/src/bin/psql/describe.c
index c8a0bb7b3a..81d4607a1c 100644
--- a/src/bin/psql/describe.c
+++ b/src/bin/psql/describe.c
@@ -6472,7 +6472,7 @@ describeSubscriptions(const char *pattern, bool verbose)
PGresult *res;
printQueryOpt myopt = pset.popt;
static const bool translate_columns[] = {false, false, false, false,
- false, false, false, false, false, false, false, false};
+ false, false, false, false, false, false, false, false, false};
if (pset.sversion < 100000)
{
@@ -6527,10 +6527,13 @@ describeSubscriptions(const char *pattern, bool verbose)
gettext_noop("Two-phase commit"),
gettext_noop("Disable on error"));
+ /* Origin and min_apply_delay are only supported in v16 and higher */
if (pset.sversion >= 160000)
appendPQExpBuffer(&buf,
- ", suborigin AS \"%s\"\n",
- gettext_noop("Origin"));
+ ", suborigin AS \"%s\"\n"
+ ", subminapplydelay AS \"%s\"\n",
+ gettext_noop("Origin"),
+ gettext_noop("Min apply delay"));
appendPQExpBuffer(&buf,
", subsynccommit AS \"%s\"\n"
diff --git a/src/bin/psql/tab-complete.c b/src/bin/psql/tab-complete.c
index 5e1882eaea..e8b9a43a47 100644
--- a/src/bin/psql/tab-complete.c
+++ b/src/bin/psql/tab-complete.c
@@ -1925,7 +1925,7 @@ psql_completion(const char *text, int start, int end)
COMPLETE_WITH("(", "PUBLICATION");
/* ALTER SUBSCRIPTION <name> SET ( */
else if (HeadMatches("ALTER", "SUBSCRIPTION", MatchAny) && TailMatches("SET", "("))
- COMPLETE_WITH("binary", "disable_on_error", "origin", "slot_name",
+ COMPLETE_WITH("binary", "disable_on_error", "min_apply_delay", "origin", "slot_name",
"streaming", "synchronous_commit");
/* ALTER SUBSCRIPTION <name> SKIP ( */
else if (HeadMatches("ALTER", "SUBSCRIPTION", MatchAny) && TailMatches("SKIP", "("))
@@ -3268,7 +3268,7 @@ psql_completion(const char *text, int start, int end)
/* Complete "CREATE SUBSCRIPTION <name> ... WITH ( <opt>" */
else if (HeadMatches("CREATE", "SUBSCRIPTION") && TailMatches("WITH", "("))
COMPLETE_WITH("binary", "connect", "copy_data", "create_slot",
- "disable_on_error", "enabled", "origin", "slot_name",
+ "disable_on_error", "enabled", "min_apply_delay", "origin", "slot_name",
"streaming", "synchronous_commit", "two_phase");
/* CREATE TRIGGER --- is allowed inside CREATE SCHEMA, so use TailMatches */
diff --git a/src/include/catalog/pg_subscription.h b/src/include/catalog/pg_subscription.h
index b0f2a1705d..b8fe47ef6e 100644
--- a/src/include/catalog/pg_subscription.h
+++ b/src/include/catalog/pg_subscription.h
@@ -74,6 +74,8 @@ CATALOG(pg_subscription,6100,SubscriptionRelationId) BKI_SHARED_RELATION BKI_ROW
Oid subowner BKI_LOOKUP(pg_authid); /* Owner of the subscription */
+ int32 subminapplydelay; /* Replication apply delay (ms) */
+
bool subenabled; /* True if the subscription is enabled (the
* worker should be running) */
@@ -120,6 +122,7 @@ typedef struct Subscription
* in */
XLogRecPtr skiplsn; /* All changes finished at this LSN are
* skipped */
+ int32 minapplydelay; /* Replication apply delay (ms) */
char *name; /* Name of the subscription */
Oid owner; /* Oid of the subscription owner */
bool enabled; /* Indicates if the subscription is enabled */
diff --git a/src/include/replication/worker_internal.h b/src/include/replication/worker_internal.h
index dc87a4edd1..3dc09d1a4c 100644
--- a/src/include/replication/worker_internal.h
+++ b/src/include/replication/worker_internal.h
@@ -255,7 +255,7 @@ extern void stream_stop_internal(TransactionId xid);
/* Common streaming function to apply all the spooled messages */
extern void apply_spooled_messages(FileSet *stream_fileset, TransactionId xid,
- XLogRecPtr lsn);
+ XLogRecPtr lsn, TimestampTz finish_ts);
extern void apply_dispatch(StringInfo s);
diff --git a/src/test/regress/expected/subscription.out b/src/test/regress/expected/subscription.out
index 3f99b14394..5ccce39986 100644
--- a/src/test/regress/expected/subscription.out
+++ b/src/test/regress/expected/subscription.out
@@ -114,18 +114,18 @@ CREATE SUBSCRIPTION regress_testsub4 CONNECTION 'dbname=regress_doesnotexist' PU
WARNING: subscription was created, but is not connected
HINT: To initiate replication, you must manually create the replication slot, enable the subscription, and refresh the subscription.
\dRs+ regress_testsub4
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
-------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub4 | regress_subscription_user | f | {testpub} | f | off | d | f | none | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay | Synchronous commit | Conninfo | Skip LSN
+------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+-----------------+--------------------+-----------------------------+----------
+ regress_testsub4 | regress_subscription_user | f | {testpub} | f | off | d | f | none | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
ALTER SUBSCRIPTION regress_testsub4 SET (origin = any);
\dRs+ regress_testsub4
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
-------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub4 | regress_subscription_user | f | {testpub} | f | off | d | f | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay | Synchronous commit | Conninfo | Skip LSN
+------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+-----------------+--------------------+-----------------------------+----------
+ regress_testsub4 | regress_subscription_user | f | {testpub} | f | off | d | f | any | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
DROP SUBSCRIPTION regress_testsub3;
@@ -143,10 +143,10 @@ ALTER SUBSCRIPTION regress_testsub CONNECTION 'foobar';
ERROR: invalid connection string syntax: missing "=" after "foobar" in connection info string
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub} | f | off | d | f | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+-----------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | off | d | f | any | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
ALTER SUBSCRIPTION regress_testsub SET PUBLICATION testpub2, testpub3 WITH (refresh = false);
@@ -163,10 +163,10 @@ ERROR: unrecognized subscription parameter: "create_slot"
-- ok
ALTER SUBSCRIPTION regress_testsub SKIP (lsn = '0/12345');
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+---------------------+--------+-----------+------------------+------------------+--------+--------------------+------------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub2,testpub3} | f | off | d | f | any | off | dbname=regress_doesnotexist2 | 0/12345
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+---------------------+--------+-----------+------------------+------------------+--------+-----------------+--------------------+------------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub2,testpub3} | f | off | d | f | any | 0 | off | dbname=regress_doesnotexist2 | 0/12345
(1 row)
-- ok - with lsn = NONE
@@ -175,10 +175,10 @@ ALTER SUBSCRIPTION regress_testsub SKIP (lsn = NONE);
ALTER SUBSCRIPTION regress_testsub SKIP (lsn = '0/0');
ERROR: invalid WAL location (LSN): 0/0
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+---------------------+--------+-----------+------------------+------------------+--------+--------------------+------------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub2,testpub3} | f | off | d | f | any | off | dbname=regress_doesnotexist2 | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+---------------------+--------+-----------+------------------+------------------+--------+-----------------+--------------------+------------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub2,testpub3} | f | off | d | f | any | 0 | off | dbname=regress_doesnotexist2 | 0/0
(1 row)
BEGIN;
@@ -210,10 +210,10 @@ ALTER SUBSCRIPTION regress_testsub_foo SET (synchronous_commit = foobar);
ERROR: invalid value for parameter "synchronous_commit": "foobar"
HINT: Available values: local, remote_write, remote_apply, on, off.
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
----------------------+---------------------------+---------+---------------------+--------+-----------+------------------+------------------+--------+--------------------+------------------------------+----------
- regress_testsub_foo | regress_subscription_user | f | {testpub2,testpub3} | f | off | d | f | any | local | dbname=regress_doesnotexist2 | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay | Synchronous commit | Conninfo | Skip LSN
+---------------------+---------------------------+---------+---------------------+--------+-----------+------------------+------------------+--------+-----------------+--------------------+------------------------------+----------
+ regress_testsub_foo | regress_subscription_user | f | {testpub2,testpub3} | f | off | d | f | any | 0 | local | dbname=regress_doesnotexist2 | 0/0
(1 row)
-- rename back to keep the rest simple
@@ -247,19 +247,19 @@ CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUB
WARNING: subscription was created, but is not connected
HINT: To initiate replication, you must manually create the replication slot, enable the subscription, and refresh the subscription.
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub} | t | off | d | f | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+-----------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | t | off | d | f | any | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
ALTER SUBSCRIPTION regress_testsub SET (binary = false);
ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE);
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub} | f | off | d | f | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+-----------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | off | d | f | any | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
DROP SUBSCRIPTION regress_testsub;
@@ -271,27 +271,27 @@ CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUB
WARNING: subscription was created, but is not connected
HINT: To initiate replication, you must manually create the replication slot, enable the subscription, and refresh the subscription.
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub} | f | on | d | f | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+-----------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | on | d | f | any | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
ALTER SUBSCRIPTION regress_testsub SET (streaming = parallel);
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub} | f | parallel | d | f | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+-----------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | parallel | d | f | any | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
ALTER SUBSCRIPTION regress_testsub SET (streaming = false);
ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE);
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub} | f | off | d | f | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+-----------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | off | d | f | any | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
-- fail - publication already exists
@@ -306,10 +306,10 @@ ALTER SUBSCRIPTION regress_testsub ADD PUBLICATION testpub1, testpub2 WITH (refr
ALTER SUBSCRIPTION regress_testsub ADD PUBLICATION testpub1, testpub2 WITH (refresh = false);
ERROR: publication "testpub1" is already in subscription "regress_testsub"
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-----------------------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub,testpub1,testpub2} | f | off | d | f | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-----------------------------+--------+-----------+------------------+------------------+--------+-----------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub,testpub1,testpub2} | f | off | d | f | any | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
-- fail - publication used more than once
@@ -324,10 +324,10 @@ ERROR: publication "testpub3" is not in subscription "regress_testsub"
-- ok - delete publications
ALTER SUBSCRIPTION regress_testsub DROP PUBLICATION testpub1, testpub2 WITH (refresh = false);
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub} | f | off | d | f | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+-----------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | off | d | f | any | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
DROP SUBSCRIPTION regress_testsub;
@@ -363,10 +363,10 @@ CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUB
WARNING: subscription was created, but is not connected
HINT: To initiate replication, you must manually create the replication slot, enable the subscription, and refresh the subscription.
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub} | f | off | p | f | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+-----------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | off | p | f | any | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
--fail - alter of two_phase option not supported.
@@ -375,10 +375,10 @@ ERROR: unrecognized subscription parameter: "two_phase"
-- but can alter streaming when two_phase enabled
ALTER SUBSCRIPTION regress_testsub SET (streaming = true);
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub} | f | on | p | f | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+-----------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | on | p | f | any | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE);
@@ -388,10 +388,10 @@ CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUB
WARNING: subscription was created, but is not connected
HINT: To initiate replication, you must manually create the replication slot, enable the subscription, and refresh the subscription.
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub} | f | on | p | f | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+-----------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | on | p | f | any | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE);
@@ -404,20 +404,57 @@ CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUB
WARNING: subscription was created, but is not connected
HINT: To initiate replication, you must manually create the replication slot, enable the subscription, and refresh the subscription.
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub} | f | off | d | f | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+-----------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | off | d | f | any | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
ALTER SUBSCRIPTION regress_testsub SET (disable_on_error = true);
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub} | f | off | d | t | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+-----------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | off | d | t | any | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
+ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE);
+DROP SUBSCRIPTION regress_testsub;
+-- fail -- min_apply_delay must be a non-negative integer
+CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUBLICATION testpub WITH (connect = false, min_apply_delay = foo);
+ERROR: invalid value for parameter "min_apply_delay": "foo"
+CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUBLICATION testpub WITH (connect = false, min_apply_delay = -1);
+ERROR: -1 ms is outside the valid range for parameter "min_apply_delay" (0 .. 2147483647)
+-- fail - utilizing streaming = parallel with time-delayed replication is not supported
+CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUBLICATION testpub WITH (connect = false, streaming = parallel, min_apply_delay = 123);
+ERROR: min_apply_delay > 0 and streaming = parallel are mutually exclusive options
+-- success -- min_apply_delay value without unit is taken as milliseconds
+CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUBLICATION testpub WITH (connect = false, min_apply_delay = 123);
+WARNING: subscription was created, but is not connected
+HINT: To initiate replication, you must manually create the replication slot, enable the subscription, and refresh the subscription.
+\dRs+
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+-----------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | off | d | f | any | 123 | off | dbname=regress_doesnotexist | 0/0
+(1 row)
+
+-- success -- min_apply_delay value with unit is converted into ms and stored as an integer
+ALTER SUBSCRIPTION regress_testsub SET (min_apply_delay = '1 d');
+\dRs+
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+-----------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | off | d | f | any | 86400000 | off | dbname=regress_doesnotexist | 0/0
+(1 row)
+
+-- fail - alter subscription with streaming = parallel should fail when time-delayed replication is set
+ALTER SUBSCRIPTION regress_testsub SET (streaming = parallel);
+ERROR: cannot enable parallel streaming mode for subscription with min_apply_delay
+-- fail - alter subscription with min_apply_delay should fail when streaming = parallel is set
+ALTER SUBSCRIPTION regress_testsub SET (min_apply_delay = 0, streaming = parallel);
+ALTER SUBSCRIPTION regress_testsub SET (min_apply_delay = 123);
+ERROR: cannot enable min_apply_delay for subscription in streaming = parallel mode
ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE);
DROP SUBSCRIPTION regress_testsub;
RESET SESSION AUTHORIZATION;
diff --git a/src/test/regress/sql/subscription.sql b/src/test/regress/sql/subscription.sql
index 7281f5fee2..7317b140f5 100644
--- a/src/test/regress/sql/subscription.sql
+++ b/src/test/regress/sql/subscription.sql
@@ -286,6 +286,30 @@ ALTER SUBSCRIPTION regress_testsub SET (disable_on_error = true);
ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE);
DROP SUBSCRIPTION regress_testsub;
+-- fail -- min_apply_delay must be a non-negative integer
+CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUBLICATION testpub WITH (connect = false, min_apply_delay = foo);
+CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUBLICATION testpub WITH (connect = false, min_apply_delay = -1);
+
+-- fail - utilizing streaming = parallel with time-delayed replication is not supported
+CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUBLICATION testpub WITH (connect = false, streaming = parallel, min_apply_delay = 123);
+
+-- success -- min_apply_delay value without unit is taken as milliseconds
+CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUBLICATION testpub WITH (connect = false, min_apply_delay = 123);
+\dRs+
+
+-- success -- min_apply_delay value with unit is converted into ms and stored as an integer
+ALTER SUBSCRIPTION regress_testsub SET (min_apply_delay = '1 d');
+\dRs+
+
+-- fail - alter subscription with streaming = parallel should fail when time-delayed replication is set
+ALTER SUBSCRIPTION regress_testsub SET (streaming = parallel);
+
+-- fail - alter subscription with min_apply_delay should fail when streaming = parallel is set
+ALTER SUBSCRIPTION regress_testsub SET (min_apply_delay = 0, streaming = parallel);
+ALTER SUBSCRIPTION regress_testsub SET (min_apply_delay = 123);
+ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE);
+DROP SUBSCRIPTION regress_testsub;
+
RESET SESSION AUTHORIZATION;
DROP ROLE regress_subscription_user;
DROP ROLE regress_subscription_user2;
diff --git a/src/test/subscription/meson.build b/src/test/subscription/meson.build
index 3db0fdfd96..a186876eb4 100644
--- a/src/test/subscription/meson.build
+++ b/src/test/subscription/meson.build
@@ -38,6 +38,7 @@ tests += {
't/029_on_error.pl',
't/030_origin.pl',
't/031_column_list.pl',
+ 't/032_apply_delay.pl',
't/100_bugs.pl',
],
},
diff --git a/src/test/subscription/t/032_apply_delay.pl b/src/test/subscription/t/032_apply_delay.pl
new file mode 100644
index 0000000000..c1b3c7b101
--- /dev/null
+++ b/src/test/subscription/t/032_apply_delay.pl
@@ -0,0 +1,187 @@
+
+# Copyright (c) 2023, PostgreSQL Global Development Group
+
+# Test replication apply delay
+use strict;
+use warnings;
+use PostgreSQL::Test::Cluster;
+use PostgreSQL::Test::Utils;
+use Test::More;
+
+# Confirm the time-delayed replication has been effective from the server log
+# message where the apply worker emits for applying delay. Moreover, verify
+# that the current worker's remaining wait time is sufficiently bigger than the
+# expected value, in order to check any update of the min_apply_delay.
+sub check_apply_delay_log
+{
+ my ($node_subscriber, $offset, $expected) = @_;
+
+ my $log_location = $node_subscriber->wait_for_log(
+ qr/time-delayed replication for txid (\d+), min_apply_delay = (\d+) ms, remaining wait time: (\d+) ms/,
+ $offset);
+
+ cmp_ok($log_location, '>', $offset,
+ "logfile contains triggered logical replication apply delay");
+
+ # Get the remaining wait time from the server log
+ my $contents = slurp_file($node_subscriber->logfile, $offset);
+ $contents =~
+ qr/time-delayed replication for txid (\d+), min_apply_delay = (\d+) ms, remaining wait time: (\d+) ms/,
+ or die "could not get the apply worker wait time";
+ my $logged_delay = $3;
+
+ # Is it larger than expected?
+ cmp_ok($logged_delay, '>', $expected,
+ "The apply worker wait time has expected duration");
+}
+
+# Compare inserted time on the publisher with applied time on the subscriber to
+# confirm the latter is applied after expected time. The time is automatically
+# generated and stored in the table column 'c'.
+sub check_apply_delay_time
+{
+ my ($node_publisher, $node_subscriber, $primary_key, $expected_diffs) =
+ @_;
+
+ my $inserted_time_on_pub = $node_publisher->safe_psql(
+ 'postgres', qq[
+ SELECT extract(epoch from c) FROM test_tab WHERE a = $primary_key;
+ ]);
+
+ my $inserted_time_on_sub = $node_subscriber->safe_psql(
+ 'postgres', qq[
+ SELECT extract(epoch from c) FROM test_tab WHERE a = $primary_key;
+ ]);
+
+ cmp_ok(
+ $inserted_time_on_sub - $inserted_time_on_pub,
+ '>',
+ $expected_diffs,
+ "The tuple on the subscriber was modified later than the publisher");
+}
+
+# Create publisher node
+my $node_publisher = PostgreSQL::Test::Cluster->new('publisher');
+$node_publisher->init(allows_streaming => 'logical');
+$node_publisher->append_conf('postgresql.conf',
+ 'logical_decoding_work_mem = 64kB');
+$node_publisher->start;
+
+# Create subscriber node
+my $node_subscriber = PostgreSQL::Test::Cluster->new('subscriber');
+$node_subscriber->init;
+$node_subscriber->append_conf('postgresql.conf', "log_min_messages = debug2");
+$node_subscriber->start;
+
+# Setup structure on publisher
+$node_publisher->safe_psql('postgres',
+ "CREATE TABLE test_tab (a int primary key, b varchar, c timestamptz (6) DEFAULT now())"
+);
+
+# Setup structure on subscriber
+$node_subscriber->safe_psql('postgres',
+ "CREATE TABLE test_tab (a int primary key, b text, c timestamptz (6) DEFAULT now())"
+);
+
+# Setup logical replication
+my $publisher_connstr = $node_publisher->connstr . ' dbname=postgres';
+
+# The column 'c' must not be published because we want to compare the time
+# difference.
+$node_publisher->safe_psql('postgres',
+ "CREATE PUBLICATION tap_pub FOR TABLE test_tab (a, b)");
+
+my $appname = 'tap_sub';
+
+# Create a subscription that applies the transaction after 1 second delay
+$node_subscriber->safe_psql('postgres',
+ "CREATE SUBSCRIPTION tap_sub CONNECTION '$publisher_connstr application_name=$appname' PUBLICATION tap_pub WITH (copy_data = off, min_apply_delay = '1s', streaming = 'on')"
+);
+
+# Check log starting now for logical replication apply delay
+my $offset = -s $node_subscriber->logfile;
+
+# New row to trigger apply delay
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO test_tab VALUES (1, 'foo')");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO test_tab VALUES (2, 'bar')");
+
+$node_publisher->wait_for_catchup($appname);
+
+my $result =
+ $node_subscriber->safe_psql('postgres',
+ "SELECT count(*), min(a), max(a) FROM test_tab");
+is($result, qq(2|1|2), 'check if the new rows were applied to subscriber');
+
+# Make sure the apply worker knows to wait for more than 500ms
+check_apply_delay_log($node_subscriber, $offset, "0.5");
+
+# Verify that the subscriber lags the publisher by at least 1 second
+check_apply_delay_time($node_publisher, $node_subscriber, '2', '1');
+
+# Setup for streaming case
+$node_publisher->append_conf('postgresql.conf',
+ 'logical_replication_mode = immediate');
+$node_publisher->reload;
+
+# Run a query to make sure that the reload has taken effect.
+$node_publisher->safe_psql('postgres', q{SELECT 1});
+
+# Check log starting now for logical replication apply delay
+$offset = -s $node_subscriber->logfile;
+
+# Test streamed transaction by insert
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO test_tab SELECT i, md5(i::text) FROM generate_series(3, 5) s(i);"
+);
+
+$node_publisher->wait_for_catchup($appname);
+
+$result =
+ $node_subscriber->safe_psql('postgres',
+ "SELECT count(*), min(a), max(a) FROM test_tab");
+is($result, qq(5|1|5), 'check if the new rows were applied to subscriber');
+
+# Make sure the apply worker knows to wait for more than 500ms
+check_apply_delay_log($node_subscriber, $offset, "0.5");
+
+# Verify that the subscriber lags the publisher by at least 1 second
+check_apply_delay_time($node_publisher, $node_subscriber, '5', '1');
+
+# Test whether ALTER SUBSCRIPTION changes the delayed time of the apply worker
+# (1 day 5 minutes). Note that the extra 5 minute is to account for any
+# decoding/network overhead.
+$node_subscriber->safe_psql('postgres',
+ "ALTER SUBSCRIPTION tap_sub SET (min_apply_delay = 86700000)");
+
+# Check log starting now for logical replication apply delay
+$offset = -s $node_subscriber->logfile;
+
+# New row to trigger apply delay
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO test_tab VALUES (0, 'foobar')");
+
+# Make sure the apply worker knows to wait for more than 1 day
+check_apply_delay_log($node_subscriber, $offset, "86400000");
+
+# Disable subscription and the worker should die immediately
+$node_subscriber->safe_psql('postgres',
+ "ALTER SUBSCRIPTION tap_sub DISABLE;");
+
+# Wait until worker dies
+my $sub_query =
+ "SELECT count(1) = 0 FROM pg_stat_subscription WHERE subname = 'tap_sub' AND pid IS NOT NULL;";
+$node_subscriber->poll_query_until('postgres', $sub_query)
+ or die "Timed out while waiting for subscriber to die";
+
+# Confirm disabling the subscription by ALTER SUBSCRIPTION DISABLE did not cause
+# the delayed transaction to be applied.
+$result = $node_subscriber->safe_psql('postgres',
+ "SELECT count(a) FROM test_tab WHERE a = 0;");
+is($result, qq(0), "check the delayed transaction was not applied");
+
+$node_subscriber->stop;
+$node_publisher->stop;
+
+done_testing();
--
2.30.0
On Monday, January 30, 2023 12:02 PM Kyotaro Horiguchi <horikyota.ntt@gmail.com> wrote:
At Sat, 28 Jan 2023 04:28:29 +0000, "Takamichi Osumi (Fujitsu)"
<osumi.takamichi@fujitsu.com> wrote inOn Friday, January 27, 2023 8:00 PM Amit Kapila
<amit.kapila16@gmail.com> wrote:
So, you have changed min_apply_delay from int64 to int32, but you
haven't mentioned the reason for the same? We use 'int' for the
similar parameter recovery_min_apply_delay, so, ideally, it makes
sense but still better to tell your reason explicitly.Yes. It's because I thought I need to make this feature consistent with the
recovery_min_apply_delay.
This feature handles the range same as the recovery_min_apply delay
from 0 to INT_MAX now so should be adjusted to match it.INT_MAX can stick out of int32 on some platforms. (I'm not sure where that
actually happens, though.) We can use PG_INT32_MAX instead.IMHO, I think we don't use int as a catalog column and I agree that
int32 is sufficient since I don't think more than 49 days delay is practical. On
the other hand, maybe I wouldn't want to use int32 for intermediate
calculations.
Hi, Horiguchi-san. Thanks for your comments !
IIUC, in the last sentence, you proposed the type of
SubOpts min_apply_delay should be change to "int". But
I couldn't find actual harm of the current codes, because
we anyway insert the SubOpts value to the catalog after holding it in SubOpts.
Also, it seems there is no explicit rule where we should use "int" local variables
for "int32" system catalog values internally. I had a look at other
variables for int32 system catalog members and either looked fine.
So, I'd like to keep the current code as it is, until actual harm is found.
The latest patch can be seen in [1]/messages/by-id/TYCPR01MB8373E26884C385EFFFB8965FEDD39@TYCPR01MB8373.jpnprd01.prod.outlook.com.
[1]: /messages/by-id/TYCPR01MB8373E26884C385EFFFB8965FEDD39@TYCPR01MB8373.jpnprd01.prod.outlook.com
Best Regards,
Takamichi Osumi
On Monday, January 30, 2023 7:05 PM I wrote:
On Saturday, January 28, 2023 1:28 PM I wrote:
Attached the updated patch v24.
I've conducted the rebase affected by the commit(1e8b61735c) by renaming
the GUC to logical_replication_mode accordingly, because it's utilized in the
TAP test of this time-delayed LR feature.
There is no other change for this version.Kindly have a look at the attached v25.
Hi,
The v25 caused a failure on windows of cfbot in [1]https://cirrus-ci.com/task/5484559622471680.
But, the failure happened in the tests of pg_upgrade
and the failure message looks the same one reported in the ongoing discussion of [2]/messages/by-id/20220919213217.ptqfdlcc5idk5xup@awork3.anarazel.de.
Then, it's an issue independent from the v25.
[1]: https://cirrus-ci.com/task/5484559622471680
[2]: /messages/by-id/20220919213217.ptqfdlcc5idk5xup@awork3.anarazel.de
Best Regards,
Takamichi Osumi
At Mon, 30 Jan 2023 14:24:31 +0530, Amit Kapila <amit.kapila16@gmail.com> wrote in
On Mon, Jan 30, 2023 at 12:38 PM Kyotaro Horiguchi
<horikyota.ntt@gmail.com> wrote:At Mon, 30 Jan 2023 11:56:33 +0530, Amit Kapila <amit.kapila16@gmail.com> wrote in
#define TimestampTzPlusMilliseconds(tz,ms) ((tz) + ((ms) * (int64) 1000))
Which can lead to overflow, which is practically harmless.
But here tz is always TimestampTz (which is int64), so do, we need to worry?
Sorry, I was putting an assuption that int were int64 here.
the function defGetMinApplyDelay() sufficient to ensure that the
'delay' value stored in the catalog will always be lesser than
INT_MAX?I'm concerned about cases where INT_MAX is wider than int32. If we
don't assume such cases, I'm fine with INT_MAX there.I am not aware of such cases. Anyway, if any such case is discovered
then we need to change the checks in defGetMinApplyDelay(), right? If
so, then I think it is better to keep it as it is unless we know that
this could be an issue on some platform.
I'm not sure. I think that int is generally thought that it is tied
with an integer type of any size. min_apply_delay is tightly bond
with a catalog column of int32 thus I thought that (PG_)INT32_MAX is
the right limit. So, as I expressed before, if we assume sizeof(int)
<= sizeof(int32), I' fine with using INT_MAX there.
regards.
--
Kyotaro Horiguchi
NTT Open Source Software Center
Dear Horiguchi-san,
I'm not sure. I think that int is generally thought that it is tied
with an integer type of any size. min_apply_delay is tightly bond
with a catalog column of int32 thus I thought that (PG_)INT32_MAX is
the right limit. So, as I expressed before, if we assume sizeof(int)
<= sizeof(int32), I' fine with using INT_MAX there.
I have checked some articles and I think platforms supported by postgres regard
Int as 32-bit integer.
According to the definition of C99, actual value of INT_MAX/INT_MIN depend on the
implementation - INT_MAX must bigger than or equal to 2^15 - 1 [1]https://www.dii.uchile.cl/~daespino/files/Iso_C_1999_definition.pdf.
So theoretically there is a possibility that int is bigger than int, as you worried.
Next, I checked some data models, and found ILP64 that regards int as 64-bit integer.
In this case INT_MAX may be 2^63-1, it exceeds PG_INT32_MAX.
I cannot find the proper document about the type, but I can site a table from the doc[2]https://unix.org/version2/whatsnew/lp64_wp.html.
```
Datatype LP64 ILP64 LLP64 ILP32 LP32
char 8 8 8 8 8
short 16 16 16 16 16
_int32 32
int 32 64 32 32 16
long 64 64 32 32 32
long long 64
pointer 64 64 64 32 32
```
I'm not sure whether the system survives or not. According to [2]https://unix.org/version2/whatsnew/lp64_wp.html, a few system
released, but I have never heard. Modern systems have LP64 or LLP64.
There have been a few examples of ILP64 systems that have shipped
(Cray and ETA come to mind).
In another paper[3]https://queue.acm.org/detail.cfm?id=1165766, Sun UltraSPARC, which is 32-bit OS and use SPARC64 processor,
seems to use ILP64 model, but it may be ancient OS.
1995 Sun UltraSPARC: 64/32-bit hardware, 32-bit-only operating system. HAL Computer’s SPARC64: uses ILP64 model for C.
Also, I checked buildfarm animals that have Sparc64 architecture,
but their alignment of int seems to be 4 byte [4]https://buildfarm.postgresql.org/cgi-bin/show_stage_log.pl?nm=castoroides&dt=2023-01-30%2012%3A00%3A07&stg=configure#:~:text=checking%20alignment%20of%20int...%204.
checking alignment of int... 4
Therefore, I think we can say that modern platforms that are supported by PostgreSQL define int as 32-bit.
It satisfies the condition sizeof(int) <= sizeof(int32), so we can keep to use INT_MAX.
[1]: https://www.dii.uchile.cl/~daespino/files/Iso_C_1999_definition.pdf
[2]: https://unix.org/version2/whatsnew/lp64_wp.html
[3]: https://queue.acm.org/detail.cfm?id=1165766
[4]: https://buildfarm.postgresql.org/cgi-bin/show_stage_log.pl?nm=castoroides&dt=2023-01-30%2012%3A00%3A07&stg=configure#:~:text=checking%20alignment%20of%20int...%204
Best Regards,
Hayato Kuroda
FUJITSU LIMITED
Hi, Kuroda-san, Thanks for the detailed study.
At Tue, 31 Jan 2023 07:06:40 +0000, "Hayato Kuroda (Fujitsu)" <kuroda.hayato@fujitsu.com> wrote in
Therefore, I think we can say that modern platforms that are supported by PostgreSQL define int as 32-bit.
It satisfies the condition sizeof(int) <= sizeof(int32), so we can keep to use INT_MAX.
Yeah, I know that that's practically correct. Just I wanted to make
clear is whether we (always) assume int == int32. I don't want to do
that just because that works. Even though we cannot be perfect, in
this particular case the destination space is explicitly made as
int32.
It's a similar discussion to the recent commit 3b4ac33254. We choosed
to use the "correct" symbols refusing to employ an implicit assumption
about the actual values. (In that sense, it is a compromize to assume
int32 being narrower than int is a premise, but the code will get
uselessly complex without that assumption:p)
regards.
--
Kyotaro Horiguchi
NTT Open Source Software Center
On Tue, Jan 31, 2023 at 1:40 PM Kyotaro Horiguchi
<horikyota.ntt@gmail.com> wrote:
Hi, Kuroda-san, Thanks for the detailed study.
At Tue, 31 Jan 2023 07:06:40 +0000, "Hayato Kuroda (Fujitsu)" <kuroda.hayato@fujitsu.com> wrote in
Therefore, I think we can say that modern platforms that are supported by PostgreSQL define int as 32-bit.
It satisfies the condition sizeof(int) <= sizeof(int32), so we can keep to use INT_MAX.Yeah, I know that that's practically correct. Just I wanted to make
clear is whether we (always) assume int == int32. I don't want to do
that just because that works. Even though we cannot be perfect, in
this particular case the destination space is explicitly made as
int32.
So, shall we check if the result of parse_int is in the range 0 and
PG_INT32_MAX to ameliorate this concern? If this works then we need to
probably change the return value of defGetMinApplyDelay() to int32.
--
With Regards,
Amit Kapila.
At Tue, 31 Jan 2023 15:12:14 +0530, Amit Kapila <amit.kapila16@gmail.com> wrote in
On Tue, Jan 31, 2023 at 1:40 PM Kyotaro Horiguchi
<horikyota.ntt@gmail.com> wrote:Hi, Kuroda-san, Thanks for the detailed study.
At Tue, 31 Jan 2023 07:06:40 +0000, "Hayato Kuroda (Fujitsu)" <kuroda.hayato@fujitsu.com> wrote in
Therefore, I think we can say that modern platforms that are supported by PostgreSQL define int as 32-bit.
It satisfies the condition sizeof(int) <= sizeof(int32), so we can keep to use INT_MAX.Yeah, I know that that's practically correct. Just I wanted to make
clear is whether we (always) assume int == int32. I don't want to do
that just because that works. Even though we cannot be perfect, in
this particular case the destination space is explicitly made as
int32.So, shall we check if the result of parse_int is in the range 0 and
PG_INT32_MAX to ameliorate this concern?
Yeah, it is exactly what I wanted to suggest.
If this works then we need to
probably change the return value of defGetMinApplyDelay() to int32.
I didn't thought doing that, int can store all values in the valid
range (I'm assuming we implicitly assume int >= int32 in bit width)
and it is the natural integer in C. Either will do for me but I
slightly prefer to use int there.
As the result I'd like to propose the following change.
diff --git a/src/backend/commands/subscriptioncmds.c b/src/backend/commands/subscriptioncmds.c
index 489eae85ee..9de2745623 100644
--- a/src/backend/commands/subscriptioncmds.c
+++ b/src/backend/commands/subscriptioncmds.c
@@ -2293,16 +2293,16 @@ defGetMinApplyDelay(DefElem *def)
hintmsg ? errhint("%s", _(hintmsg)) : 0));
/*
- * Check lower bound. parse_int() has already been confirmed that result
- * is less than or equal to INT_MAX.
+ * Check the both boundary. Although parse_int() checked the result against
+ * INT_MAX, this value is to be stored in a catalog column of int32.
*/
- if (result < 0)
+ if (result < 0 || result > PG_INT32_MAX)
ereport(ERROR,
(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
errmsg("%d ms is outside the valid range for parameter \"%s\" (%d .. %d)",
result,
"min_apply_delay",
- 0, INT_MAX)));
+ 0, PG_INT32_MAX)));
return result;
}
regards.
--
Kyotaro Horiguchi
NTT Open Source Software Center
On Wed, Feb 1, 2023 at 8:13 AM Kyotaro Horiguchi
<horikyota.ntt@gmail.com> wrote:
At Tue, 31 Jan 2023 15:12:14 +0530, Amit Kapila <amit.kapila16@gmail.com> wrote in
On Tue, Jan 31, 2023 at 1:40 PM Kyotaro Horiguchi
<horikyota.ntt@gmail.com> wrote:Hi, Kuroda-san, Thanks for the detailed study.
At Tue, 31 Jan 2023 07:06:40 +0000, "Hayato Kuroda (Fujitsu)" <kuroda.hayato@fujitsu.com> wrote in
Therefore, I think we can say that modern platforms that are supported by PostgreSQL define int as 32-bit.
It satisfies the condition sizeof(int) <= sizeof(int32), so we can keep to use INT_MAX.Yeah, I know that that's practically correct. Just I wanted to make
clear is whether we (always) assume int == int32. I don't want to do
that just because that works. Even though we cannot be perfect, in
this particular case the destination space is explicitly made as
int32.So, shall we check if the result of parse_int is in the range 0 and
PG_INT32_MAX to ameliorate this concern?Yeah, it is exactly what I wanted to suggest.
If this works then we need to
probably change the return value of defGetMinApplyDelay() to int32.I didn't thought doing that, int can store all values in the valid
range (I'm assuming we implicitly assume int >= int32 in bit width)
and it is the natural integer in C. Either will do for me but I
slightly prefer to use int there.
I think it would be clear to use int32 because the parameter where we
store the return value is also int32.
--
With Regards,
Amit Kapila.
Here are my review comments for the patch v25-0001.
======
Commit Message
1.
The other possibility is to apply the delay at the end of the parallel
apply transaction but that would cause issues related to resource
bloat and locks being held for a long time.
~
SUGGESTION
We chose not to apply the delay at the end of the parallel apply
transaction because that would cause issues related to resource bloat
and locks being held for a long time.
======
doc/src/sgml/config.sgml
2.
+ <para>
+ For time-delayed logical replication, the apply worker sends a feedback
+ message to the publisher every
+ <varname>wal_receiver_status_interval</varname> milliseconds. Make sure
+ to set <varname>wal_receiver_status_interval</varname> less than the
+ <varname>wal_sender_timeout</varname> on the publisher, otherwise, the
+ <literal>walsender</literal> will repeatedly terminate due to timeout
+ error. Note that if <varname>wal_receiver_status_interval</varname> is
+ set to zero, the apply worker sends no feedback messages during the
+ <literal>min_apply_delay</literal> period.
+ </para>
2a.
"due to timeout error." --> "due to timeout errors."
~
2b.
Shouldn't this also cross-ref to CREATE SUBSCRIPTION docs? Because the
above mentions 'min_apply_delay' but that is not defined on this page.
======
doc/src/sgml/ref/create_subscription.sgml
3.
+ <para>
+ By default, the subscriber applies changes as soon as possible. This
+ parameter allows the user to delay the application of changes by a
+ given time period. If the value is specified without units, it is
+ taken as milliseconds. The default is zero (no delay). See
+ <xref linkend="config-setting-names-values"/> for details on the
+ available valid time unites.
+ </para>
Typo: "unites"
~~~
4.
+ <para>
+ Any delay becomes effective after all initial table synchronization
+ has finished and occurs before each transaction starts to get applied
+ on the subscriber. The delay is calculated as the difference between
+ the WAL timestamp as written on the publisher and the current time on
+ the subscriber. Any overhead of time spent in logical decoding and in
+ transferring the transaction may reduce the actual wait time. It is
+ also possible that the overhead already exceeds the requested
+ <literal>min_apply_delay</literal> value, in which case no delay is
+ applied. If the system clocks on publisher and subscriber are not
+ synchronized, this may lead to apply changes earlier than expected,
+ but this is not a major issue because this parameter is typically
+ much larger than the time deviations between servers. Note that if
+ this parameter is set to a long delay, the replication will stop if
+ the replication slot falls behind the current LSN by more than
+ <link
linkend="guc-max-slot-wal-keep-size"><literal>max_slot_wal_keep_size</literal></link>.
+ </para>
"Any delay becomes effective after all initial table
synchronization..." --> "Any delay becomes effective only after all
initial table synchronization..."
~~~
5.
+ <warning>
+ <para>
+ Delaying the replication means there is a much longer time between
+ making a change on the publisher, and that change being committed
+ on the subscriber. This can impact the performance of synchronous
+ replication. See <xref linkend="guc-synchronous-commit"/>
+ parameter.
+ </para>
+ </warning>
I'm not sure why this was text changed to say "means there is a much
longer time" instead of "can mean there is a much longer time".
IMO the previous wording was better because this current text makes an
assumption about what the user has configured -- e.g. if they
configured only 1ms delay then the warning text is not really
relevant.
~~~
6.
Why was the example (it existed when I last looked at patch v19)
removed? Personally, I found that example to be a useful reminder that
the min_apply_delay can specify units other than just 'ms'.
======
src/backend/commands/subscriptioncmds.c
7. parse_subscription_options
+ /*
+ * The combination of parallel streaming mode and min_apply_delay is not
+ * allowed. This is because we start applying the transaction stream as
+ * soon as the first change arrives without knowing the transaction's
+ * prepare/commit time. This means we cannot calculate the underlying
+ * network/decoding lag between publisher and subscriber, and so always
+ * waiting for the full 'min_apply_delay' period might include unnecessary
+ * delay.
+ *
+ * The other possibility is to apply the delay at the end of the parallel
+ * apply transaction but that would cause issues related to resource bloat
+ * and locks being held for a long time.
+ */
I think the 2nd paragraph should be changed slightly as follows (like
review comment #1)
SUGGESTION
Note - we chose not to apply the delay at the end of the parallel
apply transaction because that would cause issues related to resource
bloat and locks being held for a long time.
~~~
8.
+ if (IsSet(supported_opts, SUBOPT_MIN_APPLY_DELAY) &&
+ opts->min_apply_delay > 0 && opts->streaming == LOGICALREP_STREAM_PARALLEL)
+ ereport(ERROR,
+ errcode(ERRCODE_SYNTAX_ERROR),
Saying "> 0" (in the condition) is not strictly necessary here, since
it is never < 0.
~~~
9. AlterSubscription
+ /*
+ * The combination of parallel streaming mode and
+ * min_apply_delay is not allowed. See
+ * parse_subscription_options for details of the reason.
+ */
+ if (opts.streaming == LOGICALREP_STREAM_PARALLEL)
+ if ((IsSet(opts.specified_opts, SUBOPT_MIN_APPLY_DELAY) &&
opts.min_apply_delay > 0) ||
+ (!IsSet(opts.specified_opts, SUBOPT_MIN_APPLY_DELAY) &&
sub->minapplydelay > 0))
Saying "> 0" (in the condition) is not strictly necessary here, since
it is never < 0.
~~~
10.
+ if (IsSet(opts.specified_opts, SUBOPT_MIN_APPLY_DELAY))
+ {
+ /*
+ * The combination of parallel streaming mode and
+ * min_apply_delay is not allowed.
+ */
+ if (opts.min_apply_delay > 0)
Saying "> 0" (in the condition) is not strictly necessary here, since
it is never < 0.
~~~
11. defGetMinApplyDelay
+ /*
+ * Check lower bound. parse_int() has already been confirmed that result
+ * is less than or equal to INT_MAX.
+ */
The parse_int already checks < INT_MAX. But on return from that
function, don’t you need to check again that it is < PG_INT32_MAX (in
case those are different)
(I think Kuroda-san already suggested same as this)
======
src/backend/replication/logical/worker.c
12.
+/*
+ * In order to avoid walsender timeout for time-delayed logical replication the
+ * apply worker keeps sending feedback messages during the delay period.
+ * Meanwhile, the feature delays the apply before the start of the
+ * transaction and thus we don't write WAL records for the suspended changes
+ * during the wait. When the apply worker sends a feedback message during the
+ * delay, we should not make positions of the flushed and apply LSN overwritten
+ * by the last received latest LSN. See send_feedback() for details.
+ */
"we should not make positions of the flushed and apply LSN
overwritten" --> "we should overwrite positions of the flushed and
apply LSN"
~~~
14. send_feedback
@@ -3738,8 +3867,15 @@ send_feedback(XLogRecPtr recvpos, bool force,
bool requestReply)
/*
* No outstanding transactions to flush, we can report the latest received
* position. This is important for synchronous replication.
+ *
+ * If the logical replication subscription has unprocessed changes then do
+ * not inform the publisher that the received latest LSN is already
+ * applied and flushed, otherwise, the publisher will make a wrong
+ * assumption about the logical replication progress. Instead, it just
+ * sends a feedback message to avoid a replication timeout during the
+ * delay.
*/
"Instead, it just sends" --> "Instead, just send"
======
src/bin/pg_dump/pg_dump.h
15. SubscriptionInfo
@@ -661,6 +661,7 @@ typedef struct _SubscriptionInfo
char *subdisableonerr;
char *suborigin;
char *subsynccommit;
+ int subminapplydelay;
char *subpublications;
} SubscriptionInfo;
Should this also be "int32" to match the other member type changes?
======
src/test/subscription/t/032_apply_delay.pl
16.
+# Make sure the apply worker knows to wait for more than 500ms
+check_apply_delay_log($node_subscriber, $offset, "0.5");
"knows to wait for more than" --> "waits for more than"
(this occurs in a couple of places)
------
Kind Regards,
Peter Smith.
Fujitsu Australia
At Wed, 1 Feb 2023 08:38:11 +0530, Amit Kapila <amit.kapila16@gmail.com> wrote in
On Wed, Feb 1, 2023 at 8:13 AM Kyotaro Horiguchi
<horikyota.ntt@gmail.com> wrote:At Tue, 31 Jan 2023 15:12:14 +0530, Amit Kapila <amit.kapila16@gmail.com> wrote in
So, shall we check if the result of parse_int is in the range 0 and
PG_INT32_MAX to ameliorate this concern?Yeah, it is exactly what I wanted to suggest.
If this works then we need to
probably change the return value of defGetMinApplyDelay() to int32.I didn't thought doing that, int can store all values in the valid
range (I'm assuming we implicitly assume int >= int32 in bit width)
and it is the natural integer in C. Either will do for me but I
slightly prefer to use int there.I think it would be clear to use int32 because the parameter where we
store the return value is also int32.
I'm fine with that.
regards.
--
Kyotaro Horiguchi
NTT Open Source Software Center
On Mon, Jan 30, 2023 6:05 PM Takamichi Osumi (Fujitsu) <osumi.takamichi@fujitsu.com> wrote:
On Saturday, January 28, 2023 1:28 PM I wrote:
Attached the updated patch v24.
Hi,
I've conducted the rebase affected by the commit(1e8b61735c)
by renaming the GUC to logical_replication_mode accordingly,
because it's utilized in the TAP test of this time-delayed LR feature.
There is no other change for this version.Kindly have a look at the attached v25.
Thanks for your patch. Here are some comments.
1.
+ /*
+ * The min_apply_delay parameter is ignored until all tablesync workers
+ * have reached READY state. This is because if we allowed the delay
+ * during the catchup phase, then once we reached the limit of tablesync
+ * workers it would impose a delay for each subsequent worker. That would
+ * cause initial table synchronization completion to take a long time.
+ */
+ if (!AllTablesyncsReady())
+ return;
I saw that the new parameter becomes effective after all tables are in ready
state, because the apply worker can't set the state to catchup during the delay.
But can we call process_syncing_tables() in the while-loop of
maybe_apply_delay()? Then the tablesync can finish without delay. If we can't do
so, it might be better to add some comments for it.
2.
+# Make sure the apply worker knows to wait for more than 500ms
+check_apply_delay_log($node_subscriber, $offset, "0.5");
I think the last parameter should be 500.
Besides, I am not sure it's a stable test to check the log. Is it possible that
there's no such log on a slow machine? I modified the code to sleep 1s at the
beginning of apply_dispatch(), then the new added test failed because the server
log cannot match.
Regards,
Shi yu
On Wed, Feb 1, 2023 at 3:10 PM shiy.fnst@fujitsu.com
<shiy.fnst@fujitsu.com> wrote:
On Mon, Jan 30, 2023 6:05 PM Takamichi Osumi (Fujitsu) <osumi.takamichi@fujitsu.com> wrote:
Kindly have a look at the attached v25.
Thanks for your patch. Here are some comments.
1. + /* + * The min_apply_delay parameter is ignored until all tablesync workers + * have reached READY state. This is because if we allowed the delay + * during the catchup phase, then once we reached the limit of tablesync + * workers it would impose a delay for each subsequent worker. That would + * cause initial table synchronization completion to take a long time. + */ + if (!AllTablesyncsReady()) + return;I saw that the new parameter becomes effective after all tables are in ready
state, because the apply worker can't set the state to catchup during the delay.
But can we call process_syncing_tables() in the while-loop of
maybe_apply_delay()? Then the tablesync can finish without delay. If we can't do
so, it might be better to add some comments for it.
I think the point here is that if the apply worker is ahead of
tablesync worker then to complete the catch-up, tablesync worker needs
to apply additional transactions, and delaying during that time will
cause initial table synchronization completion to take a long time. I
am not sure how much more details can be added to the existing
comments.
--
With Regards,
Amit Kapila.
Hi,
On Wednesday, February 1, 2023 5:40 PM Kyotaro Horiguchi <horikyota.ntt@gmail.com> wrote:
At Wed, 1 Feb 2023 08:38:11 +0530, Amit Kapila <amit.kapila16@gmail.com>
wrote inOn Wed, Feb 1, 2023 at 8:13 AM Kyotaro Horiguchi
<horikyota.ntt@gmail.com> wrote:At Tue, 31 Jan 2023 15:12:14 +0530, Amit Kapila
<amit.kapila16@gmail.com> wrote inSo, shall we check if the result of parse_int is in the range 0
and PG_INT32_MAX to ameliorate this concern?Yeah, it is exactly what I wanted to suggest.
If this works then we need to
probably change the return value of defGetMinApplyDelay() to int32.I didn't thought doing that, int can store all values in the valid
range (I'm assuming we implicitly assume int >= int32 in bit width)
and it is the natural integer in C. Either will do for me but I
slightly prefer to use int there.I think it would be clear to use int32 because the parameter where we
store the return value is also int32.I'm fine with that.
Thank you for confirming.
Attached the updated patch v26 accordingly.
I slightly adjusted the comments in defGetMinApplyDelay
on this point as well.
Best Regards,
Takamichi Osumi
Attachments:
v26-0001-Time-delayed-logical-replication-subscriber.patchapplication/octet-stream; name=v26-0001-Time-delayed-logical-replication-subscriber.patchDownload
From 7515da2fba7b7ce411f112e51e4b2bc76292b4c1 Mon Sep 17 00:00:00 2001
From: Takamichi Osumi <osumi.takamichi@fujitsu.com>
Date: Thu, 2 Feb 2023 07:35:04 +0000
Subject: [PATCH v26] Time-delayed logical replication subscriber
Similar to physical replication, a time-delayed copy of the data for
logical replication is useful for some scenarios (particularly to fix
errors that might cause data loss).
This patch implements a new subscription parameter called 'min_apply_delay'.
If the subscription sets min_apply_delay parameter, the logical
replication worker will delay the transaction apply for min_apply_delay
milliseconds.
The delay is calculated between the WAL time stamp and the current time
on the subscriber.
The delay occurs before we start to apply the transaction on the
subscriber. The main reason is to avoid keeping a transaction open for
a long time. Regular and prepared transactions are covered. Streamed
transactions are also covered.
The combination of parallel streaming mode and min_apply_delay is not
allowed. This is because we start applying the transaction stream as
soon as the first change arrives without knowing the transaction's
prepare/commit time. This means we cannot calculate the underlying
network/decoding lag between publisher and subscriber, and so always
waiting for the full 'min_apply_delay' period might include
unnecessary delay.
The other possibility was to apply the delay at the end of the parallel
apply transaction but that would cause issues related to resource
bloat and locks being held for a long time.
Note that this feature doesn't interact with skip transaction feature.
The skip transaction feature applies to one transaction with a specific LSN.
So, even if the skipped transaction and non-skipped transaction come
consecutively in a very short time, regardless of the order of which comes
first, the time-delayed feature gets balanced by delayed application
for other transactions before and after the skipped transaction.
Author: Euler Taveira, Takamichi Osumi, Kuroda Hayato
Reviewed-by: Amit Kapila, Peter Smith, Vignesh C, Shveta Malik,
Kyotaro Horiguchi, Shi Yu, Dilip Kumar, Melih Mutlu
Discussion: https://postgr.es/m/CAB-JLwYOYwL=XTyAXKiH5CtM_Vm8KjKh7aaitCKvmCh4rzr5pQ@mail.gmail.com
---
doc/src/sgml/catalogs.sgml | 9 +
doc/src/sgml/config.sgml | 12 ++
doc/src/sgml/glossary.sgml | 14 ++
doc/src/sgml/logical-replication.sgml | 6 +
doc/src/sgml/ref/alter_subscription.sgml | 5 +-
doc/src/sgml/ref/create_subscription.sgml | 49 ++++-
src/backend/catalog/pg_subscription.c | 1 +
src/backend/catalog/system_views.sql | 7 +-
src/backend/commands/subscriptioncmds.c | 120 ++++++++++-
.../replication/logical/applyparallelworker.c | 3 +-
src/backend/replication/logical/worker.c | 165 ++++++++++++++--
src/bin/pg_dump/pg_dump.c | 15 +-
src/bin/pg_dump/pg_dump.h | 1 +
src/bin/psql/describe.c | 9 +-
src/bin/psql/tab-complete.c | 4 +-
src/include/catalog/pg_subscription.h | 3 +
src/include/replication/worker_internal.h | 2 +-
src/test/regress/expected/subscription.out | 181 ++++++++++-------
src/test/regress/sql/subscription.sql | 24 +++
src/test/subscription/meson.build | 1 +
src/test/subscription/t/032_apply_delay.pl | 187 ++++++++++++++++++
21 files changed, 714 insertions(+), 104 deletions(-)
create mode 100644 src/test/subscription/t/032_apply_delay.pl
diff --git a/doc/src/sgml/catalogs.sgml b/doc/src/sgml/catalogs.sgml
index c1e4048054..925a6ebb12 100644
--- a/doc/src/sgml/catalogs.sgml
+++ b/doc/src/sgml/catalogs.sgml
@@ -7873,6 +7873,15 @@ SCRAM-SHA-256$<replaceable><iteration count></replaceable>:<replaceable>&l
</para></entry>
</row>
+ <row>
+ <entry role="catalog_table_entry"><para role="column_definition">
+ <structfield>subminapplydelay</structfield> <type>int4</type>
+ </para>
+ <para>
+ The minimum amount of time to delay applying changes, in milliseconds.
+ </para></entry>
+ </row>
+
<row>
<entry role="catalog_table_entry"><para role="column_definition">
<structfield>subname</structfield> <type>name</type>
diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml
index 1cf53c74ea..e25f6497e8 100644
--- a/doc/src/sgml/config.sgml
+++ b/doc/src/sgml/config.sgml
@@ -4787,6 +4787,18 @@ ANY <replaceable class="parameter">num_sync</replaceable> ( <replaceable class="
the <filename>postgresql.conf</filename> file or on the server
command line.
</para>
+ <para>
+ For time-delayed logical replication, the apply worker sends a feedback
+ message to the publisher every
+ <varname>wal_receiver_status_interval</varname> milliseconds. Make sure
+ to set <varname>wal_receiver_status_interval</varname> less than the
+ <varname>wal_sender_timeout</varname> on the publisher, otherwise, the
+ <literal>walsender</literal> will repeatedly terminate due to timeout
+ errors. Note that if <varname>wal_receiver_status_interval</varname> is
+ set to zero, the apply worker sends no feedback messages during the
+ <literal>min_apply_delay</literal> period. Refer to
+ <xref linkend="sql-createsubscription"/> for more information.
+ </para>
</listitem>
</varlistentry>
diff --git a/doc/src/sgml/glossary.sgml b/doc/src/sgml/glossary.sgml
index 7c01a541fe..6ed6fa5853 100644
--- a/doc/src/sgml/glossary.sgml
+++ b/doc/src/sgml/glossary.sgml
@@ -1729,6 +1729,20 @@
</glossdef>
</glossentry>
+ <glossentry id="glossary-time-delayed-replication">
+ <glossterm>Time-delayed replication</glossterm>
+ <glossdef>
+ <para>
+ Replication setup that applies time-delayed copy of the data.
+ </para>
+ <para>
+ For more information, see
+ <xref linkend="guc-recovery-min-apply-delay"/> for physical replication
+ and <xref linkend="sql-createsubscription"/> for logical replication.
+ </para>
+ </glossdef>
+ </glossentry>
+
<glossentry id="glossary-toast">
<glossterm>TOAST</glossterm>
<glossdef>
diff --git a/doc/src/sgml/logical-replication.sgml b/doc/src/sgml/logical-replication.sgml
index 1bd5660c87..6bd5f61e2b 100644
--- a/doc/src/sgml/logical-replication.sgml
+++ b/doc/src/sgml/logical-replication.sgml
@@ -247,6 +247,12 @@
target table.
</para>
+ <para>
+ A subscription can delay the application of changes by specifying the
+ <literal>min_apply_delay</literal> subscription parameter. See
+ <xref linkend="sql-createsubscription"/> for details.
+ </para>
+
<sect2 id="logical-replication-subscription-slot">
<title>Replication Slot Management</title>
diff --git a/doc/src/sgml/ref/alter_subscription.sgml b/doc/src/sgml/ref/alter_subscription.sgml
index 964fcbb8ff..8b7eb28e54 100644
--- a/doc/src/sgml/ref/alter_subscription.sgml
+++ b/doc/src/sgml/ref/alter_subscription.sgml
@@ -213,8 +213,9 @@ ALTER SUBSCRIPTION <replaceable class="parameter">name</replaceable> RENAME TO <
are <literal>slot_name</literal>,
<literal>synchronous_commit</literal>,
<literal>binary</literal>, <literal>streaming</literal>,
- <literal>disable_on_error</literal>, and
- <literal>origin</literal>.
+ <literal>disable_on_error</literal>,
+ <literal>origin</literal>, and
+ <literal>min_apply_delay</literal>.
</para>
</listitem>
</varlistentry>
diff --git a/doc/src/sgml/ref/create_subscription.sgml b/doc/src/sgml/ref/create_subscription.sgml
index 51c45f17c7..1b4b8390af 100644
--- a/doc/src/sgml/ref/create_subscription.sgml
+++ b/doc/src/sgml/ref/create_subscription.sgml
@@ -349,7 +349,49 @@ CREATE SUBSCRIPTION <replaceable class="parameter">subscription_name</replaceabl
</para>
</listitem>
</varlistentry>
- </variablelist></para>
+
+ <varlistentry>
+ <term><literal>min_apply_delay</literal> (<type>integer</type>)</term>
+ <listitem>
+ <para>
+ By default, the subscriber applies changes as soon as possible. This
+ parameter allows the user to delay the application of changes by a
+ given time period. If the value is specified without units, it is
+ taken as milliseconds. The default is zero (no delay). See
+ <xref linkend="config-setting-names-values"/> for details on the
+ available valid time units.
+ </para>
+ <para>
+ Any delay becomes effective only after all initial table
+ synchronization has finished and occurs before each transaction starts
+ to get applied on the subscriber. The delay is calculated as the
+ difference between the WAL timestamp as written on the publisher and
+ the current time on the subscriber. Any overhead of time spent in
+ logical decoding and in transferring the transaction may reduce the
+ actual wait time. It is also possible that the overhead already
+ exceeds the requested <literal>min_apply_delay</literal> value, in
+ which case no delay is applied. If the system clocks on publisher and
+ subscriber are not synchronized, this may lead to apply changes
+ earlier than expected, but this is not a major issue because this
+ parameter is typically much larger than the time deviations between
+ servers. Note that if this parameter is set to a long delay, the
+ replication will stop if the replication slot falls behind the current
+ LSN by more than
+ <link linkend="guc-max-slot-wal-keep-size"><literal>max_slot_wal_keep_size</literal></link>.
+ </para>
+ <warning>
+ <para>
+ Delaying the replication means there is a much longer time between
+ making a change on the publisher, and that change being committed
+ on the subscriber. This can impact the performance of synchronous
+ replication. See <xref linkend="guc-synchronous-commit"/>
+ parameter.
+ </para>
+ </warning>
+ </listitem>
+ </varlistentry>
+ </variablelist>
+ </para>
</listitem>
</varlistentry>
@@ -420,6 +462,11 @@ CREATE SUBSCRIPTION <replaceable class="parameter">subscription_name</replaceabl
published with different column lists are not supported.
</para>
+ <para>
+ A non-zero <literal>min_apply_delay</literal> parameter is not allowed when
+ streaming in parallel mode.
+ </para>
+
<para>
We allow non-existent publications to be specified so that users can add
those later. This means
diff --git a/src/backend/catalog/pg_subscription.c b/src/backend/catalog/pg_subscription.c
index a56ae311c3..c767cc1c3a 100644
--- a/src/backend/catalog/pg_subscription.c
+++ b/src/backend/catalog/pg_subscription.c
@@ -64,6 +64,7 @@ GetSubscription(Oid subid, bool missing_ok)
sub->oid = subid;
sub->dbid = subform->subdbid;
sub->skiplsn = subform->subskiplsn;
+ sub->minapplydelay = subform->subminapplydelay;
sub->name = pstrdup(NameStr(subform->subname));
sub->owner = subform->subowner;
sub->enabled = subform->subenabled;
diff --git a/src/backend/catalog/system_views.sql b/src/backend/catalog/system_views.sql
index 8608e3fa5b..317c2010cb 100644
--- a/src/backend/catalog/system_views.sql
+++ b/src/backend/catalog/system_views.sql
@@ -1299,9 +1299,10 @@ REVOKE ALL ON pg_replication_origin_status FROM public;
-- All columns of pg_subscription except subconninfo are publicly readable.
REVOKE ALL ON pg_subscription FROM public;
-GRANT SELECT (oid, subdbid, subskiplsn, subname, subowner, subenabled,
- subbinary, substream, subtwophasestate, subdisableonerr,
- subslotname, subsynccommit, subpublications, suborigin)
+GRANT SELECT (oid, subdbid, subskiplsn, subminapplydelay, subname, subowner,
+ subenabled, subbinary, substream, subtwophasestate,
+ subdisableonerr, subslotname, subsynccommit, subpublications,
+ suborigin)
ON pg_subscription TO public;
CREATE VIEW pg_stat_subscription_stats AS
diff --git a/src/backend/commands/subscriptioncmds.c b/src/backend/commands/subscriptioncmds.c
index 464db6d247..34655f4219 100644
--- a/src/backend/commands/subscriptioncmds.c
+++ b/src/backend/commands/subscriptioncmds.c
@@ -66,6 +66,7 @@
#define SUBOPT_DISABLE_ON_ERR 0x00000400
#define SUBOPT_LSN 0x00000800
#define SUBOPT_ORIGIN 0x00001000
+#define SUBOPT_MIN_APPLY_DELAY 0x00002000
/* check if the 'val' has 'bits' set */
#define IsSet(val, bits) (((val) & (bits)) == (bits))
@@ -90,6 +91,7 @@ typedef struct SubOpts
bool disableonerr;
char *origin;
XLogRecPtr lsn;
+ int32 min_apply_delay;
} SubOpts;
static List *fetch_table_list(WalReceiverConn *wrconn, List *publications);
@@ -100,7 +102,7 @@ static void check_publications_origin(WalReceiverConn *wrconn,
static void check_duplicates_in_publist(List *publist, Datum *datums);
static List *merge_publications(List *oldpublist, List *newpublist, bool addpub, const char *subname);
static void ReportSlotConnectionError(List *rstates, Oid subid, char *slotname, char *err);
-
+static int32 defGetMinApplyDelay(DefElem *def);
/*
* Common option parsing function for CREATE and ALTER SUBSCRIPTION commands.
@@ -146,6 +148,8 @@ parse_subscription_options(ParseState *pstate, List *stmt_options,
opts->disableonerr = false;
if (IsSet(supported_opts, SUBOPT_ORIGIN))
opts->origin = pstrdup(LOGICALREP_ORIGIN_ANY);
+ if (IsSet(supported_opts, SUBOPT_MIN_APPLY_DELAY))
+ opts->min_apply_delay = 0;
/* Parse options */
foreach(lc, stmt_options)
@@ -324,6 +328,15 @@ parse_subscription_options(ParseState *pstate, List *stmt_options,
opts->specified_opts |= SUBOPT_LSN;
opts->lsn = lsn;
}
+ else if (IsSet(supported_opts, SUBOPT_MIN_APPLY_DELAY) &&
+ strcmp(defel->defname, "min_apply_delay") == 0)
+ {
+ if (IsSet(opts->specified_opts, SUBOPT_MIN_APPLY_DELAY))
+ errorConflictingDefElem(defel, pstate);
+
+ opts->specified_opts |= SUBOPT_MIN_APPLY_DELAY;
+ opts->min_apply_delay = defGetMinApplyDelay(defel);
+ }
else
ereport(ERROR,
(errcode(ERRCODE_SYNTAX_ERROR),
@@ -404,6 +417,31 @@ parse_subscription_options(ParseState *pstate, List *stmt_options,
"slot_name = NONE", "create_slot = false")));
}
}
+
+ /*
+ * The combination of parallel streaming mode and min_apply_delay is not
+ * allowed. This is because we start applying the transaction stream as
+ * soon as the first change arrives without knowing the transaction's
+ * prepare/commit time. This means we cannot calculate the underlying
+ * network/decoding lag between publisher and subscriber, and so always
+ * waiting for the full 'min_apply_delay' period might include unnecessary
+ * delay.
+ *
+ * The other possibility was to apply the delay at the end of the parallel
+ * apply transaction but that would cause issues related to resource bloat
+ * and locks being held for a long time.
+ */
+ if (IsSet(supported_opts, SUBOPT_MIN_APPLY_DELAY) &&
+ opts->min_apply_delay > 0 && opts->streaming == LOGICALREP_STREAM_PARALLEL)
+ ereport(ERROR,
+ errcode(ERRCODE_SYNTAX_ERROR),
+
+ /*
+ * - translator: the first %s is a string of the form "parameter > 0"
+ * and the second one is "option = value".
+ */
+ errmsg("%s and %s are mutually exclusive options",
+ "min_apply_delay > 0", "streaming = parallel"));
}
/*
@@ -560,7 +598,8 @@ CreateSubscription(ParseState *pstate, CreateSubscriptionStmt *stmt,
SUBOPT_SLOT_NAME | SUBOPT_COPY_DATA |
SUBOPT_SYNCHRONOUS_COMMIT | SUBOPT_BINARY |
SUBOPT_STREAMING | SUBOPT_TWOPHASE_COMMIT |
- SUBOPT_DISABLE_ON_ERR | SUBOPT_ORIGIN);
+ SUBOPT_DISABLE_ON_ERR | SUBOPT_ORIGIN |
+ SUBOPT_MIN_APPLY_DELAY);
parse_subscription_options(pstate, stmt->options, supported_opts, &opts);
/*
@@ -625,6 +664,7 @@ CreateSubscription(ParseState *pstate, CreateSubscriptionStmt *stmt,
values[Anum_pg_subscription_oid - 1] = ObjectIdGetDatum(subid);
values[Anum_pg_subscription_subdbid - 1] = ObjectIdGetDatum(MyDatabaseId);
values[Anum_pg_subscription_subskiplsn - 1] = LSNGetDatum(InvalidXLogRecPtr);
+ values[Anum_pg_subscription_subminapplydelay - 1] = Int32GetDatum(opts.min_apply_delay);
values[Anum_pg_subscription_subname - 1] =
DirectFunctionCall1(namein, CStringGetDatum(stmt->subname));
values[Anum_pg_subscription_subowner - 1] = ObjectIdGetDatum(owner);
@@ -1054,7 +1094,7 @@ AlterSubscription(ParseState *pstate, AlterSubscriptionStmt *stmt,
supported_opts = (SUBOPT_SLOT_NAME |
SUBOPT_SYNCHRONOUS_COMMIT | SUBOPT_BINARY |
SUBOPT_STREAMING | SUBOPT_DISABLE_ON_ERR |
- SUBOPT_ORIGIN);
+ SUBOPT_ORIGIN | SUBOPT_MIN_APPLY_DELAY);
parse_subscription_options(pstate, stmt->options,
supported_opts, &opts);
@@ -1098,6 +1138,19 @@ AlterSubscription(ParseState *pstate, AlterSubscriptionStmt *stmt,
if (IsSet(opts.specified_opts, SUBOPT_STREAMING))
{
+ /*
+ * The combination of parallel streaming mode and
+ * min_apply_delay is not allowed. See
+ * parse_subscription_options for details of the reason.
+ */
+ if (opts.streaming == LOGICALREP_STREAM_PARALLEL)
+ if ((IsSet(opts.specified_opts, SUBOPT_MIN_APPLY_DELAY) && opts.min_apply_delay > 0) ||
+ (!IsSet(opts.specified_opts, SUBOPT_MIN_APPLY_DELAY) && sub->minapplydelay > 0))
+ ereport(ERROR,
+ errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
+ errmsg("cannot enable parallel streaming mode for subscription with %s",
+ "min_apply_delay"));
+
values[Anum_pg_subscription_substream - 1] =
CharGetDatum(opts.streaming);
replaces[Anum_pg_subscription_substream - 1] = true;
@@ -1111,6 +1164,25 @@ AlterSubscription(ParseState *pstate, AlterSubscriptionStmt *stmt,
= true;
}
+ if (IsSet(opts.specified_opts, SUBOPT_MIN_APPLY_DELAY))
+ {
+ /*
+ * The combination of parallel streaming mode and
+ * min_apply_delay is not allowed.
+ */
+ if (opts.min_apply_delay > 0)
+ if ((IsSet(opts.specified_opts, SUBOPT_STREAMING) && opts.streaming == LOGICALREP_STREAM_PARALLEL) ||
+ (!IsSet(opts.specified_opts, SUBOPT_STREAMING) && sub->stream == LOGICALREP_STREAM_PARALLEL))
+ ereport(ERROR,
+ errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
+ errmsg("cannot enable %s for subscription in %s mode",
+ "min_apply_delay", "streaming = parallel"));
+
+ values[Anum_pg_subscription_subminapplydelay - 1] =
+ Int32GetDatum(opts.min_apply_delay);
+ replaces[Anum_pg_subscription_subminapplydelay - 1] = true;
+ }
+
if (IsSet(opts.specified_opts, SUBOPT_ORIGIN))
{
values[Anum_pg_subscription_suborigin - 1] =
@@ -2195,3 +2267,45 @@ defGetStreamingMode(DefElem *def)
def->defname)));
return LOGICALREP_STREAM_OFF; /* keep compiler quiet */
}
+
+/*
+ * Extract the min_apply_delay value from a DefElem. This is very similar to
+ * parse_and_validate_value() for integer values, because min_apply_delay
+ * accepts the same parameter format as recovery_min_apply_delay.
+ */
+static int32
+defGetMinApplyDelay(DefElem *def)
+{
+ char *input_string;
+ int result;
+ const char *hintmsg;
+
+ input_string = defGetString(def);
+
+ /*
+ * Parse given string as parameter which has millisecond unit
+ */
+ if (!parse_int(input_string, &result, GUC_UNIT_MS, &hintmsg))
+ ereport(ERROR,
+ (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+ errmsg("invalid value for parameter \"%s\": \"%s\"",
+ "min_apply_delay", input_string),
+ hintmsg ? errhint("%s", _(hintmsg)) : 0));
+
+ /*
+ * Check both the lower boundary for the valid min_apply_delay range and
+ * the upper boundary as the safeguard for some platforms where INT_MAX is
+ * wider than int32 respectively. Although parse_int() has confirmed that
+ * the result is less than or equal to INT_MAX, the value will be stored
+ * in a catalog column of int32.
+ */
+ if (result < 0 || result > PG_INT32_MAX)
+ ereport(ERROR,
+ (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+ errmsg("%d ms is outside the valid range for parameter \"%s\" (%d .. %d)",
+ result,
+ "min_apply_delay",
+ 0, PG_INT32_MAX)));
+
+ return result;
+}
diff --git a/src/backend/replication/logical/applyparallelworker.c b/src/backend/replication/logical/applyparallelworker.c
index 3579e704fe..7302bce7a0 100644
--- a/src/backend/replication/logical/applyparallelworker.c
+++ b/src/backend/replication/logical/applyparallelworker.c
@@ -704,7 +704,8 @@ pa_process_spooled_messages_if_required(void)
{
apply_spooled_messages(&MyParallelShared->fileset,
MyParallelShared->xid,
- InvalidXLogRecPtr);
+ InvalidXLogRecPtr,
+ 0);
pa_set_fileset_state(MyParallelShared, FS_EMPTY);
}
diff --git a/src/backend/replication/logical/worker.c b/src/backend/replication/logical/worker.c
index cfb2ab6248..9b3de65de7 100644
--- a/src/backend/replication/logical/worker.c
+++ b/src/backend/replication/logical/worker.c
@@ -319,6 +319,17 @@ static List *on_commit_wakeup_workers_subids = NIL;
bool in_remote_transaction = false;
static XLogRecPtr remote_final_lsn = InvalidXLogRecPtr;
+/*
+ * In order to avoid walsender timeout for time-delayed logical replication the
+ * apply worker keeps sending feedback messages during the delay period.
+ * Meanwhile, the feature delays the apply before the start of the
+ * transaction and thus we don't write WAL records for the suspended changes
+ * during the wait. When the apply worker sends a feedback message during the
+ * delay, we should not overwrite positions of the flushed and apply LSN by the
+ * last received latest LSN. See send_feedback() for details.
+ */
+static XLogRecPtr last_received = InvalidXLogRecPtr;
+
/* fields valid only when processing streamed transaction */
static bool in_streamed_transaction = false;
@@ -389,7 +400,8 @@ static void stream_write_change(char action, StringInfo s);
static void stream_open_and_write_change(TransactionId xid, char action, StringInfo s);
static void stream_close_file(void);
-static void send_feedback(XLogRecPtr recvpos, bool force, bool requestReply);
+static void send_feedback(XLogRecPtr recvpos, bool force, bool requestReply,
+ bool has_unprocessed_change);
static void DisableSubscriptionAndExit(void);
@@ -999,6 +1011,109 @@ slot_modify_data(TupleTableSlot *slot, TupleTableSlot *srcslot,
ExecStoreVirtualTuple(slot);
}
+/*
+ * When min_apply_delay parameter is set on the subscriber, we wait long enough
+ * to make sure a transaction is applied at least that period behind the
+ * publisher.
+ *
+ * While the physical replication applies the delay at commit time, this
+ * feature applies the delay for the next transaction but before starting the
+ * transaction. This is mainly because keeping a transaction that conducted
+ * write operations open for a long time results in some issues such as bloat
+ * and locks.
+ *
+ * The min_apply_delay parameter will take effect only after all tables are in
+ * READY state.
+ *
+ * xid is the transaction id where we apply the delay.
+ *
+ * finish_ts is the commit/prepare time of both regular (non-streamed) and
+ * streamed transactions. Unlike the regular (non-streamed) cases, the delay
+ * is applied in a STREAM COMMIT/STREAM PREPARE message for streamed
+ * transactions. The STREAM START message does not contain a commit/prepare
+ * time (it will be available when the in-progress transaction finishes).
+ * Hence, it's not appropriate to apply a delay at the STREAM START time.
+ */
+static void
+maybe_apply_delay(TransactionId xid, TimestampTz finish_ts)
+{
+ Assert(finish_ts > 0);
+
+ /* Nothing to do if no delay set */
+ if (!MySubscription->minapplydelay)
+ return;
+
+ /*
+ * The min_apply_delay parameter is ignored until all tablesync workers
+ * have reached READY state. This is because if we allowed the delay
+ * during the catchup phase, then once we reached the limit of tablesync
+ * workers it would impose a delay for each subsequent worker. That would
+ * cause initial table synchronization completion to take a long time.
+ */
+ if (!AllTablesyncsReady())
+ return;
+
+ /* Apply the delay by the latch mechanism */
+ while (true)
+ {
+ TimestampTz delayUntil;
+ long diffms;
+
+ ResetLatch(MyLatch);
+
+ CHECK_FOR_INTERRUPTS();
+
+ /* This might change wal_receiver_status_interval */
+ if (ConfigReloadPending)
+ {
+ ConfigReloadPending = false;
+ ProcessConfigFile(PGC_SIGHUP);
+ }
+
+ /*
+ * Before calculating the time duration, reload the catalog if needed.
+ */
+ if (!in_remote_transaction && !in_streamed_transaction)
+ {
+ AcceptInvalidationMessages();
+ maybe_reread_subscription();
+ }
+
+ delayUntil = TimestampTzPlusMilliseconds(finish_ts, MySubscription->minapplydelay);
+ diffms = TimestampDifferenceMilliseconds(GetCurrentTimestamp(), delayUntil);
+
+ /*
+ * Exit without arming the latch if it's already past time to apply
+ * this transaction.
+ */
+ if (diffms <= 0)
+ break;
+
+ elog(DEBUG2, "time-delayed replication for txid %u, min_apply_delay = %d ms, remaining wait time: %ld ms",
+ xid, MySubscription->minapplydelay, diffms);
+
+ /*
+ * Call send_feedback() to prevent the publisher from exiting by
+ * timeout during the delay, when wal_receiver_status_interval is
+ * available.
+ */
+ if (wal_receiver_status_interval > 0 &&
+ diffms > wal_receiver_status_interval * 1000L)
+ {
+ WaitLatch(MyLatch,
+ WL_LATCH_SET | WL_TIMEOUT | WL_EXIT_ON_PM_DEATH,
+ wal_receiver_status_interval * 1000L,
+ WAIT_EVENT_RECOVERY_APPLY_DELAY);
+ send_feedback(last_received, true, false, true);
+ }
+ else
+ WaitLatch(MyLatch,
+ WL_LATCH_SET | WL_TIMEOUT | WL_EXIT_ON_PM_DEATH,
+ diffms,
+ WAIT_EVENT_RECOVERY_APPLY_DELAY);
+ }
+}
+
/*
* Handle BEGIN message.
*/
@@ -1013,6 +1128,9 @@ apply_handle_begin(StringInfo s)
logicalrep_read_begin(s, &begin_data);
set_apply_error_context_xact(begin_data.xid, begin_data.final_lsn);
+ /* Should we delay the current transaction? */
+ maybe_apply_delay(begin_data.xid, begin_data.committime);
+
remote_final_lsn = begin_data.final_lsn;
maybe_start_skipping_changes(begin_data.final_lsn);
@@ -1070,6 +1188,9 @@ apply_handle_begin_prepare(StringInfo s)
logicalrep_read_begin_prepare(s, &begin_data);
set_apply_error_context_xact(begin_data.xid, begin_data.prepare_lsn);
+ /* Should we delay the current prepared transaction? */
+ maybe_apply_delay(begin_data.xid, begin_data.prepare_time);
+
remote_final_lsn = begin_data.prepare_lsn;
maybe_start_skipping_changes(begin_data.prepare_lsn);
@@ -1317,7 +1438,8 @@ apply_handle_stream_prepare(StringInfo s)
* spooled operations.
*/
apply_spooled_messages(MyLogicalRepWorker->stream_fileset,
- prepare_data.xid, prepare_data.prepare_lsn);
+ prepare_data.xid, prepare_data.prepare_lsn,
+ prepare_data.prepare_time);
/* Mark the transaction as prepared. */
apply_handle_prepare_internal(&prepare_data);
@@ -2011,10 +2133,13 @@ ensure_last_message(FileSet *stream_fileset, TransactionId xid, int fileno,
/*
* Common spoolfile processing.
+ *
+ * The commit/prepare time (finish_ts) for streamed transactions is required
+ * for time-delayed logical replication.
*/
void
apply_spooled_messages(FileSet *stream_fileset, TransactionId xid,
- XLogRecPtr lsn)
+ XLogRecPtr lsn, TimestampTz finish_ts)
{
StringInfoData s2;
int nchanges;
@@ -2025,6 +2150,10 @@ apply_spooled_messages(FileSet *stream_fileset, TransactionId xid,
int fileno;
off_t offset;
+ /* Should we delay the current transaction? */
+ if (finish_ts)
+ maybe_apply_delay(xid, finish_ts);
+
if (!am_parallel_apply_worker())
maybe_start_skipping_changes(lsn);
@@ -2174,7 +2303,7 @@ apply_handle_stream_commit(StringInfo s)
* spooled operations.
*/
apply_spooled_messages(MyLogicalRepWorker->stream_fileset, xid,
- commit_data.commit_lsn);
+ commit_data.commit_lsn, commit_data.committime);
apply_handle_commit_internal(&commit_data);
@@ -3447,7 +3576,7 @@ UpdateWorkerStats(XLogRecPtr last_lsn, TimestampTz send_time, bool reply)
* Apply main loop.
*/
static void
-LogicalRepApplyLoop(XLogRecPtr last_received)
+LogicalRepApplyLoop(void)
{
TimestampTz last_recv_timestamp = GetCurrentTimestamp();
bool ping_sent = false;
@@ -3568,7 +3697,7 @@ LogicalRepApplyLoop(XLogRecPtr last_received)
if (last_received < end_lsn)
last_received = end_lsn;
- send_feedback(last_received, reply_requested, false);
+ send_feedback(last_received, reply_requested, false, false);
UpdateWorkerStats(last_received, timestamp, true);
}
/* other message types are purposefully ignored */
@@ -3581,7 +3710,7 @@ LogicalRepApplyLoop(XLogRecPtr last_received)
}
/* confirm all writes so far */
- send_feedback(last_received, false, false);
+ send_feedback(last_received, false, false, false);
if (!in_remote_transaction && !in_streamed_transaction)
{
@@ -3678,7 +3807,7 @@ LogicalRepApplyLoop(XLogRecPtr last_received)
}
}
- send_feedback(last_received, requestReply, requestReply);
+ send_feedback(last_received, requestReply, requestReply, false);
/*
* Force reporting to ensure long idle periods don't lead to
@@ -3708,7 +3837,7 @@ LogicalRepApplyLoop(XLogRecPtr last_received)
* to send a response to avoid timeouts.
*/
static void
-send_feedback(XLogRecPtr recvpos, bool force, bool requestReply)
+send_feedback(XLogRecPtr recvpos, bool force, bool requestReply, bool has_unprocessed_change)
{
static StringInfo reply_message = NULL;
static TimestampTz send_time = 0;
@@ -3738,8 +3867,14 @@ send_feedback(XLogRecPtr recvpos, bool force, bool requestReply)
/*
* No outstanding transactions to flush, we can report the latest received
* position. This is important for synchronous replication.
+ *
+ * If the logical replication subscription has unprocessed changes then do
+ * not inform the publisher that the received latest LSN is already
+ * applied and flushed, otherwise, the publisher will make a wrong
+ * assumption about the logical replication progress. Instead, just send a
+ * feedback message to avoid a replication timeout during the delay.
*/
- if (!have_pending_txes)
+ if (!have_pending_txes && !has_unprocessed_change)
flushpos = writepos = recvpos;
if (writepos < last_writepos)
@@ -3776,8 +3911,9 @@ send_feedback(XLogRecPtr recvpos, bool force, bool requestReply)
pq_sendint64(reply_message, now); /* sendTime */
pq_sendbyte(reply_message, requestReply); /* replyRequested */
- elog(DEBUG2, "sending feedback (force %d) to recv %X/%X, write %X/%X, flush %X/%X",
+ elog(DEBUG2, "sending feedback (force %d, has_unprocessed_change %s) to recv %X/%X, write %X/%X, flush %X/%X",
force,
+ has_unprocessed_change ? "yes" : "no",
LSN_FORMAT_ARGS(recvpos),
LSN_FORMAT_ARGS(writepos),
LSN_FORMAT_ARGS(flushpos));
@@ -4367,11 +4503,11 @@ start_table_sync(XLogRecPtr *origin_startpos, char **myslotname)
* of system resource error and are not repeatable.
*/
static void
-start_apply(XLogRecPtr origin_startpos)
+start_apply(void)
{
PG_TRY();
{
- LogicalRepApplyLoop(origin_startpos);
+ LogicalRepApplyLoop();
}
PG_CATCH();
{
@@ -4661,7 +4797,8 @@ ApplyWorkerMain(Datum main_arg)
}
/* Run the main loop. */
- start_apply(origin_startpos);
+ last_received = origin_startpos;
+ start_apply();
proc_exit(0);
}
diff --git a/src/bin/pg_dump/pg_dump.c b/src/bin/pg_dump/pg_dump.c
index 527c7651ab..1e87f0124e 100644
--- a/src/bin/pg_dump/pg_dump.c
+++ b/src/bin/pg_dump/pg_dump.c
@@ -4494,6 +4494,7 @@ getSubscriptions(Archive *fout)
int i_subsynccommit;
int i_subpublications;
int i_subbinary;
+ int i_subminapplydelay;
int i,
ntups;
@@ -4546,9 +4547,13 @@ getSubscriptions(Archive *fout)
LOGICALREP_TWOPHASE_STATE_DISABLED);
if (fout->remoteVersion >= 160000)
- appendPQExpBufferStr(query, " s.suborigin\n");
+ appendPQExpBufferStr(query,
+ " s.suborigin,\n"
+ " s.subminapplydelay\n");
else
- appendPQExpBuffer(query, " '%s' AS suborigin\n", LOGICALREP_ORIGIN_ANY);
+ appendPQExpBuffer(query, " '%s' AS suborigin,\n"
+ " 0 AS subminapplydelay\n",
+ LOGICALREP_ORIGIN_ANY);
appendPQExpBufferStr(query,
"FROM pg_subscription s\n"
@@ -4576,6 +4581,7 @@ getSubscriptions(Archive *fout)
i_subtwophasestate = PQfnumber(res, "subtwophasestate");
i_subdisableonerr = PQfnumber(res, "subdisableonerr");
i_suborigin = PQfnumber(res, "suborigin");
+ i_subminapplydelay = PQfnumber(res, "subminapplydelay");
subinfo = pg_malloc(ntups * sizeof(SubscriptionInfo));
@@ -4606,6 +4612,8 @@ getSubscriptions(Archive *fout)
subinfo[i].subdisableonerr =
pg_strdup(PQgetvalue(res, i, i_subdisableonerr));
subinfo[i].suborigin = pg_strdup(PQgetvalue(res, i, i_suborigin));
+ subinfo[i].subminapplydelay =
+ atoi(PQgetvalue(res, i, i_subminapplydelay));
/* Decide whether we want to dump it */
selectDumpableObject(&(subinfo[i].dobj), fout);
@@ -4687,6 +4695,9 @@ dumpSubscription(Archive *fout, const SubscriptionInfo *subinfo)
if (strcmp(subinfo->subsynccommit, "off") != 0)
appendPQExpBuffer(query, ", synchronous_commit = %s", fmtId(subinfo->subsynccommit));
+ if (subinfo->subminapplydelay > 0)
+ appendPQExpBuffer(query, ", min_apply_delay = '%d ms'", subinfo->subminapplydelay);
+
appendPQExpBufferStr(query, ");\n");
if (subinfo->dobj.dump & DUMP_COMPONENT_DEFINITION)
diff --git a/src/bin/pg_dump/pg_dump.h b/src/bin/pg_dump/pg_dump.h
index e7cbd8d7ed..b8831c3ed3 100644
--- a/src/bin/pg_dump/pg_dump.h
+++ b/src/bin/pg_dump/pg_dump.h
@@ -661,6 +661,7 @@ typedef struct _SubscriptionInfo
char *subdisableonerr;
char *suborigin;
char *subsynccommit;
+ int subminapplydelay;
char *subpublications;
} SubscriptionInfo;
diff --git a/src/bin/psql/describe.c b/src/bin/psql/describe.c
index c8a0bb7b3a..81d4607a1c 100644
--- a/src/bin/psql/describe.c
+++ b/src/bin/psql/describe.c
@@ -6472,7 +6472,7 @@ describeSubscriptions(const char *pattern, bool verbose)
PGresult *res;
printQueryOpt myopt = pset.popt;
static const bool translate_columns[] = {false, false, false, false,
- false, false, false, false, false, false, false, false};
+ false, false, false, false, false, false, false, false, false};
if (pset.sversion < 100000)
{
@@ -6527,10 +6527,13 @@ describeSubscriptions(const char *pattern, bool verbose)
gettext_noop("Two-phase commit"),
gettext_noop("Disable on error"));
+ /* Origin and min_apply_delay are only supported in v16 and higher */
if (pset.sversion >= 160000)
appendPQExpBuffer(&buf,
- ", suborigin AS \"%s\"\n",
- gettext_noop("Origin"));
+ ", suborigin AS \"%s\"\n"
+ ", subminapplydelay AS \"%s\"\n",
+ gettext_noop("Origin"),
+ gettext_noop("Min apply delay"));
appendPQExpBuffer(&buf,
", subsynccommit AS \"%s\"\n"
diff --git a/src/bin/psql/tab-complete.c b/src/bin/psql/tab-complete.c
index 5e1882eaea..e8b9a43a47 100644
--- a/src/bin/psql/tab-complete.c
+++ b/src/bin/psql/tab-complete.c
@@ -1925,7 +1925,7 @@ psql_completion(const char *text, int start, int end)
COMPLETE_WITH("(", "PUBLICATION");
/* ALTER SUBSCRIPTION <name> SET ( */
else if (HeadMatches("ALTER", "SUBSCRIPTION", MatchAny) && TailMatches("SET", "("))
- COMPLETE_WITH("binary", "disable_on_error", "origin", "slot_name",
+ COMPLETE_WITH("binary", "disable_on_error", "min_apply_delay", "origin", "slot_name",
"streaming", "synchronous_commit");
/* ALTER SUBSCRIPTION <name> SKIP ( */
else if (HeadMatches("ALTER", "SUBSCRIPTION", MatchAny) && TailMatches("SKIP", "("))
@@ -3268,7 +3268,7 @@ psql_completion(const char *text, int start, int end)
/* Complete "CREATE SUBSCRIPTION <name> ... WITH ( <opt>" */
else if (HeadMatches("CREATE", "SUBSCRIPTION") && TailMatches("WITH", "("))
COMPLETE_WITH("binary", "connect", "copy_data", "create_slot",
- "disable_on_error", "enabled", "origin", "slot_name",
+ "disable_on_error", "enabled", "min_apply_delay", "origin", "slot_name",
"streaming", "synchronous_commit", "two_phase");
/* CREATE TRIGGER --- is allowed inside CREATE SCHEMA, so use TailMatches */
diff --git a/src/include/catalog/pg_subscription.h b/src/include/catalog/pg_subscription.h
index b0f2a1705d..b8fe47ef6e 100644
--- a/src/include/catalog/pg_subscription.h
+++ b/src/include/catalog/pg_subscription.h
@@ -74,6 +74,8 @@ CATALOG(pg_subscription,6100,SubscriptionRelationId) BKI_SHARED_RELATION BKI_ROW
Oid subowner BKI_LOOKUP(pg_authid); /* Owner of the subscription */
+ int32 subminapplydelay; /* Replication apply delay (ms) */
+
bool subenabled; /* True if the subscription is enabled (the
* worker should be running) */
@@ -120,6 +122,7 @@ typedef struct Subscription
* in */
XLogRecPtr skiplsn; /* All changes finished at this LSN are
* skipped */
+ int32 minapplydelay; /* Replication apply delay (ms) */
char *name; /* Name of the subscription */
Oid owner; /* Oid of the subscription owner */
bool enabled; /* Indicates if the subscription is enabled */
diff --git a/src/include/replication/worker_internal.h b/src/include/replication/worker_internal.h
index dc87a4edd1..3dc09d1a4c 100644
--- a/src/include/replication/worker_internal.h
+++ b/src/include/replication/worker_internal.h
@@ -255,7 +255,7 @@ extern void stream_stop_internal(TransactionId xid);
/* Common streaming function to apply all the spooled messages */
extern void apply_spooled_messages(FileSet *stream_fileset, TransactionId xid,
- XLogRecPtr lsn);
+ XLogRecPtr lsn, TimestampTz finish_ts);
extern void apply_dispatch(StringInfo s);
diff --git a/src/test/regress/expected/subscription.out b/src/test/regress/expected/subscription.out
index 3f99b14394..5ccce39986 100644
--- a/src/test/regress/expected/subscription.out
+++ b/src/test/regress/expected/subscription.out
@@ -114,18 +114,18 @@ CREATE SUBSCRIPTION regress_testsub4 CONNECTION 'dbname=regress_doesnotexist' PU
WARNING: subscription was created, but is not connected
HINT: To initiate replication, you must manually create the replication slot, enable the subscription, and refresh the subscription.
\dRs+ regress_testsub4
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
-------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub4 | regress_subscription_user | f | {testpub} | f | off | d | f | none | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay | Synchronous commit | Conninfo | Skip LSN
+------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+-----------------+--------------------+-----------------------------+----------
+ regress_testsub4 | regress_subscription_user | f | {testpub} | f | off | d | f | none | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
ALTER SUBSCRIPTION regress_testsub4 SET (origin = any);
\dRs+ regress_testsub4
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
-------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub4 | regress_subscription_user | f | {testpub} | f | off | d | f | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay | Synchronous commit | Conninfo | Skip LSN
+------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+-----------------+--------------------+-----------------------------+----------
+ regress_testsub4 | regress_subscription_user | f | {testpub} | f | off | d | f | any | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
DROP SUBSCRIPTION regress_testsub3;
@@ -143,10 +143,10 @@ ALTER SUBSCRIPTION regress_testsub CONNECTION 'foobar';
ERROR: invalid connection string syntax: missing "=" after "foobar" in connection info string
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub} | f | off | d | f | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+-----------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | off | d | f | any | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
ALTER SUBSCRIPTION regress_testsub SET PUBLICATION testpub2, testpub3 WITH (refresh = false);
@@ -163,10 +163,10 @@ ERROR: unrecognized subscription parameter: "create_slot"
-- ok
ALTER SUBSCRIPTION regress_testsub SKIP (lsn = '0/12345');
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+---------------------+--------+-----------+------------------+------------------+--------+--------------------+------------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub2,testpub3} | f | off | d | f | any | off | dbname=regress_doesnotexist2 | 0/12345
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+---------------------+--------+-----------+------------------+------------------+--------+-----------------+--------------------+------------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub2,testpub3} | f | off | d | f | any | 0 | off | dbname=regress_doesnotexist2 | 0/12345
(1 row)
-- ok - with lsn = NONE
@@ -175,10 +175,10 @@ ALTER SUBSCRIPTION regress_testsub SKIP (lsn = NONE);
ALTER SUBSCRIPTION regress_testsub SKIP (lsn = '0/0');
ERROR: invalid WAL location (LSN): 0/0
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+---------------------+--------+-----------+------------------+------------------+--------+--------------------+------------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub2,testpub3} | f | off | d | f | any | off | dbname=regress_doesnotexist2 | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+---------------------+--------+-----------+------------------+------------------+--------+-----------------+--------------------+------------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub2,testpub3} | f | off | d | f | any | 0 | off | dbname=regress_doesnotexist2 | 0/0
(1 row)
BEGIN;
@@ -210,10 +210,10 @@ ALTER SUBSCRIPTION regress_testsub_foo SET (synchronous_commit = foobar);
ERROR: invalid value for parameter "synchronous_commit": "foobar"
HINT: Available values: local, remote_write, remote_apply, on, off.
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
----------------------+---------------------------+---------+---------------------+--------+-----------+------------------+------------------+--------+--------------------+------------------------------+----------
- regress_testsub_foo | regress_subscription_user | f | {testpub2,testpub3} | f | off | d | f | any | local | dbname=regress_doesnotexist2 | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay | Synchronous commit | Conninfo | Skip LSN
+---------------------+---------------------------+---------+---------------------+--------+-----------+------------------+------------------+--------+-----------------+--------------------+------------------------------+----------
+ regress_testsub_foo | regress_subscription_user | f | {testpub2,testpub3} | f | off | d | f | any | 0 | local | dbname=regress_doesnotexist2 | 0/0
(1 row)
-- rename back to keep the rest simple
@@ -247,19 +247,19 @@ CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUB
WARNING: subscription was created, but is not connected
HINT: To initiate replication, you must manually create the replication slot, enable the subscription, and refresh the subscription.
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub} | t | off | d | f | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+-----------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | t | off | d | f | any | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
ALTER SUBSCRIPTION regress_testsub SET (binary = false);
ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE);
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub} | f | off | d | f | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+-----------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | off | d | f | any | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
DROP SUBSCRIPTION regress_testsub;
@@ -271,27 +271,27 @@ CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUB
WARNING: subscription was created, but is not connected
HINT: To initiate replication, you must manually create the replication slot, enable the subscription, and refresh the subscription.
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub} | f | on | d | f | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+-----------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | on | d | f | any | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
ALTER SUBSCRIPTION regress_testsub SET (streaming = parallel);
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub} | f | parallel | d | f | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+-----------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | parallel | d | f | any | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
ALTER SUBSCRIPTION regress_testsub SET (streaming = false);
ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE);
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub} | f | off | d | f | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+-----------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | off | d | f | any | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
-- fail - publication already exists
@@ -306,10 +306,10 @@ ALTER SUBSCRIPTION regress_testsub ADD PUBLICATION testpub1, testpub2 WITH (refr
ALTER SUBSCRIPTION regress_testsub ADD PUBLICATION testpub1, testpub2 WITH (refresh = false);
ERROR: publication "testpub1" is already in subscription "regress_testsub"
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-----------------------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub,testpub1,testpub2} | f | off | d | f | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-----------------------------+--------+-----------+------------------+------------------+--------+-----------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub,testpub1,testpub2} | f | off | d | f | any | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
-- fail - publication used more than once
@@ -324,10 +324,10 @@ ERROR: publication "testpub3" is not in subscription "regress_testsub"
-- ok - delete publications
ALTER SUBSCRIPTION regress_testsub DROP PUBLICATION testpub1, testpub2 WITH (refresh = false);
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub} | f | off | d | f | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+-----------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | off | d | f | any | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
DROP SUBSCRIPTION regress_testsub;
@@ -363,10 +363,10 @@ CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUB
WARNING: subscription was created, but is not connected
HINT: To initiate replication, you must manually create the replication slot, enable the subscription, and refresh the subscription.
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub} | f | off | p | f | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+-----------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | off | p | f | any | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
--fail - alter of two_phase option not supported.
@@ -375,10 +375,10 @@ ERROR: unrecognized subscription parameter: "two_phase"
-- but can alter streaming when two_phase enabled
ALTER SUBSCRIPTION regress_testsub SET (streaming = true);
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub} | f | on | p | f | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+-----------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | on | p | f | any | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE);
@@ -388,10 +388,10 @@ CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUB
WARNING: subscription was created, but is not connected
HINT: To initiate replication, you must manually create the replication slot, enable the subscription, and refresh the subscription.
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub} | f | on | p | f | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+-----------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | on | p | f | any | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE);
@@ -404,20 +404,57 @@ CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUB
WARNING: subscription was created, but is not connected
HINT: To initiate replication, you must manually create the replication slot, enable the subscription, and refresh the subscription.
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub} | f | off | d | f | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+-----------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | off | d | f | any | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
ALTER SUBSCRIPTION regress_testsub SET (disable_on_error = true);
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub} | f | off | d | t | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+-----------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | off | d | t | any | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
+ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE);
+DROP SUBSCRIPTION regress_testsub;
+-- fail -- min_apply_delay must be a non-negative integer
+CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUBLICATION testpub WITH (connect = false, min_apply_delay = foo);
+ERROR: invalid value for parameter "min_apply_delay": "foo"
+CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUBLICATION testpub WITH (connect = false, min_apply_delay = -1);
+ERROR: -1 ms is outside the valid range for parameter "min_apply_delay" (0 .. 2147483647)
+-- fail - utilizing streaming = parallel with time-delayed replication is not supported
+CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUBLICATION testpub WITH (connect = false, streaming = parallel, min_apply_delay = 123);
+ERROR: min_apply_delay > 0 and streaming = parallel are mutually exclusive options
+-- success -- min_apply_delay value without unit is taken as milliseconds
+CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUBLICATION testpub WITH (connect = false, min_apply_delay = 123);
+WARNING: subscription was created, but is not connected
+HINT: To initiate replication, you must manually create the replication slot, enable the subscription, and refresh the subscription.
+\dRs+
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+-----------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | off | d | f | any | 123 | off | dbname=regress_doesnotexist | 0/0
+(1 row)
+
+-- success -- min_apply_delay value with unit is converted into ms and stored as an integer
+ALTER SUBSCRIPTION regress_testsub SET (min_apply_delay = '1 d');
+\dRs+
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+-----------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | off | d | f | any | 86400000 | off | dbname=regress_doesnotexist | 0/0
+(1 row)
+
+-- fail - alter subscription with streaming = parallel should fail when time-delayed replication is set
+ALTER SUBSCRIPTION regress_testsub SET (streaming = parallel);
+ERROR: cannot enable parallel streaming mode for subscription with min_apply_delay
+-- fail - alter subscription with min_apply_delay should fail when streaming = parallel is set
+ALTER SUBSCRIPTION regress_testsub SET (min_apply_delay = 0, streaming = parallel);
+ALTER SUBSCRIPTION regress_testsub SET (min_apply_delay = 123);
+ERROR: cannot enable min_apply_delay for subscription in streaming = parallel mode
ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE);
DROP SUBSCRIPTION regress_testsub;
RESET SESSION AUTHORIZATION;
diff --git a/src/test/regress/sql/subscription.sql b/src/test/regress/sql/subscription.sql
index 7281f5fee2..7317b140f5 100644
--- a/src/test/regress/sql/subscription.sql
+++ b/src/test/regress/sql/subscription.sql
@@ -286,6 +286,30 @@ ALTER SUBSCRIPTION regress_testsub SET (disable_on_error = true);
ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE);
DROP SUBSCRIPTION regress_testsub;
+-- fail -- min_apply_delay must be a non-negative integer
+CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUBLICATION testpub WITH (connect = false, min_apply_delay = foo);
+CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUBLICATION testpub WITH (connect = false, min_apply_delay = -1);
+
+-- fail - utilizing streaming = parallel with time-delayed replication is not supported
+CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUBLICATION testpub WITH (connect = false, streaming = parallel, min_apply_delay = 123);
+
+-- success -- min_apply_delay value without unit is taken as milliseconds
+CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUBLICATION testpub WITH (connect = false, min_apply_delay = 123);
+\dRs+
+
+-- success -- min_apply_delay value with unit is converted into ms and stored as an integer
+ALTER SUBSCRIPTION regress_testsub SET (min_apply_delay = '1 d');
+\dRs+
+
+-- fail - alter subscription with streaming = parallel should fail when time-delayed replication is set
+ALTER SUBSCRIPTION regress_testsub SET (streaming = parallel);
+
+-- fail - alter subscription with min_apply_delay should fail when streaming = parallel is set
+ALTER SUBSCRIPTION regress_testsub SET (min_apply_delay = 0, streaming = parallel);
+ALTER SUBSCRIPTION regress_testsub SET (min_apply_delay = 123);
+ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE);
+DROP SUBSCRIPTION regress_testsub;
+
RESET SESSION AUTHORIZATION;
DROP ROLE regress_subscription_user;
DROP ROLE regress_subscription_user2;
diff --git a/src/test/subscription/meson.build b/src/test/subscription/meson.build
index 3db0fdfd96..a186876eb4 100644
--- a/src/test/subscription/meson.build
+++ b/src/test/subscription/meson.build
@@ -38,6 +38,7 @@ tests += {
't/029_on_error.pl',
't/030_origin.pl',
't/031_column_list.pl',
+ 't/032_apply_delay.pl',
't/100_bugs.pl',
],
},
diff --git a/src/test/subscription/t/032_apply_delay.pl b/src/test/subscription/t/032_apply_delay.pl
new file mode 100644
index 0000000000..858638daae
--- /dev/null
+++ b/src/test/subscription/t/032_apply_delay.pl
@@ -0,0 +1,187 @@
+
+# Copyright (c) 2023, PostgreSQL Global Development Group
+
+# Test replication apply delay
+use strict;
+use warnings;
+use PostgreSQL::Test::Cluster;
+use PostgreSQL::Test::Utils;
+use Test::More;
+
+# Confirm the time-delayed replication has been effective from the server log
+# message where the apply worker emits for applying delay. Moreover, verify
+# that the current worker's remaining wait time is sufficiently bigger than the
+# expected value, in order to check any update of the min_apply_delay.
+sub check_apply_delay_log
+{
+ my ($node_subscriber, $offset, $expected) = @_;
+
+ my $log_location = $node_subscriber->wait_for_log(
+ qr/time-delayed replication for txid (\d+), min_apply_delay = (\d+) ms, remaining wait time: (\d+) ms/,
+ $offset);
+
+ cmp_ok($log_location, '>', $offset,
+ "logfile contains triggered logical replication apply delay");
+
+ # Get the remaining wait time from the server log
+ my $contents = slurp_file($node_subscriber->logfile, $offset);
+ $contents =~
+ qr/time-delayed replication for txid (\d+), min_apply_delay = (\d+) ms, remaining wait time: (\d+) ms/,
+ or die "could not get the apply worker wait time";
+ my $logged_delay = $3;
+
+ # Is it larger than expected?
+ cmp_ok($logged_delay, '>', $expected,
+ "The apply worker wait time has expected duration");
+}
+
+# Compare inserted time on the publisher with applied time on the subscriber to
+# confirm the latter is applied after expected time. The time is automatically
+# generated and stored in the table column 'c'.
+sub check_apply_delay_time
+{
+ my ($node_publisher, $node_subscriber, $primary_key, $expected_diffs) =
+ @_;
+
+ my $inserted_time_on_pub = $node_publisher->safe_psql(
+ 'postgres', qq[
+ SELECT extract(epoch from c) FROM test_tab WHERE a = $primary_key;
+ ]);
+
+ my $inserted_time_on_sub = $node_subscriber->safe_psql(
+ 'postgres', qq[
+ SELECT extract(epoch from c) FROM test_tab WHERE a = $primary_key;
+ ]);
+
+ cmp_ok(
+ $inserted_time_on_sub - $inserted_time_on_pub,
+ '>',
+ $expected_diffs,
+ "The tuple on the subscriber was modified later than the publisher");
+}
+
+# Create publisher node
+my $node_publisher = PostgreSQL::Test::Cluster->new('publisher');
+$node_publisher->init(allows_streaming => 'logical');
+$node_publisher->append_conf('postgresql.conf',
+ 'logical_decoding_work_mem = 64kB');
+$node_publisher->start;
+
+# Create subscriber node
+my $node_subscriber = PostgreSQL::Test::Cluster->new('subscriber');
+$node_subscriber->init;
+$node_subscriber->append_conf('postgresql.conf', "log_min_messages = debug2");
+$node_subscriber->start;
+
+# Setup structure on publisher
+$node_publisher->safe_psql('postgres',
+ "CREATE TABLE test_tab (a int primary key, b varchar, c timestamptz (6) DEFAULT now())"
+);
+
+# Setup structure on subscriber
+$node_subscriber->safe_psql('postgres',
+ "CREATE TABLE test_tab (a int primary key, b text, c timestamptz (6) DEFAULT now())"
+);
+
+# Setup logical replication
+my $publisher_connstr = $node_publisher->connstr . ' dbname=postgres';
+
+# The column 'c' must not be published because we want to compare the time
+# difference.
+$node_publisher->safe_psql('postgres',
+ "CREATE PUBLICATION tap_pub FOR TABLE test_tab (a, b)");
+
+my $appname = 'tap_sub';
+
+# Create a subscription that applies the transaction after 1 second delay
+$node_subscriber->safe_psql('postgres',
+ "CREATE SUBSCRIPTION tap_sub CONNECTION '$publisher_connstr application_name=$appname' PUBLICATION tap_pub WITH (copy_data = off, min_apply_delay = '1s', streaming = 'on')"
+);
+
+# Check log starting now for logical replication apply delay
+my $offset = -s $node_subscriber->logfile;
+
+# New row to trigger apply delay
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO test_tab VALUES (1, 'foo')");
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO test_tab VALUES (2, 'bar')");
+
+$node_publisher->wait_for_catchup($appname);
+
+my $result =
+ $node_subscriber->safe_psql('postgres',
+ "SELECT count(*), min(a), max(a) FROM test_tab");
+is($result, qq(2|1|2), 'check if the new rows were applied to subscriber');
+
+# Make sure the apply worker waits for more than 500ms
+check_apply_delay_log($node_subscriber, $offset, "500");
+
+# Verify that the subscriber lags the publisher by at least 1 second
+check_apply_delay_time($node_publisher, $node_subscriber, '2', '1');
+
+# Setup for streaming case
+$node_publisher->append_conf('postgresql.conf',
+ 'logical_replication_mode = immediate');
+$node_publisher->reload;
+
+# Run a query to make sure that the reload has taken effect.
+$node_publisher->safe_psql('postgres', q{SELECT 1});
+
+# Check log starting now for logical replication apply delay
+$offset = -s $node_subscriber->logfile;
+
+# Test streamed transaction by insert
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO test_tab SELECT i, md5(i::text) FROM generate_series(3, 5) s(i);"
+);
+
+$node_publisher->wait_for_catchup($appname);
+
+$result =
+ $node_subscriber->safe_psql('postgres',
+ "SELECT count(*), min(a), max(a) FROM test_tab");
+is($result, qq(5|1|5), 'check if the new rows were applied to subscriber');
+
+# Make sure the apply worker waits for more than 500ms
+check_apply_delay_log($node_subscriber, $offset, "500");
+
+# Verify that the subscriber lags the publisher by at least 1 second
+check_apply_delay_time($node_publisher, $node_subscriber, '5', '1');
+
+# Test whether ALTER SUBSCRIPTION changes the delayed time of the apply worker
+# (1 day 5 minutes). Note that the extra 5 minute is to account for any
+# decoding/network overhead.
+$node_subscriber->safe_psql('postgres',
+ "ALTER SUBSCRIPTION tap_sub SET (min_apply_delay = 86700000)");
+
+# Check log starting now for logical replication apply delay
+$offset = -s $node_subscriber->logfile;
+
+# New row to trigger apply delay
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO test_tab VALUES (0, 'foobar')");
+
+# Make sure the apply worker waits for more than 1 day
+check_apply_delay_log($node_subscriber, $offset, "86400000");
+
+# Disable subscription and the worker should die immediately
+$node_subscriber->safe_psql('postgres',
+ "ALTER SUBSCRIPTION tap_sub DISABLE;");
+
+# Wait until worker dies
+my $sub_query =
+ "SELECT count(1) = 0 FROM pg_stat_subscription WHERE subname = 'tap_sub' AND pid IS NOT NULL;";
+$node_subscriber->poll_query_until('postgres', $sub_query)
+ or die "Timed out while waiting for subscriber to die";
+
+# Confirm disabling the subscription by ALTER SUBSCRIPTION DISABLE did not cause
+# the delayed transaction to be applied.
+$result = $node_subscriber->safe_psql('postgres',
+ "SELECT count(a) FROM test_tab WHERE a = 0;");
+is($result, qq(0), "check the delayed transaction was not applied");
+
+$node_subscriber->stop;
+$node_publisher->stop;
+
+done_testing();
--
2.30.0
Hi,
On Wednesday, February 1, 2023 1:37 PM Peter Smith <smithpb2250@gmail.com> wrote:
Here are my review comments for the patch v25-0001.
Thank you for your review !
======
Commit Message1.
The other possibility is to apply the delay at the end of the parallel apply
transaction but that would cause issues related to resource bloat and locks being
held for a long time.~
SUGGESTION
We chose not to apply the delay at the end of the parallel apply transaction
because that would cause issues related to resource bloat and locks being held
for a long time.
I prefer the current description. So, I just changed one word
from "The other possibility is..." to "The other possibility was"
to indicate both two paragraphs (this paragraph and the previous paragraph)
are related.
======
doc/src/sgml/config.sgml2. + <para> + For time-delayed logical replication, the apply worker sends a feedback + message to the publisher every + <varname>wal_receiver_status_interval</varname> milliseconds. Make sure + to set <varname>wal_receiver_status_interval</varname> less than the + <varname>wal_sender_timeout</varname> on the publisher, otherwise, the + <literal>walsender</literal> will repeatedly terminate due to timeout + error. Note that if <varname>wal_receiver_status_interval</varname> is + set to zero, the apply worker sends no feedback messages during the + <literal>min_apply_delay</literal> period. + </para>2a.
"due to timeout error." --> "due to timeout errors."
Fixed.
~
2b.
Shouldn't this also cross-ref to CREATE SUBSCRIPTION docs? Because the
above mentions 'min_apply_delay' but that is not defined on this page.
Makes sense. Added.
======
doc/src/sgml/ref/create_subscription.sgml3. + <para> + By default, the subscriber applies changes as soon as possible. This + parameter allows the user to delay the application of changes by a + given time period. If the value is specified without units, it is + taken as milliseconds. The default is zero (no delay). See + <xref linkend="config-setting-names-values"/> for details on the + available valid time unites. + </para>Typo: "unites"
Fixed it to "units".
~~~
4. + <para> + Any delay becomes effective after all initial table synchronization + has finished and occurs before each transaction starts to get applied + on the subscriber. The delay is calculated as the difference between + the WAL timestamp as written on the publisher and the current time on + the subscriber. Any overhead of time spent in logical decoding and in + transferring the transaction may reduce the actual wait time. It is + also possible that the overhead already exceeds the requested + <literal>min_apply_delay</literal> value, in which case no delay is + applied. If the system clocks on publisher and subscriber are not + synchronized, this may lead to apply changes earlier than expected, + but this is not a major issue because this parameter is typically + much larger than the time deviations between servers. Note that if + this parameter is set to a long delay, the replication will stop if + the replication slot falls behind the current LSN by more than + <link linkend="guc-max-slot-wal-keep-size"><literal>max_slot_wal_keep_size</liter al></link>. + </para>"Any delay becomes effective after all initial table synchronization..." --> "Any
delay becomes effective only after all initial table synchronization..."
Agreed. Fixed.
~~~
5. + <warning> + <para> + Delaying the replication means there is a much longer time between + making a change on the publisher, and that change being committed + on the subscriber. This can impact the performance of synchronous + replication. See <xref linkend="guc-synchronous-commit"/> + parameter. + </para> + </warning>I'm not sure why this was text changed to say "means there is a much longer
time" instead of "can mean there is a much longer time".IMO the previous wording was better because this current text makes an
assumption about what the user has configured -- e.g. if they configured only
1ms delay then the warning text is not really relevant.
Yes, I changed here. The reason is that the purpose of this feature
is to address unintentional wrong operations on the pub and for that purpose,
I didn't feel quite very short time like you mentioned might not be set for this parameter
after some community's comments from hackers. Either was fine,
but I chose the current description, depending on the purpose.
~~~
6.
Why was the example (it existed when I last looked at patch v19) removed?
Personally, I found that example to be a useful reminder that the
min_apply_delay can specify units other than just 'ms'.
Removed because the example was one variation that used one difference value of
WITH clause, after some comments from the hackers.
The reference for available units is documented,
so the current description should be sufficient.
======
src/backend/commands/subscriptioncmds.c7. parse_subscription_options
+ /* + * The combination of parallel streaming mode and min_apply_delay is + not + * allowed. This is because we start applying the transaction stream as + * soon as the first change arrives without knowing the transaction's + * prepare/commit time. This means we cannot calculate the underlying + * network/decoding lag between publisher and subscriber, and so always + * waiting for the full 'min_apply_delay' period might include + unnecessary + * delay. + * + * The other possibility is to apply the delay at the end of the + parallel + * apply transaction but that would cause issues related to resource + bloat + * and locks being held for a long time. + */I think the 2nd paragraph should be changed slightly as follows (like review
comment #1)SUGGESTION
Note - we chose not to apply the delay at the end of the parallel apply
transaction because that would cause issues related to resource bloat and locks
being held for a long time.
Same as the first comment, changed only "is" to "was",
to indicate the last paragraph is related to past discussion(option)
for the parallel streaming mode that was not adopted.
~~~
8. + if (IsSet(supported_opts, SUBOPT_MIN_APPLY_DELAY) && + opts->min_apply_delay > 0 && opts->streaming == + opts->LOGICALREP_STREAM_PARALLEL) + ereport(ERROR, + errcode(ERRCODE_SYNTAX_ERROR),Saying "> 0" (in the condition) is not strictly necessary here, since it is never < 0.
This check is necessary.
For example, imagine a case when we CREATE a subscription with streaming = on
and then try to ALTER the subscription with streaming = parallel
without any settings for min_apply_delay. The ALTER command
throws an error of "min_apply_delay > 0 and streaming = parallel are
mutually exclusive options." then.
This is because min_apply_delay is supported by ALTER command
(so the first condition becomes true) and we set
streaming = parallel (which makes the 2nd condition true).
So, we need to check the opts's actual min_apply_delay value
to make the irrelavent case pass.
~~~
9. AlterSubscription
+ /* + * The combination of parallel streaming mode and + * min_apply_delay is not allowed. See + * parse_subscription_options for details of the reason. + */ + if (opts.streaming == LOGICALREP_STREAM_PARALLEL) if + ((IsSet(opts.specified_opts, SUBOPT_MIN_APPLY_DELAY) && opts.min_apply_delay > 0) || + (!IsSet(opts.specified_opts, SUBOPT_MIN_APPLY_DELAY) && sub->minapplydelay > 0))Saying "> 0" (in the condition) is not strictly necessary here, since it is never < 0.
This is also necessary.
For example, imagine a case that
there is a subscription whose min_apply_delay is 1 day.
Then, you want to try to execute ALTER SUBSCRIPTION
with (min_apply_delay = 0, streaming = parallel).
If we remove the condition of otps.min_apply_delay > 0,
then we error out in this case too.
First we pass the first condition
of the opts.streaming == LOGICALREP_STREAM_PARALLEL,
since we use streaming option.
Then, we also set min_apply_delay in this example,
then without checking the value of min_apply_delay,
the second condition becomes true
(IsSet(opts.specified_opts, SUBOPT_MIN_APPLY_DELAY)).
So, we need to make this case(min_apply_delay = 0) pass.
Meanwhile, checking the "sub" value is necessary for checking existing subscription value.
~~~
10. + if (IsSet(opts.specified_opts, SUBOPT_MIN_APPLY_DELAY)) { + /* + * The combination of parallel streaming mode and + * min_apply_delay is not allowed. + */ + if (opts.min_apply_delay > 0)Saying "> 0" (in the condition) is not strictly necessary here, since it is never < 0.
This is also required to check the value equals to 0 or not.
Kindly imagine a case when we want to execute ALTER min_apply_delay from 1day
with a pair of (min_apply_delay = 0 and
streaming = parallel). If we remove this check, then this ALTER command fails
with error. Without the check, when we set min_apply_delay
and parallel streaming mode, even when making the min_apply_delay 0,
the error is invoked.
The check for sub.stream is necessary for existing definition of target subscription.
~~~
11. defGetMinApplyDelay
+ /* + * Check lower bound. parse_int() has already been confirmed that + result + * is less than or equal to INT_MAX. + */The parse_int already checks < INT_MAX. But on return from that function,
don’t you need to check again that it is < PG_INT32_MAX (in case those are
different)(I think Kuroda-san already suggested same as this)
Changed according to the discussion.
======
src/backend/replication/logical/worker.c12. +/* + * In order to avoid walsender timeout for time-delayed logical +replication the + * apply worker keeps sending feedback messages during the delay period. + * Meanwhile, the feature delays the apply before the start of the + * transaction and thus we don't write WAL records for the suspended +changes + * during the wait. When the apply worker sends a feedback message +during the + * delay, we should not make positions of the flushed and apply LSN +overwritten + * by the last received latest LSN. See send_feedback() for details. + */"we should not make positions of the flushed and apply LSN overwritten" -->
"we should overwrite positions of the flushed and apply LSN"
Fixed. I added "not" in your suggestion, too.
~~~
14. send_feedback
@@ -3738,8 +3867,15 @@ send_feedback(XLogRecPtr recvpos, bool force, bool requestReply) /* * No outstanding transactions to flush, we can report the latest received * position. This is important for synchronous replication. + * + * If the logical replication subscription has unprocessed changes then + do + * not inform the publisher that the received latest LSN is already + * applied and flushed, otherwise, the publisher will make a wrong + * assumption about the logical replication progress. Instead, it just + * sends a feedback message to avoid a replication timeout during the + * delay. */"Instead, it just sends" --> "Instead, just send"
Fixed.
======
src/bin/pg_dump/pg_dump.h15. SubscriptionInfo
@@ -661,6 +661,7 @@ typedef struct _SubscriptionInfo
char *subdisableonerr;
char *suborigin;
char *subsynccommit;
+ int subminapplydelay;
char *subpublications;
} SubscriptionInfo;Should this also be "int32" to match the other member type changes?
This is intentional.
In the context of pg_dump, we are treating
this same as other int32 catalog members.
So, I'd like to keep the current code.
======
src/test/subscription/t/032_apply_delay.pl16. +# Make sure the apply worker knows to wait for more than 500ms +check_apply_delay_log($node_subscriber, $offset, "0.5");"knows to wait for more than" --> "waits for more than"
(this occurs in a couple of places)
Fixed.
Kindly have a look at v26 shared in [1]/messages/by-id/TYCPR01MB83730A45925B9680C40D92AFEDD69@TYCPR01MB8373.jpnprd01.prod.outlook.com.
[1]: /messages/by-id/TYCPR01MB83730A45925B9680C40D92AFEDD69@TYCPR01MB8373.jpnprd01.prod.outlook.com
Best Regards,
Takamichi Osumi
Hi,
On Wednesday, February 1, 2023 6:41 PM Shi, Yu/侍 雨 <shiy.fnst@fujitsu.com> wrote:
On Mon, Jan 30, 2023 6:05 PM Takamichi Osumi (Fujitsu)
<osumi.takamichi@fujitsu.com> wrote:On Saturday, January 28, 2023 1:28 PM I wrote:
Attached the updated patch v24.
Hi,
I've conducted the rebase affected by the commit(1e8b61735c) by
renaming the GUC to logical_replication_mode accordingly, because it's
utilized in the TAP test of this time-delayed LR feature.
There is no other change for this version.Kindly have a look at the attached v25.
Thanks for your patch. Here are some comments.
Thank you for your review !
2. +# Make sure the apply worker knows to wait for more than 500ms +check_apply_delay_log($node_subscriber, $offset, "0.5");I think the last parameter should be 500.
Good catch ! Fixed.
Besides, I am not sure it's a stable test to check the log. Is it possible that there's
no such log on a slow machine? I modified the code to sleep 1s at the beginning
of apply_dispatch(), then the new added test failed because the server log
cannot match.
To get the log by itself is necessary to ensure
that the delay is conducted by the apply worker, because we emit the diffms
only if it's bigger than 0 in maybe_apply_delay(). If we omit the step,
we are not sure the delay is caused by other reasons or the time-delayed feature.
As you mentioned, it's possible that no log is emitted on slow machine. Then,
the idea to make the test safer for such machines should be to make the delayed time longer.
But we shortened the delay time to 1 second to mitigate the long test execution time of this TAP test.
So, I'm not sure if it's a good idea to make it longer again.
Please have a look at the latest v26 in [1]/messages/by-id/TYCPR01MB83730A45925B9680C40D92AFEDD69@TYCPR01MB8373.jpnprd01.prod.outlook.com.
[1]: /messages/by-id/TYCPR01MB83730A45925B9680C40D92AFEDD69@TYCPR01MB8373.jpnprd01.prod.outlook.com
Best Regards,
Takamichi Osumi
On Thu, Feb 2, 2023 at 7:21 PM Takamichi Osumi (Fujitsu)
<osumi.takamichi@fujitsu.com> wrote:
...
Besides, I am not sure it's a stable test to check the log. Is it possible that there's
no such log on a slow machine? I modified the code to sleep 1s at the beginning
of apply_dispatch(), then the new added test failed because the server log
cannot match.To get the log by itself is necessary to ensure
that the delay is conducted by the apply worker, because we emit the diffms
only if it's bigger than 0 in maybe_apply_delay(). If we omit the step,
we are not sure the delay is caused by other reasons or the time-delayed feature.As you mentioned, it's possible that no log is emitted on slow machine. Then,
the idea to make the test safer for such machines should be to make the delayed time longer.
But we shortened the delay time to 1 second to mitigate the long test execution time of this TAP test.
So, I'm not sure if it's a good idea to make it longer again.
I think there are a couple of things that can be done about this problem:
1. If you need the code/test to remain as-is then at least the test
message could include some comforting text like "(this can fail on
slow machines when the delay time is already exceeded)" so then a test
failure will not cause undue alarm.
2. Try moving the DEBUG2 elog (in function maybe_apply_delay) so that
it will *always* log the remaining wait time even if that wait time
becomes negative. Then I think the test cases can be made
deterministic instead of relying on good luck. This seems like the
better option.
------
Kind Regards,
Peter Smith.
Fujitsu Australia
Here are my review comments for patch v26-0001.
On Thu, Feb 2, 2023 at 7:18 PM Takamichi Osumi (Fujitsu)
<osumi.takamichi@fujitsu.com> wrote:
Hi,
On Wednesday, February 1, 2023 1:37 PM Peter Smith <smithpb2250@gmail.com> wrote:
Here are my review comments for the patch v25-0001.
Thank you for your review !
8. + if (IsSet(supported_opts, SUBOPT_MIN_APPLY_DELAY) && + opts->min_apply_delay > 0 && opts->streaming == + opts->LOGICALREP_STREAM_PARALLEL) + ereport(ERROR, + errcode(ERRCODE_SYNTAX_ERROR),Saying "> 0" (in the condition) is not strictly necessary here, since it is never < 0.
This check is necessary.
For example, imagine a case when we CREATE a subscription with streaming = on
and then try to ALTER the subscription with streaming = parallel
without any settings for min_apply_delay. The ALTER command
throws an error of "min_apply_delay > 0 and streaming = parallel are
mutually exclusive options." then.This is because min_apply_delay is supported by ALTER command
(so the first condition becomes true) and we set
streaming = parallel (which makes the 2nd condition true).So, we need to check the opts's actual min_apply_delay value
to make the irrelavent case pass.
I think there is some misunderstanding. I was not suggesting removing
the condition -- only that I thought it could be written without the >
0 as:
if (IsSet(supported_opts, SUBOPT_MIN_APPLY_DELAY) &&
opts->min_apply_delay && opts->streaming == LOGICALREP_STREAM_PARALLEL)
ereport(ERROR,
~~~
9. AlterSubscription
+ /* + * The combination of parallel streaming mode and + * min_apply_delay is not allowed. See + * parse_subscription_options for details of the reason. + */ + if (opts.streaming == LOGICALREP_STREAM_PARALLEL) if + ((IsSet(opts.specified_opts, SUBOPT_MIN_APPLY_DELAY) && opts.min_apply_delay > 0) || + (!IsSet(opts.specified_opts, SUBOPT_MIN_APPLY_DELAY) && sub->minapplydelay > 0))Saying "> 0" (in the condition) is not strictly necessary here, since it is never < 0.
This is also necessary.
For example, imagine a case that
there is a subscription whose min_apply_delay is 1 day.
Then, you want to try to execute ALTER SUBSCRIPTION
with (min_apply_delay = 0, streaming = parallel).
If we remove the condition of otps.min_apply_delay > 0,
then we error out in this case too.First we pass the first condition
of the opts.streaming == LOGICALREP_STREAM_PARALLEL,
since we use streaming option.
Then, we also set min_apply_delay in this example,
then without checking the value of min_apply_delay,
the second condition becomes true
(IsSet(opts.specified_opts, SUBOPT_MIN_APPLY_DELAY)).So, we need to make this case(min_apply_delay = 0) pass.
Meanwhile, checking the "sub" value is necessary for checking existing subscription value.
I think there is some misunderstanding. I was not suggesting removing
the condition -- only that I thought it could be written without the >
0 as::
if (opts.streaming == LOGICALREP_STREAM_PARALLEL)
if ((IsSet(opts.specified_opts, SUBOPT_MIN_APPLY_DELAY) &&
opts.min_apply_delay) ||
(!IsSet(opts.specified_opts, SUBOPT_MIN_APPLY_DELAY) && sub->minapplydelay))
ereport(ERROR,
~~~
10. + if (IsSet(opts.specified_opts, SUBOPT_MIN_APPLY_DELAY)) { + /* + * The combination of parallel streaming mode and + * min_apply_delay is not allowed. + */ + if (opts.min_apply_delay > 0)Saying "> 0" (in the condition) is not strictly necessary here, since it is never < 0.
This is also required to check the value equals to 0 or not.
Kindly imagine a case when we want to execute ALTER min_apply_delay from 1day
with a pair of (min_apply_delay = 0 and
streaming = parallel). If we remove this check, then this ALTER command fails
with error. Without the check, when we set min_apply_delay
and parallel streaming mode, even when making the min_apply_delay 0,
the error is invoked.The check for sub.stream is necessary for existing definition of target subscription.
I think there is some misunderstanding. I was not suggesting removing
the condition -- only that I thought it could be written without the >
0 as::
if (opts.min_apply_delay)
if ((IsSet(opts.specified_opts, SUBOPT_STREAMING) && opts.streaming ==
LOGICALREP_STREAM_PARALLEL) ||
(!IsSet(opts.specified_opts, SUBOPT_STREAMING) && sub->stream ==
LOGICALREP_STREAM_PARALLEL))
ereport(ERROR,
------
Kind Regards,
Peter Smith.
Fujitsu Australia
On Fri, Feb 3, 2023 at 6:41 AM Peter Smith <smithpb2250@gmail.com> wrote:
On Thu, Feb 2, 2023 at 7:21 PM Takamichi Osumi (Fujitsu)
<osumi.takamichi@fujitsu.com> wrote:...
Besides, I am not sure it's a stable test to check the log. Is it possible that there's
no such log on a slow machine? I modified the code to sleep 1s at the beginning
of apply_dispatch(), then the new added test failed because the server log
cannot match.To get the log by itself is necessary to ensure
that the delay is conducted by the apply worker, because we emit the diffms
only if it's bigger than 0 in maybe_apply_delay(). If we omit the step,
we are not sure the delay is caused by other reasons or the time-delayed feature.As you mentioned, it's possible that no log is emitted on slow machine. Then,
the idea to make the test safer for such machines should be to make the delayed time longer.
But we shortened the delay time to 1 second to mitigate the long test execution time of this TAP test.
So, I'm not sure if it's a good idea to make it longer again.I think there are a couple of things that can be done about this problem:
1. If you need the code/test to remain as-is then at least the test
message could include some comforting text like "(this can fail on
slow machines when the delay time is already exceeded)" so then a test
failure will not cause undue alarm.2. Try moving the DEBUG2 elog (in function maybe_apply_delay) so that
it will *always* log the remaining wait time even if that wait time
becomes negative. Then I think the test cases can be made
deterministic instead of relying on good luck. This seems like the
better option.
I don't understand why we have to do any of this instead of using 3s
as min_apply_delay similar to what we are doing in
src/test/recovery/t/005_replay_delay. Also, I think we should use
exactly the same way to verify the test even though we want to keep
the log level as DEBUG2 to check logs in case of any failures.
Also, I don't see the need to add more tests like the ones below:
+# Test whether ALTER SUBSCRIPTION changes the delayed time of the apply worker
+# (1 day 5 minutes). Note that the extra 5 minute is to account for any
+# decoding/network overhead.
Let's try to add tests similar to what we have for
recovery_min_apply_delay unless there is some functionality in this
patch that is not there in the recovery_min_apply_delay feature.
--
With Regards,
Amit Kapila.
On Fri, Feb 3, 2023 at 8:02 AM Peter Smith <smithpb2250@gmail.com> wrote:
I think there is some misunderstanding. I was not suggesting removing
the condition -- only that I thought it could be written without the >
0 as:if (IsSet(supported_opts, SUBOPT_MIN_APPLY_DELAY) &&
opts->min_apply_delay && opts->streaming == LOGICALREP_STREAM_PARALLEL)
ereport(ERROR,
Yeah, we can probably write that way but in the error message we are
already using > 0, so the current style used by patch seems good to
me. Also, I think using the way you are suggesting is more apt for
booleans.
--
With Regards,
Amit Kapila.
On Fri, Feb 3, 2023 at 4:21 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
On Fri, Feb 3, 2023 at 6:41 AM Peter Smith <smithpb2250@gmail.com> wrote:
On Thu, Feb 2, 2023 at 7:21 PM Takamichi Osumi (Fujitsu)
<osumi.takamichi@fujitsu.com> wrote:...
Besides, I am not sure it's a stable test to check the log. Is it possible that there's
no such log on a slow machine? I modified the code to sleep 1s at the beginning
of apply_dispatch(), then the new added test failed because the server log
cannot match.To get the log by itself is necessary to ensure
that the delay is conducted by the apply worker, because we emit the diffms
only if it's bigger than 0 in maybe_apply_delay(). If we omit the step,
we are not sure the delay is caused by other reasons or the time-delayed feature.As you mentioned, it's possible that no log is emitted on slow machine. Then,
the idea to make the test safer for such machines should be to make the delayed time longer.
But we shortened the delay time to 1 second to mitigate the long test execution time of this TAP test.
So, I'm not sure if it's a good idea to make it longer again.I think there are a couple of things that can be done about this problem:
1. If you need the code/test to remain as-is then at least the test
message could include some comforting text like "(this can fail on
slow machines when the delay time is already exceeded)" so then a test
failure will not cause undue alarm.2. Try moving the DEBUG2 elog (in function maybe_apply_delay) so that
it will *always* log the remaining wait time even if that wait time
becomes negative. Then I think the test cases can be made
deterministic instead of relying on good luck. This seems like the
better option.I don't understand why we have to do any of this instead of using 3s
as min_apply_delay similar to what we are doing in
src/test/recovery/t/005_replay_delay. Also, I think we should use
exactly the same way to verify the test even though we want to keep
the log level as DEBUG2 to check logs in case of any failures.
IIUC the reasons are due to conflicting requirements. e.g.
- A longer delay like 3s might work better for testing this feature, but OTOH
- A longer delay will also cause the whole BF execution to take longer
------
Kind Regards,
Peter Smith.
Fujitsu Australia.
On Fri, Feb 3, 2023 at 11:12 AM Peter Smith <smithpb2250@gmail.com> wrote:
On Fri, Feb 3, 2023 at 4:21 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
On Fri, Feb 3, 2023 at 6:41 AM Peter Smith <smithpb2250@gmail.com> wrote:
On Thu, Feb 2, 2023 at 7:21 PM Takamichi Osumi (Fujitsu)
<osumi.takamichi@fujitsu.com> wrote:...
Besides, I am not sure it's a stable test to check the log. Is it possible that there's
no such log on a slow machine? I modified the code to sleep 1s at the beginning
of apply_dispatch(), then the new added test failed because the server log
cannot match.To get the log by itself is necessary to ensure
that the delay is conducted by the apply worker, because we emit the diffms
only if it's bigger than 0 in maybe_apply_delay(). If we omit the step,
we are not sure the delay is caused by other reasons or the time-delayed feature.As you mentioned, it's possible that no log is emitted on slow machine. Then,
the idea to make the test safer for such machines should be to make the delayed time longer.
But we shortened the delay time to 1 second to mitigate the long test execution time of this TAP test.
So, I'm not sure if it's a good idea to make it longer again.I think there are a couple of things that can be done about this problem:
1. If you need the code/test to remain as-is then at least the test
message could include some comforting text like "(this can fail on
slow machines when the delay time is already exceeded)" so then a test
failure will not cause undue alarm.2. Try moving the DEBUG2 elog (in function maybe_apply_delay) so that
it will *always* log the remaining wait time even if that wait time
becomes negative. Then I think the test cases can be made
deterministic instead of relying on good luck. This seems like the
better option.I don't understand why we have to do any of this instead of using 3s
as min_apply_delay similar to what we are doing in
src/test/recovery/t/005_replay_delay. Also, I think we should use
exactly the same way to verify the test even though we want to keep
the log level as DEBUG2 to check logs in case of any failures.IIUC the reasons are due to conflicting requirements. e.g.
- A longer delay like 3s might work better for testing this feature, but OTOH
- A longer delay will also cause the whole BF execution to take longer
Sure, but we already have the same test for a similar feature and it
seems to be a proven reliable way to test the feature. We do seem to
have seen buildfarm failures for tests related to
recovery_min_apply_delay and the current way is quite stable, so I
would prefer to go with that.
--
With Regards,
Amit Kapila.
Hi,
On Friday, February 3, 2023 2:21 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
On Fri, Feb 3, 2023 at 6:41 AM Peter Smith <smithpb2250@gmail.com> wrote:
On Thu, Feb 2, 2023 at 7:21 PM Takamichi Osumi (Fujitsu)
<osumi.takamichi@fujitsu.com> wrote:...
Besides, I am not sure it's a stable test to check the log. Is it
possible that there's no such log on a slow machine? I modified
the code to sleep 1s at the beginning of apply_dispatch(), then
the new added test failed because the server log cannot match.To get the log by itself is necessary to ensure that the delay is
conducted by the apply worker, because we emit the diffms only if
it's bigger than 0 in maybe_apply_delay(). If we omit the step, we
are not sure the delay is caused by other reasons or the time-delayedfeature.
As you mentioned, it's possible that no log is emitted on slow
machine. Then, the idea to make the test safer for such machines shouldbe to make the delayed time longer.
But we shortened the delay time to 1 second to mitigate the long test
execution time of this TAP test.
So, I'm not sure if it's a good idea to make it longer again.
I think there are a couple of things that can be done about this problem:
1. If you need the code/test to remain as-is then at least the test
message could include some comforting text like "(this can fail on
slow machines when the delay time is already exceeded)" so then a test
failure will not cause undue alarm.2. Try moving the DEBUG2 elog (in function maybe_apply_delay) so that
it will *always* log the remaining wait time even if that wait time
becomes negative. Then I think the test cases can be made
deterministic instead of relying on good luck. This seems like the
better option.I don't understand why we have to do any of this instead of using 3s as
min_apply_delay similar to what we are doing in
src/test/recovery/t/005_replay_delay. Also, I think we should use exactly the
same way to verify the test even though we want to keep the log level as
DEBUG2 to check logs in case of any failures.
OK, will try to make our tests similar to the tests in 005_replay_delay
as much as possible.
Also, I don't see the need to add more tests like the ones below: +# Test whether ALTER SUBSCRIPTION changes the delayed time of the apply +worker # (1 day 5 minutes). Note that the extra 5 minute is to account +for any # decoding/network overhead.Let's try to add tests similar to what we have for recovery_min_apply_delay
unless there is some functionality in this patch that is not there in the
recovery_min_apply_delay feature.
The above command is a preparation part to check a behavior unique to time-delayed
logical replication, which is to DISABLE a subscription causes the apply worker not to apply
the suspended (delayed) transaction. So, it will be OK to have this test.
Best Regards,
Takamichi Osumi
On Thurs, Feb 2, 2023 16:04 PM Takamichi Osumi (Fujitsu) <osumi.takamichi@fujitsu.com> wrote:
Attached the updated patch v26 accordingly.
Thanks for your patch.
Here is a comment:
1. The checks in function AlterSubscription
+ /*
+ * The combination of parallel streaming mode and
+ * min_apply_delay is not allowed. See
+ * parse_subscription_options for details of the reason.
+ */
+ if (opts.streaming == LOGICALREP_STREAM_PARALLEL)
+ if ((IsSet(opts.specified_opts, SUBOPT_MIN_APPLY_DELAY) && opts.min_apply_delay > 0) ||
+ (!IsSet(opts.specified_opts, SUBOPT_MIN_APPLY_DELAY) && sub->minapplydelay > 0))
and
+ /*
+ * The combination of parallel streaming mode and
+ * min_apply_delay is not allowed.
+ */
+ if (opts.min_apply_delay > 0)
+ if ((IsSet(opts.specified_opts, SUBOPT_STREAMING) && opts.streaming == LOGICALREP_STREAM_PARALLEL) ||
+ (!IsSet(opts.specified_opts, SUBOPT_STREAMING) && sub->stream == LOGICALREP_STREAM_PARALLEL))
I think the case where the options "min_apply_delay>0" and "streaming=parallel"
are set at the same time seems to have been checked in the function
parse_subscription_options, how about simplifying these two if-statements here
to the following:
```
if (opts.streaming == LOGICALREP_STREAM_PARALLEL &&
!IsSet(opts.specified_opts, SUBOPT_MIN_APPLY_DELAY) &&
sub->minapplydelay > 0)
and
if (opts.min_apply_delay > 0 &&
!IsSet(opts.specified_opts, SUBOPT_STREAMING) &&
sub->stream == LOGICALREP_STREAM_PARALLEL)
```
Regards,
Wang Wei
On Fri, Feb 3, 2023 at 3:12 PM wangw.fnst@fujitsu.com
<wangw.fnst@fujitsu.com> wrote:
Here is a comment:
1. The checks in function AlterSubscription + /* + * The combination of parallel streaming mode and + * min_apply_delay is not allowed. See + * parse_subscription_options for details of the reason. + */ + if (opts.streaming == LOGICALREP_STREAM_PARALLEL) + if ((IsSet(opts.specified_opts, SUBOPT_MIN_APPLY_DELAY) && opts.min_apply_delay > 0) || + (!IsSet(opts.specified_opts, SUBOPT_MIN_APPLY_DELAY) && sub->minapplydelay > 0)) and + /* + * The combination of parallel streaming mode and + * min_apply_delay is not allowed. + */ + if (opts.min_apply_delay > 0) + if ((IsSet(opts.specified_opts, SUBOPT_STREAMING) && opts.streaming == LOGICALREP_STREAM_PARALLEL) || + (!IsSet(opts.specified_opts, SUBOPT_STREAMING) && sub->stream == LOGICALREP_STREAM_PARALLEL))I think the case where the options "min_apply_delay>0" and "streaming=parallel"
are set at the same time seems to have been checked in the function
parse_subscription_options, how about simplifying these two if-statements here
to the following:
```
if (opts.streaming == LOGICALREP_STREAM_PARALLEL &&
!IsSet(opts.specified_opts, SUBOPT_MIN_APPLY_DELAY) &&
sub->minapplydelay > 0)and
if (opts.min_apply_delay > 0 &&
!IsSet(opts.specified_opts, SUBOPT_STREAMING) &&
sub->stream == LOGICALREP_STREAM_PARALLEL)
```
Won't just checking if ((opts.streaming == LOGICALREP_STREAM_PARALLEL
&& sub->minapplydelay > 0) || (opts.min_apply_delay > 0 && sub->stream
== LOGICALREP_STREAM_PARALLEL)) be sufficient in that case?
--
With Regards,
Amit Kapila.
Hi,
On Friday, February 3, 2023 3:35 PM I wrote:
On Friday, February 3, 2023 2:21 PM Amit Kapila <amit.kapila16@gmail.com>
wrote:On Fri, Feb 3, 2023 at 6:41 AM Peter Smith <smithpb2250@gmail.com>
wrote:
On Thu, Feb 2, 2023 at 7:21 PM Takamichi Osumi (Fujitsu)
<osumi.takamichi@fujitsu.com> wrote:
...Besides, I am not sure it's a stable test to check the log. Is
it possible that there's no such log on a slow machine? I
modified the code to sleep 1s at the beginning of
apply_dispatch(), then the new added test failed because the serverlog cannot match.
To get the log by itself is necessary to ensure that the delay is
conducted by the apply worker, because we emit the diffms only if
it's bigger than 0 in maybe_apply_delay(). If we omit the step, we
are not sure the delay is caused by other reasons or the
time-delayedfeature.
As you mentioned, it's possible that no log is emitted on slow
machine. Then, the idea to make the test safer for such machines
shouldbe to make the delayed time longer.
But we shortened the delay time to 1 second to mitigate the long
testexecution time of this TAP test.
So, I'm not sure if it's a good idea to make it longer again.
I think there are a couple of things that can be done about this problem:
1. If you need the code/test to remain as-is then at least the test
message could include some comforting text like "(this can fail on
slow machines when the delay time is already exceeded)" so then a
test failure will not cause undue alarm.2. Try moving the DEBUG2 elog (in function maybe_apply_delay) so
that it will *always* log the remaining wait time even if that wait
time becomes negative. Then I think the test cases can be made
deterministic instead of relying on good luck. This seems like the
better option.I don't understand why we have to do any of this instead of using 3s
as min_apply_delay similar to what we are doing in
src/test/recovery/t/005_replay_delay. Also, I think we should use
exactly the same way to verify the test even though we want to keep
the log level as
DEBUG2 to check logs in case of any failures.OK, will try to make our tests similar to the tests in 005_replay_delay as much
as possible.
I've updated the TAP test and made it aligned with 005_reply_delay.pl.
For coverage, I have the stream of in-progress transaction test case
and ALTER SUBSCRIPTION DISABLE behavior, which is unique to logical replication.
Also, conducted pgindent and pgperltidy. Note that the latter half of the
005_reply_delay.pl doesn't seem to match with the test for time-delayed logical replication
(e.g. promotion). So, I don't have those points.
Kindly have a look at the attached v27.
Best Regards,
Takamichi Osumi
Attachments:
v27-0001-Time-delayed-logical-replication-subscriber.patchapplication/octet-stream; name=v27-0001-Time-delayed-logical-replication-subscriber.patchDownload
From de905135f0c33de8800fb35ec4e957f26adb96ae Mon Sep 17 00:00:00 2001
From: Takamichi Osumi <osumi.takamichi@fujitsu.com>
Date: Sat, 4 Feb 2023 05:26:14 +0000
Subject: [PATCH v27] Time-delayed logical replication subscriber
Similar to physical replication, a time-delayed copy of the data for
logical replication is useful for some scenarios (particularly to fix
errors that might cause data loss).
This patch implements a new subscription parameter called 'min_apply_delay'.
If the subscription sets min_apply_delay parameter, the logical
replication worker will delay the transaction apply for min_apply_delay
milliseconds.
The delay is calculated between the WAL time stamp and the current time
on the subscriber.
The delay occurs before we start to apply the transaction on the
subscriber. The main reason is to avoid keeping a transaction open for
a long time. Regular and prepared transactions are covered. Streamed
transactions are also covered.
The combination of parallel streaming mode and min_apply_delay is not
allowed. This is because we start applying the transaction stream as
soon as the first change arrives without knowing the transaction's
prepare/commit time. This means we cannot calculate the underlying
network/decoding lag between publisher and subscriber, and so always
waiting for the full 'min_apply_delay' period might include
unnecessary delay.
The other possibility was to apply the delay at the end of the parallel
apply transaction but that would cause issues related to resource
bloat and locks being held for a long time.
Note that this feature doesn't interact with skip transaction feature.
The skip transaction feature applies to one transaction with a specific LSN.
So, even if the skipped transaction and non-skipped transaction come
consecutively in a very short time, regardless of the order of which comes
first, the time-delayed feature gets balanced by delayed application
for other transactions before and after the skipped transaction.
Author: Euler Taveira, Takamichi Osumi, Kuroda Hayato
Reviewed-by: Amit Kapila, Peter Smith, Vignesh C, Shveta Malik,
Kyotaro Horiguchi, Shi Yu, Wang Wei, Dilip Kumar, Melih Mutlu
Discussion: https://postgr.es/m/CAB-JLwYOYwL=XTyAXKiH5CtM_Vm8KjKh7aaitCKvmCh4rzr5pQ@mail.gmail.com
---
doc/src/sgml/catalogs.sgml | 9 +
doc/src/sgml/config.sgml | 12 ++
doc/src/sgml/glossary.sgml | 14 ++
doc/src/sgml/logical-replication.sgml | 6 +
doc/src/sgml/ref/alter_subscription.sgml | 5 +-
doc/src/sgml/ref/create_subscription.sgml | 49 ++++-
src/backend/catalog/pg_subscription.c | 1 +
src/backend/catalog/system_views.sql | 7 +-
src/backend/commands/subscriptioncmds.c | 118 +++++++++++-
.../replication/logical/applyparallelworker.c | 3 +-
src/backend/replication/logical/worker.c | 165 ++++++++++++++--
src/bin/pg_dump/pg_dump.c | 15 +-
src/bin/pg_dump/pg_dump.h | 1 +
src/bin/psql/describe.c | 9 +-
src/bin/psql/tab-complete.c | 4 +-
src/include/catalog/pg_subscription.h | 3 +
src/include/replication/worker_internal.h | 2 +-
src/test/regress/expected/subscription.out | 181 +++++++++++-------
src/test/regress/sql/subscription.sql | 24 +++
src/test/subscription/meson.build | 1 +
src/test/subscription/t/032_apply_delay.pl | 134 +++++++++++++
21 files changed, 659 insertions(+), 104 deletions(-)
create mode 100644 src/test/subscription/t/032_apply_delay.pl
diff --git a/doc/src/sgml/catalogs.sgml b/doc/src/sgml/catalogs.sgml
index c1e4048054..925a6ebb12 100644
--- a/doc/src/sgml/catalogs.sgml
+++ b/doc/src/sgml/catalogs.sgml
@@ -7873,6 +7873,15 @@ SCRAM-SHA-256$<replaceable><iteration count></replaceable>:<replaceable>&l
</para></entry>
</row>
+ <row>
+ <entry role="catalog_table_entry"><para role="column_definition">
+ <structfield>subminapplydelay</structfield> <type>int4</type>
+ </para>
+ <para>
+ The minimum amount of time to delay applying changes, in milliseconds.
+ </para></entry>
+ </row>
+
<row>
<entry role="catalog_table_entry"><para role="column_definition">
<structfield>subname</structfield> <type>name</type>
diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml
index d190be1925..626a8b5bd0 100644
--- a/doc/src/sgml/config.sgml
+++ b/doc/src/sgml/config.sgml
@@ -4787,6 +4787,18 @@ ANY <replaceable class="parameter">num_sync</replaceable> ( <replaceable class="
the <filename>postgresql.conf</filename> file or on the server
command line.
</para>
+ <para>
+ For time-delayed logical replication, the apply worker sends a feedback
+ message to the publisher every
+ <varname>wal_receiver_status_interval</varname> milliseconds. Make sure
+ to set <varname>wal_receiver_status_interval</varname> less than the
+ <varname>wal_sender_timeout</varname> on the publisher, otherwise, the
+ <literal>walsender</literal> will repeatedly terminate due to timeout
+ errors. Note that if <varname>wal_receiver_status_interval</varname> is
+ set to zero, the apply worker sends no feedback messages during the
+ <literal>min_apply_delay</literal> period. Refer to
+ <xref linkend="sql-createsubscription"/> for more information.
+ </para>
</listitem>
</varlistentry>
diff --git a/doc/src/sgml/glossary.sgml b/doc/src/sgml/glossary.sgml
index 7c01a541fe..6ed6fa5853 100644
--- a/doc/src/sgml/glossary.sgml
+++ b/doc/src/sgml/glossary.sgml
@@ -1729,6 +1729,20 @@
</glossdef>
</glossentry>
+ <glossentry id="glossary-time-delayed-replication">
+ <glossterm>Time-delayed replication</glossterm>
+ <glossdef>
+ <para>
+ Replication setup that applies time-delayed copy of the data.
+ </para>
+ <para>
+ For more information, see
+ <xref linkend="guc-recovery-min-apply-delay"/> for physical replication
+ and <xref linkend="sql-createsubscription"/> for logical replication.
+ </para>
+ </glossdef>
+ </glossentry>
+
<glossentry id="glossary-toast">
<glossterm>TOAST</glossterm>
<glossdef>
diff --git a/doc/src/sgml/logical-replication.sgml b/doc/src/sgml/logical-replication.sgml
index 1bd5660c87..6bd5f61e2b 100644
--- a/doc/src/sgml/logical-replication.sgml
+++ b/doc/src/sgml/logical-replication.sgml
@@ -247,6 +247,12 @@
target table.
</para>
+ <para>
+ A subscription can delay the application of changes by specifying the
+ <literal>min_apply_delay</literal> subscription parameter. See
+ <xref linkend="sql-createsubscription"/> for details.
+ </para>
+
<sect2 id="logical-replication-subscription-slot">
<title>Replication Slot Management</title>
diff --git a/doc/src/sgml/ref/alter_subscription.sgml b/doc/src/sgml/ref/alter_subscription.sgml
index 964fcbb8ff..8b7eb28e54 100644
--- a/doc/src/sgml/ref/alter_subscription.sgml
+++ b/doc/src/sgml/ref/alter_subscription.sgml
@@ -213,8 +213,9 @@ ALTER SUBSCRIPTION <replaceable class="parameter">name</replaceable> RENAME TO <
are <literal>slot_name</literal>,
<literal>synchronous_commit</literal>,
<literal>binary</literal>, <literal>streaming</literal>,
- <literal>disable_on_error</literal>, and
- <literal>origin</literal>.
+ <literal>disable_on_error</literal>,
+ <literal>origin</literal>, and
+ <literal>min_apply_delay</literal>.
</para>
</listitem>
</varlistentry>
diff --git a/doc/src/sgml/ref/create_subscription.sgml b/doc/src/sgml/ref/create_subscription.sgml
index 51c45f17c7..1b4b8390af 100644
--- a/doc/src/sgml/ref/create_subscription.sgml
+++ b/doc/src/sgml/ref/create_subscription.sgml
@@ -349,7 +349,49 @@ CREATE SUBSCRIPTION <replaceable class="parameter">subscription_name</replaceabl
</para>
</listitem>
</varlistentry>
- </variablelist></para>
+
+ <varlistentry>
+ <term><literal>min_apply_delay</literal> (<type>integer</type>)</term>
+ <listitem>
+ <para>
+ By default, the subscriber applies changes as soon as possible. This
+ parameter allows the user to delay the application of changes by a
+ given time period. If the value is specified without units, it is
+ taken as milliseconds. The default is zero (no delay). See
+ <xref linkend="config-setting-names-values"/> for details on the
+ available valid time units.
+ </para>
+ <para>
+ Any delay becomes effective only after all initial table
+ synchronization has finished and occurs before each transaction starts
+ to get applied on the subscriber. The delay is calculated as the
+ difference between the WAL timestamp as written on the publisher and
+ the current time on the subscriber. Any overhead of time spent in
+ logical decoding and in transferring the transaction may reduce the
+ actual wait time. It is also possible that the overhead already
+ exceeds the requested <literal>min_apply_delay</literal> value, in
+ which case no delay is applied. If the system clocks on publisher and
+ subscriber are not synchronized, this may lead to apply changes
+ earlier than expected, but this is not a major issue because this
+ parameter is typically much larger than the time deviations between
+ servers. Note that if this parameter is set to a long delay, the
+ replication will stop if the replication slot falls behind the current
+ LSN by more than
+ <link linkend="guc-max-slot-wal-keep-size"><literal>max_slot_wal_keep_size</literal></link>.
+ </para>
+ <warning>
+ <para>
+ Delaying the replication means there is a much longer time between
+ making a change on the publisher, and that change being committed
+ on the subscriber. This can impact the performance of synchronous
+ replication. See <xref linkend="guc-synchronous-commit"/>
+ parameter.
+ </para>
+ </warning>
+ </listitem>
+ </varlistentry>
+ </variablelist>
+ </para>
</listitem>
</varlistentry>
@@ -420,6 +462,11 @@ CREATE SUBSCRIPTION <replaceable class="parameter">subscription_name</replaceabl
published with different column lists are not supported.
</para>
+ <para>
+ A non-zero <literal>min_apply_delay</literal> parameter is not allowed when
+ streaming in parallel mode.
+ </para>
+
<para>
We allow non-existent publications to be specified so that users can add
those later. This means
diff --git a/src/backend/catalog/pg_subscription.c b/src/backend/catalog/pg_subscription.c
index a56ae311c3..c767cc1c3a 100644
--- a/src/backend/catalog/pg_subscription.c
+++ b/src/backend/catalog/pg_subscription.c
@@ -64,6 +64,7 @@ GetSubscription(Oid subid, bool missing_ok)
sub->oid = subid;
sub->dbid = subform->subdbid;
sub->skiplsn = subform->subskiplsn;
+ sub->minapplydelay = subform->subminapplydelay;
sub->name = pstrdup(NameStr(subform->subname));
sub->owner = subform->subowner;
sub->enabled = subform->subenabled;
diff --git a/src/backend/catalog/system_views.sql b/src/backend/catalog/system_views.sql
index 8608e3fa5b..317c2010cb 100644
--- a/src/backend/catalog/system_views.sql
+++ b/src/backend/catalog/system_views.sql
@@ -1299,9 +1299,10 @@ REVOKE ALL ON pg_replication_origin_status FROM public;
-- All columns of pg_subscription except subconninfo are publicly readable.
REVOKE ALL ON pg_subscription FROM public;
-GRANT SELECT (oid, subdbid, subskiplsn, subname, subowner, subenabled,
- subbinary, substream, subtwophasestate, subdisableonerr,
- subslotname, subsynccommit, subpublications, suborigin)
+GRANT SELECT (oid, subdbid, subskiplsn, subminapplydelay, subname, subowner,
+ subenabled, subbinary, substream, subtwophasestate,
+ subdisableonerr, subslotname, subsynccommit, subpublications,
+ suborigin)
ON pg_subscription TO public;
CREATE VIEW pg_stat_subscription_stats AS
diff --git a/src/backend/commands/subscriptioncmds.c b/src/backend/commands/subscriptioncmds.c
index 464db6d247..629366f91a 100644
--- a/src/backend/commands/subscriptioncmds.c
+++ b/src/backend/commands/subscriptioncmds.c
@@ -66,6 +66,7 @@
#define SUBOPT_DISABLE_ON_ERR 0x00000400
#define SUBOPT_LSN 0x00000800
#define SUBOPT_ORIGIN 0x00001000
+#define SUBOPT_MIN_APPLY_DELAY 0x00002000
/* check if the 'val' has 'bits' set */
#define IsSet(val, bits) (((val) & (bits)) == (bits))
@@ -90,6 +91,7 @@ typedef struct SubOpts
bool disableonerr;
char *origin;
XLogRecPtr lsn;
+ int32 min_apply_delay;
} SubOpts;
static List *fetch_table_list(WalReceiverConn *wrconn, List *publications);
@@ -100,7 +102,7 @@ static void check_publications_origin(WalReceiverConn *wrconn,
static void check_duplicates_in_publist(List *publist, Datum *datums);
static List *merge_publications(List *oldpublist, List *newpublist, bool addpub, const char *subname);
static void ReportSlotConnectionError(List *rstates, Oid subid, char *slotname, char *err);
-
+static int32 defGetMinApplyDelay(DefElem *def);
/*
* Common option parsing function for CREATE and ALTER SUBSCRIPTION commands.
@@ -146,6 +148,8 @@ parse_subscription_options(ParseState *pstate, List *stmt_options,
opts->disableonerr = false;
if (IsSet(supported_opts, SUBOPT_ORIGIN))
opts->origin = pstrdup(LOGICALREP_ORIGIN_ANY);
+ if (IsSet(supported_opts, SUBOPT_MIN_APPLY_DELAY))
+ opts->min_apply_delay = 0;
/* Parse options */
foreach(lc, stmt_options)
@@ -324,6 +328,15 @@ parse_subscription_options(ParseState *pstate, List *stmt_options,
opts->specified_opts |= SUBOPT_LSN;
opts->lsn = lsn;
}
+ else if (IsSet(supported_opts, SUBOPT_MIN_APPLY_DELAY) &&
+ strcmp(defel->defname, "min_apply_delay") == 0)
+ {
+ if (IsSet(opts->specified_opts, SUBOPT_MIN_APPLY_DELAY))
+ errorConflictingDefElem(defel, pstate);
+
+ opts->specified_opts |= SUBOPT_MIN_APPLY_DELAY;
+ opts->min_apply_delay = defGetMinApplyDelay(defel);
+ }
else
ereport(ERROR,
(errcode(ERRCODE_SYNTAX_ERROR),
@@ -404,6 +417,31 @@ parse_subscription_options(ParseState *pstate, List *stmt_options,
"slot_name = NONE", "create_slot = false")));
}
}
+
+ /*
+ * The combination of parallel streaming mode and min_apply_delay is not
+ * allowed. This is because we start applying the transaction stream as
+ * soon as the first change arrives without knowing the transaction's
+ * prepare/commit time. This means we cannot calculate the underlying
+ * network/decoding lag between publisher and subscriber, and so always
+ * waiting for the full 'min_apply_delay' period might include unnecessary
+ * delay.
+ *
+ * The other possibility was to apply the delay at the end of the parallel
+ * apply transaction but that would cause issues related to resource bloat
+ * and locks being held for a long time.
+ */
+ if (IsSet(supported_opts, SUBOPT_MIN_APPLY_DELAY) &&
+ opts->min_apply_delay > 0 && opts->streaming == LOGICALREP_STREAM_PARALLEL)
+ ereport(ERROR,
+ errcode(ERRCODE_SYNTAX_ERROR),
+
+ /*
+ * - translator: the first %s is a string of the form "parameter > 0"
+ * and the second one is "option = value".
+ */
+ errmsg("%s and %s are mutually exclusive options",
+ "min_apply_delay > 0", "streaming = parallel"));
}
/*
@@ -560,7 +598,8 @@ CreateSubscription(ParseState *pstate, CreateSubscriptionStmt *stmt,
SUBOPT_SLOT_NAME | SUBOPT_COPY_DATA |
SUBOPT_SYNCHRONOUS_COMMIT | SUBOPT_BINARY |
SUBOPT_STREAMING | SUBOPT_TWOPHASE_COMMIT |
- SUBOPT_DISABLE_ON_ERR | SUBOPT_ORIGIN);
+ SUBOPT_DISABLE_ON_ERR | SUBOPT_ORIGIN |
+ SUBOPT_MIN_APPLY_DELAY);
parse_subscription_options(pstate, stmt->options, supported_opts, &opts);
/*
@@ -625,6 +664,7 @@ CreateSubscription(ParseState *pstate, CreateSubscriptionStmt *stmt,
values[Anum_pg_subscription_oid - 1] = ObjectIdGetDatum(subid);
values[Anum_pg_subscription_subdbid - 1] = ObjectIdGetDatum(MyDatabaseId);
values[Anum_pg_subscription_subskiplsn - 1] = LSNGetDatum(InvalidXLogRecPtr);
+ values[Anum_pg_subscription_subminapplydelay - 1] = Int32GetDatum(opts.min_apply_delay);
values[Anum_pg_subscription_subname - 1] =
DirectFunctionCall1(namein, CStringGetDatum(stmt->subname));
values[Anum_pg_subscription_subowner - 1] = ObjectIdGetDatum(owner);
@@ -1054,7 +1094,7 @@ AlterSubscription(ParseState *pstate, AlterSubscriptionStmt *stmt,
supported_opts = (SUBOPT_SLOT_NAME |
SUBOPT_SYNCHRONOUS_COMMIT | SUBOPT_BINARY |
SUBOPT_STREAMING | SUBOPT_DISABLE_ON_ERR |
- SUBOPT_ORIGIN);
+ SUBOPT_ORIGIN | SUBOPT_MIN_APPLY_DELAY);
parse_subscription_options(pstate, stmt->options,
supported_opts, &opts);
@@ -1098,6 +1138,18 @@ AlterSubscription(ParseState *pstate, AlterSubscriptionStmt *stmt,
if (IsSet(opts.specified_opts, SUBOPT_STREAMING))
{
+ /*
+ * The combination of parallel streaming mode and
+ * min_apply_delay is not allowed. See
+ * parse_subscription_options for details of the reason.
+ */
+ if (opts.streaming == LOGICALREP_STREAM_PARALLEL &&
+ !IsSet(opts.specified_opts, SUBOPT_MIN_APPLY_DELAY) && sub->minapplydelay > 0)
+ ereport(ERROR,
+ errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
+ errmsg("cannot enable parallel streaming mode for subscription with %s",
+ "min_apply_delay"));
+
values[Anum_pg_subscription_substream - 1] =
CharGetDatum(opts.streaming);
replaces[Anum_pg_subscription_substream - 1] = true;
@@ -1111,6 +1163,24 @@ AlterSubscription(ParseState *pstate, AlterSubscriptionStmt *stmt,
= true;
}
+ if (IsSet(opts.specified_opts, SUBOPT_MIN_APPLY_DELAY))
+ {
+ /*
+ * The combination of parallel streaming mode and
+ * min_apply_delay is not allowed.
+ */
+ if (opts.min_apply_delay > 0 &&
+ !IsSet(opts.specified_opts, SUBOPT_STREAMING) && sub->stream == LOGICALREP_STREAM_PARALLEL)
+ ereport(ERROR,
+ errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
+ errmsg("cannot enable %s for subscription in %s mode",
+ "min_apply_delay", "streaming = parallel"));
+
+ values[Anum_pg_subscription_subminapplydelay - 1] =
+ Int32GetDatum(opts.min_apply_delay);
+ replaces[Anum_pg_subscription_subminapplydelay - 1] = true;
+ }
+
if (IsSet(opts.specified_opts, SUBOPT_ORIGIN))
{
values[Anum_pg_subscription_suborigin - 1] =
@@ -2195,3 +2265,45 @@ defGetStreamingMode(DefElem *def)
def->defname)));
return LOGICALREP_STREAM_OFF; /* keep compiler quiet */
}
+
+/*
+ * Extract the min_apply_delay value from a DefElem. This is very similar to
+ * parse_and_validate_value() for integer values, because min_apply_delay
+ * accepts the same parameter format as recovery_min_apply_delay.
+ */
+static int32
+defGetMinApplyDelay(DefElem *def)
+{
+ char *input_string;
+ int result;
+ const char *hintmsg;
+
+ input_string = defGetString(def);
+
+ /*
+ * Parse given string as parameter which has millisecond unit
+ */
+ if (!parse_int(input_string, &result, GUC_UNIT_MS, &hintmsg))
+ ereport(ERROR,
+ (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+ errmsg("invalid value for parameter \"%s\": \"%s\"",
+ "min_apply_delay", input_string),
+ hintmsg ? errhint("%s", _(hintmsg)) : 0));
+
+ /*
+ * Check both the lower boundary for the valid min_apply_delay range and
+ * the upper boundary as the safeguard for some platforms where INT_MAX is
+ * wider than int32 respectively. Although parse_int() has confirmed that
+ * the result is less than or equal to INT_MAX, the value will be stored
+ * in a catalog column of int32.
+ */
+ if (result < 0 || result > PG_INT32_MAX)
+ ereport(ERROR,
+ (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+ errmsg("%d ms is outside the valid range for parameter \"%s\" (%d .. %d)",
+ result,
+ "min_apply_delay",
+ 0, PG_INT32_MAX)));
+
+ return result;
+}
diff --git a/src/backend/replication/logical/applyparallelworker.c b/src/backend/replication/logical/applyparallelworker.c
index e670ec617a..d29e2dd7b9 100644
--- a/src/backend/replication/logical/applyparallelworker.c
+++ b/src/backend/replication/logical/applyparallelworker.c
@@ -704,7 +704,8 @@ pa_process_spooled_messages_if_required(void)
{
apply_spooled_messages(&MyParallelShared->fileset,
MyParallelShared->xid,
- InvalidXLogRecPtr);
+ InvalidXLogRecPtr,
+ 0);
pa_set_fileset_state(MyParallelShared, FS_EMPTY);
}
diff --git a/src/backend/replication/logical/worker.c b/src/backend/replication/logical/worker.c
index cfb2ab6248..9b3de65de7 100644
--- a/src/backend/replication/logical/worker.c
+++ b/src/backend/replication/logical/worker.c
@@ -319,6 +319,17 @@ static List *on_commit_wakeup_workers_subids = NIL;
bool in_remote_transaction = false;
static XLogRecPtr remote_final_lsn = InvalidXLogRecPtr;
+/*
+ * In order to avoid walsender timeout for time-delayed logical replication the
+ * apply worker keeps sending feedback messages during the delay period.
+ * Meanwhile, the feature delays the apply before the start of the
+ * transaction and thus we don't write WAL records for the suspended changes
+ * during the wait. When the apply worker sends a feedback message during the
+ * delay, we should not overwrite positions of the flushed and apply LSN by the
+ * last received latest LSN. See send_feedback() for details.
+ */
+static XLogRecPtr last_received = InvalidXLogRecPtr;
+
/* fields valid only when processing streamed transaction */
static bool in_streamed_transaction = false;
@@ -389,7 +400,8 @@ static void stream_write_change(char action, StringInfo s);
static void stream_open_and_write_change(TransactionId xid, char action, StringInfo s);
static void stream_close_file(void);
-static void send_feedback(XLogRecPtr recvpos, bool force, bool requestReply);
+static void send_feedback(XLogRecPtr recvpos, bool force, bool requestReply,
+ bool has_unprocessed_change);
static void DisableSubscriptionAndExit(void);
@@ -999,6 +1011,109 @@ slot_modify_data(TupleTableSlot *slot, TupleTableSlot *srcslot,
ExecStoreVirtualTuple(slot);
}
+/*
+ * When min_apply_delay parameter is set on the subscriber, we wait long enough
+ * to make sure a transaction is applied at least that period behind the
+ * publisher.
+ *
+ * While the physical replication applies the delay at commit time, this
+ * feature applies the delay for the next transaction but before starting the
+ * transaction. This is mainly because keeping a transaction that conducted
+ * write operations open for a long time results in some issues such as bloat
+ * and locks.
+ *
+ * The min_apply_delay parameter will take effect only after all tables are in
+ * READY state.
+ *
+ * xid is the transaction id where we apply the delay.
+ *
+ * finish_ts is the commit/prepare time of both regular (non-streamed) and
+ * streamed transactions. Unlike the regular (non-streamed) cases, the delay
+ * is applied in a STREAM COMMIT/STREAM PREPARE message for streamed
+ * transactions. The STREAM START message does not contain a commit/prepare
+ * time (it will be available when the in-progress transaction finishes).
+ * Hence, it's not appropriate to apply a delay at the STREAM START time.
+ */
+static void
+maybe_apply_delay(TransactionId xid, TimestampTz finish_ts)
+{
+ Assert(finish_ts > 0);
+
+ /* Nothing to do if no delay set */
+ if (!MySubscription->minapplydelay)
+ return;
+
+ /*
+ * The min_apply_delay parameter is ignored until all tablesync workers
+ * have reached READY state. This is because if we allowed the delay
+ * during the catchup phase, then once we reached the limit of tablesync
+ * workers it would impose a delay for each subsequent worker. That would
+ * cause initial table synchronization completion to take a long time.
+ */
+ if (!AllTablesyncsReady())
+ return;
+
+ /* Apply the delay by the latch mechanism */
+ while (true)
+ {
+ TimestampTz delayUntil;
+ long diffms;
+
+ ResetLatch(MyLatch);
+
+ CHECK_FOR_INTERRUPTS();
+
+ /* This might change wal_receiver_status_interval */
+ if (ConfigReloadPending)
+ {
+ ConfigReloadPending = false;
+ ProcessConfigFile(PGC_SIGHUP);
+ }
+
+ /*
+ * Before calculating the time duration, reload the catalog if needed.
+ */
+ if (!in_remote_transaction && !in_streamed_transaction)
+ {
+ AcceptInvalidationMessages();
+ maybe_reread_subscription();
+ }
+
+ delayUntil = TimestampTzPlusMilliseconds(finish_ts, MySubscription->minapplydelay);
+ diffms = TimestampDifferenceMilliseconds(GetCurrentTimestamp(), delayUntil);
+
+ /*
+ * Exit without arming the latch if it's already past time to apply
+ * this transaction.
+ */
+ if (diffms <= 0)
+ break;
+
+ elog(DEBUG2, "time-delayed replication for txid %u, min_apply_delay = %d ms, remaining wait time: %ld ms",
+ xid, MySubscription->minapplydelay, diffms);
+
+ /*
+ * Call send_feedback() to prevent the publisher from exiting by
+ * timeout during the delay, when wal_receiver_status_interval is
+ * available.
+ */
+ if (wal_receiver_status_interval > 0 &&
+ diffms > wal_receiver_status_interval * 1000L)
+ {
+ WaitLatch(MyLatch,
+ WL_LATCH_SET | WL_TIMEOUT | WL_EXIT_ON_PM_DEATH,
+ wal_receiver_status_interval * 1000L,
+ WAIT_EVENT_RECOVERY_APPLY_DELAY);
+ send_feedback(last_received, true, false, true);
+ }
+ else
+ WaitLatch(MyLatch,
+ WL_LATCH_SET | WL_TIMEOUT | WL_EXIT_ON_PM_DEATH,
+ diffms,
+ WAIT_EVENT_RECOVERY_APPLY_DELAY);
+ }
+}
+
/*
* Handle BEGIN message.
*/
@@ -1013,6 +1128,9 @@ apply_handle_begin(StringInfo s)
logicalrep_read_begin(s, &begin_data);
set_apply_error_context_xact(begin_data.xid, begin_data.final_lsn);
+ /* Should we delay the current transaction? */
+ maybe_apply_delay(begin_data.xid, begin_data.committime);
+
remote_final_lsn = begin_data.final_lsn;
maybe_start_skipping_changes(begin_data.final_lsn);
@@ -1070,6 +1188,9 @@ apply_handle_begin_prepare(StringInfo s)
logicalrep_read_begin_prepare(s, &begin_data);
set_apply_error_context_xact(begin_data.xid, begin_data.prepare_lsn);
+ /* Should we delay the current prepared transaction? */
+ maybe_apply_delay(begin_data.xid, begin_data.prepare_time);
+
remote_final_lsn = begin_data.prepare_lsn;
maybe_start_skipping_changes(begin_data.prepare_lsn);
@@ -1317,7 +1438,8 @@ apply_handle_stream_prepare(StringInfo s)
* spooled operations.
*/
apply_spooled_messages(MyLogicalRepWorker->stream_fileset,
- prepare_data.xid, prepare_data.prepare_lsn);
+ prepare_data.xid, prepare_data.prepare_lsn,
+ prepare_data.prepare_time);
/* Mark the transaction as prepared. */
apply_handle_prepare_internal(&prepare_data);
@@ -2011,10 +2133,13 @@ ensure_last_message(FileSet *stream_fileset, TransactionId xid, int fileno,
/*
* Common spoolfile processing.
+ *
+ * The commit/prepare time (finish_ts) for streamed transactions is required
+ * for time-delayed logical replication.
*/
void
apply_spooled_messages(FileSet *stream_fileset, TransactionId xid,
- XLogRecPtr lsn)
+ XLogRecPtr lsn, TimestampTz finish_ts)
{
StringInfoData s2;
int nchanges;
@@ -2025,6 +2150,10 @@ apply_spooled_messages(FileSet *stream_fileset, TransactionId xid,
int fileno;
off_t offset;
+ /* Should we delay the current transaction? */
+ if (finish_ts)
+ maybe_apply_delay(xid, finish_ts);
+
if (!am_parallel_apply_worker())
maybe_start_skipping_changes(lsn);
@@ -2174,7 +2303,7 @@ apply_handle_stream_commit(StringInfo s)
* spooled operations.
*/
apply_spooled_messages(MyLogicalRepWorker->stream_fileset, xid,
- commit_data.commit_lsn);
+ commit_data.commit_lsn, commit_data.committime);
apply_handle_commit_internal(&commit_data);
@@ -3447,7 +3576,7 @@ UpdateWorkerStats(XLogRecPtr last_lsn, TimestampTz send_time, bool reply)
* Apply main loop.
*/
static void
-LogicalRepApplyLoop(XLogRecPtr last_received)
+LogicalRepApplyLoop(void)
{
TimestampTz last_recv_timestamp = GetCurrentTimestamp();
bool ping_sent = false;
@@ -3568,7 +3697,7 @@ LogicalRepApplyLoop(XLogRecPtr last_received)
if (last_received < end_lsn)
last_received = end_lsn;
- send_feedback(last_received, reply_requested, false);
+ send_feedback(last_received, reply_requested, false, false);
UpdateWorkerStats(last_received, timestamp, true);
}
/* other message types are purposefully ignored */
@@ -3581,7 +3710,7 @@ LogicalRepApplyLoop(XLogRecPtr last_received)
}
/* confirm all writes so far */
- send_feedback(last_received, false, false);
+ send_feedback(last_received, false, false, false);
if (!in_remote_transaction && !in_streamed_transaction)
{
@@ -3678,7 +3807,7 @@ LogicalRepApplyLoop(XLogRecPtr last_received)
}
}
- send_feedback(last_received, requestReply, requestReply);
+ send_feedback(last_received, requestReply, requestReply, false);
/*
* Force reporting to ensure long idle periods don't lead to
@@ -3708,7 +3837,7 @@ LogicalRepApplyLoop(XLogRecPtr last_received)
* to send a response to avoid timeouts.
*/
static void
-send_feedback(XLogRecPtr recvpos, bool force, bool requestReply)
+send_feedback(XLogRecPtr recvpos, bool force, bool requestReply, bool has_unprocessed_change)
{
static StringInfo reply_message = NULL;
static TimestampTz send_time = 0;
@@ -3738,8 +3867,14 @@ send_feedback(XLogRecPtr recvpos, bool force, bool requestReply)
/*
* No outstanding transactions to flush, we can report the latest received
* position. This is important for synchronous replication.
+ *
+ * If the logical replication subscription has unprocessed changes then do
+ * not inform the publisher that the received latest LSN is already
+ * applied and flushed, otherwise, the publisher will make a wrong
+ * assumption about the logical replication progress. Instead, just send a
+ * feedback message to avoid a replication timeout during the delay.
*/
- if (!have_pending_txes)
+ if (!have_pending_txes && !has_unprocessed_change)
flushpos = writepos = recvpos;
if (writepos < last_writepos)
@@ -3776,8 +3911,9 @@ send_feedback(XLogRecPtr recvpos, bool force, bool requestReply)
pq_sendint64(reply_message, now); /* sendTime */
pq_sendbyte(reply_message, requestReply); /* replyRequested */
- elog(DEBUG2, "sending feedback (force %d) to recv %X/%X, write %X/%X, flush %X/%X",
+ elog(DEBUG2, "sending feedback (force %d, has_unprocessed_change %s) to recv %X/%X, write %X/%X, flush %X/%X",
force,
+ has_unprocessed_change ? "yes" : "no",
LSN_FORMAT_ARGS(recvpos),
LSN_FORMAT_ARGS(writepos),
LSN_FORMAT_ARGS(flushpos));
@@ -4367,11 +4503,11 @@ start_table_sync(XLogRecPtr *origin_startpos, char **myslotname)
* of system resource error and are not repeatable.
*/
static void
-start_apply(XLogRecPtr origin_startpos)
+start_apply(void)
{
PG_TRY();
{
- LogicalRepApplyLoop(origin_startpos);
+ LogicalRepApplyLoop();
}
PG_CATCH();
{
@@ -4661,7 +4797,8 @@ ApplyWorkerMain(Datum main_arg)
}
/* Run the main loop. */
- start_apply(origin_startpos);
+ last_received = origin_startpos;
+ start_apply();
proc_exit(0);
}
diff --git a/src/bin/pg_dump/pg_dump.c b/src/bin/pg_dump/pg_dump.c
index 527c7651ab..1e87f0124e 100644
--- a/src/bin/pg_dump/pg_dump.c
+++ b/src/bin/pg_dump/pg_dump.c
@@ -4494,6 +4494,7 @@ getSubscriptions(Archive *fout)
int i_subsynccommit;
int i_subpublications;
int i_subbinary;
+ int i_subminapplydelay;
int i,
ntups;
@@ -4546,9 +4547,13 @@ getSubscriptions(Archive *fout)
LOGICALREP_TWOPHASE_STATE_DISABLED);
if (fout->remoteVersion >= 160000)
- appendPQExpBufferStr(query, " s.suborigin\n");
+ appendPQExpBufferStr(query,
+ " s.suborigin,\n"
+ " s.subminapplydelay\n");
else
- appendPQExpBuffer(query, " '%s' AS suborigin\n", LOGICALREP_ORIGIN_ANY);
+ appendPQExpBuffer(query, " '%s' AS suborigin,\n"
+ " 0 AS subminapplydelay\n",
+ LOGICALREP_ORIGIN_ANY);
appendPQExpBufferStr(query,
"FROM pg_subscription s\n"
@@ -4576,6 +4581,7 @@ getSubscriptions(Archive *fout)
i_subtwophasestate = PQfnumber(res, "subtwophasestate");
i_subdisableonerr = PQfnumber(res, "subdisableonerr");
i_suborigin = PQfnumber(res, "suborigin");
+ i_subminapplydelay = PQfnumber(res, "subminapplydelay");
subinfo = pg_malloc(ntups * sizeof(SubscriptionInfo));
@@ -4606,6 +4612,8 @@ getSubscriptions(Archive *fout)
subinfo[i].subdisableonerr =
pg_strdup(PQgetvalue(res, i, i_subdisableonerr));
subinfo[i].suborigin = pg_strdup(PQgetvalue(res, i, i_suborigin));
+ subinfo[i].subminapplydelay =
+ atoi(PQgetvalue(res, i, i_subminapplydelay));
/* Decide whether we want to dump it */
selectDumpableObject(&(subinfo[i].dobj), fout);
@@ -4687,6 +4695,9 @@ dumpSubscription(Archive *fout, const SubscriptionInfo *subinfo)
if (strcmp(subinfo->subsynccommit, "off") != 0)
appendPQExpBuffer(query, ", synchronous_commit = %s", fmtId(subinfo->subsynccommit));
+ if (subinfo->subminapplydelay > 0)
+ appendPQExpBuffer(query, ", min_apply_delay = '%d ms'", subinfo->subminapplydelay);
+
appendPQExpBufferStr(query, ");\n");
if (subinfo->dobj.dump & DUMP_COMPONENT_DEFINITION)
diff --git a/src/bin/pg_dump/pg_dump.h b/src/bin/pg_dump/pg_dump.h
index e7cbd8d7ed..b8831c3ed3 100644
--- a/src/bin/pg_dump/pg_dump.h
+++ b/src/bin/pg_dump/pg_dump.h
@@ -661,6 +661,7 @@ typedef struct _SubscriptionInfo
char *subdisableonerr;
char *suborigin;
char *subsynccommit;
+ int subminapplydelay;
char *subpublications;
} SubscriptionInfo;
diff --git a/src/bin/psql/describe.c b/src/bin/psql/describe.c
index c8a0bb7b3a..81d4607a1c 100644
--- a/src/bin/psql/describe.c
+++ b/src/bin/psql/describe.c
@@ -6472,7 +6472,7 @@ describeSubscriptions(const char *pattern, bool verbose)
PGresult *res;
printQueryOpt myopt = pset.popt;
static const bool translate_columns[] = {false, false, false, false,
- false, false, false, false, false, false, false, false};
+ false, false, false, false, false, false, false, false, false};
if (pset.sversion < 100000)
{
@@ -6527,10 +6527,13 @@ describeSubscriptions(const char *pattern, bool verbose)
gettext_noop("Two-phase commit"),
gettext_noop("Disable on error"));
+ /* Origin and min_apply_delay are only supported in v16 and higher */
if (pset.sversion >= 160000)
appendPQExpBuffer(&buf,
- ", suborigin AS \"%s\"\n",
- gettext_noop("Origin"));
+ ", suborigin AS \"%s\"\n"
+ ", subminapplydelay AS \"%s\"\n",
+ gettext_noop("Origin"),
+ gettext_noop("Min apply delay"));
appendPQExpBuffer(&buf,
", subsynccommit AS \"%s\"\n"
diff --git a/src/bin/psql/tab-complete.c b/src/bin/psql/tab-complete.c
index 5e1882eaea..e8b9a43a47 100644
--- a/src/bin/psql/tab-complete.c
+++ b/src/bin/psql/tab-complete.c
@@ -1925,7 +1925,7 @@ psql_completion(const char *text, int start, int end)
COMPLETE_WITH("(", "PUBLICATION");
/* ALTER SUBSCRIPTION <name> SET ( */
else if (HeadMatches("ALTER", "SUBSCRIPTION", MatchAny) && TailMatches("SET", "("))
- COMPLETE_WITH("binary", "disable_on_error", "origin", "slot_name",
+ COMPLETE_WITH("binary", "disable_on_error", "min_apply_delay", "origin", "slot_name",
"streaming", "synchronous_commit");
/* ALTER SUBSCRIPTION <name> SKIP ( */
else if (HeadMatches("ALTER", "SUBSCRIPTION", MatchAny) && TailMatches("SKIP", "("))
@@ -3268,7 +3268,7 @@ psql_completion(const char *text, int start, int end)
/* Complete "CREATE SUBSCRIPTION <name> ... WITH ( <opt>" */
else if (HeadMatches("CREATE", "SUBSCRIPTION") && TailMatches("WITH", "("))
COMPLETE_WITH("binary", "connect", "copy_data", "create_slot",
- "disable_on_error", "enabled", "origin", "slot_name",
+ "disable_on_error", "enabled", "min_apply_delay", "origin", "slot_name",
"streaming", "synchronous_commit", "two_phase");
/* CREATE TRIGGER --- is allowed inside CREATE SCHEMA, so use TailMatches */
diff --git a/src/include/catalog/pg_subscription.h b/src/include/catalog/pg_subscription.h
index b0f2a1705d..b8fe47ef6e 100644
--- a/src/include/catalog/pg_subscription.h
+++ b/src/include/catalog/pg_subscription.h
@@ -74,6 +74,8 @@ CATALOG(pg_subscription,6100,SubscriptionRelationId) BKI_SHARED_RELATION BKI_ROW
Oid subowner BKI_LOOKUP(pg_authid); /* Owner of the subscription */
+ int32 subminapplydelay; /* Replication apply delay (ms) */
+
bool subenabled; /* True if the subscription is enabled (the
* worker should be running) */
@@ -120,6 +122,7 @@ typedef struct Subscription
* in */
XLogRecPtr skiplsn; /* All changes finished at this LSN are
* skipped */
+ int32 minapplydelay; /* Replication apply delay (ms) */
char *name; /* Name of the subscription */
Oid owner; /* Oid of the subscription owner */
bool enabled; /* Indicates if the subscription is enabled */
diff --git a/src/include/replication/worker_internal.h b/src/include/replication/worker_internal.h
index dc87a4edd1..3dc09d1a4c 100644
--- a/src/include/replication/worker_internal.h
+++ b/src/include/replication/worker_internal.h
@@ -255,7 +255,7 @@ extern void stream_stop_internal(TransactionId xid);
/* Common streaming function to apply all the spooled messages */
extern void apply_spooled_messages(FileSet *stream_fileset, TransactionId xid,
- XLogRecPtr lsn);
+ XLogRecPtr lsn, TimestampTz finish_ts);
extern void apply_dispatch(StringInfo s);
diff --git a/src/test/regress/expected/subscription.out b/src/test/regress/expected/subscription.out
index 3f99b14394..5ccce39986 100644
--- a/src/test/regress/expected/subscription.out
+++ b/src/test/regress/expected/subscription.out
@@ -114,18 +114,18 @@ CREATE SUBSCRIPTION regress_testsub4 CONNECTION 'dbname=regress_doesnotexist' PU
WARNING: subscription was created, but is not connected
HINT: To initiate replication, you must manually create the replication slot, enable the subscription, and refresh the subscription.
\dRs+ regress_testsub4
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
-------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub4 | regress_subscription_user | f | {testpub} | f | off | d | f | none | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay | Synchronous commit | Conninfo | Skip LSN
+------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+-----------------+--------------------+-----------------------------+----------
+ regress_testsub4 | regress_subscription_user | f | {testpub} | f | off | d | f | none | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
ALTER SUBSCRIPTION regress_testsub4 SET (origin = any);
\dRs+ regress_testsub4
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
-------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub4 | regress_subscription_user | f | {testpub} | f | off | d | f | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay | Synchronous commit | Conninfo | Skip LSN
+------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+-----------------+--------------------+-----------------------------+----------
+ regress_testsub4 | regress_subscription_user | f | {testpub} | f | off | d | f | any | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
DROP SUBSCRIPTION regress_testsub3;
@@ -143,10 +143,10 @@ ALTER SUBSCRIPTION regress_testsub CONNECTION 'foobar';
ERROR: invalid connection string syntax: missing "=" after "foobar" in connection info string
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub} | f | off | d | f | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+-----------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | off | d | f | any | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
ALTER SUBSCRIPTION regress_testsub SET PUBLICATION testpub2, testpub3 WITH (refresh = false);
@@ -163,10 +163,10 @@ ERROR: unrecognized subscription parameter: "create_slot"
-- ok
ALTER SUBSCRIPTION regress_testsub SKIP (lsn = '0/12345');
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+---------------------+--------+-----------+------------------+------------------+--------+--------------------+------------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub2,testpub3} | f | off | d | f | any | off | dbname=regress_doesnotexist2 | 0/12345
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+---------------------+--------+-----------+------------------+------------------+--------+-----------------+--------------------+------------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub2,testpub3} | f | off | d | f | any | 0 | off | dbname=regress_doesnotexist2 | 0/12345
(1 row)
-- ok - with lsn = NONE
@@ -175,10 +175,10 @@ ALTER SUBSCRIPTION regress_testsub SKIP (lsn = NONE);
ALTER SUBSCRIPTION regress_testsub SKIP (lsn = '0/0');
ERROR: invalid WAL location (LSN): 0/0
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+---------------------+--------+-----------+------------------+------------------+--------+--------------------+------------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub2,testpub3} | f | off | d | f | any | off | dbname=regress_doesnotexist2 | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+---------------------+--------+-----------+------------------+------------------+--------+-----------------+--------------------+------------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub2,testpub3} | f | off | d | f | any | 0 | off | dbname=regress_doesnotexist2 | 0/0
(1 row)
BEGIN;
@@ -210,10 +210,10 @@ ALTER SUBSCRIPTION regress_testsub_foo SET (synchronous_commit = foobar);
ERROR: invalid value for parameter "synchronous_commit": "foobar"
HINT: Available values: local, remote_write, remote_apply, on, off.
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
----------------------+---------------------------+---------+---------------------+--------+-----------+------------------+------------------+--------+--------------------+------------------------------+----------
- regress_testsub_foo | regress_subscription_user | f | {testpub2,testpub3} | f | off | d | f | any | local | dbname=regress_doesnotexist2 | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay | Synchronous commit | Conninfo | Skip LSN
+---------------------+---------------------------+---------+---------------------+--------+-----------+------------------+------------------+--------+-----------------+--------------------+------------------------------+----------
+ regress_testsub_foo | regress_subscription_user | f | {testpub2,testpub3} | f | off | d | f | any | 0 | local | dbname=regress_doesnotexist2 | 0/0
(1 row)
-- rename back to keep the rest simple
@@ -247,19 +247,19 @@ CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUB
WARNING: subscription was created, but is not connected
HINT: To initiate replication, you must manually create the replication slot, enable the subscription, and refresh the subscription.
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub} | t | off | d | f | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+-----------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | t | off | d | f | any | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
ALTER SUBSCRIPTION regress_testsub SET (binary = false);
ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE);
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub} | f | off | d | f | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+-----------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | off | d | f | any | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
DROP SUBSCRIPTION regress_testsub;
@@ -271,27 +271,27 @@ CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUB
WARNING: subscription was created, but is not connected
HINT: To initiate replication, you must manually create the replication slot, enable the subscription, and refresh the subscription.
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub} | f | on | d | f | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+-----------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | on | d | f | any | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
ALTER SUBSCRIPTION regress_testsub SET (streaming = parallel);
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub} | f | parallel | d | f | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+-----------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | parallel | d | f | any | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
ALTER SUBSCRIPTION regress_testsub SET (streaming = false);
ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE);
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub} | f | off | d | f | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+-----------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | off | d | f | any | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
-- fail - publication already exists
@@ -306,10 +306,10 @@ ALTER SUBSCRIPTION regress_testsub ADD PUBLICATION testpub1, testpub2 WITH (refr
ALTER SUBSCRIPTION regress_testsub ADD PUBLICATION testpub1, testpub2 WITH (refresh = false);
ERROR: publication "testpub1" is already in subscription "regress_testsub"
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-----------------------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub,testpub1,testpub2} | f | off | d | f | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-----------------------------+--------+-----------+------------------+------------------+--------+-----------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub,testpub1,testpub2} | f | off | d | f | any | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
-- fail - publication used more than once
@@ -324,10 +324,10 @@ ERROR: publication "testpub3" is not in subscription "regress_testsub"
-- ok - delete publications
ALTER SUBSCRIPTION regress_testsub DROP PUBLICATION testpub1, testpub2 WITH (refresh = false);
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub} | f | off | d | f | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+-----------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | off | d | f | any | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
DROP SUBSCRIPTION regress_testsub;
@@ -363,10 +363,10 @@ CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUB
WARNING: subscription was created, but is not connected
HINT: To initiate replication, you must manually create the replication slot, enable the subscription, and refresh the subscription.
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub} | f | off | p | f | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+-----------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | off | p | f | any | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
--fail - alter of two_phase option not supported.
@@ -375,10 +375,10 @@ ERROR: unrecognized subscription parameter: "two_phase"
-- but can alter streaming when two_phase enabled
ALTER SUBSCRIPTION regress_testsub SET (streaming = true);
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub} | f | on | p | f | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+-----------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | on | p | f | any | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE);
@@ -388,10 +388,10 @@ CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUB
WARNING: subscription was created, but is not connected
HINT: To initiate replication, you must manually create the replication slot, enable the subscription, and refresh the subscription.
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub} | f | on | p | f | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+-----------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | on | p | f | any | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE);
@@ -404,20 +404,57 @@ CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUB
WARNING: subscription was created, but is not connected
HINT: To initiate replication, you must manually create the replication slot, enable the subscription, and refresh the subscription.
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub} | f | off | d | f | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+-----------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | off | d | f | any | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
ALTER SUBSCRIPTION regress_testsub SET (disable_on_error = true);
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub} | f | off | d | t | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+-----------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | off | d | t | any | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
+ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE);
+DROP SUBSCRIPTION regress_testsub;
+-- fail -- min_apply_delay must be a non-negative integer
+CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUBLICATION testpub WITH (connect = false, min_apply_delay = foo);
+ERROR: invalid value for parameter "min_apply_delay": "foo"
+CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUBLICATION testpub WITH (connect = false, min_apply_delay = -1);
+ERROR: -1 ms is outside the valid range for parameter "min_apply_delay" (0 .. 2147483647)
+-- fail - utilizing streaming = parallel with time-delayed replication is not supported
+CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUBLICATION testpub WITH (connect = false, streaming = parallel, min_apply_delay = 123);
+ERROR: min_apply_delay > 0 and streaming = parallel are mutually exclusive options
+-- success -- min_apply_delay value without unit is taken as milliseconds
+CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUBLICATION testpub WITH (connect = false, min_apply_delay = 123);
+WARNING: subscription was created, but is not connected
+HINT: To initiate replication, you must manually create the replication slot, enable the subscription, and refresh the subscription.
+\dRs+
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+-----------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | off | d | f | any | 123 | off | dbname=regress_doesnotexist | 0/0
+(1 row)
+
+-- success -- min_apply_delay value with unit is converted into ms and stored as an integer
+ALTER SUBSCRIPTION regress_testsub SET (min_apply_delay = '1 d');
+\dRs+
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+-----------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | off | d | f | any | 86400000 | off | dbname=regress_doesnotexist | 0/0
+(1 row)
+
+-- fail - alter subscription with streaming = parallel should fail when time-delayed replication is set
+ALTER SUBSCRIPTION regress_testsub SET (streaming = parallel);
+ERROR: cannot enable parallel streaming mode for subscription with min_apply_delay
+-- fail - alter subscription with min_apply_delay should fail when streaming = parallel is set
+ALTER SUBSCRIPTION regress_testsub SET (min_apply_delay = 0, streaming = parallel);
+ALTER SUBSCRIPTION regress_testsub SET (min_apply_delay = 123);
+ERROR: cannot enable min_apply_delay for subscription in streaming = parallel mode
ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE);
DROP SUBSCRIPTION regress_testsub;
RESET SESSION AUTHORIZATION;
diff --git a/src/test/regress/sql/subscription.sql b/src/test/regress/sql/subscription.sql
index 7281f5fee2..7317b140f5 100644
--- a/src/test/regress/sql/subscription.sql
+++ b/src/test/regress/sql/subscription.sql
@@ -286,6 +286,30 @@ ALTER SUBSCRIPTION regress_testsub SET (disable_on_error = true);
ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE);
DROP SUBSCRIPTION regress_testsub;
+-- fail -- min_apply_delay must be a non-negative integer
+CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUBLICATION testpub WITH (connect = false, min_apply_delay = foo);
+CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUBLICATION testpub WITH (connect = false, min_apply_delay = -1);
+
+-- fail - utilizing streaming = parallel with time-delayed replication is not supported
+CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUBLICATION testpub WITH (connect = false, streaming = parallel, min_apply_delay = 123);
+
+-- success -- min_apply_delay value without unit is taken as milliseconds
+CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUBLICATION testpub WITH (connect = false, min_apply_delay = 123);
+\dRs+
+
+-- success -- min_apply_delay value with unit is converted into ms and stored as an integer
+ALTER SUBSCRIPTION regress_testsub SET (min_apply_delay = '1 d');
+\dRs+
+
+-- fail - alter subscription with streaming = parallel should fail when time-delayed replication is set
+ALTER SUBSCRIPTION regress_testsub SET (streaming = parallel);
+
+-- fail - alter subscription with min_apply_delay should fail when streaming = parallel is set
+ALTER SUBSCRIPTION regress_testsub SET (min_apply_delay = 0, streaming = parallel);
+ALTER SUBSCRIPTION regress_testsub SET (min_apply_delay = 123);
+ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE);
+DROP SUBSCRIPTION regress_testsub;
+
RESET SESSION AUTHORIZATION;
DROP ROLE regress_subscription_user;
DROP ROLE regress_subscription_user2;
diff --git a/src/test/subscription/meson.build b/src/test/subscription/meson.build
index 3db0fdfd96..a186876eb4 100644
--- a/src/test/subscription/meson.build
+++ b/src/test/subscription/meson.build
@@ -38,6 +38,7 @@ tests += {
't/029_on_error.pl',
't/030_origin.pl',
't/031_column_list.pl',
+ 't/032_apply_delay.pl',
't/100_bugs.pl',
],
},
diff --git a/src/test/subscription/t/032_apply_delay.pl b/src/test/subscription/t/032_apply_delay.pl
new file mode 100644
index 0000000000..178000830c
--- /dev/null
+++ b/src/test/subscription/t/032_apply_delay.pl
@@ -0,0 +1,134 @@
+
+# Copyright (c) 2023, PostgreSQL Global Development Group
+
+# Checks for min_apply_delay
+use strict;
+use warnings;
+use PostgreSQL::Test::Cluster;
+use PostgreSQL::Test::Utils;
+use Test::More;
+
+# Confirm the time-delayed replication has been effective from the server log
+# message where the apply worker emits for applying delay. Moreover, verify
+# that the current worker's remaining wait time is sufficiently bigger than the
+# expected value, in order to check any update of the min_apply_delay.
+sub check_apply_delay_log
+{
+ my ($node_subscriber, $offset, $expected) = @_;
+
+ # Get the remaining wait time from the server log
+ my $contents = slurp_file($node_subscriber->logfile, $offset);
+ $contents =~
+ qr/time-delayed replication for txid (\d+), min_apply_delay = (\d+) ms, remaining wait time: (\d+) ms/,
+ or die "could not get the apply worker wait time";
+ my $logged_delay = $3;
+
+ # Is it larger than expected?
+ cmp_ok($logged_delay, '>', $expected,
+ "The apply worker wait time has expected duration");
+}
+
+# Initialize publisher node
+my $node_publisher = PostgreSQL::Test::Cluster->new('publisher');
+$node_publisher->init(allows_streaming => 'logical');
+$node_publisher->append_conf('postgresql.conf',
+ 'logical_decoding_work_mem = 64kB');
+$node_publisher->start;
+
+# And some content
+$node_publisher->safe_psql('postgres',
+ "CREATE TABLE tab_int AS SELECT generate_series(1, 10) AS a");
+$node_publisher->safe_psql('postgres',
+ "CREATE PUBLICATION tap_pub FOR TABLE tab_int;");
+
+# Create subscriber
+my $node_subscriber = PostgreSQL::Test::Cluster->new('subscriber');
+my $delay = 3;
+$node_subscriber->init;
+$node_subscriber->append_conf(
+ 'postgresql.conf', qq(
+log_min_messages = debug2
+));
+$node_subscriber->start;
+$node_subscriber->safe_psql('postgres',
+ "CREATE TABLE tab_int AS SELECT generate_series(1, 10) AS a");
+my $publisher_connstr = $node_publisher->connstr . ' dbname=postgres';
+my $appname = 'tap_sub';
+$node_subscriber->safe_psql('postgres',
+ "CREATE SUBSCRIPTION tap_sub CONNECTION '$publisher_connstr application_name=$appname' PUBLICATION tap_pub WITH (copy_data = off, min_apply_delay = '${delay}s', streaming = 'on')"
+);
+
+# Make new content on publisher and check its presence in subscriber depending
+# on the delay applied above. Before doing the insertion, get the
+# current timestamp that will be used as a comparison base. Even on slow
+# machines, this allows to have a predictable behavior when comparing the
+# delay between data insertion moment on publisher and replay time on subscriber.
+my $publisher_insert_time = time();
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_int VALUES (generate_series(11, 20))");
+
+# Now wait for replay to complete on publisher. We're done waiting when the
+# subscriber has applyed up to the publisher LSN.
+$node_publisher->wait_for_catchup($appname);
+
+# This test is successful if and only if the LSN has been applied with at least
+# the configured apply delay.
+ok( time() - $publisher_insert_time >= $delay,
+ "subscriber applies WAL only after replication delay for non-streaming transaction"
+);
+
+# For better coverage, setup for streaming case
+$node_publisher->append_conf('postgresql.conf',
+ 'logical_replication_mode = immediate');
+$node_publisher->reload;
+
+# Run a query to make sure that the reload has taken effect.
+$node_publisher->safe_psql('postgres', q{SELECT 1});
+
+# Check log starting now for logical replication apply delay
+$publisher_insert_time = time();
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_int VALUES (generate_series(21, 30))");
+$node_publisher->wait_for_catchup($appname);
+ok( time() - $publisher_insert_time >= $delay,
+ "subscriber applies WAL only after replication delay for streaming transaction"
+);
+
+# From here, confirm disabling the subscription by ALTER SUBSCRIPTION DISABLE
+# did not cause the delayed transaction to be applied. This is unique to
+# time-delayed logical replication.
+
+# Execute ALTER SUBSCRIPTION to change the delayed time of the apply worker
+# (1 day 5 minutes). Note that the extra 5 minute is to account for any
+# decoding/network overhead.
+$node_subscriber->safe_psql('postgres',
+ "ALTER SUBSCRIPTION tap_sub SET (min_apply_delay = 86700000)");
+
+# Check log starting now for logical replication apply delay
+my $offset = -s $node_subscriber->logfile;
+
+# New row to trigger apply delay
+$node_publisher->safe_psql('postgres', "INSERT INTO tab_int VALUES (0)");
+
+# Make sure the apply worker waits for more than 1 day
+check_apply_delay_log($node_subscriber, $offset, "86400000");
+
+# Disable subscription and the worker should die immediately
+$node_subscriber->safe_psql('postgres',
+ "ALTER SUBSCRIPTION tap_sub DISABLE;");
+
+# Wait until worker dies
+my $sub_query =
+ "SELECT count(1) = 0 FROM pg_stat_subscription WHERE subname = 'tap_sub' AND pid IS NOT NULL;";
+$node_subscriber->poll_query_until('postgres', $sub_query)
+ or die "Timed out while waiting for subscriber to die";
+
+# Confirm the record is not applied expectedly
+my $result = $node_subscriber->safe_psql('postgres',
+ "SELECT count(a) FROM tab_int WHERE a = 0;");
+is($result, qq(0), "check the delayed transaction was not applied");
+
+$node_subscriber->stop;
+$node_publisher->stop;
+
+done_testing();
--
2.30.0
Hi,
On wangw.fnst@fujitsu.com Amit Kapila <amit.kapila16@gmail.com> wrote:
On Fri, Feb 3, 2023 at 3:12 PM wangw.fnst@fujitsu.com
<wangw.fnst@fujitsu.com> wrote:Here is a comment:
1. The checks in function AlterSubscription + /* + * The combination of parallelstreaming mode and
+ * min_apply_delay is not
allowed. See
+ * parse_subscription_options
for details of the reason.
+ */ + if (opts.streaming ==LOGICALREP_STREAM_PARALLEL)
+ if
((IsSet(opts.specified_opts, SUBOPT_MIN_APPLY_DELAY) &&
opts.min_apply_delay > 0) ||+ + (!IsSet(opts.specified_opts, SUBOPT_MIN_APPLY_DELAY) && + sub->minapplydelay > 0)) and + /* + * The combination of parallelstreaming mode and
+ * min_apply_delay is not
allowed.
+ */ + if (opts.min_apply_delay > 0) + if((IsSet(opts.specified_opts, SUBOPT_STREAMING) && opts.streaming ==
LOGICALREP_STREAM_PARALLEL) ||+ + (!IsSet(opts.specified_opts, SUBOPT_STREAMING) && sub->stream == + LOGICALREP_STREAM_PARALLEL))I think the case where the options "min_apply_delay>0" and
"streaming=parallel"
are set at the same time seems to have been checked in the function
parse_subscription_options, how about simplifying these two
if-statements here to the following:
```
if (opts.streaming == LOGICALREP_STREAM_PARALLEL &&
!IsSet(opts.specified_opts, SUBOPT_MIN_APPLY_DELAY) &&
sub->minapplydelay > 0)and
if (opts.min_apply_delay > 0 &&
!IsSet(opts.specified_opts, SUBOPT_STREAMING) &&
sub->stream == LOGICALREP_STREAM_PARALLEL) ```Won't just checking if ((opts.streaming ==
LOGICALREP_STREAM_PARALLEL && sub->minapplydelay > 0) ||
(opts.min_apply_delay > 0 && sub->stream ==
LOGICALREP_STREAM_PARALLEL)) be sufficient in that case?
We need checks for !IsSet(). If we don't have those,
we error out when executing the alter subscription with min_apply_delay = 0
and streaming = parallel, at the same time for a subscription whose min_apply_delay
setting is bigger than 0, for instance. In this case, we pass (don't error out)
parse_subscription_options()'s test for the combination of mutual exclusive options
and then, error out the condition by matching the first condition
opts.streaming == parallel and sub->minapplydelay > 0 above.
Also, the Wang-san's refactoring proposal makes sense. Adopted.
Regarding the style how to write min_apply_delay > 0
(or just putting min_apply_delay in 'if' conditions) for checking parameters,
I agreed with Amit-san so I keep them as it is in the latest patch v27.
Kindly have a look at v27 posted in [1]/messages/by-id/TYCPR01MB83738F2BEF83DE525410E3ACEDD49@TYCPR01MB8373.jpnprd01.prod.outlook.com
[1]: /messages/by-id/TYCPR01MB83738F2BEF83DE525410E3ACEDD49@TYCPR01MB8373.jpnprd01.prod.outlook.com
Best Regards,
Takamichi Osumi
On Sat, Feb 4, 2023 at 5:04 PM Takamichi Osumi (Fujitsu)
<osumi.takamichi@fujitsu.com> wrote:
...
Kindly have a look at the attached v27.
Here are some review comments for patch v27-0001.
======
src/test/subscription/t/032_apply_delay.pl
1.
+# Confirm the time-delayed replication has been effective from the server log
+# message where the apply worker emits for applying delay. Moreover, verify
+# that the current worker's remaining wait time is sufficiently bigger than the
+# expected value, in order to check any update of the min_apply_delay.
+sub check_apply_delay_log
~
"has been effective from the server log" --> "worked, by inspecting
the server log"
~~~
2.
+my $delay = 3;
Might be better to name this variable as 'min_apply_delay'.
~~~
3.
+# Now wait for replay to complete on publisher. We're done waiting when the
+# subscriber has applyed up to the publisher LSN.
+$node_publisher->wait_for_catchup($appname);
3a.
Something seemed wrong with the comment.
Was it meant to say more like? "The publisher waits for the
replication to complete".
Typo: "applyed"
~
3b.
Instead of doing this wait_for_catchup stuff why don't you just use a
synchronous pub/sub and then the publication will just block
internally like you require but without you having to block using test
code?
~~~
4.
+# Run a query to make sure that the reload has taken effect.
+$node_publisher->safe_psql('postgres', q{SELECT 1});
SUGGESTION (for the comment)
# Running a dummy query causes the config to be reloaded.
~~~
5.
+# Confirm the record is not applied expectedly
+my $result = $node_subscriber->safe_psql('postgres',
+ "SELECT count(a) FROM tab_int WHERE a = 0;");
+is($result, qq(0), "check the delayed transaction was not applied");
"expectedly" ??
SUGGESTION (for comment)
# Confirm the record was not applied
------
Kind Regards,
Peter Smith.
Fujitsu Australia
On Monday, February 6, 2023 12:03 PM Peter Smith <smithpb2250@gmail.com> wrote:
On Sat, Feb 4, 2023 at 5:04 PM Takamichi Osumi (Fujitsu)
<osumi.takamichi@fujitsu.com> wrote:...
Kindly have a look at the attached v27.
Here are some review comments for patch v27-0001.
Thanks for checking !
======
src/test/subscription/t/032_apply_delay.pl1. +# Confirm the time-delayed replication has been effective from the +server log # message where the apply worker emits for applying delay. +Moreover, verify # that the current worker's remaining wait time is +sufficiently bigger than the # expected value, in order to check any update of the min_apply_delay. +sub check_apply_delay_log~
"has been effective from the server log" --> "worked, by inspecting the server
log"
Sounds good to me. Also,
this is an unique part for time-delayed logical replication.
So, we can update those as we want. Fixed.
~~~
2.
+my $delay = 3;Might be better to name this variable as 'min_apply_delay'.
I named this variable by following the test of recovery_min_apply_delay
(src/test/recovery/005_replay_delay.pl). So, this is aligned
with the test and I'd like to keep it as it is.
~~~
3. +# Now wait for replay to complete on publisher. We're done waiting when +the # subscriber has applyed up to the publisher LSN. +$node_publisher->wait_for_catchup($appname);3a.
Something seemed wrong with the comment.Was it meant to say more like? "The publisher waits for the replication to
complete".Typo: "applyed"
Your wording looks better than mine. Fixed.
~
3b.
Instead of doing this wait_for_catchup stuff why don't you just use a
synchronous pub/sub and then the publication will just block internally like
you require but without you having to block using test code?
This is the style of 005_reply_delay.pl. Then, this is also aligned with it.
So, I'd like to keep the current way of times comparison as it is.
Even if we could omit wait_for_catchup(), there will be new codes
for synchronous replication and that would make the min_apply_delay tests
more different from the corresponding one. Note that if we use
the synchronous mode, we need to turn it off for the last
ALTER SUBSCRIPTION DISABLE test case whose min_apply_delay to 1 day 5 min
and execute one record insert after that. This will make the tests confusing.
~~~
4. +# Run a query to make sure that the reload has taken effect. +$node_publisher->safe_psql('postgres', q{SELECT 1});SUGGESTION (for the comment)
# Running a dummy query causes the config to be reloaded.
Fixed.
~~~
5. +# Confirm the record is not applied expectedly my $result = +$node_subscriber->safe_psql('postgres', + "SELECT count(a) FROM tab_int WHERE a = 0;"); is($result, qq(0), +"check the delayed transaction was not applied");"expectedly" ??
SUGGESTION (for comment)
# Confirm the record was not applied
Fixed.
Best Regards,
Takamichi Osumi
Attachments:
v28-0001-Time-delayed-logical-replication-subscriber.patchapplication/octet-stream; name=v28-0001-Time-delayed-logical-replication-subscriber.patchDownload
From 83d9c74f1c35e49d31a25b21ca69be6fb146cee0 Mon Sep 17 00:00:00 2001
From: Takamichi Osumi <osumi.takamichi@fujitsu.com>
Date: Mon, 6 Feb 2023 06:33:55 +0000
Subject: [PATCH v28] Time-delayed logical replication subscriber
Similar to physical replication, a time-delayed copy of the data for
logical replication is useful for some scenarios (particularly to fix
errors that might cause data loss).
This patch implements a new subscription parameter called 'min_apply_delay'.
If the subscription sets min_apply_delay parameter, the logical
replication worker will delay the transaction apply for min_apply_delay
milliseconds.
The delay is calculated between the WAL time stamp and the current time
on the subscriber.
The delay occurs before we start to apply the transaction on the
subscriber. The main reason is to avoid keeping a transaction open for
a long time. Regular and prepared transactions are covered. Streamed
transactions are also covered.
The combination of parallel streaming mode and min_apply_delay is not
allowed. This is because we start applying the transaction stream as
soon as the first change arrives without knowing the transaction's
prepare/commit time. This means we cannot calculate the underlying
network/decoding lag between publisher and subscriber, and so always
waiting for the full 'min_apply_delay' period might include
unnecessary delay.
The other possibility was to apply the delay at the end of the parallel
apply transaction but that would cause issues related to resource
bloat and locks being held for a long time.
Note that this feature doesn't interact with skip transaction feature.
The skip transaction feature applies to one transaction with a specific LSN.
So, even if the skipped transaction and non-skipped transaction come
consecutively in a very short time, regardless of the order of which comes
first, the time-delayed feature gets balanced by delayed application
for other transactions before and after the skipped transaction.
Author: Euler Taveira, Takamichi Osumi, Kuroda Hayato
Reviewed-by: Amit Kapila, Peter Smith, Vignesh C, Shveta Malik,
Kyotaro Horiguchi, Shi Yu, Wang Wei, Dilip Kumar, Melih Mutlu
Discussion: https://postgr.es/m/CAB-JLwYOYwL=XTyAXKiH5CtM_Vm8KjKh7aaitCKvmCh4rzr5pQ@mail.gmail.com
---
doc/src/sgml/catalogs.sgml | 9 +
doc/src/sgml/config.sgml | 12 ++
doc/src/sgml/glossary.sgml | 14 ++
doc/src/sgml/logical-replication.sgml | 6 +
doc/src/sgml/ref/alter_subscription.sgml | 5 +-
doc/src/sgml/ref/create_subscription.sgml | 49 ++++-
src/backend/catalog/pg_subscription.c | 1 +
src/backend/catalog/system_views.sql | 7 +-
src/backend/commands/subscriptioncmds.c | 118 +++++++++++-
.../replication/logical/applyparallelworker.c | 3 +-
src/backend/replication/logical/worker.c | 165 ++++++++++++++--
src/bin/pg_dump/pg_dump.c | 15 +-
src/bin/pg_dump/pg_dump.h | 1 +
src/bin/psql/describe.c | 9 +-
src/bin/psql/tab-complete.c | 4 +-
src/include/catalog/pg_subscription.h | 3 +
src/include/replication/worker_internal.h | 2 +-
src/test/regress/expected/subscription.out | 181 +++++++++++-------
src/test/regress/sql/subscription.sql | 24 +++
src/test/subscription/meson.build | 1 +
src/test/subscription/t/032_apply_delay.pl | 133 +++++++++++++
21 files changed, 658 insertions(+), 104 deletions(-)
create mode 100644 src/test/subscription/t/032_apply_delay.pl
diff --git a/doc/src/sgml/catalogs.sgml b/doc/src/sgml/catalogs.sgml
index c1e4048054..925a6ebb12 100644
--- a/doc/src/sgml/catalogs.sgml
+++ b/doc/src/sgml/catalogs.sgml
@@ -7873,6 +7873,15 @@ SCRAM-SHA-256$<replaceable><iteration count></replaceable>:<replaceable>&l
</para></entry>
</row>
+ <row>
+ <entry role="catalog_table_entry"><para role="column_definition">
+ <structfield>subminapplydelay</structfield> <type>int4</type>
+ </para>
+ <para>
+ The minimum amount of time to delay applying changes, in milliseconds.
+ </para></entry>
+ </row>
+
<row>
<entry role="catalog_table_entry"><para role="column_definition">
<structfield>subname</structfield> <type>name</type>
diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml
index d190be1925..626a8b5bd0 100644
--- a/doc/src/sgml/config.sgml
+++ b/doc/src/sgml/config.sgml
@@ -4787,6 +4787,18 @@ ANY <replaceable class="parameter">num_sync</replaceable> ( <replaceable class="
the <filename>postgresql.conf</filename> file or on the server
command line.
</para>
+ <para>
+ For time-delayed logical replication, the apply worker sends a feedback
+ message to the publisher every
+ <varname>wal_receiver_status_interval</varname> milliseconds. Make sure
+ to set <varname>wal_receiver_status_interval</varname> less than the
+ <varname>wal_sender_timeout</varname> on the publisher, otherwise, the
+ <literal>walsender</literal> will repeatedly terminate due to timeout
+ errors. Note that if <varname>wal_receiver_status_interval</varname> is
+ set to zero, the apply worker sends no feedback messages during the
+ <literal>min_apply_delay</literal> period. Refer to
+ <xref linkend="sql-createsubscription"/> for more information.
+ </para>
</listitem>
</varlistentry>
diff --git a/doc/src/sgml/glossary.sgml b/doc/src/sgml/glossary.sgml
index 7c01a541fe..6ed6fa5853 100644
--- a/doc/src/sgml/glossary.sgml
+++ b/doc/src/sgml/glossary.sgml
@@ -1729,6 +1729,20 @@
</glossdef>
</glossentry>
+ <glossentry id="glossary-time-delayed-replication">
+ <glossterm>Time-delayed replication</glossterm>
+ <glossdef>
+ <para>
+ Replication setup that applies time-delayed copy of the data.
+ </para>
+ <para>
+ For more information, see
+ <xref linkend="guc-recovery-min-apply-delay"/> for physical replication
+ and <xref linkend="sql-createsubscription"/> for logical replication.
+ </para>
+ </glossdef>
+ </glossentry>
+
<glossentry id="glossary-toast">
<glossterm>TOAST</glossterm>
<glossdef>
diff --git a/doc/src/sgml/logical-replication.sgml b/doc/src/sgml/logical-replication.sgml
index 1bd5660c87..6bd5f61e2b 100644
--- a/doc/src/sgml/logical-replication.sgml
+++ b/doc/src/sgml/logical-replication.sgml
@@ -247,6 +247,12 @@
target table.
</para>
+ <para>
+ A subscription can delay the application of changes by specifying the
+ <literal>min_apply_delay</literal> subscription parameter. See
+ <xref linkend="sql-createsubscription"/> for details.
+ </para>
+
<sect2 id="logical-replication-subscription-slot">
<title>Replication Slot Management</title>
diff --git a/doc/src/sgml/ref/alter_subscription.sgml b/doc/src/sgml/ref/alter_subscription.sgml
index 964fcbb8ff..8b7eb28e54 100644
--- a/doc/src/sgml/ref/alter_subscription.sgml
+++ b/doc/src/sgml/ref/alter_subscription.sgml
@@ -213,8 +213,9 @@ ALTER SUBSCRIPTION <replaceable class="parameter">name</replaceable> RENAME TO <
are <literal>slot_name</literal>,
<literal>synchronous_commit</literal>,
<literal>binary</literal>, <literal>streaming</literal>,
- <literal>disable_on_error</literal>, and
- <literal>origin</literal>.
+ <literal>disable_on_error</literal>,
+ <literal>origin</literal>, and
+ <literal>min_apply_delay</literal>.
</para>
</listitem>
</varlistentry>
diff --git a/doc/src/sgml/ref/create_subscription.sgml b/doc/src/sgml/ref/create_subscription.sgml
index 51c45f17c7..1b4b8390af 100644
--- a/doc/src/sgml/ref/create_subscription.sgml
+++ b/doc/src/sgml/ref/create_subscription.sgml
@@ -349,7 +349,49 @@ CREATE SUBSCRIPTION <replaceable class="parameter">subscription_name</replaceabl
</para>
</listitem>
</varlistentry>
- </variablelist></para>
+
+ <varlistentry>
+ <term><literal>min_apply_delay</literal> (<type>integer</type>)</term>
+ <listitem>
+ <para>
+ By default, the subscriber applies changes as soon as possible. This
+ parameter allows the user to delay the application of changes by a
+ given time period. If the value is specified without units, it is
+ taken as milliseconds. The default is zero (no delay). See
+ <xref linkend="config-setting-names-values"/> for details on the
+ available valid time units.
+ </para>
+ <para>
+ Any delay becomes effective only after all initial table
+ synchronization has finished and occurs before each transaction starts
+ to get applied on the subscriber. The delay is calculated as the
+ difference between the WAL timestamp as written on the publisher and
+ the current time on the subscriber. Any overhead of time spent in
+ logical decoding and in transferring the transaction may reduce the
+ actual wait time. It is also possible that the overhead already
+ exceeds the requested <literal>min_apply_delay</literal> value, in
+ which case no delay is applied. If the system clocks on publisher and
+ subscriber are not synchronized, this may lead to apply changes
+ earlier than expected, but this is not a major issue because this
+ parameter is typically much larger than the time deviations between
+ servers. Note that if this parameter is set to a long delay, the
+ replication will stop if the replication slot falls behind the current
+ LSN by more than
+ <link linkend="guc-max-slot-wal-keep-size"><literal>max_slot_wal_keep_size</literal></link>.
+ </para>
+ <warning>
+ <para>
+ Delaying the replication means there is a much longer time between
+ making a change on the publisher, and that change being committed
+ on the subscriber. This can impact the performance of synchronous
+ replication. See <xref linkend="guc-synchronous-commit"/>
+ parameter.
+ </para>
+ </warning>
+ </listitem>
+ </varlistentry>
+ </variablelist>
+ </para>
</listitem>
</varlistentry>
@@ -420,6 +462,11 @@ CREATE SUBSCRIPTION <replaceable class="parameter">subscription_name</replaceabl
published with different column lists are not supported.
</para>
+ <para>
+ A non-zero <literal>min_apply_delay</literal> parameter is not allowed when
+ streaming in parallel mode.
+ </para>
+
<para>
We allow non-existent publications to be specified so that users can add
those later. This means
diff --git a/src/backend/catalog/pg_subscription.c b/src/backend/catalog/pg_subscription.c
index a56ae311c3..c767cc1c3a 100644
--- a/src/backend/catalog/pg_subscription.c
+++ b/src/backend/catalog/pg_subscription.c
@@ -64,6 +64,7 @@ GetSubscription(Oid subid, bool missing_ok)
sub->oid = subid;
sub->dbid = subform->subdbid;
sub->skiplsn = subform->subskiplsn;
+ sub->minapplydelay = subform->subminapplydelay;
sub->name = pstrdup(NameStr(subform->subname));
sub->owner = subform->subowner;
sub->enabled = subform->subenabled;
diff --git a/src/backend/catalog/system_views.sql b/src/backend/catalog/system_views.sql
index 8608e3fa5b..317c2010cb 100644
--- a/src/backend/catalog/system_views.sql
+++ b/src/backend/catalog/system_views.sql
@@ -1299,9 +1299,10 @@ REVOKE ALL ON pg_replication_origin_status FROM public;
-- All columns of pg_subscription except subconninfo are publicly readable.
REVOKE ALL ON pg_subscription FROM public;
-GRANT SELECT (oid, subdbid, subskiplsn, subname, subowner, subenabled,
- subbinary, substream, subtwophasestate, subdisableonerr,
- subslotname, subsynccommit, subpublications, suborigin)
+GRANT SELECT (oid, subdbid, subskiplsn, subminapplydelay, subname, subowner,
+ subenabled, subbinary, substream, subtwophasestate,
+ subdisableonerr, subslotname, subsynccommit, subpublications,
+ suborigin)
ON pg_subscription TO public;
CREATE VIEW pg_stat_subscription_stats AS
diff --git a/src/backend/commands/subscriptioncmds.c b/src/backend/commands/subscriptioncmds.c
index 464db6d247..629366f91a 100644
--- a/src/backend/commands/subscriptioncmds.c
+++ b/src/backend/commands/subscriptioncmds.c
@@ -66,6 +66,7 @@
#define SUBOPT_DISABLE_ON_ERR 0x00000400
#define SUBOPT_LSN 0x00000800
#define SUBOPT_ORIGIN 0x00001000
+#define SUBOPT_MIN_APPLY_DELAY 0x00002000
/* check if the 'val' has 'bits' set */
#define IsSet(val, bits) (((val) & (bits)) == (bits))
@@ -90,6 +91,7 @@ typedef struct SubOpts
bool disableonerr;
char *origin;
XLogRecPtr lsn;
+ int32 min_apply_delay;
} SubOpts;
static List *fetch_table_list(WalReceiverConn *wrconn, List *publications);
@@ -100,7 +102,7 @@ static void check_publications_origin(WalReceiverConn *wrconn,
static void check_duplicates_in_publist(List *publist, Datum *datums);
static List *merge_publications(List *oldpublist, List *newpublist, bool addpub, const char *subname);
static void ReportSlotConnectionError(List *rstates, Oid subid, char *slotname, char *err);
-
+static int32 defGetMinApplyDelay(DefElem *def);
/*
* Common option parsing function for CREATE and ALTER SUBSCRIPTION commands.
@@ -146,6 +148,8 @@ parse_subscription_options(ParseState *pstate, List *stmt_options,
opts->disableonerr = false;
if (IsSet(supported_opts, SUBOPT_ORIGIN))
opts->origin = pstrdup(LOGICALREP_ORIGIN_ANY);
+ if (IsSet(supported_opts, SUBOPT_MIN_APPLY_DELAY))
+ opts->min_apply_delay = 0;
/* Parse options */
foreach(lc, stmt_options)
@@ -324,6 +328,15 @@ parse_subscription_options(ParseState *pstate, List *stmt_options,
opts->specified_opts |= SUBOPT_LSN;
opts->lsn = lsn;
}
+ else if (IsSet(supported_opts, SUBOPT_MIN_APPLY_DELAY) &&
+ strcmp(defel->defname, "min_apply_delay") == 0)
+ {
+ if (IsSet(opts->specified_opts, SUBOPT_MIN_APPLY_DELAY))
+ errorConflictingDefElem(defel, pstate);
+
+ opts->specified_opts |= SUBOPT_MIN_APPLY_DELAY;
+ opts->min_apply_delay = defGetMinApplyDelay(defel);
+ }
else
ereport(ERROR,
(errcode(ERRCODE_SYNTAX_ERROR),
@@ -404,6 +417,31 @@ parse_subscription_options(ParseState *pstate, List *stmt_options,
"slot_name = NONE", "create_slot = false")));
}
}
+
+ /*
+ * The combination of parallel streaming mode and min_apply_delay is not
+ * allowed. This is because we start applying the transaction stream as
+ * soon as the first change arrives without knowing the transaction's
+ * prepare/commit time. This means we cannot calculate the underlying
+ * network/decoding lag between publisher and subscriber, and so always
+ * waiting for the full 'min_apply_delay' period might include unnecessary
+ * delay.
+ *
+ * The other possibility was to apply the delay at the end of the parallel
+ * apply transaction but that would cause issues related to resource bloat
+ * and locks being held for a long time.
+ */
+ if (IsSet(supported_opts, SUBOPT_MIN_APPLY_DELAY) &&
+ opts->min_apply_delay > 0 && opts->streaming == LOGICALREP_STREAM_PARALLEL)
+ ereport(ERROR,
+ errcode(ERRCODE_SYNTAX_ERROR),
+
+ /*
+ * - translator: the first %s is a string of the form "parameter > 0"
+ * and the second one is "option = value".
+ */
+ errmsg("%s and %s are mutually exclusive options",
+ "min_apply_delay > 0", "streaming = parallel"));
}
/*
@@ -560,7 +598,8 @@ CreateSubscription(ParseState *pstate, CreateSubscriptionStmt *stmt,
SUBOPT_SLOT_NAME | SUBOPT_COPY_DATA |
SUBOPT_SYNCHRONOUS_COMMIT | SUBOPT_BINARY |
SUBOPT_STREAMING | SUBOPT_TWOPHASE_COMMIT |
- SUBOPT_DISABLE_ON_ERR | SUBOPT_ORIGIN);
+ SUBOPT_DISABLE_ON_ERR | SUBOPT_ORIGIN |
+ SUBOPT_MIN_APPLY_DELAY);
parse_subscription_options(pstate, stmt->options, supported_opts, &opts);
/*
@@ -625,6 +664,7 @@ CreateSubscription(ParseState *pstate, CreateSubscriptionStmt *stmt,
values[Anum_pg_subscription_oid - 1] = ObjectIdGetDatum(subid);
values[Anum_pg_subscription_subdbid - 1] = ObjectIdGetDatum(MyDatabaseId);
values[Anum_pg_subscription_subskiplsn - 1] = LSNGetDatum(InvalidXLogRecPtr);
+ values[Anum_pg_subscription_subminapplydelay - 1] = Int32GetDatum(opts.min_apply_delay);
values[Anum_pg_subscription_subname - 1] =
DirectFunctionCall1(namein, CStringGetDatum(stmt->subname));
values[Anum_pg_subscription_subowner - 1] = ObjectIdGetDatum(owner);
@@ -1054,7 +1094,7 @@ AlterSubscription(ParseState *pstate, AlterSubscriptionStmt *stmt,
supported_opts = (SUBOPT_SLOT_NAME |
SUBOPT_SYNCHRONOUS_COMMIT | SUBOPT_BINARY |
SUBOPT_STREAMING | SUBOPT_DISABLE_ON_ERR |
- SUBOPT_ORIGIN);
+ SUBOPT_ORIGIN | SUBOPT_MIN_APPLY_DELAY);
parse_subscription_options(pstate, stmt->options,
supported_opts, &opts);
@@ -1098,6 +1138,18 @@ AlterSubscription(ParseState *pstate, AlterSubscriptionStmt *stmt,
if (IsSet(opts.specified_opts, SUBOPT_STREAMING))
{
+ /*
+ * The combination of parallel streaming mode and
+ * min_apply_delay is not allowed. See
+ * parse_subscription_options for details of the reason.
+ */
+ if (opts.streaming == LOGICALREP_STREAM_PARALLEL &&
+ !IsSet(opts.specified_opts, SUBOPT_MIN_APPLY_DELAY) && sub->minapplydelay > 0)
+ ereport(ERROR,
+ errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
+ errmsg("cannot enable parallel streaming mode for subscription with %s",
+ "min_apply_delay"));
+
values[Anum_pg_subscription_substream - 1] =
CharGetDatum(opts.streaming);
replaces[Anum_pg_subscription_substream - 1] = true;
@@ -1111,6 +1163,24 @@ AlterSubscription(ParseState *pstate, AlterSubscriptionStmt *stmt,
= true;
}
+ if (IsSet(opts.specified_opts, SUBOPT_MIN_APPLY_DELAY))
+ {
+ /*
+ * The combination of parallel streaming mode and
+ * min_apply_delay is not allowed.
+ */
+ if (opts.min_apply_delay > 0 &&
+ !IsSet(opts.specified_opts, SUBOPT_STREAMING) && sub->stream == LOGICALREP_STREAM_PARALLEL)
+ ereport(ERROR,
+ errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
+ errmsg("cannot enable %s for subscription in %s mode",
+ "min_apply_delay", "streaming = parallel"));
+
+ values[Anum_pg_subscription_subminapplydelay - 1] =
+ Int32GetDatum(opts.min_apply_delay);
+ replaces[Anum_pg_subscription_subminapplydelay - 1] = true;
+ }
+
if (IsSet(opts.specified_opts, SUBOPT_ORIGIN))
{
values[Anum_pg_subscription_suborigin - 1] =
@@ -2195,3 +2265,45 @@ defGetStreamingMode(DefElem *def)
def->defname)));
return LOGICALREP_STREAM_OFF; /* keep compiler quiet */
}
+
+/*
+ * Extract the min_apply_delay value from a DefElem. This is very similar to
+ * parse_and_validate_value() for integer values, because min_apply_delay
+ * accepts the same parameter format as recovery_min_apply_delay.
+ */
+static int32
+defGetMinApplyDelay(DefElem *def)
+{
+ char *input_string;
+ int result;
+ const char *hintmsg;
+
+ input_string = defGetString(def);
+
+ /*
+ * Parse given string as parameter which has millisecond unit
+ */
+ if (!parse_int(input_string, &result, GUC_UNIT_MS, &hintmsg))
+ ereport(ERROR,
+ (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+ errmsg("invalid value for parameter \"%s\": \"%s\"",
+ "min_apply_delay", input_string),
+ hintmsg ? errhint("%s", _(hintmsg)) : 0));
+
+ /*
+ * Check both the lower boundary for the valid min_apply_delay range and
+ * the upper boundary as the safeguard for some platforms where INT_MAX is
+ * wider than int32 respectively. Although parse_int() has confirmed that
+ * the result is less than or equal to INT_MAX, the value will be stored
+ * in a catalog column of int32.
+ */
+ if (result < 0 || result > PG_INT32_MAX)
+ ereport(ERROR,
+ (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+ errmsg("%d ms is outside the valid range for parameter \"%s\" (%d .. %d)",
+ result,
+ "min_apply_delay",
+ 0, PG_INT32_MAX)));
+
+ return result;
+}
diff --git a/src/backend/replication/logical/applyparallelworker.c b/src/backend/replication/logical/applyparallelworker.c
index e670ec617a..d29e2dd7b9 100644
--- a/src/backend/replication/logical/applyparallelworker.c
+++ b/src/backend/replication/logical/applyparallelworker.c
@@ -704,7 +704,8 @@ pa_process_spooled_messages_if_required(void)
{
apply_spooled_messages(&MyParallelShared->fileset,
MyParallelShared->xid,
- InvalidXLogRecPtr);
+ InvalidXLogRecPtr,
+ 0);
pa_set_fileset_state(MyParallelShared, FS_EMPTY);
}
diff --git a/src/backend/replication/logical/worker.c b/src/backend/replication/logical/worker.c
index cfb2ab6248..9b3de65de7 100644
--- a/src/backend/replication/logical/worker.c
+++ b/src/backend/replication/logical/worker.c
@@ -319,6 +319,17 @@ static List *on_commit_wakeup_workers_subids = NIL;
bool in_remote_transaction = false;
static XLogRecPtr remote_final_lsn = InvalidXLogRecPtr;
+/*
+ * In order to avoid walsender timeout for time-delayed logical replication the
+ * apply worker keeps sending feedback messages during the delay period.
+ * Meanwhile, the feature delays the apply before the start of the
+ * transaction and thus we don't write WAL records for the suspended changes
+ * during the wait. When the apply worker sends a feedback message during the
+ * delay, we should not overwrite positions of the flushed and apply LSN by the
+ * last received latest LSN. See send_feedback() for details.
+ */
+static XLogRecPtr last_received = InvalidXLogRecPtr;
+
/* fields valid only when processing streamed transaction */
static bool in_streamed_transaction = false;
@@ -389,7 +400,8 @@ static void stream_write_change(char action, StringInfo s);
static void stream_open_and_write_change(TransactionId xid, char action, StringInfo s);
static void stream_close_file(void);
-static void send_feedback(XLogRecPtr recvpos, bool force, bool requestReply);
+static void send_feedback(XLogRecPtr recvpos, bool force, bool requestReply,
+ bool has_unprocessed_change);
static void DisableSubscriptionAndExit(void);
@@ -999,6 +1011,109 @@ slot_modify_data(TupleTableSlot *slot, TupleTableSlot *srcslot,
ExecStoreVirtualTuple(slot);
}
+/*
+ * When min_apply_delay parameter is set on the subscriber, we wait long enough
+ * to make sure a transaction is applied at least that period behind the
+ * publisher.
+ *
+ * While the physical replication applies the delay at commit time, this
+ * feature applies the delay for the next transaction but before starting the
+ * transaction. This is mainly because keeping a transaction that conducted
+ * write operations open for a long time results in some issues such as bloat
+ * and locks.
+ *
+ * The min_apply_delay parameter will take effect only after all tables are in
+ * READY state.
+ *
+ * xid is the transaction id where we apply the delay.
+ *
+ * finish_ts is the commit/prepare time of both regular (non-streamed) and
+ * streamed transactions. Unlike the regular (non-streamed) cases, the delay
+ * is applied in a STREAM COMMIT/STREAM PREPARE message for streamed
+ * transactions. The STREAM START message does not contain a commit/prepare
+ * time (it will be available when the in-progress transaction finishes).
+ * Hence, it's not appropriate to apply a delay at the STREAM START time.
+ */
+static void
+maybe_apply_delay(TransactionId xid, TimestampTz finish_ts)
+{
+ Assert(finish_ts > 0);
+
+ /* Nothing to do if no delay set */
+ if (!MySubscription->minapplydelay)
+ return;
+
+ /*
+ * The min_apply_delay parameter is ignored until all tablesync workers
+ * have reached READY state. This is because if we allowed the delay
+ * during the catchup phase, then once we reached the limit of tablesync
+ * workers it would impose a delay for each subsequent worker. That would
+ * cause initial table synchronization completion to take a long time.
+ */
+ if (!AllTablesyncsReady())
+ return;
+
+ /* Apply the delay by the latch mechanism */
+ while (true)
+ {
+ TimestampTz delayUntil;
+ long diffms;
+
+ ResetLatch(MyLatch);
+
+ CHECK_FOR_INTERRUPTS();
+
+ /* This might change wal_receiver_status_interval */
+ if (ConfigReloadPending)
+ {
+ ConfigReloadPending = false;
+ ProcessConfigFile(PGC_SIGHUP);
+ }
+
+ /*
+ * Before calculating the time duration, reload the catalog if needed.
+ */
+ if (!in_remote_transaction && !in_streamed_transaction)
+ {
+ AcceptInvalidationMessages();
+ maybe_reread_subscription();
+ }
+
+ delayUntil = TimestampTzPlusMilliseconds(finish_ts, MySubscription->minapplydelay);
+ diffms = TimestampDifferenceMilliseconds(GetCurrentTimestamp(), delayUntil);
+
+ /*
+ * Exit without arming the latch if it's already past time to apply
+ * this transaction.
+ */
+ if (diffms <= 0)
+ break;
+
+ elog(DEBUG2, "time-delayed replication for txid %u, min_apply_delay = %d ms, remaining wait time: %ld ms",
+ xid, MySubscription->minapplydelay, diffms);
+
+ /*
+ * Call send_feedback() to prevent the publisher from exiting by
+ * timeout during the delay, when wal_receiver_status_interval is
+ * available.
+ */
+ if (wal_receiver_status_interval > 0 &&
+ diffms > wal_receiver_status_interval * 1000L)
+ {
+ WaitLatch(MyLatch,
+ WL_LATCH_SET | WL_TIMEOUT | WL_EXIT_ON_PM_DEATH,
+ wal_receiver_status_interval * 1000L,
+ WAIT_EVENT_RECOVERY_APPLY_DELAY);
+ send_feedback(last_received, true, false, true);
+ }
+ else
+ WaitLatch(MyLatch,
+ WL_LATCH_SET | WL_TIMEOUT | WL_EXIT_ON_PM_DEATH,
+ diffms,
+ WAIT_EVENT_RECOVERY_APPLY_DELAY);
+ }
+}
+
/*
* Handle BEGIN message.
*/
@@ -1013,6 +1128,9 @@ apply_handle_begin(StringInfo s)
logicalrep_read_begin(s, &begin_data);
set_apply_error_context_xact(begin_data.xid, begin_data.final_lsn);
+ /* Should we delay the current transaction? */
+ maybe_apply_delay(begin_data.xid, begin_data.committime);
+
remote_final_lsn = begin_data.final_lsn;
maybe_start_skipping_changes(begin_data.final_lsn);
@@ -1070,6 +1188,9 @@ apply_handle_begin_prepare(StringInfo s)
logicalrep_read_begin_prepare(s, &begin_data);
set_apply_error_context_xact(begin_data.xid, begin_data.prepare_lsn);
+ /* Should we delay the current prepared transaction? */
+ maybe_apply_delay(begin_data.xid, begin_data.prepare_time);
+
remote_final_lsn = begin_data.prepare_lsn;
maybe_start_skipping_changes(begin_data.prepare_lsn);
@@ -1317,7 +1438,8 @@ apply_handle_stream_prepare(StringInfo s)
* spooled operations.
*/
apply_spooled_messages(MyLogicalRepWorker->stream_fileset,
- prepare_data.xid, prepare_data.prepare_lsn);
+ prepare_data.xid, prepare_data.prepare_lsn,
+ prepare_data.prepare_time);
/* Mark the transaction as prepared. */
apply_handle_prepare_internal(&prepare_data);
@@ -2011,10 +2133,13 @@ ensure_last_message(FileSet *stream_fileset, TransactionId xid, int fileno,
/*
* Common spoolfile processing.
+ *
+ * The commit/prepare time (finish_ts) for streamed transactions is required
+ * for time-delayed logical replication.
*/
void
apply_spooled_messages(FileSet *stream_fileset, TransactionId xid,
- XLogRecPtr lsn)
+ XLogRecPtr lsn, TimestampTz finish_ts)
{
StringInfoData s2;
int nchanges;
@@ -2025,6 +2150,10 @@ apply_spooled_messages(FileSet *stream_fileset, TransactionId xid,
int fileno;
off_t offset;
+ /* Should we delay the current transaction? */
+ if (finish_ts)
+ maybe_apply_delay(xid, finish_ts);
+
if (!am_parallel_apply_worker())
maybe_start_skipping_changes(lsn);
@@ -2174,7 +2303,7 @@ apply_handle_stream_commit(StringInfo s)
* spooled operations.
*/
apply_spooled_messages(MyLogicalRepWorker->stream_fileset, xid,
- commit_data.commit_lsn);
+ commit_data.commit_lsn, commit_data.committime);
apply_handle_commit_internal(&commit_data);
@@ -3447,7 +3576,7 @@ UpdateWorkerStats(XLogRecPtr last_lsn, TimestampTz send_time, bool reply)
* Apply main loop.
*/
static void
-LogicalRepApplyLoop(XLogRecPtr last_received)
+LogicalRepApplyLoop(void)
{
TimestampTz last_recv_timestamp = GetCurrentTimestamp();
bool ping_sent = false;
@@ -3568,7 +3697,7 @@ LogicalRepApplyLoop(XLogRecPtr last_received)
if (last_received < end_lsn)
last_received = end_lsn;
- send_feedback(last_received, reply_requested, false);
+ send_feedback(last_received, reply_requested, false, false);
UpdateWorkerStats(last_received, timestamp, true);
}
/* other message types are purposefully ignored */
@@ -3581,7 +3710,7 @@ LogicalRepApplyLoop(XLogRecPtr last_received)
}
/* confirm all writes so far */
- send_feedback(last_received, false, false);
+ send_feedback(last_received, false, false, false);
if (!in_remote_transaction && !in_streamed_transaction)
{
@@ -3678,7 +3807,7 @@ LogicalRepApplyLoop(XLogRecPtr last_received)
}
}
- send_feedback(last_received, requestReply, requestReply);
+ send_feedback(last_received, requestReply, requestReply, false);
/*
* Force reporting to ensure long idle periods don't lead to
@@ -3708,7 +3837,7 @@ LogicalRepApplyLoop(XLogRecPtr last_received)
* to send a response to avoid timeouts.
*/
static void
-send_feedback(XLogRecPtr recvpos, bool force, bool requestReply)
+send_feedback(XLogRecPtr recvpos, bool force, bool requestReply, bool has_unprocessed_change)
{
static StringInfo reply_message = NULL;
static TimestampTz send_time = 0;
@@ -3738,8 +3867,14 @@ send_feedback(XLogRecPtr recvpos, bool force, bool requestReply)
/*
* No outstanding transactions to flush, we can report the latest received
* position. This is important for synchronous replication.
+ *
+ * If the logical replication subscription has unprocessed changes then do
+ * not inform the publisher that the received latest LSN is already
+ * applied and flushed, otherwise, the publisher will make a wrong
+ * assumption about the logical replication progress. Instead, just send a
+ * feedback message to avoid a replication timeout during the delay.
*/
- if (!have_pending_txes)
+ if (!have_pending_txes && !has_unprocessed_change)
flushpos = writepos = recvpos;
if (writepos < last_writepos)
@@ -3776,8 +3911,9 @@ send_feedback(XLogRecPtr recvpos, bool force, bool requestReply)
pq_sendint64(reply_message, now); /* sendTime */
pq_sendbyte(reply_message, requestReply); /* replyRequested */
- elog(DEBUG2, "sending feedback (force %d) to recv %X/%X, write %X/%X, flush %X/%X",
+ elog(DEBUG2, "sending feedback (force %d, has_unprocessed_change %s) to recv %X/%X, write %X/%X, flush %X/%X",
force,
+ has_unprocessed_change ? "yes" : "no",
LSN_FORMAT_ARGS(recvpos),
LSN_FORMAT_ARGS(writepos),
LSN_FORMAT_ARGS(flushpos));
@@ -4367,11 +4503,11 @@ start_table_sync(XLogRecPtr *origin_startpos, char **myslotname)
* of system resource error and are not repeatable.
*/
static void
-start_apply(XLogRecPtr origin_startpos)
+start_apply(void)
{
PG_TRY();
{
- LogicalRepApplyLoop(origin_startpos);
+ LogicalRepApplyLoop();
}
PG_CATCH();
{
@@ -4661,7 +4797,8 @@ ApplyWorkerMain(Datum main_arg)
}
/* Run the main loop. */
- start_apply(origin_startpos);
+ last_received = origin_startpos;
+ start_apply();
proc_exit(0);
}
diff --git a/src/bin/pg_dump/pg_dump.c b/src/bin/pg_dump/pg_dump.c
index 527c7651ab..1e87f0124e 100644
--- a/src/bin/pg_dump/pg_dump.c
+++ b/src/bin/pg_dump/pg_dump.c
@@ -4494,6 +4494,7 @@ getSubscriptions(Archive *fout)
int i_subsynccommit;
int i_subpublications;
int i_subbinary;
+ int i_subminapplydelay;
int i,
ntups;
@@ -4546,9 +4547,13 @@ getSubscriptions(Archive *fout)
LOGICALREP_TWOPHASE_STATE_DISABLED);
if (fout->remoteVersion >= 160000)
- appendPQExpBufferStr(query, " s.suborigin\n");
+ appendPQExpBufferStr(query,
+ " s.suborigin,\n"
+ " s.subminapplydelay\n");
else
- appendPQExpBuffer(query, " '%s' AS suborigin\n", LOGICALREP_ORIGIN_ANY);
+ appendPQExpBuffer(query, " '%s' AS suborigin,\n"
+ " 0 AS subminapplydelay\n",
+ LOGICALREP_ORIGIN_ANY);
appendPQExpBufferStr(query,
"FROM pg_subscription s\n"
@@ -4576,6 +4581,7 @@ getSubscriptions(Archive *fout)
i_subtwophasestate = PQfnumber(res, "subtwophasestate");
i_subdisableonerr = PQfnumber(res, "subdisableonerr");
i_suborigin = PQfnumber(res, "suborigin");
+ i_subminapplydelay = PQfnumber(res, "subminapplydelay");
subinfo = pg_malloc(ntups * sizeof(SubscriptionInfo));
@@ -4606,6 +4612,8 @@ getSubscriptions(Archive *fout)
subinfo[i].subdisableonerr =
pg_strdup(PQgetvalue(res, i, i_subdisableonerr));
subinfo[i].suborigin = pg_strdup(PQgetvalue(res, i, i_suborigin));
+ subinfo[i].subminapplydelay =
+ atoi(PQgetvalue(res, i, i_subminapplydelay));
/* Decide whether we want to dump it */
selectDumpableObject(&(subinfo[i].dobj), fout);
@@ -4687,6 +4695,9 @@ dumpSubscription(Archive *fout, const SubscriptionInfo *subinfo)
if (strcmp(subinfo->subsynccommit, "off") != 0)
appendPQExpBuffer(query, ", synchronous_commit = %s", fmtId(subinfo->subsynccommit));
+ if (subinfo->subminapplydelay > 0)
+ appendPQExpBuffer(query, ", min_apply_delay = '%d ms'", subinfo->subminapplydelay);
+
appendPQExpBufferStr(query, ");\n");
if (subinfo->dobj.dump & DUMP_COMPONENT_DEFINITION)
diff --git a/src/bin/pg_dump/pg_dump.h b/src/bin/pg_dump/pg_dump.h
index e7cbd8d7ed..b8831c3ed3 100644
--- a/src/bin/pg_dump/pg_dump.h
+++ b/src/bin/pg_dump/pg_dump.h
@@ -661,6 +661,7 @@ typedef struct _SubscriptionInfo
char *subdisableonerr;
char *suborigin;
char *subsynccommit;
+ int subminapplydelay;
char *subpublications;
} SubscriptionInfo;
diff --git a/src/bin/psql/describe.c b/src/bin/psql/describe.c
index c8a0bb7b3a..81d4607a1c 100644
--- a/src/bin/psql/describe.c
+++ b/src/bin/psql/describe.c
@@ -6472,7 +6472,7 @@ describeSubscriptions(const char *pattern, bool verbose)
PGresult *res;
printQueryOpt myopt = pset.popt;
static const bool translate_columns[] = {false, false, false, false,
- false, false, false, false, false, false, false, false};
+ false, false, false, false, false, false, false, false, false};
if (pset.sversion < 100000)
{
@@ -6527,10 +6527,13 @@ describeSubscriptions(const char *pattern, bool verbose)
gettext_noop("Two-phase commit"),
gettext_noop("Disable on error"));
+ /* Origin and min_apply_delay are only supported in v16 and higher */
if (pset.sversion >= 160000)
appendPQExpBuffer(&buf,
- ", suborigin AS \"%s\"\n",
- gettext_noop("Origin"));
+ ", suborigin AS \"%s\"\n"
+ ", subminapplydelay AS \"%s\"\n",
+ gettext_noop("Origin"),
+ gettext_noop("Min apply delay"));
appendPQExpBuffer(&buf,
", subsynccommit AS \"%s\"\n"
diff --git a/src/bin/psql/tab-complete.c b/src/bin/psql/tab-complete.c
index 5e1882eaea..e8b9a43a47 100644
--- a/src/bin/psql/tab-complete.c
+++ b/src/bin/psql/tab-complete.c
@@ -1925,7 +1925,7 @@ psql_completion(const char *text, int start, int end)
COMPLETE_WITH("(", "PUBLICATION");
/* ALTER SUBSCRIPTION <name> SET ( */
else if (HeadMatches("ALTER", "SUBSCRIPTION", MatchAny) && TailMatches("SET", "("))
- COMPLETE_WITH("binary", "disable_on_error", "origin", "slot_name",
+ COMPLETE_WITH("binary", "disable_on_error", "min_apply_delay", "origin", "slot_name",
"streaming", "synchronous_commit");
/* ALTER SUBSCRIPTION <name> SKIP ( */
else if (HeadMatches("ALTER", "SUBSCRIPTION", MatchAny) && TailMatches("SKIP", "("))
@@ -3268,7 +3268,7 @@ psql_completion(const char *text, int start, int end)
/* Complete "CREATE SUBSCRIPTION <name> ... WITH ( <opt>" */
else if (HeadMatches("CREATE", "SUBSCRIPTION") && TailMatches("WITH", "("))
COMPLETE_WITH("binary", "connect", "copy_data", "create_slot",
- "disable_on_error", "enabled", "origin", "slot_name",
+ "disable_on_error", "enabled", "min_apply_delay", "origin", "slot_name",
"streaming", "synchronous_commit", "two_phase");
/* CREATE TRIGGER --- is allowed inside CREATE SCHEMA, so use TailMatches */
diff --git a/src/include/catalog/pg_subscription.h b/src/include/catalog/pg_subscription.h
index b0f2a1705d..b8fe47ef6e 100644
--- a/src/include/catalog/pg_subscription.h
+++ b/src/include/catalog/pg_subscription.h
@@ -74,6 +74,8 @@ CATALOG(pg_subscription,6100,SubscriptionRelationId) BKI_SHARED_RELATION BKI_ROW
Oid subowner BKI_LOOKUP(pg_authid); /* Owner of the subscription */
+ int32 subminapplydelay; /* Replication apply delay (ms) */
+
bool subenabled; /* True if the subscription is enabled (the
* worker should be running) */
@@ -120,6 +122,7 @@ typedef struct Subscription
* in */
XLogRecPtr skiplsn; /* All changes finished at this LSN are
* skipped */
+ int32 minapplydelay; /* Replication apply delay (ms) */
char *name; /* Name of the subscription */
Oid owner; /* Oid of the subscription owner */
bool enabled; /* Indicates if the subscription is enabled */
diff --git a/src/include/replication/worker_internal.h b/src/include/replication/worker_internal.h
index dc87a4edd1..3dc09d1a4c 100644
--- a/src/include/replication/worker_internal.h
+++ b/src/include/replication/worker_internal.h
@@ -255,7 +255,7 @@ extern void stream_stop_internal(TransactionId xid);
/* Common streaming function to apply all the spooled messages */
extern void apply_spooled_messages(FileSet *stream_fileset, TransactionId xid,
- XLogRecPtr lsn);
+ XLogRecPtr lsn, TimestampTz finish_ts);
extern void apply_dispatch(StringInfo s);
diff --git a/src/test/regress/expected/subscription.out b/src/test/regress/expected/subscription.out
index 3f99b14394..5ccce39986 100644
--- a/src/test/regress/expected/subscription.out
+++ b/src/test/regress/expected/subscription.out
@@ -114,18 +114,18 @@ CREATE SUBSCRIPTION regress_testsub4 CONNECTION 'dbname=regress_doesnotexist' PU
WARNING: subscription was created, but is not connected
HINT: To initiate replication, you must manually create the replication slot, enable the subscription, and refresh the subscription.
\dRs+ regress_testsub4
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
-------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub4 | regress_subscription_user | f | {testpub} | f | off | d | f | none | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay | Synchronous commit | Conninfo | Skip LSN
+------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+-----------------+--------------------+-----------------------------+----------
+ regress_testsub4 | regress_subscription_user | f | {testpub} | f | off | d | f | none | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
ALTER SUBSCRIPTION regress_testsub4 SET (origin = any);
\dRs+ regress_testsub4
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
-------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub4 | regress_subscription_user | f | {testpub} | f | off | d | f | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay | Synchronous commit | Conninfo | Skip LSN
+------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+-----------------+--------------------+-----------------------------+----------
+ regress_testsub4 | regress_subscription_user | f | {testpub} | f | off | d | f | any | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
DROP SUBSCRIPTION regress_testsub3;
@@ -143,10 +143,10 @@ ALTER SUBSCRIPTION regress_testsub CONNECTION 'foobar';
ERROR: invalid connection string syntax: missing "=" after "foobar" in connection info string
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub} | f | off | d | f | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+-----------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | off | d | f | any | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
ALTER SUBSCRIPTION regress_testsub SET PUBLICATION testpub2, testpub3 WITH (refresh = false);
@@ -163,10 +163,10 @@ ERROR: unrecognized subscription parameter: "create_slot"
-- ok
ALTER SUBSCRIPTION regress_testsub SKIP (lsn = '0/12345');
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+---------------------+--------+-----------+------------------+------------------+--------+--------------------+------------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub2,testpub3} | f | off | d | f | any | off | dbname=regress_doesnotexist2 | 0/12345
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+---------------------+--------+-----------+------------------+------------------+--------+-----------------+--------------------+------------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub2,testpub3} | f | off | d | f | any | 0 | off | dbname=regress_doesnotexist2 | 0/12345
(1 row)
-- ok - with lsn = NONE
@@ -175,10 +175,10 @@ ALTER SUBSCRIPTION regress_testsub SKIP (lsn = NONE);
ALTER SUBSCRIPTION regress_testsub SKIP (lsn = '0/0');
ERROR: invalid WAL location (LSN): 0/0
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+---------------------+--------+-----------+------------------+------------------+--------+--------------------+------------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub2,testpub3} | f | off | d | f | any | off | dbname=regress_doesnotexist2 | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+---------------------+--------+-----------+------------------+------------------+--------+-----------------+--------------------+------------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub2,testpub3} | f | off | d | f | any | 0 | off | dbname=regress_doesnotexist2 | 0/0
(1 row)
BEGIN;
@@ -210,10 +210,10 @@ ALTER SUBSCRIPTION regress_testsub_foo SET (synchronous_commit = foobar);
ERROR: invalid value for parameter "synchronous_commit": "foobar"
HINT: Available values: local, remote_write, remote_apply, on, off.
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
----------------------+---------------------------+---------+---------------------+--------+-----------+------------------+------------------+--------+--------------------+------------------------------+----------
- regress_testsub_foo | regress_subscription_user | f | {testpub2,testpub3} | f | off | d | f | any | local | dbname=regress_doesnotexist2 | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay | Synchronous commit | Conninfo | Skip LSN
+---------------------+---------------------------+---------+---------------------+--------+-----------+------------------+------------------+--------+-----------------+--------------------+------------------------------+----------
+ regress_testsub_foo | regress_subscription_user | f | {testpub2,testpub3} | f | off | d | f | any | 0 | local | dbname=regress_doesnotexist2 | 0/0
(1 row)
-- rename back to keep the rest simple
@@ -247,19 +247,19 @@ CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUB
WARNING: subscription was created, but is not connected
HINT: To initiate replication, you must manually create the replication slot, enable the subscription, and refresh the subscription.
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub} | t | off | d | f | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+-----------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | t | off | d | f | any | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
ALTER SUBSCRIPTION regress_testsub SET (binary = false);
ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE);
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub} | f | off | d | f | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+-----------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | off | d | f | any | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
DROP SUBSCRIPTION regress_testsub;
@@ -271,27 +271,27 @@ CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUB
WARNING: subscription was created, but is not connected
HINT: To initiate replication, you must manually create the replication slot, enable the subscription, and refresh the subscription.
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub} | f | on | d | f | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+-----------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | on | d | f | any | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
ALTER SUBSCRIPTION regress_testsub SET (streaming = parallel);
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub} | f | parallel | d | f | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+-----------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | parallel | d | f | any | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
ALTER SUBSCRIPTION regress_testsub SET (streaming = false);
ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE);
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub} | f | off | d | f | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+-----------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | off | d | f | any | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
-- fail - publication already exists
@@ -306,10 +306,10 @@ ALTER SUBSCRIPTION regress_testsub ADD PUBLICATION testpub1, testpub2 WITH (refr
ALTER SUBSCRIPTION regress_testsub ADD PUBLICATION testpub1, testpub2 WITH (refresh = false);
ERROR: publication "testpub1" is already in subscription "regress_testsub"
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-----------------------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub,testpub1,testpub2} | f | off | d | f | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-----------------------------+--------+-----------+------------------+------------------+--------+-----------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub,testpub1,testpub2} | f | off | d | f | any | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
-- fail - publication used more than once
@@ -324,10 +324,10 @@ ERROR: publication "testpub3" is not in subscription "regress_testsub"
-- ok - delete publications
ALTER SUBSCRIPTION regress_testsub DROP PUBLICATION testpub1, testpub2 WITH (refresh = false);
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub} | f | off | d | f | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+-----------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | off | d | f | any | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
DROP SUBSCRIPTION regress_testsub;
@@ -363,10 +363,10 @@ CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUB
WARNING: subscription was created, but is not connected
HINT: To initiate replication, you must manually create the replication slot, enable the subscription, and refresh the subscription.
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub} | f | off | p | f | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+-----------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | off | p | f | any | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
--fail - alter of two_phase option not supported.
@@ -375,10 +375,10 @@ ERROR: unrecognized subscription parameter: "two_phase"
-- but can alter streaming when two_phase enabled
ALTER SUBSCRIPTION regress_testsub SET (streaming = true);
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub} | f | on | p | f | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+-----------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | on | p | f | any | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE);
@@ -388,10 +388,10 @@ CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUB
WARNING: subscription was created, but is not connected
HINT: To initiate replication, you must manually create the replication slot, enable the subscription, and refresh the subscription.
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub} | f | on | p | f | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+-----------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | on | p | f | any | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE);
@@ -404,20 +404,57 @@ CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUB
WARNING: subscription was created, but is not connected
HINT: To initiate replication, you must manually create the replication slot, enable the subscription, and refresh the subscription.
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub} | f | off | d | f | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+-----------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | off | d | f | any | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
ALTER SUBSCRIPTION regress_testsub SET (disable_on_error = true);
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub} | f | off | d | t | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+-----------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | off | d | t | any | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
+ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE);
+DROP SUBSCRIPTION regress_testsub;
+-- fail -- min_apply_delay must be a non-negative integer
+CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUBLICATION testpub WITH (connect = false, min_apply_delay = foo);
+ERROR: invalid value for parameter "min_apply_delay": "foo"
+CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUBLICATION testpub WITH (connect = false, min_apply_delay = -1);
+ERROR: -1 ms is outside the valid range for parameter "min_apply_delay" (0 .. 2147483647)
+-- fail - utilizing streaming = parallel with time-delayed replication is not supported
+CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUBLICATION testpub WITH (connect = false, streaming = parallel, min_apply_delay = 123);
+ERROR: min_apply_delay > 0 and streaming = parallel are mutually exclusive options
+-- success -- min_apply_delay value without unit is taken as milliseconds
+CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUBLICATION testpub WITH (connect = false, min_apply_delay = 123);
+WARNING: subscription was created, but is not connected
+HINT: To initiate replication, you must manually create the replication slot, enable the subscription, and refresh the subscription.
+\dRs+
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+-----------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | off | d | f | any | 123 | off | dbname=regress_doesnotexist | 0/0
+(1 row)
+
+-- success -- min_apply_delay value with unit is converted into ms and stored as an integer
+ALTER SUBSCRIPTION regress_testsub SET (min_apply_delay = '1 d');
+\dRs+
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+-----------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | off | d | f | any | 86400000 | off | dbname=regress_doesnotexist | 0/0
+(1 row)
+
+-- fail - alter subscription with streaming = parallel should fail when time-delayed replication is set
+ALTER SUBSCRIPTION regress_testsub SET (streaming = parallel);
+ERROR: cannot enable parallel streaming mode for subscription with min_apply_delay
+-- fail - alter subscription with min_apply_delay should fail when streaming = parallel is set
+ALTER SUBSCRIPTION regress_testsub SET (min_apply_delay = 0, streaming = parallel);
+ALTER SUBSCRIPTION regress_testsub SET (min_apply_delay = 123);
+ERROR: cannot enable min_apply_delay for subscription in streaming = parallel mode
ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE);
DROP SUBSCRIPTION regress_testsub;
RESET SESSION AUTHORIZATION;
diff --git a/src/test/regress/sql/subscription.sql b/src/test/regress/sql/subscription.sql
index 7281f5fee2..7317b140f5 100644
--- a/src/test/regress/sql/subscription.sql
+++ b/src/test/regress/sql/subscription.sql
@@ -286,6 +286,30 @@ ALTER SUBSCRIPTION regress_testsub SET (disable_on_error = true);
ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE);
DROP SUBSCRIPTION regress_testsub;
+-- fail -- min_apply_delay must be a non-negative integer
+CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUBLICATION testpub WITH (connect = false, min_apply_delay = foo);
+CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUBLICATION testpub WITH (connect = false, min_apply_delay = -1);
+
+-- fail - utilizing streaming = parallel with time-delayed replication is not supported
+CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUBLICATION testpub WITH (connect = false, streaming = parallel, min_apply_delay = 123);
+
+-- success -- min_apply_delay value without unit is taken as milliseconds
+CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUBLICATION testpub WITH (connect = false, min_apply_delay = 123);
+\dRs+
+
+-- success -- min_apply_delay value with unit is converted into ms and stored as an integer
+ALTER SUBSCRIPTION regress_testsub SET (min_apply_delay = '1 d');
+\dRs+
+
+-- fail - alter subscription with streaming = parallel should fail when time-delayed replication is set
+ALTER SUBSCRIPTION regress_testsub SET (streaming = parallel);
+
+-- fail - alter subscription with min_apply_delay should fail when streaming = parallel is set
+ALTER SUBSCRIPTION regress_testsub SET (min_apply_delay = 0, streaming = parallel);
+ALTER SUBSCRIPTION regress_testsub SET (min_apply_delay = 123);
+ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE);
+DROP SUBSCRIPTION regress_testsub;
+
RESET SESSION AUTHORIZATION;
DROP ROLE regress_subscription_user;
DROP ROLE regress_subscription_user2;
diff --git a/src/test/subscription/meson.build b/src/test/subscription/meson.build
index 3db0fdfd96..a186876eb4 100644
--- a/src/test/subscription/meson.build
+++ b/src/test/subscription/meson.build
@@ -38,6 +38,7 @@ tests += {
't/029_on_error.pl',
't/030_origin.pl',
't/031_column_list.pl',
+ 't/032_apply_delay.pl',
't/100_bugs.pl',
],
},
diff --git a/src/test/subscription/t/032_apply_delay.pl b/src/test/subscription/t/032_apply_delay.pl
new file mode 100644
index 0000000000..8accc1fefd
--- /dev/null
+++ b/src/test/subscription/t/032_apply_delay.pl
@@ -0,0 +1,133 @@
+
+# Copyright (c) 2023, PostgreSQL Global Development Group
+
+# Checks for min_apply_delay
+use strict;
+use warnings;
+use PostgreSQL::Test::Cluster;
+use PostgreSQL::Test::Utils;
+use Test::More;
+
+# Confirm the time-delayed replication worked, by inspecting the server log
+# message where the apply worker emits for applying delay. Moreover, verify
+# that the current worker's remaining wait time is sufficiently bigger than the
+# expected value, in order to check any update of the min_apply_delay.
+sub check_apply_delay_log
+{
+ my ($node_subscriber, $offset, $expected) = @_;
+
+ # Get the remaining wait time from the server log
+ my $contents = slurp_file($node_subscriber->logfile, $offset);
+ $contents =~
+ qr/time-delayed replication for txid (\d+), min_apply_delay = (\d+) ms, remaining wait time: (\d+) ms/,
+ or die "could not get the apply worker wait time";
+ my $logged_delay = $3;
+
+ # Is it larger than expected?
+ cmp_ok($logged_delay, '>', $expected,
+ "The apply worker wait time has expected duration");
+}
+
+# Initialize publisher node
+my $node_publisher = PostgreSQL::Test::Cluster->new('publisher');
+$node_publisher->init(allows_streaming => 'logical');
+$node_publisher->append_conf('postgresql.conf',
+ 'logical_decoding_work_mem = 64kB');
+$node_publisher->start;
+
+# And some content
+$node_publisher->safe_psql('postgres',
+ "CREATE TABLE tab_int AS SELECT generate_series(1, 10) AS a");
+$node_publisher->safe_psql('postgres',
+ "CREATE PUBLICATION tap_pub FOR TABLE tab_int;");
+
+# Create subscriber
+my $node_subscriber = PostgreSQL::Test::Cluster->new('subscriber');
+my $delay = 3;
+$node_subscriber->init;
+$node_subscriber->append_conf(
+ 'postgresql.conf', qq(
+log_min_messages = debug2
+));
+$node_subscriber->start;
+$node_subscriber->safe_psql('postgres',
+ "CREATE TABLE tab_int AS SELECT generate_series(1, 10) AS a");
+my $publisher_connstr = $node_publisher->connstr . ' dbname=postgres';
+my $appname = 'tap_sub';
+$node_subscriber->safe_psql('postgres',
+ "CREATE SUBSCRIPTION tap_sub CONNECTION '$publisher_connstr application_name=$appname' PUBLICATION tap_pub WITH (copy_data = off, min_apply_delay = '${delay}s', streaming = 'on')"
+);
+
+# Make new content on publisher and check its presence in subscriber depending
+# on the delay applied above. Before doing the insertion, get the
+# current timestamp that will be used as a comparison base. Even on slow
+# machines, this allows to have a predictable behavior when comparing the
+# delay between data insertion moment on publisher and replay time on subscriber.
+my $publisher_insert_time = time();
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_int VALUES (generate_series(11, 20))");
+
+# The publisher waits for the replication to complete
+$node_publisher->wait_for_catchup($appname);
+
+# This test is successful if and only if the LSN has been applied with at least
+# the configured apply delay.
+ok( time() - $publisher_insert_time >= $delay,
+ "subscriber applies WAL only after replication delay for non-streaming transaction"
+);
+
+# For better coverage, setup for streaming case
+$node_publisher->append_conf('postgresql.conf',
+ 'logical_replication_mode = immediate');
+$node_publisher->reload;
+
+# Running a dummy query causes the config to be reloaded.
+$node_publisher->safe_psql('postgres', q{SELECT 1});
+
+# Check log starting now for logical replication apply delay
+$publisher_insert_time = time();
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_int VALUES (generate_series(21, 30))");
+$node_publisher->wait_for_catchup($appname);
+ok( time() - $publisher_insert_time >= $delay,
+ "subscriber applies WAL only after replication delay for streaming transaction"
+);
+
+# From here, confirm disabling the subscription by ALTER SUBSCRIPTION DISABLE
+# did not cause the delayed transaction to be applied. This is unique to
+# time-delayed logical replication.
+
+# Execute ALTER SUBSCRIPTION to change the delayed time of the apply worker
+# (1 day 5 minutes). Note that the extra 5 minute is to account for any
+# decoding/network overhead.
+$node_subscriber->safe_psql('postgres',
+ "ALTER SUBSCRIPTION tap_sub SET (min_apply_delay = 86700000)");
+
+# Check log starting now for logical replication apply delay
+my $offset = -s $node_subscriber->logfile;
+
+# New row to trigger apply delay
+$node_publisher->safe_psql('postgres', "INSERT INTO tab_int VALUES (0)");
+
+# Make sure the apply worker waits for more than 1 day
+check_apply_delay_log($node_subscriber, $offset, "86400000");
+
+# Disable subscription and the worker should die immediately
+$node_subscriber->safe_psql('postgres',
+ "ALTER SUBSCRIPTION tap_sub DISABLE;");
+
+# Wait until worker dies
+my $sub_query =
+ "SELECT count(1) = 0 FROM pg_stat_subscription WHERE subname = 'tap_sub' AND pid IS NOT NULL;";
+$node_subscriber->poll_query_until('postgres', $sub_query)
+ or die "Timed out while waiting for subscriber to die";
+
+# Confirm the record was not applied
+my $result = $node_subscriber->safe_psql('postgres',
+ "SELECT count(a) FROM tab_int WHERE a = 0;");
+is($result, qq(0), "check the delayed transaction was not applied");
+
+$node_subscriber->stop;
+$node_publisher->stop;
+
+done_testing();
--
2.30.0
On Mon, Feb 6, 2023 at 12:36 PM Takamichi Osumi (Fujitsu)
<osumi.takamichi@fujitsu.com> wrote:
I have made a couple of changes in the attached: (a) changed a few
error and LOG messages; (a) added/changed comments. See, if these look
good to you then please include them in the next version.
--
With Regards,
Amit Kapila.
Attachments:
v28_amit_changes.1.patchapplication/octet-stream; name=v28_amit_changes.1.patchDownload
diff --git a/doc/src/sgml/catalogs.sgml b/doc/src/sgml/catalogs.sgml
index 925a6ebb12..4f1876b20b 100644
--- a/doc/src/sgml/catalogs.sgml
+++ b/doc/src/sgml/catalogs.sgml
@@ -7878,7 +7878,7 @@ SCRAM-SHA-256$<replaceable><iteration count></replaceable>:<replaceable>&l
<structfield>subminapplydelay</structfield> <type>int4</type>
</para>
<para>
- The minimum amount of time to delay applying changes, in milliseconds.
+ The minimum delay (ms) for applying changes.
</para></entry>
</row>
diff --git a/src/backend/commands/subscriptioncmds.c b/src/backend/commands/subscriptioncmds.c
index 629366f91a..8a713e99f6 100644
--- a/src/backend/commands/subscriptioncmds.c
+++ b/src/backend/commands/subscriptioncmds.c
@@ -420,12 +420,12 @@ parse_subscription_options(ParseState *pstate, List *stmt_options,
/*
* The combination of parallel streaming mode and min_apply_delay is not
- * allowed. This is because we start applying the transaction stream as
- * soon as the first change arrives without knowing the transaction's
- * prepare/commit time. This means we cannot calculate the underlying
- * network/decoding lag between publisher and subscriber, and so always
- * waiting for the full 'min_apply_delay' period might include unnecessary
- * delay.
+ * allowed. This is because in parallel streaming mode, we start applying
+ * the transaction stream as soon as the first change arrives without
+ * knowing the transaction's prepare/commit time. This means we cannot
+ * calculate the underlying network/decoding lag between publisher and
+ * subscriber, and so always waiting for the full 'min_apply_delay' period
+ * might include unnecessary delay.
*
* The other possibility was to apply the delay at the end of the parallel
* apply transaction but that would cause issues related to resource bloat
@@ -437,7 +437,7 @@ parse_subscription_options(ParseState *pstate, List *stmt_options,
errcode(ERRCODE_SYNTAX_ERROR),
/*
- * - translator: the first %s is a string of the form "parameter > 0"
+ * translator: the first %s is a string of the form "parameter > 0"
* and the second one is "option = value".
*/
errmsg("%s and %s are mutually exclusive options",
@@ -1141,13 +1141,13 @@ AlterSubscription(ParseState *pstate, AlterSubscriptionStmt *stmt,
/*
* The combination of parallel streaming mode and
* min_apply_delay is not allowed. See
- * parse_subscription_options for details of the reason.
+ * parse_subscription_options.
*/
if (opts.streaming == LOGICALREP_STREAM_PARALLEL &&
!IsSet(opts.specified_opts, SUBOPT_MIN_APPLY_DELAY) && sub->minapplydelay > 0)
ereport(ERROR,
errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
- errmsg("cannot enable parallel streaming mode for subscription with %s",
+ errmsg("cannot set parallel streaming mode for subscription with %s",
"min_apply_delay"));
values[Anum_pg_subscription_substream - 1] =
@@ -1167,14 +1167,15 @@ AlterSubscription(ParseState *pstate, AlterSubscriptionStmt *stmt,
{
/*
* The combination of parallel streaming mode and
- * min_apply_delay is not allowed.
+ * min_apply_delay is not allowed. See
+ * parse_subscription_options.
*/
if (opts.min_apply_delay > 0 &&
!IsSet(opts.specified_opts, SUBOPT_STREAMING) && sub->stream == LOGICALREP_STREAM_PARALLEL)
ereport(ERROR,
errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
- errmsg("cannot enable %s for subscription in %s mode",
- "min_apply_delay", "streaming = parallel"));
+ errmsg("cannot set %s for subscription in parallel streaming mode",
+ "min_apply_delay"));
values[Anum_pg_subscription_subminapplydelay - 1] =
Int32GetDatum(opts.min_apply_delay);
diff --git a/src/backend/replication/logical/worker.c b/src/backend/replication/logical/worker.c
index 9b3de65de7..fde6978950 100644
--- a/src/backend/replication/logical/worker.c
+++ b/src/backend/replication/logical/worker.c
@@ -3911,9 +3911,9 @@ send_feedback(XLogRecPtr recvpos, bool force, bool requestReply, bool has_unproc
pq_sendint64(reply_message, now); /* sendTime */
pq_sendbyte(reply_message, requestReply); /* replyRequested */
- elog(DEBUG2, "sending feedback (force %d, has_unprocessed_change %s) to recv %X/%X, write %X/%X, flush %X/%X",
+ elog(DEBUG2, "sending feedback (force %d, has_unprocessed_change %d) to recv %X/%X, write %X/%X, flush %X/%X",
force,
- has_unprocessed_change ? "yes" : "no",
+ has_unprocessed_change,
LSN_FORMAT_ARGS(recvpos),
LSN_FORMAT_ARGS(writepos),
LSN_FORMAT_ARGS(flushpos));
On Tue, Jan 24, 2023 at 5:02 AM Euler Taveira <euler@eulerto.com> wrote:
- elog(DEBUG2, "sending feedback (force %d) to recv %X/%X, write %X/%X, flush %X/%X in-delayed: %d", + elog(DEBUG2, "sending feedback (force %d) to recv %X/%X, write %X/%X, flush %X/%X, apply delay: %s", force, LSN_FORMAT_ARGS(recvpos), LSN_FORMAT_ARGS(writepos), LSN_FORMAT_ARGS(flushpos), - in_delayed_apply); + in_delayed_apply? "yes" : "no");It is better to use a string to represent the yes/no option.
I think it is better to be consistent with the existing force
parameter which is also boolean, otherwise, it will look odd.
--
With Regards,
Amit Kapila.
On Monday, February 6, 2023 8:51 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
On Mon, Feb 6, 2023 at 12:36 PM Takamichi Osumi (Fujitsu)
<osumi.takamichi@fujitsu.com> wrote:I have made a couple of changes in the attached: (a) changed a few error and
LOG messages; (a) added/changed comments. See, if these look good to you
then please include them in the next version.
Hi, thanks for sharing the patch !
The proposed changes make comments easier to understand
and more aligned with other existing comments. So, LGTM.
The attached patch v29 has included your changes.
Best Regards,
Takamichi Osumi
Attachments:
v29-0001-Time-delayed-logical-replication-subscriber.patchapplication/octet-stream; name=v29-0001-Time-delayed-logical-replication-subscriber.patchDownload
From 7555ec9c90921160a172b8ca5e2bde141b864b5a Mon Sep 17 00:00:00 2001
From: Takamichi Osumi <osumi.takamichi@fujitsu.com>
Date: Mon, 6 Feb 2023 06:33:55 +0000
Subject: [PATCH v29] Time-delayed logical replication subscriber
Similar to physical replication, a time-delayed copy of the data for
logical replication is useful for some scenarios (particularly to fix
errors that might cause data loss).
This patch implements a new subscription parameter called 'min_apply_delay'.
If the subscription sets min_apply_delay parameter, the logical
replication worker will delay the transaction apply for min_apply_delay
milliseconds.
The delay is calculated between the WAL time stamp and the current time
on the subscriber.
The delay occurs before we start to apply the transaction on the
subscriber. The main reason is to avoid keeping a transaction open for
a long time. Regular and prepared transactions are covered. Streamed
transactions are also covered.
The combination of parallel streaming mode and min_apply_delay is not
allowed. This is because in parallel streaming mode, we start applying
the transaction stream as soon as the first change arrives without
knowing the transaction's prepare/commit time. This means we cannot
calculate the underlying network/decoding lag between publisher and
subscriber, and so always waiting for the full 'min_apply_delay' period
might include unnecessary delay.
The other possibility was to apply the delay at the end of the parallel
apply transaction but that would cause issues related to resource
bloat and locks being held for a long time.
Note that this feature doesn't interact with skip transaction feature.
The skip transaction feature applies to one transaction with a specific LSN.
So, even if the skipped transaction and non-skipped transaction come
consecutively in a very short time, regardless of the order of which comes
first, the time-delayed feature gets balanced by delayed application
for other transactions before and after the skipped transaction.
Author: Euler Taveira, Takamichi Osumi, Kuroda Hayato
Reviewed-by: Amit Kapila, Peter Smith, Vignesh C, Shveta Malik,
Kyotaro Horiguchi, Shi Yu, Wang Wei, Dilip Kumar, Melih Mutlu
Discussion: https://postgr.es/m/CAB-JLwYOYwL=XTyAXKiH5CtM_Vm8KjKh7aaitCKvmCh4rzr5pQ@mail.gmail.com
tmp
---
doc/src/sgml/catalogs.sgml | 9 +
doc/src/sgml/config.sgml | 12 ++
doc/src/sgml/glossary.sgml | 14 ++
doc/src/sgml/logical-replication.sgml | 6 +
doc/src/sgml/ref/alter_subscription.sgml | 5 +-
doc/src/sgml/ref/create_subscription.sgml | 49 ++++-
src/backend/catalog/pg_subscription.c | 1 +
src/backend/catalog/system_views.sql | 7 +-
src/backend/commands/subscriptioncmds.c | 119 +++++++++++-
.../replication/logical/applyparallelworker.c | 3 +-
src/backend/replication/logical/worker.c | 165 ++++++++++++++--
src/bin/pg_dump/pg_dump.c | 15 +-
src/bin/pg_dump/pg_dump.h | 1 +
src/bin/psql/describe.c | 9 +-
src/bin/psql/tab-complete.c | 4 +-
src/include/catalog/pg_subscription.h | 3 +
src/include/replication/worker_internal.h | 2 +-
src/test/regress/expected/subscription.out | 181 +++++++++++-------
src/test/regress/sql/subscription.sql | 24 +++
src/test/subscription/meson.build | 1 +
src/test/subscription/t/032_apply_delay.pl | 133 +++++++++++++
21 files changed, 659 insertions(+), 104 deletions(-)
create mode 100644 src/test/subscription/t/032_apply_delay.pl
diff --git a/doc/src/sgml/catalogs.sgml b/doc/src/sgml/catalogs.sgml
index c1e4048054..4f1876b20b 100644
--- a/doc/src/sgml/catalogs.sgml
+++ b/doc/src/sgml/catalogs.sgml
@@ -7873,6 +7873,15 @@ SCRAM-SHA-256$<replaceable><iteration count></replaceable>:<replaceable>&l
</para></entry>
</row>
+ <row>
+ <entry role="catalog_table_entry"><para role="column_definition">
+ <structfield>subminapplydelay</structfield> <type>int4</type>
+ </para>
+ <para>
+ The minimum delay (ms) for applying changes.
+ </para></entry>
+ </row>
+
<row>
<entry role="catalog_table_entry"><para role="column_definition">
<structfield>subname</structfield> <type>name</type>
diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml
index d190be1925..626a8b5bd0 100644
--- a/doc/src/sgml/config.sgml
+++ b/doc/src/sgml/config.sgml
@@ -4787,6 +4787,18 @@ ANY <replaceable class="parameter">num_sync</replaceable> ( <replaceable class="
the <filename>postgresql.conf</filename> file or on the server
command line.
</para>
+ <para>
+ For time-delayed logical replication, the apply worker sends a feedback
+ message to the publisher every
+ <varname>wal_receiver_status_interval</varname> milliseconds. Make sure
+ to set <varname>wal_receiver_status_interval</varname> less than the
+ <varname>wal_sender_timeout</varname> on the publisher, otherwise, the
+ <literal>walsender</literal> will repeatedly terminate due to timeout
+ errors. Note that if <varname>wal_receiver_status_interval</varname> is
+ set to zero, the apply worker sends no feedback messages during the
+ <literal>min_apply_delay</literal> period. Refer to
+ <xref linkend="sql-createsubscription"/> for more information.
+ </para>
</listitem>
</varlistentry>
diff --git a/doc/src/sgml/glossary.sgml b/doc/src/sgml/glossary.sgml
index 7c01a541fe..6ed6fa5853 100644
--- a/doc/src/sgml/glossary.sgml
+++ b/doc/src/sgml/glossary.sgml
@@ -1729,6 +1729,20 @@
</glossdef>
</glossentry>
+ <glossentry id="glossary-time-delayed-replication">
+ <glossterm>Time-delayed replication</glossterm>
+ <glossdef>
+ <para>
+ Replication setup that applies time-delayed copy of the data.
+ </para>
+ <para>
+ For more information, see
+ <xref linkend="guc-recovery-min-apply-delay"/> for physical replication
+ and <xref linkend="sql-createsubscription"/> for logical replication.
+ </para>
+ </glossdef>
+ </glossentry>
+
<glossentry id="glossary-toast">
<glossterm>TOAST</glossterm>
<glossdef>
diff --git a/doc/src/sgml/logical-replication.sgml b/doc/src/sgml/logical-replication.sgml
index 1bd5660c87..6bd5f61e2b 100644
--- a/doc/src/sgml/logical-replication.sgml
+++ b/doc/src/sgml/logical-replication.sgml
@@ -247,6 +247,12 @@
target table.
</para>
+ <para>
+ A subscription can delay the application of changes by specifying the
+ <literal>min_apply_delay</literal> subscription parameter. See
+ <xref linkend="sql-createsubscription"/> for details.
+ </para>
+
<sect2 id="logical-replication-subscription-slot">
<title>Replication Slot Management</title>
diff --git a/doc/src/sgml/ref/alter_subscription.sgml b/doc/src/sgml/ref/alter_subscription.sgml
index 964fcbb8ff..8b7eb28e54 100644
--- a/doc/src/sgml/ref/alter_subscription.sgml
+++ b/doc/src/sgml/ref/alter_subscription.sgml
@@ -213,8 +213,9 @@ ALTER SUBSCRIPTION <replaceable class="parameter">name</replaceable> RENAME TO <
are <literal>slot_name</literal>,
<literal>synchronous_commit</literal>,
<literal>binary</literal>, <literal>streaming</literal>,
- <literal>disable_on_error</literal>, and
- <literal>origin</literal>.
+ <literal>disable_on_error</literal>,
+ <literal>origin</literal>, and
+ <literal>min_apply_delay</literal>.
</para>
</listitem>
</varlistentry>
diff --git a/doc/src/sgml/ref/create_subscription.sgml b/doc/src/sgml/ref/create_subscription.sgml
index 51c45f17c7..1b4b8390af 100644
--- a/doc/src/sgml/ref/create_subscription.sgml
+++ b/doc/src/sgml/ref/create_subscription.sgml
@@ -349,7 +349,49 @@ CREATE SUBSCRIPTION <replaceable class="parameter">subscription_name</replaceabl
</para>
</listitem>
</varlistentry>
- </variablelist></para>
+
+ <varlistentry>
+ <term><literal>min_apply_delay</literal> (<type>integer</type>)</term>
+ <listitem>
+ <para>
+ By default, the subscriber applies changes as soon as possible. This
+ parameter allows the user to delay the application of changes by a
+ given time period. If the value is specified without units, it is
+ taken as milliseconds. The default is zero (no delay). See
+ <xref linkend="config-setting-names-values"/> for details on the
+ available valid time units.
+ </para>
+ <para>
+ Any delay becomes effective only after all initial table
+ synchronization has finished and occurs before each transaction starts
+ to get applied on the subscriber. The delay is calculated as the
+ difference between the WAL timestamp as written on the publisher and
+ the current time on the subscriber. Any overhead of time spent in
+ logical decoding and in transferring the transaction may reduce the
+ actual wait time. It is also possible that the overhead already
+ exceeds the requested <literal>min_apply_delay</literal> value, in
+ which case no delay is applied. If the system clocks on publisher and
+ subscriber are not synchronized, this may lead to apply changes
+ earlier than expected, but this is not a major issue because this
+ parameter is typically much larger than the time deviations between
+ servers. Note that if this parameter is set to a long delay, the
+ replication will stop if the replication slot falls behind the current
+ LSN by more than
+ <link linkend="guc-max-slot-wal-keep-size"><literal>max_slot_wal_keep_size</literal></link>.
+ </para>
+ <warning>
+ <para>
+ Delaying the replication means there is a much longer time between
+ making a change on the publisher, and that change being committed
+ on the subscriber. This can impact the performance of synchronous
+ replication. See <xref linkend="guc-synchronous-commit"/>
+ parameter.
+ </para>
+ </warning>
+ </listitem>
+ </varlistentry>
+ </variablelist>
+ </para>
</listitem>
</varlistentry>
@@ -420,6 +462,11 @@ CREATE SUBSCRIPTION <replaceable class="parameter">subscription_name</replaceabl
published with different column lists are not supported.
</para>
+ <para>
+ A non-zero <literal>min_apply_delay</literal> parameter is not allowed when
+ streaming in parallel mode.
+ </para>
+
<para>
We allow non-existent publications to be specified so that users can add
those later. This means
diff --git a/src/backend/catalog/pg_subscription.c b/src/backend/catalog/pg_subscription.c
index a56ae311c3..c767cc1c3a 100644
--- a/src/backend/catalog/pg_subscription.c
+++ b/src/backend/catalog/pg_subscription.c
@@ -64,6 +64,7 @@ GetSubscription(Oid subid, bool missing_ok)
sub->oid = subid;
sub->dbid = subform->subdbid;
sub->skiplsn = subform->subskiplsn;
+ sub->minapplydelay = subform->subminapplydelay;
sub->name = pstrdup(NameStr(subform->subname));
sub->owner = subform->subowner;
sub->enabled = subform->subenabled;
diff --git a/src/backend/catalog/system_views.sql b/src/backend/catalog/system_views.sql
index 8608e3fa5b..317c2010cb 100644
--- a/src/backend/catalog/system_views.sql
+++ b/src/backend/catalog/system_views.sql
@@ -1299,9 +1299,10 @@ REVOKE ALL ON pg_replication_origin_status FROM public;
-- All columns of pg_subscription except subconninfo are publicly readable.
REVOKE ALL ON pg_subscription FROM public;
-GRANT SELECT (oid, subdbid, subskiplsn, subname, subowner, subenabled,
- subbinary, substream, subtwophasestate, subdisableonerr,
- subslotname, subsynccommit, subpublications, suborigin)
+GRANT SELECT (oid, subdbid, subskiplsn, subminapplydelay, subname, subowner,
+ subenabled, subbinary, substream, subtwophasestate,
+ subdisableonerr, subslotname, subsynccommit, subpublications,
+ suborigin)
ON pg_subscription TO public;
CREATE VIEW pg_stat_subscription_stats AS
diff --git a/src/backend/commands/subscriptioncmds.c b/src/backend/commands/subscriptioncmds.c
index 464db6d247..8a713e99f6 100644
--- a/src/backend/commands/subscriptioncmds.c
+++ b/src/backend/commands/subscriptioncmds.c
@@ -66,6 +66,7 @@
#define SUBOPT_DISABLE_ON_ERR 0x00000400
#define SUBOPT_LSN 0x00000800
#define SUBOPT_ORIGIN 0x00001000
+#define SUBOPT_MIN_APPLY_DELAY 0x00002000
/* check if the 'val' has 'bits' set */
#define IsSet(val, bits) (((val) & (bits)) == (bits))
@@ -90,6 +91,7 @@ typedef struct SubOpts
bool disableonerr;
char *origin;
XLogRecPtr lsn;
+ int32 min_apply_delay;
} SubOpts;
static List *fetch_table_list(WalReceiverConn *wrconn, List *publications);
@@ -100,7 +102,7 @@ static void check_publications_origin(WalReceiverConn *wrconn,
static void check_duplicates_in_publist(List *publist, Datum *datums);
static List *merge_publications(List *oldpublist, List *newpublist, bool addpub, const char *subname);
static void ReportSlotConnectionError(List *rstates, Oid subid, char *slotname, char *err);
-
+static int32 defGetMinApplyDelay(DefElem *def);
/*
* Common option parsing function for CREATE and ALTER SUBSCRIPTION commands.
@@ -146,6 +148,8 @@ parse_subscription_options(ParseState *pstate, List *stmt_options,
opts->disableonerr = false;
if (IsSet(supported_opts, SUBOPT_ORIGIN))
opts->origin = pstrdup(LOGICALREP_ORIGIN_ANY);
+ if (IsSet(supported_opts, SUBOPT_MIN_APPLY_DELAY))
+ opts->min_apply_delay = 0;
/* Parse options */
foreach(lc, stmt_options)
@@ -324,6 +328,15 @@ parse_subscription_options(ParseState *pstate, List *stmt_options,
opts->specified_opts |= SUBOPT_LSN;
opts->lsn = lsn;
}
+ else if (IsSet(supported_opts, SUBOPT_MIN_APPLY_DELAY) &&
+ strcmp(defel->defname, "min_apply_delay") == 0)
+ {
+ if (IsSet(opts->specified_opts, SUBOPT_MIN_APPLY_DELAY))
+ errorConflictingDefElem(defel, pstate);
+
+ opts->specified_opts |= SUBOPT_MIN_APPLY_DELAY;
+ opts->min_apply_delay = defGetMinApplyDelay(defel);
+ }
else
ereport(ERROR,
(errcode(ERRCODE_SYNTAX_ERROR),
@@ -404,6 +417,31 @@ parse_subscription_options(ParseState *pstate, List *stmt_options,
"slot_name = NONE", "create_slot = false")));
}
}
+
+ /*
+ * The combination of parallel streaming mode and min_apply_delay is not
+ * allowed. This is because in parallel streaming mode, we start applying
+ * the transaction stream as soon as the first change arrives without
+ * knowing the transaction's prepare/commit time. This means we cannot
+ * calculate the underlying network/decoding lag between publisher and
+ * subscriber, and so always waiting for the full 'min_apply_delay' period
+ * might include unnecessary delay.
+ *
+ * The other possibility was to apply the delay at the end of the parallel
+ * apply transaction but that would cause issues related to resource bloat
+ * and locks being held for a long time.
+ */
+ if (IsSet(supported_opts, SUBOPT_MIN_APPLY_DELAY) &&
+ opts->min_apply_delay > 0 && opts->streaming == LOGICALREP_STREAM_PARALLEL)
+ ereport(ERROR,
+ errcode(ERRCODE_SYNTAX_ERROR),
+
+ /*
+ * translator: the first %s is a string of the form "parameter > 0"
+ * and the second one is "option = value".
+ */
+ errmsg("%s and %s are mutually exclusive options",
+ "min_apply_delay > 0", "streaming = parallel"));
}
/*
@@ -560,7 +598,8 @@ CreateSubscription(ParseState *pstate, CreateSubscriptionStmt *stmt,
SUBOPT_SLOT_NAME | SUBOPT_COPY_DATA |
SUBOPT_SYNCHRONOUS_COMMIT | SUBOPT_BINARY |
SUBOPT_STREAMING | SUBOPT_TWOPHASE_COMMIT |
- SUBOPT_DISABLE_ON_ERR | SUBOPT_ORIGIN);
+ SUBOPT_DISABLE_ON_ERR | SUBOPT_ORIGIN |
+ SUBOPT_MIN_APPLY_DELAY);
parse_subscription_options(pstate, stmt->options, supported_opts, &opts);
/*
@@ -625,6 +664,7 @@ CreateSubscription(ParseState *pstate, CreateSubscriptionStmt *stmt,
values[Anum_pg_subscription_oid - 1] = ObjectIdGetDatum(subid);
values[Anum_pg_subscription_subdbid - 1] = ObjectIdGetDatum(MyDatabaseId);
values[Anum_pg_subscription_subskiplsn - 1] = LSNGetDatum(InvalidXLogRecPtr);
+ values[Anum_pg_subscription_subminapplydelay - 1] = Int32GetDatum(opts.min_apply_delay);
values[Anum_pg_subscription_subname - 1] =
DirectFunctionCall1(namein, CStringGetDatum(stmt->subname));
values[Anum_pg_subscription_subowner - 1] = ObjectIdGetDatum(owner);
@@ -1054,7 +1094,7 @@ AlterSubscription(ParseState *pstate, AlterSubscriptionStmt *stmt,
supported_opts = (SUBOPT_SLOT_NAME |
SUBOPT_SYNCHRONOUS_COMMIT | SUBOPT_BINARY |
SUBOPT_STREAMING | SUBOPT_DISABLE_ON_ERR |
- SUBOPT_ORIGIN);
+ SUBOPT_ORIGIN | SUBOPT_MIN_APPLY_DELAY);
parse_subscription_options(pstate, stmt->options,
supported_opts, &opts);
@@ -1098,6 +1138,18 @@ AlterSubscription(ParseState *pstate, AlterSubscriptionStmt *stmt,
if (IsSet(opts.specified_opts, SUBOPT_STREAMING))
{
+ /*
+ * The combination of parallel streaming mode and
+ * min_apply_delay is not allowed. See
+ * parse_subscription_options.
+ */
+ if (opts.streaming == LOGICALREP_STREAM_PARALLEL &&
+ !IsSet(opts.specified_opts, SUBOPT_MIN_APPLY_DELAY) && sub->minapplydelay > 0)
+ ereport(ERROR,
+ errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
+ errmsg("cannot set parallel streaming mode for subscription with %s",
+ "min_apply_delay"));
+
values[Anum_pg_subscription_substream - 1] =
CharGetDatum(opts.streaming);
replaces[Anum_pg_subscription_substream - 1] = true;
@@ -1111,6 +1163,25 @@ AlterSubscription(ParseState *pstate, AlterSubscriptionStmt *stmt,
= true;
}
+ if (IsSet(opts.specified_opts, SUBOPT_MIN_APPLY_DELAY))
+ {
+ /*
+ * The combination of parallel streaming mode and
+ * min_apply_delay is not allowed. See
+ * parse_subscription_options.
+ */
+ if (opts.min_apply_delay > 0 &&
+ !IsSet(opts.specified_opts, SUBOPT_STREAMING) && sub->stream == LOGICALREP_STREAM_PARALLEL)
+ ereport(ERROR,
+ errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
+ errmsg("cannot set %s for subscription in parallel streaming mode",
+ "min_apply_delay"));
+
+ values[Anum_pg_subscription_subminapplydelay - 1] =
+ Int32GetDatum(opts.min_apply_delay);
+ replaces[Anum_pg_subscription_subminapplydelay - 1] = true;
+ }
+
if (IsSet(opts.specified_opts, SUBOPT_ORIGIN))
{
values[Anum_pg_subscription_suborigin - 1] =
@@ -2195,3 +2266,45 @@ defGetStreamingMode(DefElem *def)
def->defname)));
return LOGICALREP_STREAM_OFF; /* keep compiler quiet */
}
+
+/*
+ * Extract the min_apply_delay value from a DefElem. This is very similar to
+ * parse_and_validate_value() for integer values, because min_apply_delay
+ * accepts the same parameter format as recovery_min_apply_delay.
+ */
+static int32
+defGetMinApplyDelay(DefElem *def)
+{
+ char *input_string;
+ int result;
+ const char *hintmsg;
+
+ input_string = defGetString(def);
+
+ /*
+ * Parse given string as parameter which has millisecond unit
+ */
+ if (!parse_int(input_string, &result, GUC_UNIT_MS, &hintmsg))
+ ereport(ERROR,
+ (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+ errmsg("invalid value for parameter \"%s\": \"%s\"",
+ "min_apply_delay", input_string),
+ hintmsg ? errhint("%s", _(hintmsg)) : 0));
+
+ /*
+ * Check both the lower boundary for the valid min_apply_delay range and
+ * the upper boundary as the safeguard for some platforms where INT_MAX is
+ * wider than int32 respectively. Although parse_int() has confirmed that
+ * the result is less than or equal to INT_MAX, the value will be stored
+ * in a catalog column of int32.
+ */
+ if (result < 0 || result > PG_INT32_MAX)
+ ereport(ERROR,
+ (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+ errmsg("%d ms is outside the valid range for parameter \"%s\" (%d .. %d)",
+ result,
+ "min_apply_delay",
+ 0, PG_INT32_MAX)));
+
+ return result;
+}
diff --git a/src/backend/replication/logical/applyparallelworker.c b/src/backend/replication/logical/applyparallelworker.c
index e670ec617a..d29e2dd7b9 100644
--- a/src/backend/replication/logical/applyparallelworker.c
+++ b/src/backend/replication/logical/applyparallelworker.c
@@ -704,7 +704,8 @@ pa_process_spooled_messages_if_required(void)
{
apply_spooled_messages(&MyParallelShared->fileset,
MyParallelShared->xid,
- InvalidXLogRecPtr);
+ InvalidXLogRecPtr,
+ 0);
pa_set_fileset_state(MyParallelShared, FS_EMPTY);
}
diff --git a/src/backend/replication/logical/worker.c b/src/backend/replication/logical/worker.c
index cfb2ab6248..fde6978950 100644
--- a/src/backend/replication/logical/worker.c
+++ b/src/backend/replication/logical/worker.c
@@ -319,6 +319,17 @@ static List *on_commit_wakeup_workers_subids = NIL;
bool in_remote_transaction = false;
static XLogRecPtr remote_final_lsn = InvalidXLogRecPtr;
+/*
+ * In order to avoid walsender timeout for time-delayed logical replication the
+ * apply worker keeps sending feedback messages during the delay period.
+ * Meanwhile, the feature delays the apply before the start of the
+ * transaction and thus we don't write WAL records for the suspended changes
+ * during the wait. When the apply worker sends a feedback message during the
+ * delay, we should not overwrite positions of the flushed and apply LSN by the
+ * last received latest LSN. See send_feedback() for details.
+ */
+static XLogRecPtr last_received = InvalidXLogRecPtr;
+
/* fields valid only when processing streamed transaction */
static bool in_streamed_transaction = false;
@@ -389,7 +400,8 @@ static void stream_write_change(char action, StringInfo s);
static void stream_open_and_write_change(TransactionId xid, char action, StringInfo s);
static void stream_close_file(void);
-static void send_feedback(XLogRecPtr recvpos, bool force, bool requestReply);
+static void send_feedback(XLogRecPtr recvpos, bool force, bool requestReply,
+ bool has_unprocessed_change);
static void DisableSubscriptionAndExit(void);
@@ -999,6 +1011,109 @@ slot_modify_data(TupleTableSlot *slot, TupleTableSlot *srcslot,
ExecStoreVirtualTuple(slot);
}
+/*
+ * When min_apply_delay parameter is set on the subscriber, we wait long enough
+ * to make sure a transaction is applied at least that period behind the
+ * publisher.
+ *
+ * While the physical replication applies the delay at commit time, this
+ * feature applies the delay for the next transaction but before starting the
+ * transaction. This is mainly because keeping a transaction that conducted
+ * write operations open for a long time results in some issues such as bloat
+ * and locks.
+ *
+ * The min_apply_delay parameter will take effect only after all tables are in
+ * READY state.
+ *
+ * xid is the transaction id where we apply the delay.
+ *
+ * finish_ts is the commit/prepare time of both regular (non-streamed) and
+ * streamed transactions. Unlike the regular (non-streamed) cases, the delay
+ * is applied in a STREAM COMMIT/STREAM PREPARE message for streamed
+ * transactions. The STREAM START message does not contain a commit/prepare
+ * time (it will be available when the in-progress transaction finishes).
+ * Hence, it's not appropriate to apply a delay at the STREAM START time.
+ */
+static void
+maybe_apply_delay(TransactionId xid, TimestampTz finish_ts)
+{
+ Assert(finish_ts > 0);
+
+ /* Nothing to do if no delay set */
+ if (!MySubscription->minapplydelay)
+ return;
+
+ /*
+ * The min_apply_delay parameter is ignored until all tablesync workers
+ * have reached READY state. This is because if we allowed the delay
+ * during the catchup phase, then once we reached the limit of tablesync
+ * workers it would impose a delay for each subsequent worker. That would
+ * cause initial table synchronization completion to take a long time.
+ */
+ if (!AllTablesyncsReady())
+ return;
+
+ /* Apply the delay by the latch mechanism */
+ while (true)
+ {
+ TimestampTz delayUntil;
+ long diffms;
+
+ ResetLatch(MyLatch);
+
+ CHECK_FOR_INTERRUPTS();
+
+ /* This might change wal_receiver_status_interval */
+ if (ConfigReloadPending)
+ {
+ ConfigReloadPending = false;
+ ProcessConfigFile(PGC_SIGHUP);
+ }
+
+ /*
+ * Before calculating the time duration, reload the catalog if needed.
+ */
+ if (!in_remote_transaction && !in_streamed_transaction)
+ {
+ AcceptInvalidationMessages();
+ maybe_reread_subscription();
+ }
+
+ delayUntil = TimestampTzPlusMilliseconds(finish_ts, MySubscription->minapplydelay);
+ diffms = TimestampDifferenceMilliseconds(GetCurrentTimestamp(), delayUntil);
+
+ /*
+ * Exit without arming the latch if it's already past time to apply
+ * this transaction.
+ */
+ if (diffms <= 0)
+ break;
+
+ elog(DEBUG2, "time-delayed replication for txid %u, min_apply_delay = %d ms, remaining wait time: %ld ms",
+ xid, MySubscription->minapplydelay, diffms);
+
+ /*
+ * Call send_feedback() to prevent the publisher from exiting by
+ * timeout during the delay, when wal_receiver_status_interval is
+ * available.
+ */
+ if (wal_receiver_status_interval > 0 &&
+ diffms > wal_receiver_status_interval * 1000L)
+ {
+ WaitLatch(MyLatch,
+ WL_LATCH_SET | WL_TIMEOUT | WL_EXIT_ON_PM_DEATH,
+ wal_receiver_status_interval * 1000L,
+ WAIT_EVENT_RECOVERY_APPLY_DELAY);
+ send_feedback(last_received, true, false, true);
+ }
+ else
+ WaitLatch(MyLatch,
+ WL_LATCH_SET | WL_TIMEOUT | WL_EXIT_ON_PM_DEATH,
+ diffms,
+ WAIT_EVENT_RECOVERY_APPLY_DELAY);
+ }
+}
+
/*
* Handle BEGIN message.
*/
@@ -1013,6 +1128,9 @@ apply_handle_begin(StringInfo s)
logicalrep_read_begin(s, &begin_data);
set_apply_error_context_xact(begin_data.xid, begin_data.final_lsn);
+ /* Should we delay the current transaction? */
+ maybe_apply_delay(begin_data.xid, begin_data.committime);
+
remote_final_lsn = begin_data.final_lsn;
maybe_start_skipping_changes(begin_data.final_lsn);
@@ -1070,6 +1188,9 @@ apply_handle_begin_prepare(StringInfo s)
logicalrep_read_begin_prepare(s, &begin_data);
set_apply_error_context_xact(begin_data.xid, begin_data.prepare_lsn);
+ /* Should we delay the current prepared transaction? */
+ maybe_apply_delay(begin_data.xid, begin_data.prepare_time);
+
remote_final_lsn = begin_data.prepare_lsn;
maybe_start_skipping_changes(begin_data.prepare_lsn);
@@ -1317,7 +1438,8 @@ apply_handle_stream_prepare(StringInfo s)
* spooled operations.
*/
apply_spooled_messages(MyLogicalRepWorker->stream_fileset,
- prepare_data.xid, prepare_data.prepare_lsn);
+ prepare_data.xid, prepare_data.prepare_lsn,
+ prepare_data.prepare_time);
/* Mark the transaction as prepared. */
apply_handle_prepare_internal(&prepare_data);
@@ -2011,10 +2133,13 @@ ensure_last_message(FileSet *stream_fileset, TransactionId xid, int fileno,
/*
* Common spoolfile processing.
+ *
+ * The commit/prepare time (finish_ts) for streamed transactions is required
+ * for time-delayed logical replication.
*/
void
apply_spooled_messages(FileSet *stream_fileset, TransactionId xid,
- XLogRecPtr lsn)
+ XLogRecPtr lsn, TimestampTz finish_ts)
{
StringInfoData s2;
int nchanges;
@@ -2025,6 +2150,10 @@ apply_spooled_messages(FileSet *stream_fileset, TransactionId xid,
int fileno;
off_t offset;
+ /* Should we delay the current transaction? */
+ if (finish_ts)
+ maybe_apply_delay(xid, finish_ts);
+
if (!am_parallel_apply_worker())
maybe_start_skipping_changes(lsn);
@@ -2174,7 +2303,7 @@ apply_handle_stream_commit(StringInfo s)
* spooled operations.
*/
apply_spooled_messages(MyLogicalRepWorker->stream_fileset, xid,
- commit_data.commit_lsn);
+ commit_data.commit_lsn, commit_data.committime);
apply_handle_commit_internal(&commit_data);
@@ -3447,7 +3576,7 @@ UpdateWorkerStats(XLogRecPtr last_lsn, TimestampTz send_time, bool reply)
* Apply main loop.
*/
static void
-LogicalRepApplyLoop(XLogRecPtr last_received)
+LogicalRepApplyLoop(void)
{
TimestampTz last_recv_timestamp = GetCurrentTimestamp();
bool ping_sent = false;
@@ -3568,7 +3697,7 @@ LogicalRepApplyLoop(XLogRecPtr last_received)
if (last_received < end_lsn)
last_received = end_lsn;
- send_feedback(last_received, reply_requested, false);
+ send_feedback(last_received, reply_requested, false, false);
UpdateWorkerStats(last_received, timestamp, true);
}
/* other message types are purposefully ignored */
@@ -3581,7 +3710,7 @@ LogicalRepApplyLoop(XLogRecPtr last_received)
}
/* confirm all writes so far */
- send_feedback(last_received, false, false);
+ send_feedback(last_received, false, false, false);
if (!in_remote_transaction && !in_streamed_transaction)
{
@@ -3678,7 +3807,7 @@ LogicalRepApplyLoop(XLogRecPtr last_received)
}
}
- send_feedback(last_received, requestReply, requestReply);
+ send_feedback(last_received, requestReply, requestReply, false);
/*
* Force reporting to ensure long idle periods don't lead to
@@ -3708,7 +3837,7 @@ LogicalRepApplyLoop(XLogRecPtr last_received)
* to send a response to avoid timeouts.
*/
static void
-send_feedback(XLogRecPtr recvpos, bool force, bool requestReply)
+send_feedback(XLogRecPtr recvpos, bool force, bool requestReply, bool has_unprocessed_change)
{
static StringInfo reply_message = NULL;
static TimestampTz send_time = 0;
@@ -3738,8 +3867,14 @@ send_feedback(XLogRecPtr recvpos, bool force, bool requestReply)
/*
* No outstanding transactions to flush, we can report the latest received
* position. This is important for synchronous replication.
+ *
+ * If the logical replication subscription has unprocessed changes then do
+ * not inform the publisher that the received latest LSN is already
+ * applied and flushed, otherwise, the publisher will make a wrong
+ * assumption about the logical replication progress. Instead, just send a
+ * feedback message to avoid a replication timeout during the delay.
*/
- if (!have_pending_txes)
+ if (!have_pending_txes && !has_unprocessed_change)
flushpos = writepos = recvpos;
if (writepos < last_writepos)
@@ -3776,8 +3911,9 @@ send_feedback(XLogRecPtr recvpos, bool force, bool requestReply)
pq_sendint64(reply_message, now); /* sendTime */
pq_sendbyte(reply_message, requestReply); /* replyRequested */
- elog(DEBUG2, "sending feedback (force %d) to recv %X/%X, write %X/%X, flush %X/%X",
+ elog(DEBUG2, "sending feedback (force %d, has_unprocessed_change %d) to recv %X/%X, write %X/%X, flush %X/%X",
force,
+ has_unprocessed_change,
LSN_FORMAT_ARGS(recvpos),
LSN_FORMAT_ARGS(writepos),
LSN_FORMAT_ARGS(flushpos));
@@ -4367,11 +4503,11 @@ start_table_sync(XLogRecPtr *origin_startpos, char **myslotname)
* of system resource error and are not repeatable.
*/
static void
-start_apply(XLogRecPtr origin_startpos)
+start_apply(void)
{
PG_TRY();
{
- LogicalRepApplyLoop(origin_startpos);
+ LogicalRepApplyLoop();
}
PG_CATCH();
{
@@ -4661,7 +4797,8 @@ ApplyWorkerMain(Datum main_arg)
}
/* Run the main loop. */
- start_apply(origin_startpos);
+ last_received = origin_startpos;
+ start_apply();
proc_exit(0);
}
diff --git a/src/bin/pg_dump/pg_dump.c b/src/bin/pg_dump/pg_dump.c
index 527c7651ab..1e87f0124e 100644
--- a/src/bin/pg_dump/pg_dump.c
+++ b/src/bin/pg_dump/pg_dump.c
@@ -4494,6 +4494,7 @@ getSubscriptions(Archive *fout)
int i_subsynccommit;
int i_subpublications;
int i_subbinary;
+ int i_subminapplydelay;
int i,
ntups;
@@ -4546,9 +4547,13 @@ getSubscriptions(Archive *fout)
LOGICALREP_TWOPHASE_STATE_DISABLED);
if (fout->remoteVersion >= 160000)
- appendPQExpBufferStr(query, " s.suborigin\n");
+ appendPQExpBufferStr(query,
+ " s.suborigin,\n"
+ " s.subminapplydelay\n");
else
- appendPQExpBuffer(query, " '%s' AS suborigin\n", LOGICALREP_ORIGIN_ANY);
+ appendPQExpBuffer(query, " '%s' AS suborigin,\n"
+ " 0 AS subminapplydelay\n",
+ LOGICALREP_ORIGIN_ANY);
appendPQExpBufferStr(query,
"FROM pg_subscription s\n"
@@ -4576,6 +4581,7 @@ getSubscriptions(Archive *fout)
i_subtwophasestate = PQfnumber(res, "subtwophasestate");
i_subdisableonerr = PQfnumber(res, "subdisableonerr");
i_suborigin = PQfnumber(res, "suborigin");
+ i_subminapplydelay = PQfnumber(res, "subminapplydelay");
subinfo = pg_malloc(ntups * sizeof(SubscriptionInfo));
@@ -4606,6 +4612,8 @@ getSubscriptions(Archive *fout)
subinfo[i].subdisableonerr =
pg_strdup(PQgetvalue(res, i, i_subdisableonerr));
subinfo[i].suborigin = pg_strdup(PQgetvalue(res, i, i_suborigin));
+ subinfo[i].subminapplydelay =
+ atoi(PQgetvalue(res, i, i_subminapplydelay));
/* Decide whether we want to dump it */
selectDumpableObject(&(subinfo[i].dobj), fout);
@@ -4687,6 +4695,9 @@ dumpSubscription(Archive *fout, const SubscriptionInfo *subinfo)
if (strcmp(subinfo->subsynccommit, "off") != 0)
appendPQExpBuffer(query, ", synchronous_commit = %s", fmtId(subinfo->subsynccommit));
+ if (subinfo->subminapplydelay > 0)
+ appendPQExpBuffer(query, ", min_apply_delay = '%d ms'", subinfo->subminapplydelay);
+
appendPQExpBufferStr(query, ");\n");
if (subinfo->dobj.dump & DUMP_COMPONENT_DEFINITION)
diff --git a/src/bin/pg_dump/pg_dump.h b/src/bin/pg_dump/pg_dump.h
index e7cbd8d7ed..b8831c3ed3 100644
--- a/src/bin/pg_dump/pg_dump.h
+++ b/src/bin/pg_dump/pg_dump.h
@@ -661,6 +661,7 @@ typedef struct _SubscriptionInfo
char *subdisableonerr;
char *suborigin;
char *subsynccommit;
+ int subminapplydelay;
char *subpublications;
} SubscriptionInfo;
diff --git a/src/bin/psql/describe.c b/src/bin/psql/describe.c
index c8a0bb7b3a..81d4607a1c 100644
--- a/src/bin/psql/describe.c
+++ b/src/bin/psql/describe.c
@@ -6472,7 +6472,7 @@ describeSubscriptions(const char *pattern, bool verbose)
PGresult *res;
printQueryOpt myopt = pset.popt;
static const bool translate_columns[] = {false, false, false, false,
- false, false, false, false, false, false, false, false};
+ false, false, false, false, false, false, false, false, false};
if (pset.sversion < 100000)
{
@@ -6527,10 +6527,13 @@ describeSubscriptions(const char *pattern, bool verbose)
gettext_noop("Two-phase commit"),
gettext_noop("Disable on error"));
+ /* Origin and min_apply_delay are only supported in v16 and higher */
if (pset.sversion >= 160000)
appendPQExpBuffer(&buf,
- ", suborigin AS \"%s\"\n",
- gettext_noop("Origin"));
+ ", suborigin AS \"%s\"\n"
+ ", subminapplydelay AS \"%s\"\n",
+ gettext_noop("Origin"),
+ gettext_noop("Min apply delay"));
appendPQExpBuffer(&buf,
", subsynccommit AS \"%s\"\n"
diff --git a/src/bin/psql/tab-complete.c b/src/bin/psql/tab-complete.c
index 5e1882eaea..e8b9a43a47 100644
--- a/src/bin/psql/tab-complete.c
+++ b/src/bin/psql/tab-complete.c
@@ -1925,7 +1925,7 @@ psql_completion(const char *text, int start, int end)
COMPLETE_WITH("(", "PUBLICATION");
/* ALTER SUBSCRIPTION <name> SET ( */
else if (HeadMatches("ALTER", "SUBSCRIPTION", MatchAny) && TailMatches("SET", "("))
- COMPLETE_WITH("binary", "disable_on_error", "origin", "slot_name",
+ COMPLETE_WITH("binary", "disable_on_error", "min_apply_delay", "origin", "slot_name",
"streaming", "synchronous_commit");
/* ALTER SUBSCRIPTION <name> SKIP ( */
else if (HeadMatches("ALTER", "SUBSCRIPTION", MatchAny) && TailMatches("SKIP", "("))
@@ -3268,7 +3268,7 @@ psql_completion(const char *text, int start, int end)
/* Complete "CREATE SUBSCRIPTION <name> ... WITH ( <opt>" */
else if (HeadMatches("CREATE", "SUBSCRIPTION") && TailMatches("WITH", "("))
COMPLETE_WITH("binary", "connect", "copy_data", "create_slot",
- "disable_on_error", "enabled", "origin", "slot_name",
+ "disable_on_error", "enabled", "min_apply_delay", "origin", "slot_name",
"streaming", "synchronous_commit", "two_phase");
/* CREATE TRIGGER --- is allowed inside CREATE SCHEMA, so use TailMatches */
diff --git a/src/include/catalog/pg_subscription.h b/src/include/catalog/pg_subscription.h
index b0f2a1705d..b8fe47ef6e 100644
--- a/src/include/catalog/pg_subscription.h
+++ b/src/include/catalog/pg_subscription.h
@@ -74,6 +74,8 @@ CATALOG(pg_subscription,6100,SubscriptionRelationId) BKI_SHARED_RELATION BKI_ROW
Oid subowner BKI_LOOKUP(pg_authid); /* Owner of the subscription */
+ int32 subminapplydelay; /* Replication apply delay (ms) */
+
bool subenabled; /* True if the subscription is enabled (the
* worker should be running) */
@@ -120,6 +122,7 @@ typedef struct Subscription
* in */
XLogRecPtr skiplsn; /* All changes finished at this LSN are
* skipped */
+ int32 minapplydelay; /* Replication apply delay (ms) */
char *name; /* Name of the subscription */
Oid owner; /* Oid of the subscription owner */
bool enabled; /* Indicates if the subscription is enabled */
diff --git a/src/include/replication/worker_internal.h b/src/include/replication/worker_internal.h
index dc87a4edd1..3dc09d1a4c 100644
--- a/src/include/replication/worker_internal.h
+++ b/src/include/replication/worker_internal.h
@@ -255,7 +255,7 @@ extern void stream_stop_internal(TransactionId xid);
/* Common streaming function to apply all the spooled messages */
extern void apply_spooled_messages(FileSet *stream_fileset, TransactionId xid,
- XLogRecPtr lsn);
+ XLogRecPtr lsn, TimestampTz finish_ts);
extern void apply_dispatch(StringInfo s);
diff --git a/src/test/regress/expected/subscription.out b/src/test/regress/expected/subscription.out
index 3f99b14394..cf8e727ee9 100644
--- a/src/test/regress/expected/subscription.out
+++ b/src/test/regress/expected/subscription.out
@@ -114,18 +114,18 @@ CREATE SUBSCRIPTION regress_testsub4 CONNECTION 'dbname=regress_doesnotexist' PU
WARNING: subscription was created, but is not connected
HINT: To initiate replication, you must manually create the replication slot, enable the subscription, and refresh the subscription.
\dRs+ regress_testsub4
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
-------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub4 | regress_subscription_user | f | {testpub} | f | off | d | f | none | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay | Synchronous commit | Conninfo | Skip LSN
+------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+-----------------+--------------------+-----------------------------+----------
+ regress_testsub4 | regress_subscription_user | f | {testpub} | f | off | d | f | none | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
ALTER SUBSCRIPTION regress_testsub4 SET (origin = any);
\dRs+ regress_testsub4
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
-------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub4 | regress_subscription_user | f | {testpub} | f | off | d | f | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay | Synchronous commit | Conninfo | Skip LSN
+------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+-----------------+--------------------+-----------------------------+----------
+ regress_testsub4 | regress_subscription_user | f | {testpub} | f | off | d | f | any | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
DROP SUBSCRIPTION regress_testsub3;
@@ -143,10 +143,10 @@ ALTER SUBSCRIPTION regress_testsub CONNECTION 'foobar';
ERROR: invalid connection string syntax: missing "=" after "foobar" in connection info string
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub} | f | off | d | f | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+-----------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | off | d | f | any | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
ALTER SUBSCRIPTION regress_testsub SET PUBLICATION testpub2, testpub3 WITH (refresh = false);
@@ -163,10 +163,10 @@ ERROR: unrecognized subscription parameter: "create_slot"
-- ok
ALTER SUBSCRIPTION regress_testsub SKIP (lsn = '0/12345');
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+---------------------+--------+-----------+------------------+------------------+--------+--------------------+------------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub2,testpub3} | f | off | d | f | any | off | dbname=regress_doesnotexist2 | 0/12345
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+---------------------+--------+-----------+------------------+------------------+--------+-----------------+--------------------+------------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub2,testpub3} | f | off | d | f | any | 0 | off | dbname=regress_doesnotexist2 | 0/12345
(1 row)
-- ok - with lsn = NONE
@@ -175,10 +175,10 @@ ALTER SUBSCRIPTION regress_testsub SKIP (lsn = NONE);
ALTER SUBSCRIPTION regress_testsub SKIP (lsn = '0/0');
ERROR: invalid WAL location (LSN): 0/0
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+---------------------+--------+-----------+------------------+------------------+--------+--------------------+------------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub2,testpub3} | f | off | d | f | any | off | dbname=regress_doesnotexist2 | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+---------------------+--------+-----------+------------------+------------------+--------+-----------------+--------------------+------------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub2,testpub3} | f | off | d | f | any | 0 | off | dbname=regress_doesnotexist2 | 0/0
(1 row)
BEGIN;
@@ -210,10 +210,10 @@ ALTER SUBSCRIPTION regress_testsub_foo SET (synchronous_commit = foobar);
ERROR: invalid value for parameter "synchronous_commit": "foobar"
HINT: Available values: local, remote_write, remote_apply, on, off.
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
----------------------+---------------------------+---------+---------------------+--------+-----------+------------------+------------------+--------+--------------------+------------------------------+----------
- regress_testsub_foo | regress_subscription_user | f | {testpub2,testpub3} | f | off | d | f | any | local | dbname=regress_doesnotexist2 | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay | Synchronous commit | Conninfo | Skip LSN
+---------------------+---------------------------+---------+---------------------+--------+-----------+------------------+------------------+--------+-----------------+--------------------+------------------------------+----------
+ regress_testsub_foo | regress_subscription_user | f | {testpub2,testpub3} | f | off | d | f | any | 0 | local | dbname=regress_doesnotexist2 | 0/0
(1 row)
-- rename back to keep the rest simple
@@ -247,19 +247,19 @@ CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUB
WARNING: subscription was created, but is not connected
HINT: To initiate replication, you must manually create the replication slot, enable the subscription, and refresh the subscription.
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub} | t | off | d | f | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+-----------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | t | off | d | f | any | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
ALTER SUBSCRIPTION regress_testsub SET (binary = false);
ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE);
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub} | f | off | d | f | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+-----------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | off | d | f | any | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
DROP SUBSCRIPTION regress_testsub;
@@ -271,27 +271,27 @@ CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUB
WARNING: subscription was created, but is not connected
HINT: To initiate replication, you must manually create the replication slot, enable the subscription, and refresh the subscription.
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub} | f | on | d | f | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+-----------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | on | d | f | any | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
ALTER SUBSCRIPTION regress_testsub SET (streaming = parallel);
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub} | f | parallel | d | f | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+-----------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | parallel | d | f | any | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
ALTER SUBSCRIPTION regress_testsub SET (streaming = false);
ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE);
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub} | f | off | d | f | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+-----------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | off | d | f | any | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
-- fail - publication already exists
@@ -306,10 +306,10 @@ ALTER SUBSCRIPTION regress_testsub ADD PUBLICATION testpub1, testpub2 WITH (refr
ALTER SUBSCRIPTION regress_testsub ADD PUBLICATION testpub1, testpub2 WITH (refresh = false);
ERROR: publication "testpub1" is already in subscription "regress_testsub"
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-----------------------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub,testpub1,testpub2} | f | off | d | f | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-----------------------------+--------+-----------+------------------+------------------+--------+-----------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub,testpub1,testpub2} | f | off | d | f | any | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
-- fail - publication used more than once
@@ -324,10 +324,10 @@ ERROR: publication "testpub3" is not in subscription "regress_testsub"
-- ok - delete publications
ALTER SUBSCRIPTION regress_testsub DROP PUBLICATION testpub1, testpub2 WITH (refresh = false);
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub} | f | off | d | f | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+-----------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | off | d | f | any | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
DROP SUBSCRIPTION regress_testsub;
@@ -363,10 +363,10 @@ CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUB
WARNING: subscription was created, but is not connected
HINT: To initiate replication, you must manually create the replication slot, enable the subscription, and refresh the subscription.
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub} | f | off | p | f | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+-----------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | off | p | f | any | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
--fail - alter of two_phase option not supported.
@@ -375,10 +375,10 @@ ERROR: unrecognized subscription parameter: "two_phase"
-- but can alter streaming when two_phase enabled
ALTER SUBSCRIPTION regress_testsub SET (streaming = true);
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub} | f | on | p | f | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+-----------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | on | p | f | any | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE);
@@ -388,10 +388,10 @@ CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUB
WARNING: subscription was created, but is not connected
HINT: To initiate replication, you must manually create the replication slot, enable the subscription, and refresh the subscription.
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub} | f | on | p | f | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+-----------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | on | p | f | any | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE);
@@ -404,20 +404,57 @@ CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUB
WARNING: subscription was created, but is not connected
HINT: To initiate replication, you must manually create the replication slot, enable the subscription, and refresh the subscription.
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub} | f | off | d | f | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+-----------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | off | d | f | any | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
ALTER SUBSCRIPTION regress_testsub SET (disable_on_error = true);
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub} | f | off | d | t | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+-----------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | off | d | t | any | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
+ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE);
+DROP SUBSCRIPTION regress_testsub;
+-- fail -- min_apply_delay must be a non-negative integer
+CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUBLICATION testpub WITH (connect = false, min_apply_delay = foo);
+ERROR: invalid value for parameter "min_apply_delay": "foo"
+CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUBLICATION testpub WITH (connect = false, min_apply_delay = -1);
+ERROR: -1 ms is outside the valid range for parameter "min_apply_delay" (0 .. 2147483647)
+-- fail - utilizing streaming = parallel with time-delayed replication is not supported
+CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUBLICATION testpub WITH (connect = false, streaming = parallel, min_apply_delay = 123);
+ERROR: min_apply_delay > 0 and streaming = parallel are mutually exclusive options
+-- success -- min_apply_delay value without unit is taken as milliseconds
+CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUBLICATION testpub WITH (connect = false, min_apply_delay = 123);
+WARNING: subscription was created, but is not connected
+HINT: To initiate replication, you must manually create the replication slot, enable the subscription, and refresh the subscription.
+\dRs+
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+-----------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | off | d | f | any | 123 | off | dbname=regress_doesnotexist | 0/0
+(1 row)
+
+-- success -- min_apply_delay value with unit is converted into ms and stored as an integer
+ALTER SUBSCRIPTION regress_testsub SET (min_apply_delay = '1 d');
+\dRs+
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+-----------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | off | d | f | any | 86400000 | off | dbname=regress_doesnotexist | 0/0
+(1 row)
+
+-- fail - alter subscription with streaming = parallel should fail when time-delayed replication is set
+ALTER SUBSCRIPTION regress_testsub SET (streaming = parallel);
+ERROR: cannot set parallel streaming mode for subscription with min_apply_delay
+-- fail - alter subscription with min_apply_delay should fail when streaming = parallel is set
+ALTER SUBSCRIPTION regress_testsub SET (min_apply_delay = 0, streaming = parallel);
+ALTER SUBSCRIPTION regress_testsub SET (min_apply_delay = 123);
+ERROR: cannot set min_apply_delay for subscription in parallel streaming mode
ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE);
DROP SUBSCRIPTION regress_testsub;
RESET SESSION AUTHORIZATION;
diff --git a/src/test/regress/sql/subscription.sql b/src/test/regress/sql/subscription.sql
index 7281f5fee2..7317b140f5 100644
--- a/src/test/regress/sql/subscription.sql
+++ b/src/test/regress/sql/subscription.sql
@@ -286,6 +286,30 @@ ALTER SUBSCRIPTION regress_testsub SET (disable_on_error = true);
ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE);
DROP SUBSCRIPTION regress_testsub;
+-- fail -- min_apply_delay must be a non-negative integer
+CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUBLICATION testpub WITH (connect = false, min_apply_delay = foo);
+CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUBLICATION testpub WITH (connect = false, min_apply_delay = -1);
+
+-- fail - utilizing streaming = parallel with time-delayed replication is not supported
+CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUBLICATION testpub WITH (connect = false, streaming = parallel, min_apply_delay = 123);
+
+-- success -- min_apply_delay value without unit is taken as milliseconds
+CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUBLICATION testpub WITH (connect = false, min_apply_delay = 123);
+\dRs+
+
+-- success -- min_apply_delay value with unit is converted into ms and stored as an integer
+ALTER SUBSCRIPTION regress_testsub SET (min_apply_delay = '1 d');
+\dRs+
+
+-- fail - alter subscription with streaming = parallel should fail when time-delayed replication is set
+ALTER SUBSCRIPTION regress_testsub SET (streaming = parallel);
+
+-- fail - alter subscription with min_apply_delay should fail when streaming = parallel is set
+ALTER SUBSCRIPTION regress_testsub SET (min_apply_delay = 0, streaming = parallel);
+ALTER SUBSCRIPTION regress_testsub SET (min_apply_delay = 123);
+ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE);
+DROP SUBSCRIPTION regress_testsub;
+
RESET SESSION AUTHORIZATION;
DROP ROLE regress_subscription_user;
DROP ROLE regress_subscription_user2;
diff --git a/src/test/subscription/meson.build b/src/test/subscription/meson.build
index 3db0fdfd96..a186876eb4 100644
--- a/src/test/subscription/meson.build
+++ b/src/test/subscription/meson.build
@@ -38,6 +38,7 @@ tests += {
't/029_on_error.pl',
't/030_origin.pl',
't/031_column_list.pl',
+ 't/032_apply_delay.pl',
't/100_bugs.pl',
],
},
diff --git a/src/test/subscription/t/032_apply_delay.pl b/src/test/subscription/t/032_apply_delay.pl
new file mode 100644
index 0000000000..8accc1fefd
--- /dev/null
+++ b/src/test/subscription/t/032_apply_delay.pl
@@ -0,0 +1,133 @@
+
+# Copyright (c) 2023, PostgreSQL Global Development Group
+
+# Checks for min_apply_delay
+use strict;
+use warnings;
+use PostgreSQL::Test::Cluster;
+use PostgreSQL::Test::Utils;
+use Test::More;
+
+# Confirm the time-delayed replication worked, by inspecting the server log
+# message where the apply worker emits for applying delay. Moreover, verify
+# that the current worker's remaining wait time is sufficiently bigger than the
+# expected value, in order to check any update of the min_apply_delay.
+sub check_apply_delay_log
+{
+ my ($node_subscriber, $offset, $expected) = @_;
+
+ # Get the remaining wait time from the server log
+ my $contents = slurp_file($node_subscriber->logfile, $offset);
+ $contents =~
+ qr/time-delayed replication for txid (\d+), min_apply_delay = (\d+) ms, remaining wait time: (\d+) ms/,
+ or die "could not get the apply worker wait time";
+ my $logged_delay = $3;
+
+ # Is it larger than expected?
+ cmp_ok($logged_delay, '>', $expected,
+ "The apply worker wait time has expected duration");
+}
+
+# Initialize publisher node
+my $node_publisher = PostgreSQL::Test::Cluster->new('publisher');
+$node_publisher->init(allows_streaming => 'logical');
+$node_publisher->append_conf('postgresql.conf',
+ 'logical_decoding_work_mem = 64kB');
+$node_publisher->start;
+
+# And some content
+$node_publisher->safe_psql('postgres',
+ "CREATE TABLE tab_int AS SELECT generate_series(1, 10) AS a");
+$node_publisher->safe_psql('postgres',
+ "CREATE PUBLICATION tap_pub FOR TABLE tab_int;");
+
+# Create subscriber
+my $node_subscriber = PostgreSQL::Test::Cluster->new('subscriber');
+my $delay = 3;
+$node_subscriber->init;
+$node_subscriber->append_conf(
+ 'postgresql.conf', qq(
+log_min_messages = debug2
+));
+$node_subscriber->start;
+$node_subscriber->safe_psql('postgres',
+ "CREATE TABLE tab_int AS SELECT generate_series(1, 10) AS a");
+my $publisher_connstr = $node_publisher->connstr . ' dbname=postgres';
+my $appname = 'tap_sub';
+$node_subscriber->safe_psql('postgres',
+ "CREATE SUBSCRIPTION tap_sub CONNECTION '$publisher_connstr application_name=$appname' PUBLICATION tap_pub WITH (copy_data = off, min_apply_delay = '${delay}s', streaming = 'on')"
+);
+
+# Make new content on publisher and check its presence in subscriber depending
+# on the delay applied above. Before doing the insertion, get the
+# current timestamp that will be used as a comparison base. Even on slow
+# machines, this allows to have a predictable behavior when comparing the
+# delay between data insertion moment on publisher and replay time on subscriber.
+my $publisher_insert_time = time();
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_int VALUES (generate_series(11, 20))");
+
+# The publisher waits for the replication to complete
+$node_publisher->wait_for_catchup($appname);
+
+# This test is successful if and only if the LSN has been applied with at least
+# the configured apply delay.
+ok( time() - $publisher_insert_time >= $delay,
+ "subscriber applies WAL only after replication delay for non-streaming transaction"
+);
+
+# For better coverage, setup for streaming case
+$node_publisher->append_conf('postgresql.conf',
+ 'logical_replication_mode = immediate');
+$node_publisher->reload;
+
+# Running a dummy query causes the config to be reloaded.
+$node_publisher->safe_psql('postgres', q{SELECT 1});
+
+# Check log starting now for logical replication apply delay
+$publisher_insert_time = time();
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_int VALUES (generate_series(21, 30))");
+$node_publisher->wait_for_catchup($appname);
+ok( time() - $publisher_insert_time >= $delay,
+ "subscriber applies WAL only after replication delay for streaming transaction"
+);
+
+# From here, confirm disabling the subscription by ALTER SUBSCRIPTION DISABLE
+# did not cause the delayed transaction to be applied. This is unique to
+# time-delayed logical replication.
+
+# Execute ALTER SUBSCRIPTION to change the delayed time of the apply worker
+# (1 day 5 minutes). Note that the extra 5 minute is to account for any
+# decoding/network overhead.
+$node_subscriber->safe_psql('postgres',
+ "ALTER SUBSCRIPTION tap_sub SET (min_apply_delay = 86700000)");
+
+# Check log starting now for logical replication apply delay
+my $offset = -s $node_subscriber->logfile;
+
+# New row to trigger apply delay
+$node_publisher->safe_psql('postgres', "INSERT INTO tab_int VALUES (0)");
+
+# Make sure the apply worker waits for more than 1 day
+check_apply_delay_log($node_subscriber, $offset, "86400000");
+
+# Disable subscription and the worker should die immediately
+$node_subscriber->safe_psql('postgres',
+ "ALTER SUBSCRIPTION tap_sub DISABLE;");
+
+# Wait until worker dies
+my $sub_query =
+ "SELECT count(1) = 0 FROM pg_stat_subscription WHERE subname = 'tap_sub' AND pid IS NOT NULL;";
+$node_subscriber->poll_query_until('postgres', $sub_query)
+ or die "Timed out while waiting for subscriber to die";
+
+# Confirm the record was not applied
+my $result = $node_subscriber->safe_psql('postgres',
+ "SELECT count(a) FROM tab_int WHERE a = 0;");
+is($result, qq(0), "check the delayed transaction was not applied");
+
+$node_subscriber->stop;
+$node_publisher->stop;
+
+done_testing();
--
2.27.0
Hi,
On Monday, February 6, 2023 8:57 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
On Tue, Jan 24, 2023 at 5:02 AM Euler Taveira <euler@eulerto.com> wrote:
- elog(DEBUG2, "sending feedback (force %d) to recv %X/%X,
write %X/%X, flush %X/%X in-delayed: %d",
+ elog(DEBUG2, "sending feedback (force %d) to recv %X/%X, write + %X/%X, flush %X/%X, apply delay: %s", force, LSN_FORMAT_ARGS(recvpos), LSN_FORMAT_ARGS(writepos), LSN_FORMAT_ARGS(flushpos), - in_delayed_apply); + in_delayed_apply? "yes" : "no");It is better to use a string to represent the yes/no option.
I think it is better to be consistent with the existing force parameter which is
also boolean, otherwise, it will look odd.
Agreed. The latest patch v29 posted in [1]/messages/by-id/TYCPR01MB8373A59E7B74AA4F96B62BEAEDDA9@TYCPR01MB8373.jpnprd01.prod.outlook.com followed this suggestion.
Kindly have a look at it.
[1]: /messages/by-id/TYCPR01MB8373A59E7B74AA4F96B62BEAEDDA9@TYCPR01MB8373.jpnprd01.prod.outlook.com
Best Regards,
Takamichi Osumi
Here are my review comments for v29-0001.
======
Commit Message
1.
Discussion: /messages/by-id/CAB-JLwYOYwL=XTyAXKiH5CtM_Vm8KjKh7aaitCKvmCh4rzr5pQ@mail.gmail.com
tmp
~
What's that "tmp" doing there? A typo?
======
doc/src/sgml/catalogs.sgml
2.
+ <row>
+ <entry role="catalog_table_entry"><para role="column_definition">
+ <structfield>subminapplydelay</structfield> <type>int4</type>
+ </para>
+ <para>
+ The minimum delay (ms) for applying changes.
+ </para></entry>
+ </row>
For consistency remove the period (.) because the other
single-sentence descriptions on this page do not have one.
======
src/backend/commands/subscriptioncmds.c
3. AlterSubscription
+ errmsg("cannot set parallel streaming mode for subscription with %s",
+ "min_apply_delay"));
Since there are no translator considerations here why not write it like this:
errmsg("cannot set parallel streaming mode for subscription with
min_apply_delay")
~~~
4. AlterSubscription
+ errmsg("cannot set %s for subscription in parallel streaming mode",
+ "min_apply_delay"));
Since there are no translator considerations here why not write it like this:
errmsg("cannot set min_apply_delay for subscription in parallel streaming mode")
~~~
5.
+defGetMinApplyDelay(DefElem *def)
+{
+ char *input_string;
+ int result;
+ const char *hintmsg;
+
+ input_string = defGetString(def);
+
+ /*
+ * Parse given string as parameter which has millisecond unit
+ */
+ if (!parse_int(input_string, &result, GUC_UNIT_MS, &hintmsg))
+ ereport(ERROR,
+ (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+ errmsg("invalid value for parameter \"%s\": \"%s\"",
+ "min_apply_delay", input_string),
+ hintmsg ? errhint("%s", _(hintmsg)) : 0));
+
+ /*
+ * Check both the lower boundary for the valid min_apply_delay range and
+ * the upper boundary as the safeguard for some platforms where INT_MAX is
+ * wider than int32 respectively. Although parse_int() has confirmed that
+ * the result is less than or equal to INT_MAX, the value will be stored
+ * in a catalog column of int32.
+ */
+ if (result < 0 || result > PG_INT32_MAX)
+ ereport(ERROR,
+ (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+ errmsg("%d ms is outside the valid range for parameter \"%s\" (%d .. %d)",
+ result,
+ "min_apply_delay",
+ 0, PG_INT32_MAX)));
+
+ return result;
+}
5a.
Since there are no translator considerations here why not write the
first error like:
errmsg("invalid value for parameter \"min_apply_delay\": \"%s\"",
input_string)
~
5b.
Since there are no translator considerations here why not write the
second error like:
errmsg("%d ms is outside the valid range for parameter
\"min_apply_delay\" (%d .. %d)",
result, 0, PG_INT32_MAX))
------
Kind Regards,
Peter Smith.
Fujitsu Australia
Dear Peter,
Thank you for reviewing! PSA new version.
======
Commit Message1.
Discussion:
/messages/by-id/CAB-JLwYOYwL=XTyAXKiH5CtM_Vm8KjKh7aaitCKvmCh4r
zr5pQ@mail.gmail.comtmp
~
What's that "tmp" doing there? A typo?
Removed. It was a typo.
I used `git rebase` command to combining the local commits,
but the commit message seemed to be remained.
======
doc/src/sgml/catalogs.sgml2. + <row> + <entry role="catalog_table_entry"><para role="column_definition"> + <structfield>subminapplydelay</structfield> <type>int4</type> + </para> + <para> + The minimum delay (ms) for applying changes. + </para></entry> + </row>For consistency remove the period (.) because the other
single-sentence descriptions on this page do not have one.
I have also confirmed and agreed. Fixed.
======
src/backend/commands/subscriptioncmds.c3. AlterSubscription + errmsg("cannot set parallel streaming mode for subscription with %s", + "min_apply_delay"));Since there are no translator considerations here why not write it like this:
errmsg("cannot set parallel streaming mode for subscription with
min_apply_delay")
Fixed.
~~~
4. AlterSubscription + errmsg("cannot set %s for subscription in parallel streaming mode", + "min_apply_delay"));Since there are no translator considerations here why not write it like this:
errmsg("cannot set min_apply_delay for subscription in parallel streaming mode")
Fixed.
~~~
5. +defGetMinApplyDelay(DefElem *def) +{ + char *input_string; + int result; + const char *hintmsg; + + input_string = defGetString(def); + + /* + * Parse given string as parameter which has millisecond unit + */ + if (!parse_int(input_string, &result, GUC_UNIT_MS, &hintmsg)) + ereport(ERROR, + (errcode(ERRCODE_INVALID_PARAMETER_VALUE), + errmsg("invalid value for parameter \"%s\": \"%s\"", + "min_apply_delay", input_string), + hintmsg ? errhint("%s", _(hintmsg)) : 0)); + + /* + * Check both the lower boundary for the valid min_apply_delay range and + * the upper boundary as the safeguard for some platforms where INT_MAX is + * wider than int32 respectively. Although parse_int() has confirmed that + * the result is less than or equal to INT_MAX, the value will be stored + * in a catalog column of int32. + */ + if (result < 0 || result > PG_INT32_MAX) + ereport(ERROR, + (errcode(ERRCODE_INVALID_PARAMETER_VALUE), + errmsg("%d ms is outside the valid range for parameter \"%s\" (%d .. %d)", + result, + "min_apply_delay", + 0, PG_INT32_MAX))); + + return result; +}5a.
Since there are no translator considerations here why not write the
first error like:errmsg("invalid value for parameter \"min_apply_delay\": \"%s\"",
input_string)~
5b.
Since there are no translator considerations here why not write the
second error like:errmsg("%d ms is outside the valid range for parameter
\"min_apply_delay\" (%d .. %d)",
result, 0, PG_INT32_MAX))
Both of you said were fixed.
Best Regards,
Hayato Kuroda
FUJITSU LIMITED
Show quoted text
-----Original Message-----
From: Peter Smith <smithpb2250@gmail.com>
Sent: Tuesday, February 7, 2023 9:33 AM
To: Osumi, Takamichi/大墨 昂道 <osumi.takamichi@fujitsu.com>
Cc: Amit Kapila <amit.kapila16@gmail.com>; Shi, Yu/侍 雨
<shiy.fnst@fujitsu.com>; Kyotaro Horiguchi <horikyota.ntt@gmail.com>;
vignesh21@gmail.com; Kuroda, Hayato/黒田 隼人
<kuroda.hayato@fujitsu.com>; shveta.malik@gmail.com; dilipbalaut@gmail.com;
euler@eulerto.com; m.melihmutlu@gmail.com; andres@anarazel.de;
marcos@f10.com.br; pgsql-hackers@postgresql.org
Subject: Re: Time delayed LR (WAS Re: logical replication restrictions)Here are my review comments for v29-0001.
======
Commit Message1.
Discussion:
/messages/by-id/CAB-JLwYOYwL=XTyAXKiH5CtM_Vm8KjKh7aaitCKvmCh4r
zr5pQ@mail.gmail.comtmp
~
What's that "tmp" doing there? A typo?
======
doc/src/sgml/catalogs.sgml2. + <row> + <entry role="catalog_table_entry"><para role="column_definition"> + <structfield>subminapplydelay</structfield> <type>int4</type> + </para> + <para> + The minimum delay (ms) for applying changes. + </para></entry> + </row>For consistency remove the period (.) because the other
single-sentence descriptions on this page do not have one.======
src/backend/commands/subscriptioncmds.c3. AlterSubscription + errmsg("cannot set parallel streaming mode for subscription with %s", + "min_apply_delay"));Since there are no translator considerations here why not write it like this:
errmsg("cannot set parallel streaming mode for subscription with
min_apply_delay")~~~
4. AlterSubscription + errmsg("cannot set %s for subscription in parallel streaming mode", + "min_apply_delay"));Since there are no translator considerations here why not write it like this:
errmsg("cannot set min_apply_delay for subscription in parallel streaming mode")
~~~
5. +defGetMinApplyDelay(DefElem *def) +{ + char *input_string; + int result; + const char *hintmsg; + + input_string = defGetString(def); + + /* + * Parse given string as parameter which has millisecond unit + */ + if (!parse_int(input_string, &result, GUC_UNIT_MS, &hintmsg)) + ereport(ERROR, + (errcode(ERRCODE_INVALID_PARAMETER_VALUE), + errmsg("invalid value for parameter \"%s\": \"%s\"", + "min_apply_delay", input_string), + hintmsg ? errhint("%s", _(hintmsg)) : 0)); + + /* + * Check both the lower boundary for the valid min_apply_delay range and + * the upper boundary as the safeguard for some platforms where INT_MAX is + * wider than int32 respectively. Although parse_int() has confirmed that + * the result is less than or equal to INT_MAX, the value will be stored + * in a catalog column of int32. + */ + if (result < 0 || result > PG_INT32_MAX) + ereport(ERROR, + (errcode(ERRCODE_INVALID_PARAMETER_VALUE), + errmsg("%d ms is outside the valid range for parameter \"%s\" (%d .. %d)", + result, + "min_apply_delay", + 0, PG_INT32_MAX))); + + return result; +}5a.
Since there are no translator considerations here why not write the
first error like:errmsg("invalid value for parameter \"min_apply_delay\": \"%s\"",
input_string)~
5b.
Since there are no translator considerations here why not write the
second error like:errmsg("%d ms is outside the valid range for parameter
\"min_apply_delay\" (%d .. %d)",
result, 0, PG_INT32_MAX))------
Kind Regards,
Peter Smith.
Fujitsu Australia
Attachments:
v30-0001-Time-delayed-logical-replication-subscriber.patchapplication/octet-stream; name=v30-0001-Time-delayed-logical-replication-subscriber.patchDownload
From 89d9494bbf2d88ccc4fd253645a4807dca55266e Mon Sep 17 00:00:00 2001
From: Takamichi Osumi <osumi.takamichi@fujitsu.com>
Date: Mon, 6 Feb 2023 06:33:55 +0000
Subject: [PATCH v30] Time-delayed logical replication subscriber
Similar to physical replication, a time-delayed copy of the data for
logical replication is useful for some scenarios (particularly to fix
errors that might cause data loss).
This patch implements a new subscription parameter called 'min_apply_delay'.
If the subscription sets min_apply_delay parameter, the logical
replication worker will delay the transaction apply for min_apply_delay
milliseconds.
The delay is calculated between the WAL time stamp and the current time
on the subscriber.
The delay occurs before we start to apply the transaction on the
subscriber. The main reason is to avoid keeping a transaction open for
a long time. Regular and prepared transactions are covered. Streamed
transactions are also covered.
The combination of parallel streaming mode and min_apply_delay is not
allowed. This is because in parallel streaming mode, we start applying
the transaction stream as soon as the first change arrives without
knowing the transaction's prepare/commit time. This means we cannot
calculate the underlying network/decoding lag between publisher and
subscriber, and so always waiting for the full 'min_apply_delay' period
might include unnecessary delay.
The other possibility was to apply the delay at the end of the parallel
apply transaction but that would cause issues related to resource
bloat and locks being held for a long time.
Note that this feature doesn't interact with skip transaction feature.
The skip transaction feature applies to one transaction with a specific LSN.
So, even if the skipped transaction and non-skipped transaction come
consecutively in a very short time, regardless of the order of which comes
first, the time-delayed feature gets balanced by delayed application
for other transactions before and after the skipped transaction.
Author: Euler Taveira, Takamichi Osumi, Kuroda Hayato
Reviewed-by: Amit Kapila, Peter Smith, Vignesh C, Shveta Malik,
Kyotaro Horiguchi, Shi Yu, Wang Wei, Dilip Kumar, Melih Mutlu
Discussion: https://postgr.es/m/CAB-JLwYOYwL=XTyAXKiH5CtM_Vm8KjKh7aaitCKvmCh4rzr5pQ@mail.gmail.com
---
doc/src/sgml/catalogs.sgml | 9 +
doc/src/sgml/config.sgml | 12 ++
doc/src/sgml/glossary.sgml | 14 ++
doc/src/sgml/logical-replication.sgml | 6 +
doc/src/sgml/ref/alter_subscription.sgml | 5 +-
doc/src/sgml/ref/create_subscription.sgml | 49 ++++-
src/backend/catalog/pg_subscription.c | 1 +
src/backend/catalog/system_views.sql | 7 +-
src/backend/commands/subscriptioncmds.c | 115 ++++++++++-
.../replication/logical/applyparallelworker.c | 3 +-
src/backend/replication/logical/worker.c | 165 ++++++++++++++--
src/bin/pg_dump/pg_dump.c | 15 +-
src/bin/pg_dump/pg_dump.h | 1 +
src/bin/psql/describe.c | 9 +-
src/bin/psql/tab-complete.c | 4 +-
src/include/catalog/pg_subscription.h | 3 +
src/include/replication/worker_internal.h | 2 +-
src/test/regress/expected/subscription.out | 181 +++++++++++-------
src/test/regress/sql/subscription.sql | 24 +++
src/test/subscription/meson.build | 1 +
src/test/subscription/t/032_apply_delay.pl | 133 +++++++++++++
21 files changed, 655 insertions(+), 104 deletions(-)
create mode 100644 src/test/subscription/t/032_apply_delay.pl
diff --git a/doc/src/sgml/catalogs.sgml b/doc/src/sgml/catalogs.sgml
index c1e4048054..7eb92ec51a 100644
--- a/doc/src/sgml/catalogs.sgml
+++ b/doc/src/sgml/catalogs.sgml
@@ -7873,6 +7873,15 @@ SCRAM-SHA-256$<replaceable><iteration count></replaceable>:<replaceable>&l
</para></entry>
</row>
+ <row>
+ <entry role="catalog_table_entry"><para role="column_definition">
+ <structfield>subminapplydelay</structfield> <type>int4</type>
+ </para>
+ <para>
+ The minimum delay (ms) for applying changes
+ </para></entry>
+ </row>
+
<row>
<entry role="catalog_table_entry"><para role="column_definition">
<structfield>subname</structfield> <type>name</type>
diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml
index d190be1925..626a8b5bd0 100644
--- a/doc/src/sgml/config.sgml
+++ b/doc/src/sgml/config.sgml
@@ -4787,6 +4787,18 @@ ANY <replaceable class="parameter">num_sync</replaceable> ( <replaceable class="
the <filename>postgresql.conf</filename> file or on the server
command line.
</para>
+ <para>
+ For time-delayed logical replication, the apply worker sends a feedback
+ message to the publisher every
+ <varname>wal_receiver_status_interval</varname> milliseconds. Make sure
+ to set <varname>wal_receiver_status_interval</varname> less than the
+ <varname>wal_sender_timeout</varname> on the publisher, otherwise, the
+ <literal>walsender</literal> will repeatedly terminate due to timeout
+ errors. Note that if <varname>wal_receiver_status_interval</varname> is
+ set to zero, the apply worker sends no feedback messages during the
+ <literal>min_apply_delay</literal> period. Refer to
+ <xref linkend="sql-createsubscription"/> for more information.
+ </para>
</listitem>
</varlistentry>
diff --git a/doc/src/sgml/glossary.sgml b/doc/src/sgml/glossary.sgml
index 7c01a541fe..6ed6fa5853 100644
--- a/doc/src/sgml/glossary.sgml
+++ b/doc/src/sgml/glossary.sgml
@@ -1729,6 +1729,20 @@
</glossdef>
</glossentry>
+ <glossentry id="glossary-time-delayed-replication">
+ <glossterm>Time-delayed replication</glossterm>
+ <glossdef>
+ <para>
+ Replication setup that applies time-delayed copy of the data.
+ </para>
+ <para>
+ For more information, see
+ <xref linkend="guc-recovery-min-apply-delay"/> for physical replication
+ and <xref linkend="sql-createsubscription"/> for logical replication.
+ </para>
+ </glossdef>
+ </glossentry>
+
<glossentry id="glossary-toast">
<glossterm>TOAST</glossterm>
<glossdef>
diff --git a/doc/src/sgml/logical-replication.sgml b/doc/src/sgml/logical-replication.sgml
index 1bd5660c87..6bd5f61e2b 100644
--- a/doc/src/sgml/logical-replication.sgml
+++ b/doc/src/sgml/logical-replication.sgml
@@ -247,6 +247,12 @@
target table.
</para>
+ <para>
+ A subscription can delay the application of changes by specifying the
+ <literal>min_apply_delay</literal> subscription parameter. See
+ <xref linkend="sql-createsubscription"/> for details.
+ </para>
+
<sect2 id="logical-replication-subscription-slot">
<title>Replication Slot Management</title>
diff --git a/doc/src/sgml/ref/alter_subscription.sgml b/doc/src/sgml/ref/alter_subscription.sgml
index 964fcbb8ff..8b7eb28e54 100644
--- a/doc/src/sgml/ref/alter_subscription.sgml
+++ b/doc/src/sgml/ref/alter_subscription.sgml
@@ -213,8 +213,9 @@ ALTER SUBSCRIPTION <replaceable class="parameter">name</replaceable> RENAME TO <
are <literal>slot_name</literal>,
<literal>synchronous_commit</literal>,
<literal>binary</literal>, <literal>streaming</literal>,
- <literal>disable_on_error</literal>, and
- <literal>origin</literal>.
+ <literal>disable_on_error</literal>,
+ <literal>origin</literal>, and
+ <literal>min_apply_delay</literal>.
</para>
</listitem>
</varlistentry>
diff --git a/doc/src/sgml/ref/create_subscription.sgml b/doc/src/sgml/ref/create_subscription.sgml
index 51c45f17c7..1b4b8390af 100644
--- a/doc/src/sgml/ref/create_subscription.sgml
+++ b/doc/src/sgml/ref/create_subscription.sgml
@@ -349,7 +349,49 @@ CREATE SUBSCRIPTION <replaceable class="parameter">subscription_name</replaceabl
</para>
</listitem>
</varlistentry>
- </variablelist></para>
+
+ <varlistentry>
+ <term><literal>min_apply_delay</literal> (<type>integer</type>)</term>
+ <listitem>
+ <para>
+ By default, the subscriber applies changes as soon as possible. This
+ parameter allows the user to delay the application of changes by a
+ given time period. If the value is specified without units, it is
+ taken as milliseconds. The default is zero (no delay). See
+ <xref linkend="config-setting-names-values"/> for details on the
+ available valid time units.
+ </para>
+ <para>
+ Any delay becomes effective only after all initial table
+ synchronization has finished and occurs before each transaction starts
+ to get applied on the subscriber. The delay is calculated as the
+ difference between the WAL timestamp as written on the publisher and
+ the current time on the subscriber. Any overhead of time spent in
+ logical decoding and in transferring the transaction may reduce the
+ actual wait time. It is also possible that the overhead already
+ exceeds the requested <literal>min_apply_delay</literal> value, in
+ which case no delay is applied. If the system clocks on publisher and
+ subscriber are not synchronized, this may lead to apply changes
+ earlier than expected, but this is not a major issue because this
+ parameter is typically much larger than the time deviations between
+ servers. Note that if this parameter is set to a long delay, the
+ replication will stop if the replication slot falls behind the current
+ LSN by more than
+ <link linkend="guc-max-slot-wal-keep-size"><literal>max_slot_wal_keep_size</literal></link>.
+ </para>
+ <warning>
+ <para>
+ Delaying the replication means there is a much longer time between
+ making a change on the publisher, and that change being committed
+ on the subscriber. This can impact the performance of synchronous
+ replication. See <xref linkend="guc-synchronous-commit"/>
+ parameter.
+ </para>
+ </warning>
+ </listitem>
+ </varlistentry>
+ </variablelist>
+ </para>
</listitem>
</varlistentry>
@@ -420,6 +462,11 @@ CREATE SUBSCRIPTION <replaceable class="parameter">subscription_name</replaceabl
published with different column lists are not supported.
</para>
+ <para>
+ A non-zero <literal>min_apply_delay</literal> parameter is not allowed when
+ streaming in parallel mode.
+ </para>
+
<para>
We allow non-existent publications to be specified so that users can add
those later. This means
diff --git a/src/backend/catalog/pg_subscription.c b/src/backend/catalog/pg_subscription.c
index a56ae311c3..c767cc1c3a 100644
--- a/src/backend/catalog/pg_subscription.c
+++ b/src/backend/catalog/pg_subscription.c
@@ -64,6 +64,7 @@ GetSubscription(Oid subid, bool missing_ok)
sub->oid = subid;
sub->dbid = subform->subdbid;
sub->skiplsn = subform->subskiplsn;
+ sub->minapplydelay = subform->subminapplydelay;
sub->name = pstrdup(NameStr(subform->subname));
sub->owner = subform->subowner;
sub->enabled = subform->subenabled;
diff --git a/src/backend/catalog/system_views.sql b/src/backend/catalog/system_views.sql
index 8608e3fa5b..317c2010cb 100644
--- a/src/backend/catalog/system_views.sql
+++ b/src/backend/catalog/system_views.sql
@@ -1299,9 +1299,10 @@ REVOKE ALL ON pg_replication_origin_status FROM public;
-- All columns of pg_subscription except subconninfo are publicly readable.
REVOKE ALL ON pg_subscription FROM public;
-GRANT SELECT (oid, subdbid, subskiplsn, subname, subowner, subenabled,
- subbinary, substream, subtwophasestate, subdisableonerr,
- subslotname, subsynccommit, subpublications, suborigin)
+GRANT SELECT (oid, subdbid, subskiplsn, subminapplydelay, subname, subowner,
+ subenabled, subbinary, substream, subtwophasestate,
+ subdisableonerr, subslotname, subsynccommit, subpublications,
+ suborigin)
ON pg_subscription TO public;
CREATE VIEW pg_stat_subscription_stats AS
diff --git a/src/backend/commands/subscriptioncmds.c b/src/backend/commands/subscriptioncmds.c
index 464db6d247..09e8b3c0fb 100644
--- a/src/backend/commands/subscriptioncmds.c
+++ b/src/backend/commands/subscriptioncmds.c
@@ -66,6 +66,7 @@
#define SUBOPT_DISABLE_ON_ERR 0x00000400
#define SUBOPT_LSN 0x00000800
#define SUBOPT_ORIGIN 0x00001000
+#define SUBOPT_MIN_APPLY_DELAY 0x00002000
/* check if the 'val' has 'bits' set */
#define IsSet(val, bits) (((val) & (bits)) == (bits))
@@ -90,6 +91,7 @@ typedef struct SubOpts
bool disableonerr;
char *origin;
XLogRecPtr lsn;
+ int32 min_apply_delay;
} SubOpts;
static List *fetch_table_list(WalReceiverConn *wrconn, List *publications);
@@ -100,7 +102,7 @@ static void check_publications_origin(WalReceiverConn *wrconn,
static void check_duplicates_in_publist(List *publist, Datum *datums);
static List *merge_publications(List *oldpublist, List *newpublist, bool addpub, const char *subname);
static void ReportSlotConnectionError(List *rstates, Oid subid, char *slotname, char *err);
-
+static int32 defGetMinApplyDelay(DefElem *def);
/*
* Common option parsing function for CREATE and ALTER SUBSCRIPTION commands.
@@ -146,6 +148,8 @@ parse_subscription_options(ParseState *pstate, List *stmt_options,
opts->disableonerr = false;
if (IsSet(supported_opts, SUBOPT_ORIGIN))
opts->origin = pstrdup(LOGICALREP_ORIGIN_ANY);
+ if (IsSet(supported_opts, SUBOPT_MIN_APPLY_DELAY))
+ opts->min_apply_delay = 0;
/* Parse options */
foreach(lc, stmt_options)
@@ -324,6 +328,15 @@ parse_subscription_options(ParseState *pstate, List *stmt_options,
opts->specified_opts |= SUBOPT_LSN;
opts->lsn = lsn;
}
+ else if (IsSet(supported_opts, SUBOPT_MIN_APPLY_DELAY) &&
+ strcmp(defel->defname, "min_apply_delay") == 0)
+ {
+ if (IsSet(opts->specified_opts, SUBOPT_MIN_APPLY_DELAY))
+ errorConflictingDefElem(defel, pstate);
+
+ opts->specified_opts |= SUBOPT_MIN_APPLY_DELAY;
+ opts->min_apply_delay = defGetMinApplyDelay(defel);
+ }
else
ereport(ERROR,
(errcode(ERRCODE_SYNTAX_ERROR),
@@ -404,6 +417,31 @@ parse_subscription_options(ParseState *pstate, List *stmt_options,
"slot_name = NONE", "create_slot = false")));
}
}
+
+ /*
+ * The combination of parallel streaming mode and min_apply_delay is not
+ * allowed. This is because in parallel streaming mode, we start applying
+ * the transaction stream as soon as the first change arrives without
+ * knowing the transaction's prepare/commit time. This means we cannot
+ * calculate the underlying network/decoding lag between publisher and
+ * subscriber, and so always waiting for the full 'min_apply_delay' period
+ * might include unnecessary delay.
+ *
+ * The other possibility was to apply the delay at the end of the parallel
+ * apply transaction but that would cause issues related to resource bloat
+ * and locks being held for a long time.
+ */
+ if (IsSet(supported_opts, SUBOPT_MIN_APPLY_DELAY) &&
+ opts->min_apply_delay > 0 && opts->streaming == LOGICALREP_STREAM_PARALLEL)
+ ereport(ERROR,
+ errcode(ERRCODE_SYNTAX_ERROR),
+
+ /*
+ * translator: the first %s is a string of the form "parameter > 0"
+ * and the second one is "option = value".
+ */
+ errmsg("%s and %s are mutually exclusive options",
+ "min_apply_delay > 0", "streaming = parallel"));
}
/*
@@ -560,7 +598,8 @@ CreateSubscription(ParseState *pstate, CreateSubscriptionStmt *stmt,
SUBOPT_SLOT_NAME | SUBOPT_COPY_DATA |
SUBOPT_SYNCHRONOUS_COMMIT | SUBOPT_BINARY |
SUBOPT_STREAMING | SUBOPT_TWOPHASE_COMMIT |
- SUBOPT_DISABLE_ON_ERR | SUBOPT_ORIGIN);
+ SUBOPT_DISABLE_ON_ERR | SUBOPT_ORIGIN |
+ SUBOPT_MIN_APPLY_DELAY);
parse_subscription_options(pstate, stmt->options, supported_opts, &opts);
/*
@@ -625,6 +664,7 @@ CreateSubscription(ParseState *pstate, CreateSubscriptionStmt *stmt,
values[Anum_pg_subscription_oid - 1] = ObjectIdGetDatum(subid);
values[Anum_pg_subscription_subdbid - 1] = ObjectIdGetDatum(MyDatabaseId);
values[Anum_pg_subscription_subskiplsn - 1] = LSNGetDatum(InvalidXLogRecPtr);
+ values[Anum_pg_subscription_subminapplydelay - 1] = Int32GetDatum(opts.min_apply_delay);
values[Anum_pg_subscription_subname - 1] =
DirectFunctionCall1(namein, CStringGetDatum(stmt->subname));
values[Anum_pg_subscription_subowner - 1] = ObjectIdGetDatum(owner);
@@ -1054,7 +1094,7 @@ AlterSubscription(ParseState *pstate, AlterSubscriptionStmt *stmt,
supported_opts = (SUBOPT_SLOT_NAME |
SUBOPT_SYNCHRONOUS_COMMIT | SUBOPT_BINARY |
SUBOPT_STREAMING | SUBOPT_DISABLE_ON_ERR |
- SUBOPT_ORIGIN);
+ SUBOPT_ORIGIN | SUBOPT_MIN_APPLY_DELAY);
parse_subscription_options(pstate, stmt->options,
supported_opts, &opts);
@@ -1098,6 +1138,17 @@ AlterSubscription(ParseState *pstate, AlterSubscriptionStmt *stmt,
if (IsSet(opts.specified_opts, SUBOPT_STREAMING))
{
+ /*
+ * The combination of parallel streaming mode and
+ * min_apply_delay is not allowed. See
+ * parse_subscription_options.
+ */
+ if (opts.streaming == LOGICALREP_STREAM_PARALLEL &&
+ !IsSet(opts.specified_opts, SUBOPT_MIN_APPLY_DELAY) && sub->minapplydelay > 0)
+ ereport(ERROR,
+ errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
+ errmsg("cannot set parallel streaming mode for subscription with min_apply_delay"));
+
values[Anum_pg_subscription_substream - 1] =
CharGetDatum(opts.streaming);
replaces[Anum_pg_subscription_substream - 1] = true;
@@ -1111,6 +1162,24 @@ AlterSubscription(ParseState *pstate, AlterSubscriptionStmt *stmt,
= true;
}
+ if (IsSet(opts.specified_opts, SUBOPT_MIN_APPLY_DELAY))
+ {
+ /*
+ * The combination of parallel streaming mode and
+ * min_apply_delay is not allowed. See
+ * parse_subscription_options.
+ */
+ if (opts.min_apply_delay > 0 &&
+ !IsSet(opts.specified_opts, SUBOPT_STREAMING) && sub->stream == LOGICALREP_STREAM_PARALLEL)
+ ereport(ERROR,
+ errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
+ errmsg("cannot set min_apply_delay for subscription in parallel streaming mode"));
+
+ values[Anum_pg_subscription_subminapplydelay - 1] =
+ Int32GetDatum(opts.min_apply_delay);
+ replaces[Anum_pg_subscription_subminapplydelay - 1] = true;
+ }
+
if (IsSet(opts.specified_opts, SUBOPT_ORIGIN))
{
values[Anum_pg_subscription_suborigin - 1] =
@@ -2195,3 +2264,43 @@ defGetStreamingMode(DefElem *def)
def->defname)));
return LOGICALREP_STREAM_OFF; /* keep compiler quiet */
}
+
+/*
+ * Extract the min_apply_delay value from a DefElem. This is very similar to
+ * parse_and_validate_value() for integer values, because min_apply_delay
+ * accepts the same parameter format as recovery_min_apply_delay.
+ */
+static int32
+defGetMinApplyDelay(DefElem *def)
+{
+ char *input_string;
+ int result;
+ const char *hintmsg;
+
+ input_string = defGetString(def);
+
+ /*
+ * Parse given string as parameter which has millisecond unit
+ */
+ if (!parse_int(input_string, &result, GUC_UNIT_MS, &hintmsg))
+ ereport(ERROR,
+ (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+ errmsg("invalid value for parameter \"min_apply_delay\": \"%s\"",
+ input_string),
+ hintmsg ? errhint("%s", _(hintmsg)) : 0));
+
+ /*
+ * Check both the lower boundary for the valid min_apply_delay range and
+ * the upper boundary as the safeguard for some platforms where INT_MAX is
+ * wider than int32 respectively. Although parse_int() has confirmed that
+ * the result is less than or equal to INT_MAX, the value will be stored
+ * in a catalog column of int32.
+ */
+ if (result < 0 || result > PG_INT32_MAX)
+ ereport(ERROR,
+ (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+ errmsg("%d ms is outside the valid range for parameter \"min_apply_delay\" (%d .. %d)",
+ result, 0, PG_INT32_MAX)));
+
+ return result;
+}
diff --git a/src/backend/replication/logical/applyparallelworker.c b/src/backend/replication/logical/applyparallelworker.c
index e670ec617a..d29e2dd7b9 100644
--- a/src/backend/replication/logical/applyparallelworker.c
+++ b/src/backend/replication/logical/applyparallelworker.c
@@ -704,7 +704,8 @@ pa_process_spooled_messages_if_required(void)
{
apply_spooled_messages(&MyParallelShared->fileset,
MyParallelShared->xid,
- InvalidXLogRecPtr);
+ InvalidXLogRecPtr,
+ 0);
pa_set_fileset_state(MyParallelShared, FS_EMPTY);
}
diff --git a/src/backend/replication/logical/worker.c b/src/backend/replication/logical/worker.c
index cfb2ab6248..fde6978950 100644
--- a/src/backend/replication/logical/worker.c
+++ b/src/backend/replication/logical/worker.c
@@ -319,6 +319,17 @@ static List *on_commit_wakeup_workers_subids = NIL;
bool in_remote_transaction = false;
static XLogRecPtr remote_final_lsn = InvalidXLogRecPtr;
+/*
+ * In order to avoid walsender timeout for time-delayed logical replication the
+ * apply worker keeps sending feedback messages during the delay period.
+ * Meanwhile, the feature delays the apply before the start of the
+ * transaction and thus we don't write WAL records for the suspended changes
+ * during the wait. When the apply worker sends a feedback message during the
+ * delay, we should not overwrite positions of the flushed and apply LSN by the
+ * last received latest LSN. See send_feedback() for details.
+ */
+static XLogRecPtr last_received = InvalidXLogRecPtr;
+
/* fields valid only when processing streamed transaction */
static bool in_streamed_transaction = false;
@@ -389,7 +400,8 @@ static void stream_write_change(char action, StringInfo s);
static void stream_open_and_write_change(TransactionId xid, char action, StringInfo s);
static void stream_close_file(void);
-static void send_feedback(XLogRecPtr recvpos, bool force, bool requestReply);
+static void send_feedback(XLogRecPtr recvpos, bool force, bool requestReply,
+ bool has_unprocessed_change);
static void DisableSubscriptionAndExit(void);
@@ -999,6 +1011,109 @@ slot_modify_data(TupleTableSlot *slot, TupleTableSlot *srcslot,
ExecStoreVirtualTuple(slot);
}
+/*
+ * When min_apply_delay parameter is set on the subscriber, we wait long enough
+ * to make sure a transaction is applied at least that period behind the
+ * publisher.
+ *
+ * While the physical replication applies the delay at commit time, this
+ * feature applies the delay for the next transaction but before starting the
+ * transaction. This is mainly because keeping a transaction that conducted
+ * write operations open for a long time results in some issues such as bloat
+ * and locks.
+ *
+ * The min_apply_delay parameter will take effect only after all tables are in
+ * READY state.
+ *
+ * xid is the transaction id where we apply the delay.
+ *
+ * finish_ts is the commit/prepare time of both regular (non-streamed) and
+ * streamed transactions. Unlike the regular (non-streamed) cases, the delay
+ * is applied in a STREAM COMMIT/STREAM PREPARE message for streamed
+ * transactions. The STREAM START message does not contain a commit/prepare
+ * time (it will be available when the in-progress transaction finishes).
+ * Hence, it's not appropriate to apply a delay at the STREAM START time.
+ */
+static void
+maybe_apply_delay(TransactionId xid, TimestampTz finish_ts)
+{
+ Assert(finish_ts > 0);
+
+ /* Nothing to do if no delay set */
+ if (!MySubscription->minapplydelay)
+ return;
+
+ /*
+ * The min_apply_delay parameter is ignored until all tablesync workers
+ * have reached READY state. This is because if we allowed the delay
+ * during the catchup phase, then once we reached the limit of tablesync
+ * workers it would impose a delay for each subsequent worker. That would
+ * cause initial table synchronization completion to take a long time.
+ */
+ if (!AllTablesyncsReady())
+ return;
+
+ /* Apply the delay by the latch mechanism */
+ while (true)
+ {
+ TimestampTz delayUntil;
+ long diffms;
+
+ ResetLatch(MyLatch);
+
+ CHECK_FOR_INTERRUPTS();
+
+ /* This might change wal_receiver_status_interval */
+ if (ConfigReloadPending)
+ {
+ ConfigReloadPending = false;
+ ProcessConfigFile(PGC_SIGHUP);
+ }
+
+ /*
+ * Before calculating the time duration, reload the catalog if needed.
+ */
+ if (!in_remote_transaction && !in_streamed_transaction)
+ {
+ AcceptInvalidationMessages();
+ maybe_reread_subscription();
+ }
+
+ delayUntil = TimestampTzPlusMilliseconds(finish_ts, MySubscription->minapplydelay);
+ diffms = TimestampDifferenceMilliseconds(GetCurrentTimestamp(), delayUntil);
+
+ /*
+ * Exit without arming the latch if it's already past time to apply
+ * this transaction.
+ */
+ if (diffms <= 0)
+ break;
+
+ elog(DEBUG2, "time-delayed replication for txid %u, min_apply_delay = %d ms, remaining wait time: %ld ms",
+ xid, MySubscription->minapplydelay, diffms);
+
+ /*
+ * Call send_feedback() to prevent the publisher from exiting by
+ * timeout during the delay, when wal_receiver_status_interval is
+ * available.
+ */
+ if (wal_receiver_status_interval > 0 &&
+ diffms > wal_receiver_status_interval * 1000L)
+ {
+ WaitLatch(MyLatch,
+ WL_LATCH_SET | WL_TIMEOUT | WL_EXIT_ON_PM_DEATH,
+ wal_receiver_status_interval * 1000L,
+ WAIT_EVENT_RECOVERY_APPLY_DELAY);
+ send_feedback(last_received, true, false, true);
+ }
+ else
+ WaitLatch(MyLatch,
+ WL_LATCH_SET | WL_TIMEOUT | WL_EXIT_ON_PM_DEATH,
+ diffms,
+ WAIT_EVENT_RECOVERY_APPLY_DELAY);
+ }
+}
+
/*
* Handle BEGIN message.
*/
@@ -1013,6 +1128,9 @@ apply_handle_begin(StringInfo s)
logicalrep_read_begin(s, &begin_data);
set_apply_error_context_xact(begin_data.xid, begin_data.final_lsn);
+ /* Should we delay the current transaction? */
+ maybe_apply_delay(begin_data.xid, begin_data.committime);
+
remote_final_lsn = begin_data.final_lsn;
maybe_start_skipping_changes(begin_data.final_lsn);
@@ -1070,6 +1188,9 @@ apply_handle_begin_prepare(StringInfo s)
logicalrep_read_begin_prepare(s, &begin_data);
set_apply_error_context_xact(begin_data.xid, begin_data.prepare_lsn);
+ /* Should we delay the current prepared transaction? */
+ maybe_apply_delay(begin_data.xid, begin_data.prepare_time);
+
remote_final_lsn = begin_data.prepare_lsn;
maybe_start_skipping_changes(begin_data.prepare_lsn);
@@ -1317,7 +1438,8 @@ apply_handle_stream_prepare(StringInfo s)
* spooled operations.
*/
apply_spooled_messages(MyLogicalRepWorker->stream_fileset,
- prepare_data.xid, prepare_data.prepare_lsn);
+ prepare_data.xid, prepare_data.prepare_lsn,
+ prepare_data.prepare_time);
/* Mark the transaction as prepared. */
apply_handle_prepare_internal(&prepare_data);
@@ -2011,10 +2133,13 @@ ensure_last_message(FileSet *stream_fileset, TransactionId xid, int fileno,
/*
* Common spoolfile processing.
+ *
+ * The commit/prepare time (finish_ts) for streamed transactions is required
+ * for time-delayed logical replication.
*/
void
apply_spooled_messages(FileSet *stream_fileset, TransactionId xid,
- XLogRecPtr lsn)
+ XLogRecPtr lsn, TimestampTz finish_ts)
{
StringInfoData s2;
int nchanges;
@@ -2025,6 +2150,10 @@ apply_spooled_messages(FileSet *stream_fileset, TransactionId xid,
int fileno;
off_t offset;
+ /* Should we delay the current transaction? */
+ if (finish_ts)
+ maybe_apply_delay(xid, finish_ts);
+
if (!am_parallel_apply_worker())
maybe_start_skipping_changes(lsn);
@@ -2174,7 +2303,7 @@ apply_handle_stream_commit(StringInfo s)
* spooled operations.
*/
apply_spooled_messages(MyLogicalRepWorker->stream_fileset, xid,
- commit_data.commit_lsn);
+ commit_data.commit_lsn, commit_data.committime);
apply_handle_commit_internal(&commit_data);
@@ -3447,7 +3576,7 @@ UpdateWorkerStats(XLogRecPtr last_lsn, TimestampTz send_time, bool reply)
* Apply main loop.
*/
static void
-LogicalRepApplyLoop(XLogRecPtr last_received)
+LogicalRepApplyLoop(void)
{
TimestampTz last_recv_timestamp = GetCurrentTimestamp();
bool ping_sent = false;
@@ -3568,7 +3697,7 @@ LogicalRepApplyLoop(XLogRecPtr last_received)
if (last_received < end_lsn)
last_received = end_lsn;
- send_feedback(last_received, reply_requested, false);
+ send_feedback(last_received, reply_requested, false, false);
UpdateWorkerStats(last_received, timestamp, true);
}
/* other message types are purposefully ignored */
@@ -3581,7 +3710,7 @@ LogicalRepApplyLoop(XLogRecPtr last_received)
}
/* confirm all writes so far */
- send_feedback(last_received, false, false);
+ send_feedback(last_received, false, false, false);
if (!in_remote_transaction && !in_streamed_transaction)
{
@@ -3678,7 +3807,7 @@ LogicalRepApplyLoop(XLogRecPtr last_received)
}
}
- send_feedback(last_received, requestReply, requestReply);
+ send_feedback(last_received, requestReply, requestReply, false);
/*
* Force reporting to ensure long idle periods don't lead to
@@ -3708,7 +3837,7 @@ LogicalRepApplyLoop(XLogRecPtr last_received)
* to send a response to avoid timeouts.
*/
static void
-send_feedback(XLogRecPtr recvpos, bool force, bool requestReply)
+send_feedback(XLogRecPtr recvpos, bool force, bool requestReply, bool has_unprocessed_change)
{
static StringInfo reply_message = NULL;
static TimestampTz send_time = 0;
@@ -3738,8 +3867,14 @@ send_feedback(XLogRecPtr recvpos, bool force, bool requestReply)
/*
* No outstanding transactions to flush, we can report the latest received
* position. This is important for synchronous replication.
+ *
+ * If the logical replication subscription has unprocessed changes then do
+ * not inform the publisher that the received latest LSN is already
+ * applied and flushed, otherwise, the publisher will make a wrong
+ * assumption about the logical replication progress. Instead, just send a
+ * feedback message to avoid a replication timeout during the delay.
*/
- if (!have_pending_txes)
+ if (!have_pending_txes && !has_unprocessed_change)
flushpos = writepos = recvpos;
if (writepos < last_writepos)
@@ -3776,8 +3911,9 @@ send_feedback(XLogRecPtr recvpos, bool force, bool requestReply)
pq_sendint64(reply_message, now); /* sendTime */
pq_sendbyte(reply_message, requestReply); /* replyRequested */
- elog(DEBUG2, "sending feedback (force %d) to recv %X/%X, write %X/%X, flush %X/%X",
+ elog(DEBUG2, "sending feedback (force %d, has_unprocessed_change %d) to recv %X/%X, write %X/%X, flush %X/%X",
force,
+ has_unprocessed_change,
LSN_FORMAT_ARGS(recvpos),
LSN_FORMAT_ARGS(writepos),
LSN_FORMAT_ARGS(flushpos));
@@ -4367,11 +4503,11 @@ start_table_sync(XLogRecPtr *origin_startpos, char **myslotname)
* of system resource error and are not repeatable.
*/
static void
-start_apply(XLogRecPtr origin_startpos)
+start_apply(void)
{
PG_TRY();
{
- LogicalRepApplyLoop(origin_startpos);
+ LogicalRepApplyLoop();
}
PG_CATCH();
{
@@ -4661,7 +4797,8 @@ ApplyWorkerMain(Datum main_arg)
}
/* Run the main loop. */
- start_apply(origin_startpos);
+ last_received = origin_startpos;
+ start_apply();
proc_exit(0);
}
diff --git a/src/bin/pg_dump/pg_dump.c b/src/bin/pg_dump/pg_dump.c
index 527c7651ab..1e87f0124e 100644
--- a/src/bin/pg_dump/pg_dump.c
+++ b/src/bin/pg_dump/pg_dump.c
@@ -4494,6 +4494,7 @@ getSubscriptions(Archive *fout)
int i_subsynccommit;
int i_subpublications;
int i_subbinary;
+ int i_subminapplydelay;
int i,
ntups;
@@ -4546,9 +4547,13 @@ getSubscriptions(Archive *fout)
LOGICALREP_TWOPHASE_STATE_DISABLED);
if (fout->remoteVersion >= 160000)
- appendPQExpBufferStr(query, " s.suborigin\n");
+ appendPQExpBufferStr(query,
+ " s.suborigin,\n"
+ " s.subminapplydelay\n");
else
- appendPQExpBuffer(query, " '%s' AS suborigin\n", LOGICALREP_ORIGIN_ANY);
+ appendPQExpBuffer(query, " '%s' AS suborigin,\n"
+ " 0 AS subminapplydelay\n",
+ LOGICALREP_ORIGIN_ANY);
appendPQExpBufferStr(query,
"FROM pg_subscription s\n"
@@ -4576,6 +4581,7 @@ getSubscriptions(Archive *fout)
i_subtwophasestate = PQfnumber(res, "subtwophasestate");
i_subdisableonerr = PQfnumber(res, "subdisableonerr");
i_suborigin = PQfnumber(res, "suborigin");
+ i_subminapplydelay = PQfnumber(res, "subminapplydelay");
subinfo = pg_malloc(ntups * sizeof(SubscriptionInfo));
@@ -4606,6 +4612,8 @@ getSubscriptions(Archive *fout)
subinfo[i].subdisableonerr =
pg_strdup(PQgetvalue(res, i, i_subdisableonerr));
subinfo[i].suborigin = pg_strdup(PQgetvalue(res, i, i_suborigin));
+ subinfo[i].subminapplydelay =
+ atoi(PQgetvalue(res, i, i_subminapplydelay));
/* Decide whether we want to dump it */
selectDumpableObject(&(subinfo[i].dobj), fout);
@@ -4687,6 +4695,9 @@ dumpSubscription(Archive *fout, const SubscriptionInfo *subinfo)
if (strcmp(subinfo->subsynccommit, "off") != 0)
appendPQExpBuffer(query, ", synchronous_commit = %s", fmtId(subinfo->subsynccommit));
+ if (subinfo->subminapplydelay > 0)
+ appendPQExpBuffer(query, ", min_apply_delay = '%d ms'", subinfo->subminapplydelay);
+
appendPQExpBufferStr(query, ");\n");
if (subinfo->dobj.dump & DUMP_COMPONENT_DEFINITION)
diff --git a/src/bin/pg_dump/pg_dump.h b/src/bin/pg_dump/pg_dump.h
index e7cbd8d7ed..b8831c3ed3 100644
--- a/src/bin/pg_dump/pg_dump.h
+++ b/src/bin/pg_dump/pg_dump.h
@@ -661,6 +661,7 @@ typedef struct _SubscriptionInfo
char *subdisableonerr;
char *suborigin;
char *subsynccommit;
+ int subminapplydelay;
char *subpublications;
} SubscriptionInfo;
diff --git a/src/bin/psql/describe.c b/src/bin/psql/describe.c
index c8a0bb7b3a..81d4607a1c 100644
--- a/src/bin/psql/describe.c
+++ b/src/bin/psql/describe.c
@@ -6472,7 +6472,7 @@ describeSubscriptions(const char *pattern, bool verbose)
PGresult *res;
printQueryOpt myopt = pset.popt;
static const bool translate_columns[] = {false, false, false, false,
- false, false, false, false, false, false, false, false};
+ false, false, false, false, false, false, false, false, false};
if (pset.sversion < 100000)
{
@@ -6527,10 +6527,13 @@ describeSubscriptions(const char *pattern, bool verbose)
gettext_noop("Two-phase commit"),
gettext_noop("Disable on error"));
+ /* Origin and min_apply_delay are only supported in v16 and higher */
if (pset.sversion >= 160000)
appendPQExpBuffer(&buf,
- ", suborigin AS \"%s\"\n",
- gettext_noop("Origin"));
+ ", suborigin AS \"%s\"\n"
+ ", subminapplydelay AS \"%s\"\n",
+ gettext_noop("Origin"),
+ gettext_noop("Min apply delay"));
appendPQExpBuffer(&buf,
", subsynccommit AS \"%s\"\n"
diff --git a/src/bin/psql/tab-complete.c b/src/bin/psql/tab-complete.c
index 5e1882eaea..e8b9a43a47 100644
--- a/src/bin/psql/tab-complete.c
+++ b/src/bin/psql/tab-complete.c
@@ -1925,7 +1925,7 @@ psql_completion(const char *text, int start, int end)
COMPLETE_WITH("(", "PUBLICATION");
/* ALTER SUBSCRIPTION <name> SET ( */
else if (HeadMatches("ALTER", "SUBSCRIPTION", MatchAny) && TailMatches("SET", "("))
- COMPLETE_WITH("binary", "disable_on_error", "origin", "slot_name",
+ COMPLETE_WITH("binary", "disable_on_error", "min_apply_delay", "origin", "slot_name",
"streaming", "synchronous_commit");
/* ALTER SUBSCRIPTION <name> SKIP ( */
else if (HeadMatches("ALTER", "SUBSCRIPTION", MatchAny) && TailMatches("SKIP", "("))
@@ -3268,7 +3268,7 @@ psql_completion(const char *text, int start, int end)
/* Complete "CREATE SUBSCRIPTION <name> ... WITH ( <opt>" */
else if (HeadMatches("CREATE", "SUBSCRIPTION") && TailMatches("WITH", "("))
COMPLETE_WITH("binary", "connect", "copy_data", "create_slot",
- "disable_on_error", "enabled", "origin", "slot_name",
+ "disable_on_error", "enabled", "min_apply_delay", "origin", "slot_name",
"streaming", "synchronous_commit", "two_phase");
/* CREATE TRIGGER --- is allowed inside CREATE SCHEMA, so use TailMatches */
diff --git a/src/include/catalog/pg_subscription.h b/src/include/catalog/pg_subscription.h
index b0f2a1705d..b8fe47ef6e 100644
--- a/src/include/catalog/pg_subscription.h
+++ b/src/include/catalog/pg_subscription.h
@@ -74,6 +74,8 @@ CATALOG(pg_subscription,6100,SubscriptionRelationId) BKI_SHARED_RELATION BKI_ROW
Oid subowner BKI_LOOKUP(pg_authid); /* Owner of the subscription */
+ int32 subminapplydelay; /* Replication apply delay (ms) */
+
bool subenabled; /* True if the subscription is enabled (the
* worker should be running) */
@@ -120,6 +122,7 @@ typedef struct Subscription
* in */
XLogRecPtr skiplsn; /* All changes finished at this LSN are
* skipped */
+ int32 minapplydelay; /* Replication apply delay (ms) */
char *name; /* Name of the subscription */
Oid owner; /* Oid of the subscription owner */
bool enabled; /* Indicates if the subscription is enabled */
diff --git a/src/include/replication/worker_internal.h b/src/include/replication/worker_internal.h
index dc87a4edd1..3dc09d1a4c 100644
--- a/src/include/replication/worker_internal.h
+++ b/src/include/replication/worker_internal.h
@@ -255,7 +255,7 @@ extern void stream_stop_internal(TransactionId xid);
/* Common streaming function to apply all the spooled messages */
extern void apply_spooled_messages(FileSet *stream_fileset, TransactionId xid,
- XLogRecPtr lsn);
+ XLogRecPtr lsn, TimestampTz finish_ts);
extern void apply_dispatch(StringInfo s);
diff --git a/src/test/regress/expected/subscription.out b/src/test/regress/expected/subscription.out
index 3f99b14394..cf8e727ee9 100644
--- a/src/test/regress/expected/subscription.out
+++ b/src/test/regress/expected/subscription.out
@@ -114,18 +114,18 @@ CREATE SUBSCRIPTION regress_testsub4 CONNECTION 'dbname=regress_doesnotexist' PU
WARNING: subscription was created, but is not connected
HINT: To initiate replication, you must manually create the replication slot, enable the subscription, and refresh the subscription.
\dRs+ regress_testsub4
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
-------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub4 | regress_subscription_user | f | {testpub} | f | off | d | f | none | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay | Synchronous commit | Conninfo | Skip LSN
+------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+-----------------+--------------------+-----------------------------+----------
+ regress_testsub4 | regress_subscription_user | f | {testpub} | f | off | d | f | none | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
ALTER SUBSCRIPTION regress_testsub4 SET (origin = any);
\dRs+ regress_testsub4
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
-------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub4 | regress_subscription_user | f | {testpub} | f | off | d | f | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay | Synchronous commit | Conninfo | Skip LSN
+------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+-----------------+--------------------+-----------------------------+----------
+ regress_testsub4 | regress_subscription_user | f | {testpub} | f | off | d | f | any | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
DROP SUBSCRIPTION regress_testsub3;
@@ -143,10 +143,10 @@ ALTER SUBSCRIPTION regress_testsub CONNECTION 'foobar';
ERROR: invalid connection string syntax: missing "=" after "foobar" in connection info string
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub} | f | off | d | f | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+-----------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | off | d | f | any | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
ALTER SUBSCRIPTION regress_testsub SET PUBLICATION testpub2, testpub3 WITH (refresh = false);
@@ -163,10 +163,10 @@ ERROR: unrecognized subscription parameter: "create_slot"
-- ok
ALTER SUBSCRIPTION regress_testsub SKIP (lsn = '0/12345');
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+---------------------+--------+-----------+------------------+------------------+--------+--------------------+------------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub2,testpub3} | f | off | d | f | any | off | dbname=regress_doesnotexist2 | 0/12345
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+---------------------+--------+-----------+------------------+------------------+--------+-----------------+--------------------+------------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub2,testpub3} | f | off | d | f | any | 0 | off | dbname=regress_doesnotexist2 | 0/12345
(1 row)
-- ok - with lsn = NONE
@@ -175,10 +175,10 @@ ALTER SUBSCRIPTION regress_testsub SKIP (lsn = NONE);
ALTER SUBSCRIPTION regress_testsub SKIP (lsn = '0/0');
ERROR: invalid WAL location (LSN): 0/0
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+---------------------+--------+-----------+------------------+------------------+--------+--------------------+------------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub2,testpub3} | f | off | d | f | any | off | dbname=regress_doesnotexist2 | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+---------------------+--------+-----------+------------------+------------------+--------+-----------------+--------------------+------------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub2,testpub3} | f | off | d | f | any | 0 | off | dbname=regress_doesnotexist2 | 0/0
(1 row)
BEGIN;
@@ -210,10 +210,10 @@ ALTER SUBSCRIPTION regress_testsub_foo SET (synchronous_commit = foobar);
ERROR: invalid value for parameter "synchronous_commit": "foobar"
HINT: Available values: local, remote_write, remote_apply, on, off.
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
----------------------+---------------------------+---------+---------------------+--------+-----------+------------------+------------------+--------+--------------------+------------------------------+----------
- regress_testsub_foo | regress_subscription_user | f | {testpub2,testpub3} | f | off | d | f | any | local | dbname=regress_doesnotexist2 | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay | Synchronous commit | Conninfo | Skip LSN
+---------------------+---------------------------+---------+---------------------+--------+-----------+------------------+------------------+--------+-----------------+--------------------+------------------------------+----------
+ regress_testsub_foo | regress_subscription_user | f | {testpub2,testpub3} | f | off | d | f | any | 0 | local | dbname=regress_doesnotexist2 | 0/0
(1 row)
-- rename back to keep the rest simple
@@ -247,19 +247,19 @@ CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUB
WARNING: subscription was created, but is not connected
HINT: To initiate replication, you must manually create the replication slot, enable the subscription, and refresh the subscription.
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub} | t | off | d | f | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+-----------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | t | off | d | f | any | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
ALTER SUBSCRIPTION regress_testsub SET (binary = false);
ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE);
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub} | f | off | d | f | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+-----------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | off | d | f | any | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
DROP SUBSCRIPTION regress_testsub;
@@ -271,27 +271,27 @@ CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUB
WARNING: subscription was created, but is not connected
HINT: To initiate replication, you must manually create the replication slot, enable the subscription, and refresh the subscription.
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub} | f | on | d | f | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+-----------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | on | d | f | any | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
ALTER SUBSCRIPTION regress_testsub SET (streaming = parallel);
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub} | f | parallel | d | f | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+-----------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | parallel | d | f | any | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
ALTER SUBSCRIPTION regress_testsub SET (streaming = false);
ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE);
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub} | f | off | d | f | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+-----------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | off | d | f | any | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
-- fail - publication already exists
@@ -306,10 +306,10 @@ ALTER SUBSCRIPTION regress_testsub ADD PUBLICATION testpub1, testpub2 WITH (refr
ALTER SUBSCRIPTION regress_testsub ADD PUBLICATION testpub1, testpub2 WITH (refresh = false);
ERROR: publication "testpub1" is already in subscription "regress_testsub"
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-----------------------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub,testpub1,testpub2} | f | off | d | f | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-----------------------------+--------+-----------+------------------+------------------+--------+-----------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub,testpub1,testpub2} | f | off | d | f | any | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
-- fail - publication used more than once
@@ -324,10 +324,10 @@ ERROR: publication "testpub3" is not in subscription "regress_testsub"
-- ok - delete publications
ALTER SUBSCRIPTION regress_testsub DROP PUBLICATION testpub1, testpub2 WITH (refresh = false);
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub} | f | off | d | f | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+-----------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | off | d | f | any | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
DROP SUBSCRIPTION regress_testsub;
@@ -363,10 +363,10 @@ CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUB
WARNING: subscription was created, but is not connected
HINT: To initiate replication, you must manually create the replication slot, enable the subscription, and refresh the subscription.
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub} | f | off | p | f | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+-----------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | off | p | f | any | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
--fail - alter of two_phase option not supported.
@@ -375,10 +375,10 @@ ERROR: unrecognized subscription parameter: "two_phase"
-- but can alter streaming when two_phase enabled
ALTER SUBSCRIPTION regress_testsub SET (streaming = true);
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub} | f | on | p | f | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+-----------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | on | p | f | any | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE);
@@ -388,10 +388,10 @@ CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUB
WARNING: subscription was created, but is not connected
HINT: To initiate replication, you must manually create the replication slot, enable the subscription, and refresh the subscription.
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub} | f | on | p | f | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+-----------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | on | p | f | any | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE);
@@ -404,20 +404,57 @@ CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUB
WARNING: subscription was created, but is not connected
HINT: To initiate replication, you must manually create the replication slot, enable the subscription, and refresh the subscription.
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub} | f | off | d | f | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+-----------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | off | d | f | any | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
ALTER SUBSCRIPTION regress_testsub SET (disable_on_error = true);
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub} | f | off | d | t | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+-----------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | off | d | t | any | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
+ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE);
+DROP SUBSCRIPTION regress_testsub;
+-- fail -- min_apply_delay must be a non-negative integer
+CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUBLICATION testpub WITH (connect = false, min_apply_delay = foo);
+ERROR: invalid value for parameter "min_apply_delay": "foo"
+CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUBLICATION testpub WITH (connect = false, min_apply_delay = -1);
+ERROR: -1 ms is outside the valid range for parameter "min_apply_delay" (0 .. 2147483647)
+-- fail - utilizing streaming = parallel with time-delayed replication is not supported
+CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUBLICATION testpub WITH (connect = false, streaming = parallel, min_apply_delay = 123);
+ERROR: min_apply_delay > 0 and streaming = parallel are mutually exclusive options
+-- success -- min_apply_delay value without unit is taken as milliseconds
+CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUBLICATION testpub WITH (connect = false, min_apply_delay = 123);
+WARNING: subscription was created, but is not connected
+HINT: To initiate replication, you must manually create the replication slot, enable the subscription, and refresh the subscription.
+\dRs+
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+-----------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | off | d | f | any | 123 | off | dbname=regress_doesnotexist | 0/0
+(1 row)
+
+-- success -- min_apply_delay value with unit is converted into ms and stored as an integer
+ALTER SUBSCRIPTION regress_testsub SET (min_apply_delay = '1 d');
+\dRs+
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+-----------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | off | d | f | any | 86400000 | off | dbname=regress_doesnotexist | 0/0
+(1 row)
+
+-- fail - alter subscription with streaming = parallel should fail when time-delayed replication is set
+ALTER SUBSCRIPTION regress_testsub SET (streaming = parallel);
+ERROR: cannot set parallel streaming mode for subscription with min_apply_delay
+-- fail - alter subscription with min_apply_delay should fail when streaming = parallel is set
+ALTER SUBSCRIPTION regress_testsub SET (min_apply_delay = 0, streaming = parallel);
+ALTER SUBSCRIPTION regress_testsub SET (min_apply_delay = 123);
+ERROR: cannot set min_apply_delay for subscription in parallel streaming mode
ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE);
DROP SUBSCRIPTION regress_testsub;
RESET SESSION AUTHORIZATION;
diff --git a/src/test/regress/sql/subscription.sql b/src/test/regress/sql/subscription.sql
index 7281f5fee2..7317b140f5 100644
--- a/src/test/regress/sql/subscription.sql
+++ b/src/test/regress/sql/subscription.sql
@@ -286,6 +286,30 @@ ALTER SUBSCRIPTION regress_testsub SET (disable_on_error = true);
ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE);
DROP SUBSCRIPTION regress_testsub;
+-- fail -- min_apply_delay must be a non-negative integer
+CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUBLICATION testpub WITH (connect = false, min_apply_delay = foo);
+CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUBLICATION testpub WITH (connect = false, min_apply_delay = -1);
+
+-- fail - utilizing streaming = parallel with time-delayed replication is not supported
+CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUBLICATION testpub WITH (connect = false, streaming = parallel, min_apply_delay = 123);
+
+-- success -- min_apply_delay value without unit is taken as milliseconds
+CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUBLICATION testpub WITH (connect = false, min_apply_delay = 123);
+\dRs+
+
+-- success -- min_apply_delay value with unit is converted into ms and stored as an integer
+ALTER SUBSCRIPTION regress_testsub SET (min_apply_delay = '1 d');
+\dRs+
+
+-- fail - alter subscription with streaming = parallel should fail when time-delayed replication is set
+ALTER SUBSCRIPTION regress_testsub SET (streaming = parallel);
+
+-- fail - alter subscription with min_apply_delay should fail when streaming = parallel is set
+ALTER SUBSCRIPTION regress_testsub SET (min_apply_delay = 0, streaming = parallel);
+ALTER SUBSCRIPTION regress_testsub SET (min_apply_delay = 123);
+ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE);
+DROP SUBSCRIPTION regress_testsub;
+
RESET SESSION AUTHORIZATION;
DROP ROLE regress_subscription_user;
DROP ROLE regress_subscription_user2;
diff --git a/src/test/subscription/meson.build b/src/test/subscription/meson.build
index 3db0fdfd96..a186876eb4 100644
--- a/src/test/subscription/meson.build
+++ b/src/test/subscription/meson.build
@@ -38,6 +38,7 @@ tests += {
't/029_on_error.pl',
't/030_origin.pl',
't/031_column_list.pl',
+ 't/032_apply_delay.pl',
't/100_bugs.pl',
],
},
diff --git a/src/test/subscription/t/032_apply_delay.pl b/src/test/subscription/t/032_apply_delay.pl
new file mode 100644
index 0000000000..8accc1fefd
--- /dev/null
+++ b/src/test/subscription/t/032_apply_delay.pl
@@ -0,0 +1,133 @@
+
+# Copyright (c) 2023, PostgreSQL Global Development Group
+
+# Checks for min_apply_delay
+use strict;
+use warnings;
+use PostgreSQL::Test::Cluster;
+use PostgreSQL::Test::Utils;
+use Test::More;
+
+# Confirm the time-delayed replication worked, by inspecting the server log
+# message where the apply worker emits for applying delay. Moreover, verify
+# that the current worker's remaining wait time is sufficiently bigger than the
+# expected value, in order to check any update of the min_apply_delay.
+sub check_apply_delay_log
+{
+ my ($node_subscriber, $offset, $expected) = @_;
+
+ # Get the remaining wait time from the server log
+ my $contents = slurp_file($node_subscriber->logfile, $offset);
+ $contents =~
+ qr/time-delayed replication for txid (\d+), min_apply_delay = (\d+) ms, remaining wait time: (\d+) ms/,
+ or die "could not get the apply worker wait time";
+ my $logged_delay = $3;
+
+ # Is it larger than expected?
+ cmp_ok($logged_delay, '>', $expected,
+ "The apply worker wait time has expected duration");
+}
+
+# Initialize publisher node
+my $node_publisher = PostgreSQL::Test::Cluster->new('publisher');
+$node_publisher->init(allows_streaming => 'logical');
+$node_publisher->append_conf('postgresql.conf',
+ 'logical_decoding_work_mem = 64kB');
+$node_publisher->start;
+
+# And some content
+$node_publisher->safe_psql('postgres',
+ "CREATE TABLE tab_int AS SELECT generate_series(1, 10) AS a");
+$node_publisher->safe_psql('postgres',
+ "CREATE PUBLICATION tap_pub FOR TABLE tab_int;");
+
+# Create subscriber
+my $node_subscriber = PostgreSQL::Test::Cluster->new('subscriber');
+my $delay = 3;
+$node_subscriber->init;
+$node_subscriber->append_conf(
+ 'postgresql.conf', qq(
+log_min_messages = debug2
+));
+$node_subscriber->start;
+$node_subscriber->safe_psql('postgres',
+ "CREATE TABLE tab_int AS SELECT generate_series(1, 10) AS a");
+my $publisher_connstr = $node_publisher->connstr . ' dbname=postgres';
+my $appname = 'tap_sub';
+$node_subscriber->safe_psql('postgres',
+ "CREATE SUBSCRIPTION tap_sub CONNECTION '$publisher_connstr application_name=$appname' PUBLICATION tap_pub WITH (copy_data = off, min_apply_delay = '${delay}s', streaming = 'on')"
+);
+
+# Make new content on publisher and check its presence in subscriber depending
+# on the delay applied above. Before doing the insertion, get the
+# current timestamp that will be used as a comparison base. Even on slow
+# machines, this allows to have a predictable behavior when comparing the
+# delay between data insertion moment on publisher and replay time on subscriber.
+my $publisher_insert_time = time();
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_int VALUES (generate_series(11, 20))");
+
+# The publisher waits for the replication to complete
+$node_publisher->wait_for_catchup($appname);
+
+# This test is successful if and only if the LSN has been applied with at least
+# the configured apply delay.
+ok( time() - $publisher_insert_time >= $delay,
+ "subscriber applies WAL only after replication delay for non-streaming transaction"
+);
+
+# For better coverage, setup for streaming case
+$node_publisher->append_conf('postgresql.conf',
+ 'logical_replication_mode = immediate');
+$node_publisher->reload;
+
+# Running a dummy query causes the config to be reloaded.
+$node_publisher->safe_psql('postgres', q{SELECT 1});
+
+# Check log starting now for logical replication apply delay
+$publisher_insert_time = time();
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_int VALUES (generate_series(21, 30))");
+$node_publisher->wait_for_catchup($appname);
+ok( time() - $publisher_insert_time >= $delay,
+ "subscriber applies WAL only after replication delay for streaming transaction"
+);
+
+# From here, confirm disabling the subscription by ALTER SUBSCRIPTION DISABLE
+# did not cause the delayed transaction to be applied. This is unique to
+# time-delayed logical replication.
+
+# Execute ALTER SUBSCRIPTION to change the delayed time of the apply worker
+# (1 day 5 minutes). Note that the extra 5 minute is to account for any
+# decoding/network overhead.
+$node_subscriber->safe_psql('postgres',
+ "ALTER SUBSCRIPTION tap_sub SET (min_apply_delay = 86700000)");
+
+# Check log starting now for logical replication apply delay
+my $offset = -s $node_subscriber->logfile;
+
+# New row to trigger apply delay
+$node_publisher->safe_psql('postgres', "INSERT INTO tab_int VALUES (0)");
+
+# Make sure the apply worker waits for more than 1 day
+check_apply_delay_log($node_subscriber, $offset, "86400000");
+
+# Disable subscription and the worker should die immediately
+$node_subscriber->safe_psql('postgres',
+ "ALTER SUBSCRIPTION tap_sub DISABLE;");
+
+# Wait until worker dies
+my $sub_query =
+ "SELECT count(1) = 0 FROM pg_stat_subscription WHERE subname = 'tap_sub' AND pid IS NOT NULL;";
+$node_subscriber->poll_query_until('postgres', $sub_query)
+ or die "Timed out while waiting for subscriber to die";
+
+# Confirm the record was not applied
+my $result = $node_subscriber->safe_psql('postgres',
+ "SELECT count(a) FROM tab_int WHERE a = 0;");
+is($result, qq(0), "check the delayed transaction was not applied");
+
+$node_subscriber->stop;
+$node_publisher->stop;
+
+done_testing();
--
2.27.0
On Tue, Feb 7, 2023 at 6:03 AM Peter Smith <smithpb2250@gmail.com> wrote:
5. +defGetMinApplyDelay(DefElem *def) +{ + char *input_string; + int result; + const char *hintmsg; + + input_string = defGetString(def); + + /* + * Parse given string as parameter which has millisecond unit + */ + if (!parse_int(input_string, &result, GUC_UNIT_MS, &hintmsg)) + ereport(ERROR, + (errcode(ERRCODE_INVALID_PARAMETER_VALUE), + errmsg("invalid value for parameter \"%s\": \"%s\"", + "min_apply_delay", input_string), + hintmsg ? errhint("%s", _(hintmsg)) : 0)); + + /* + * Check both the lower boundary for the valid min_apply_delay range and + * the upper boundary as the safeguard for some platforms where INT_MAX is + * wider than int32 respectively. Although parse_int() has confirmed that + * the result is less than or equal to INT_MAX, the value will be stored + * in a catalog column of int32. + */ + if (result < 0 || result > PG_INT32_MAX) + ereport(ERROR, + (errcode(ERRCODE_INVALID_PARAMETER_VALUE), + errmsg("%d ms is outside the valid range for parameter \"%s\" (%d .. %d)", + result, + "min_apply_delay", + 0, PG_INT32_MAX))); + + return result; +}5a.
Since there are no translator considerations here why not write the
first error like:errmsg("invalid value for parameter \"min_apply_delay\": \"%s\"",
input_string)~
5b.
Since there are no translator considerations here why not write the
second error like:errmsg("%d ms is outside the valid range for parameter
\"min_apply_delay\" (%d .. %d)",
result, 0, PG_INT32_MAX))
I see that existing usage in the code matches what the patch had
before this comment. See below and similar usages in the code.
if (start <= 0)
ereport(ERROR,
(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
errmsg("invalid value for parameter \"%s\": %d",
"start", start)));
--
With Regards,
Amit Kapila.
At Tue, 7 Feb 2023 09:10:01 +0530, Amit Kapila <amit.kapila16@gmail.com> wrote in
On Tue, Feb 7, 2023 at 6:03 AM Peter Smith <smithpb2250@gmail.com> wrote:
5b.
Since there are no translator considerations here why not write the
second error like:errmsg("%d ms is outside the valid range for parameter
\"min_apply_delay\" (%d .. %d)",
result, 0, PG_INT32_MAX))I see that existing usage in the code matches what the patch had
before this comment. See below and similar usages in the code.
if (start <= 0)
ereport(ERROR,
(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
errmsg("invalid value for parameter \"%s\": %d",
"start", start)));
The same errmsg text occurs mamy times in the tree. On the other hand
the pointed message is the only one. I suppose Peter considered this
aspect.
# "%d%s%s is outside the valid range for parameter \"%s\" (%d .. %d)"
# also appears just once
As for me, it seems to me a good practice to do that regadless of the
number of duplicates to (semi)mechanically avoid duplicates.
(But I believe I would do as Peter suggests by myself for the first
cut, though:p)
regards.
--
Kyotaro Horiguchi
NTT Open Source Software Center
Thanks!
At Mon, 6 Feb 2023 13:10:01 +0000, "Takamichi Osumi (Fujitsu)" <osumi.takamichi@fujitsu.com> wrote in
The attached patch v29 has included your changes.
catalogs.sgml
+ <para>
+ The minimum delay (ms) for applying changes.
+ </para></entry>
I think we don't use unit symbols that way. Namely I think we would
write it as "The minimum delay for applying changes in milliseconds"
alter_subscription.sgml
are <literal>slot_name</literal>,
<literal>synchronous_commit</literal>,
<literal>binary</literal>, <literal>streaming</literal>,
- <literal>disable_on_error</literal>, and
- <literal>origin</literal>.
+ <literal>disable_on_error</literal>,
+ <literal>origin</literal>, and
+ <literal>min_apply_delay</literal>.
</para>
By the way, is there any rule for the order among the words? They
don't seem in alphabetical order nor in the same order to the
create_sbuscription page. (I seems like in the order of SUBOPT_*
symbols, but I'm not sure it's a good idea..)
subscriptioncmds.c
+ if (opts.streaming == LOGICALREP_STREAM_PARALLEL &&
+ !IsSet(opts.specified_opts, SUBOPT_MIN_APPLY_DELAY) && sub->minapplydelay > 0)
..
+ if (opts.min_apply_delay > 0 &&
+ !IsSet(opts.specified_opts, SUBOPT_STREAMING) && sub->stream == LOGICALREP_STREAM_PARALLEL)
Don't we wrap the lines?
worker.c
+ if (wal_receiver_status_interval > 0 &&
+ diffms > wal_receiver_status_interval * 1000L)
+ {
+ WaitLatch(MyLatch,
+ WL_LATCH_SET | WL_TIMEOUT | WL_EXIT_ON_PM_DEATH,
+ wal_receiver_status_interval * 1000L,
+ WAIT_EVENT_RECOVERY_APPLY_DELAY);
+ send_feedback(last_received, true, false, true);
+ }
+ else
+ WaitLatch(MyLatch,
+ WL_LATCH_SET | WL_TIMEOUT | WL_EXIT_ON_PM_DEATH,
+ diffms,
+ WAIT_EVENT_RECOVERY_APPLY_DELAY);
send_feedback always handles the case where
wal_receiver_status_interval == 0. thus we can simply wait for
min(wal_receiver_status_interval, diffms) then call send_feedback()
unconditionally.
-start_apply(XLogRecPtr origin_startpos)
+start_apply(void)
-LogicalRepApplyLoop(XLogRecPtr last_received)
+LogicalRepApplyLoop(void)
Does this patch requires this change?
regards.
--
Kyotaro Horiguchi
NTT Open Source Software Center
On Tue, Feb 7, 2023 at 10:07 AM Kyotaro Horiguchi
<horikyota.ntt@gmail.com> wrote:
At Tue, 7 Feb 2023 09:10:01 +0530, Amit Kapila <amit.kapila16@gmail.com> wrote in
On Tue, Feb 7, 2023 at 6:03 AM Peter Smith <smithpb2250@gmail.com> wrote:
5b.
Since there are no translator considerations here why not write the
second error like:errmsg("%d ms is outside the valid range for parameter
\"min_apply_delay\" (%d .. %d)",
result, 0, PG_INT32_MAX))I see that existing usage in the code matches what the patch had
before this comment. See below and similar usages in the code.
if (start <= 0)
ereport(ERROR,
(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
errmsg("invalid value for parameter \"%s\": %d",
"start", start)));The same errmsg text occurs mamy times in the tree. On the other hand
the pointed message is the only one. I suppose Peter considered this
aspect.# "%d%s%s is outside the valid range for parameter \"%s\" (%d .. %d)"
# also appears just onceAs for me, it seems to me a good practice to do that regadless of the
number of duplicates to (semi)mechanically avoid duplicates.(But I believe I would do as Peter suggests by myself for the first
cut, though:p)
Personally, I would prefer consistency. I think we can later start a
new thread to change the existing message and if there is a consensus
and value in the same then we could use the same style here as well.
--
With Regards,
Amit Kapila.
On Tue, Feb 7, 2023 at 4:02 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
On Tue, Feb 7, 2023 at 10:07 AM Kyotaro Horiguchi
<horikyota.ntt@gmail.com> wrote:At Tue, 7 Feb 2023 09:10:01 +0530, Amit Kapila <amit.kapila16@gmail.com> wrote in
On Tue, Feb 7, 2023 at 6:03 AM Peter Smith <smithpb2250@gmail.com> wrote:
5b.
Since there are no translator considerations here why not write the
second error like:errmsg("%d ms is outside the valid range for parameter
\"min_apply_delay\" (%d .. %d)",
result, 0, PG_INT32_MAX))I see that existing usage in the code matches what the patch had
before this comment. See below and similar usages in the code.
if (start <= 0)
ereport(ERROR,
(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
errmsg("invalid value for parameter \"%s\": %d",
"start", start)));The same errmsg text occurs mamy times in the tree. On the other hand
the pointed message is the only one. I suppose Peter considered this
aspect.# "%d%s%s is outside the valid range for parameter \"%s\" (%d .. %d)"
# also appears just onceAs for me, it seems to me a good practice to do that regadless of the
number of duplicates to (semi)mechanically avoid duplicates.(But I believe I would do as Peter suggests by myself for the first
cut, though:p)Personally, I would prefer consistency. I think we can later start a
new thread to change the existing message and if there is a consensus
and value in the same then we could use the same style here as well.
Of course, if there is a convention then we should stick to it.
My understanding was that (string literal) message parameters are
specified separately from the message format string primarily as an
aid to translators. That makes good sense for parameters with names
that are also English words (like "start" etc), but for non-word
parameters like "min_apply_delay" there is no such ambiguity in the
first place.
Anyway, I am fine with it being written either way.
------
Kind Regards,
Peter Smith.
Fujitsu Australia
On Tue, Feb 7, 2023 at 10:13 AM Kyotaro Horiguchi
<horikyota.ntt@gmail.com> wrote:
At Mon, 6 Feb 2023 13:10:01 +0000, "Takamichi Osumi (Fujitsu)" <osumi.takamichi@fujitsu.com> wrote in
The attached patch v29 has included your changes.
catalogs.sgml
+ <para> + The minimum delay (ms) for applying changes. + </para></entry>I think we don't use unit symbols that way. Namely I think we would
write it as "The minimum delay for applying changes in milliseconds"
Okay, if we prefer to use milliseconds, then how about: "The minimum
delay, in milliseconds, for applying changes"?
alter_subscription.sgml
are <literal>slot_name</literal>, <literal>synchronous_commit</literal>, <literal>binary</literal>, <literal>streaming</literal>, - <literal>disable_on_error</literal>, and - <literal>origin</literal>. + <literal>disable_on_error</literal>, + <literal>origin</literal>, and + <literal>min_apply_delay</literal>. </para>By the way, is there any rule for the order among the words?
Currently, it is in the order in which the corresponding features are added.
They
don't seem in alphabetical order nor in the same order to the
create_sbuscription page.
In create_subscription page also, it appears to be in the order in
which those are added with a difference that they are divided into two
categories (parameters that control what happens during subscription
creation and parameters that control the subscription's replication
behavior after it has been created)
(I seems like in the order of SUBOPT_*
symbols, but I'm not sure it's a good idea..)subscriptioncmds.c
+ if (opts.streaming == LOGICALREP_STREAM_PARALLEL && + !IsSet(opts.specified_opts, SUBOPT_MIN_APPLY_DELAY) && sub->minapplydelay > 0) .. + if (opts.min_apply_delay > 0 && + !IsSet(opts.specified_opts, SUBOPT_STREAMING) && sub->stream == LOGICALREP_STREAM_PARALLEL)Don't we wrap the lines?
worker.c
+ if (wal_receiver_status_interval > 0 && + diffms > wal_receiver_status_interval * 1000L) + { + WaitLatch(MyLatch, + WL_LATCH_SET | WL_TIMEOUT | WL_EXIT_ON_PM_DEATH, + wal_receiver_status_interval * 1000L, + WAIT_EVENT_RECOVERY_APPLY_DELAY); + send_feedback(last_received, true, false, true); + } + else + WaitLatch(MyLatch, + WL_LATCH_SET | WL_TIMEOUT | WL_EXIT_ON_PM_DEATH, + diffms, + WAIT_EVENT_RECOVERY_APPLY_DELAY);send_feedback always handles the case where
wal_receiver_status_interval == 0.
It only handles when force is false but here we are using that as
true. So, not sure, if what you said would be an improvement.
thus we can simply wait for
min(wal_receiver_status_interval, diffms) then call send_feedback()
unconditionally.-start_apply(XLogRecPtr origin_startpos) +start_apply(void)-LogicalRepApplyLoop(XLogRecPtr last_received) +LogicalRepApplyLoop(void)Does this patch requires this change?
I think this is because the scope of last_received has been changed so
that it can be used to pass in send_feedback() during the delay.
--
With Regards,
Amit Kapila.
On Tue, Feb 7, 2023 at 10:42 AM Peter Smith <smithpb2250@gmail.com> wrote:
On Tue, Feb 7, 2023 at 4:02 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
On Tue, Feb 7, 2023 at 10:07 AM Kyotaro Horiguchi
<horikyota.ntt@gmail.com> wrote:At Tue, 7 Feb 2023 09:10:01 +0530, Amit Kapila <amit.kapila16@gmail.com> wrote in
On Tue, Feb 7, 2023 at 6:03 AM Peter Smith <smithpb2250@gmail.com> wrote:
5b.
Since there are no translator considerations here why not write the
second error like:errmsg("%d ms is outside the valid range for parameter
\"min_apply_delay\" (%d .. %d)",
result, 0, PG_INT32_MAX))I see that existing usage in the code matches what the patch had
before this comment. See below and similar usages in the code.
if (start <= 0)
ereport(ERROR,
(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
errmsg("invalid value for parameter \"%s\": %d",
"start", start)));The same errmsg text occurs mamy times in the tree. On the other hand
the pointed message is the only one. I suppose Peter considered this
aspect.# "%d%s%s is outside the valid range for parameter \"%s\" (%d .. %d)"
# also appears just onceAs for me, it seems to me a good practice to do that regadless of the
number of duplicates to (semi)mechanically avoid duplicates.(But I believe I would do as Peter suggests by myself for the first
cut, though:p)Personally, I would prefer consistency. I think we can later start a
new thread to change the existing message and if there is a consensus
and value in the same then we could use the same style here as well.Of course, if there is a convention then we should stick to it.
My understanding was that (string literal) message parameters are
specified separately from the message format string primarily as an
aid to translators. That makes good sense for parameters with names
that are also English words (like "start" etc), but for non-word
parameters like "min_apply_delay" there is no such ambiguity in the
first place.
TBH, I am not an expert in this matter. So, to avoid, making any
mistakes I thought of keeping it close to the existing style.
--
With Regards,
Amit Kapila.
On Tue, Feb 7, 2023 at 8:22 AM Hayato Kuroda (Fujitsu)
<kuroda.hayato@fujitsu.com> wrote:
Thank you for reviewing! PSA new version.
Few comments:
=============
1.
@@ -74,6 +74,8 @@ CATALOG(pg_subscription,6100,SubscriptionRelationId)
BKI_SHARED_RELATION BKI_ROW
Oid subowner BKI_LOOKUP(pg_authid); /* Owner of the subscription */
+ int32 subminapplydelay; /* Replication apply delay (ms) */
+
bool subenabled; /* True if the subscription is enabled (the
* worker should be running) */
@@ -120,6 +122,7 @@ typedef struct Subscription
* in */
XLogRecPtr skiplsn; /* All changes finished at this LSN are
* skipped */
+ int32 minapplydelay; /* Replication apply delay (ms) */
char *name; /* Name of the subscription */
Oid owner; /* Oid of the subscription owner */
Why the new parameter is placed at different locations in above two
strcutures? I think it should be after owner in both cases and
accordingly its order should be changed in GetSubscription() or any
other place it is used.
2. A minor comment change suggestion:
/*
* Common spoolfile processing.
*
- * The commit/prepare time (finish_ts) for streamed transactions is required
- * for time-delayed logical replication.
+ * The commit/prepare time (finish_ts) is required for time-delayed logical
+ * replication.
*/
3. I find the newly added tests take about 8s on my machine which is
close highest in the subscription folder. I understand that it can't
be less than 3s because of the delay but checking multiple cases makes
it take that long. I think we can avoid the tests for streaming and
disable the subscription. Also, after removing those, I think it would
be better to add the remaining test in 001_rep_changes to save set-up
and tear-down costs as well.
4.
+$node_publisher->append_conf('postgresql.conf',
+ 'logical_decoding_work_mem = 64kB');
I think this setting is also not required.
--
With Regards,
Amit Kapila.
Hi,
On Tuesday, February 7, 2023 6:56 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
On Tue, Feb 7, 2023 at 8:22 AM Hayato Kuroda (Fujitsu)
<kuroda.hayato@fujitsu.com> wrote:Thank you for reviewing! PSA new version.
Few comments:
=============
Thanks for your comments !
1.
@@ -74,6 +74,8 @@ CATALOG(pg_subscription,6100,SubscriptionRelationId)
BKI_SHARED_RELATION BKI_ROWOid subowner BKI_LOOKUP(pg_authid); /* Owner of the subscription */
+ int32 subminapplydelay; /* Replication apply delay (ms) */ + bool subenabled; /* True if the subscription is enabled (the * worker should be running) */@@ -120,6 +122,7 @@ typedef struct Subscription * in */ XLogRecPtr skiplsn; /* All changes finished at this LSN are * skipped */ + int32 minapplydelay; /* Replication apply delay (ms) */ char *name; /* Name of the subscription */ Oid owner; /* Oid of the subscription owner */Why the new parameter is placed at different locations in above two
strcutures? I think it should be after owner in both cases and accordingly its
order should be changed in GetSubscription() or any other place it is used.
Fixed.
2. A minor comment change suggestion: /* * Common spoolfile processing. * - * The commit/prepare time (finish_ts) for streamed transactions is required - * for time-delayed logical replication. + * The commit/prepare time (finish_ts) is required for time-delayed + logical + * replication. */
Fixed.
3. I find the newly added tests take about 8s on my machine which is close
highest in the subscription folder. I understand that it can't be less than 3s
because of the delay but checking multiple cases makes it take that long. I
think we can avoid the tests for streaming and disable the subscription. Also,
after removing those, I think it would be better to add the remaining test in
001_rep_changes to save set-up and tear-down costs as well.
Sounds good to me. Moved the test to 001_rep_changes.pl.
4. +$node_publisher->append_conf('postgresql.conf', + 'logical_decoding_work_mem = 64kB');I think this setting is also not required.
Yes. And, in the process to move the test, removed.
Attached the v31 patch.
Note that regarding the translator style,
I chose to export the parameters from the errmsg to outside
at this stage. If there is a need to change it, then I'll follow it.
Other changes are minor alignments to make 'if' conditions
that exceeded 80 characters folded and look nicer.
Also conducted pgindent and pgperltidy.
Best Regards,
Takamichi Osumi
Attachments:
v31-0001-Time-delayed-logical-replication-subscriber.patchapplication/octet-stream; name=v31-0001-Time-delayed-logical-replication-subscriber.patchDownload
From c2ec743a5ea36845f387b09453801a1fc2df86da Mon Sep 17 00:00:00 2001
From: Takamichi Osumi <osumi.takamichi@fujitsu.com>
Date: Tue, 7 Feb 2023 13:05:34 +0000
Subject: [PATCH v31] Time-delayed logical replication subscriber
Similar to physical replication, a time-delayed copy of the data for
logical replication is useful for some scenarios (particularly to fix
errors that might cause data loss).
This patch implements a new subscription parameter called 'min_apply_delay'.
If the subscription sets min_apply_delay parameter, the logical
replication worker will delay the transaction apply for min_apply_delay
milliseconds.
The delay is calculated between the WAL time stamp and the current time
on the subscriber.
The delay occurs before we start to apply the transaction on the
subscriber. The main reason is to avoid keeping a transaction open for
a long time. Regular and prepared transactions are covered. Streamed
transactions are also covered.
The combination of parallel streaming mode and min_apply_delay is not
allowed. This is because in parallel streaming mode, we start applying
the transaction stream as soon as the first change arrives without
knowing the transaction's prepare/commit time. This means we cannot
calculate the underlying network/decoding lag between publisher and
subscriber, and so always waiting for the full 'min_apply_delay' period
might include unnecessary delay.
The other possibility was to apply the delay at the end of the parallel
apply transaction but that would cause issues related to resource
bloat and locks being held for a long time.
Note that this feature doesn't interact with skip transaction feature.
The skip transaction feature applies to one transaction with a specific LSN.
So, even if the skipped transaction and non-skipped transaction come
consecutively in a very short time, regardless of the order of which comes
first, the time-delayed feature gets balanced by delayed application
for other transactions before and after the skipped transaction.
Author: Euler Taveira, Takamichi Osumi, Kuroda Hayato
Reviewed-by: Amit Kapila, Peter Smith, Vignesh C, Shveta Malik,
Kyotaro Horiguchi, Shi Yu, Wang Wei, Dilip Kumar, Melih Mutlu
Discussion: https://postgr.es/m/CAB-JLwYOYwL=XTyAXKiH5CtM_Vm8KjKh7aaitCKvmCh4rzr5pQ@mail.gmail.com
---
doc/src/sgml/catalogs.sgml | 9 +
doc/src/sgml/config.sgml | 12 ++
doc/src/sgml/glossary.sgml | 14 ++
doc/src/sgml/logical-replication.sgml | 6 +
doc/src/sgml/ref/alter_subscription.sgml | 5 +-
doc/src/sgml/ref/create_subscription.sgml | 49 ++++-
src/backend/catalog/pg_subscription.c | 1 +
src/backend/catalog/system_views.sql | 7 +-
src/backend/commands/subscriptioncmds.c | 122 +++++++++++-
.../replication/logical/applyparallelworker.c | 3 +-
src/backend/replication/logical/worker.c | 165 ++++++++++++++--
src/bin/pg_dump/pg_dump.c | 15 +-
src/bin/pg_dump/pg_dump.h | 1 +
src/bin/psql/describe.c | 9 +-
src/bin/psql/tab-complete.c | 4 +-
src/include/catalog/pg_subscription.h | 3 +
src/include/replication/worker_internal.h | 2 +-
src/test/regress/expected/subscription.out | 181 +++++++++++-------
src/test/regress/sql/subscription.sql | 24 +++
src/test/subscription/t/001_rep_changes.pl | 30 +++
20 files changed, 558 insertions(+), 104 deletions(-)
diff --git a/doc/src/sgml/catalogs.sgml b/doc/src/sgml/catalogs.sgml
index c1e4048054..5dc5ca1133 100644
--- a/doc/src/sgml/catalogs.sgml
+++ b/doc/src/sgml/catalogs.sgml
@@ -7873,6 +7873,15 @@ SCRAM-SHA-256$<replaceable><iteration count></replaceable>:<replaceable>&l
</para></entry>
</row>
+ <row>
+ <entry role="catalog_table_entry"><para role="column_definition">
+ <structfield>subminapplydelay</structfield> <type>int4</type>
+ </para>
+ <para>
+ The minimum delay, in milliseconds, for applying changes
+ </para></entry>
+ </row>
+
<row>
<entry role="catalog_table_entry"><para role="column_definition">
<structfield>subname</structfield> <type>name</type>
diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml
index d190be1925..626a8b5bd0 100644
--- a/doc/src/sgml/config.sgml
+++ b/doc/src/sgml/config.sgml
@@ -4787,6 +4787,18 @@ ANY <replaceable class="parameter">num_sync</replaceable> ( <replaceable class="
the <filename>postgresql.conf</filename> file or on the server
command line.
</para>
+ <para>
+ For time-delayed logical replication, the apply worker sends a feedback
+ message to the publisher every
+ <varname>wal_receiver_status_interval</varname> milliseconds. Make sure
+ to set <varname>wal_receiver_status_interval</varname> less than the
+ <varname>wal_sender_timeout</varname> on the publisher, otherwise, the
+ <literal>walsender</literal> will repeatedly terminate due to timeout
+ errors. Note that if <varname>wal_receiver_status_interval</varname> is
+ set to zero, the apply worker sends no feedback messages during the
+ <literal>min_apply_delay</literal> period. Refer to
+ <xref linkend="sql-createsubscription"/> for more information.
+ </para>
</listitem>
</varlistentry>
diff --git a/doc/src/sgml/glossary.sgml b/doc/src/sgml/glossary.sgml
index 7c01a541fe..6ed6fa5853 100644
--- a/doc/src/sgml/glossary.sgml
+++ b/doc/src/sgml/glossary.sgml
@@ -1729,6 +1729,20 @@
</glossdef>
</glossentry>
+ <glossentry id="glossary-time-delayed-replication">
+ <glossterm>Time-delayed replication</glossterm>
+ <glossdef>
+ <para>
+ Replication setup that applies time-delayed copy of the data.
+ </para>
+ <para>
+ For more information, see
+ <xref linkend="guc-recovery-min-apply-delay"/> for physical replication
+ and <xref linkend="sql-createsubscription"/> for logical replication.
+ </para>
+ </glossdef>
+ </glossentry>
+
<glossentry id="glossary-toast">
<glossterm>TOAST</glossterm>
<glossdef>
diff --git a/doc/src/sgml/logical-replication.sgml b/doc/src/sgml/logical-replication.sgml
index 1bd5660c87..6bd5f61e2b 100644
--- a/doc/src/sgml/logical-replication.sgml
+++ b/doc/src/sgml/logical-replication.sgml
@@ -247,6 +247,12 @@
target table.
</para>
+ <para>
+ A subscription can delay the application of changes by specifying the
+ <literal>min_apply_delay</literal> subscription parameter. See
+ <xref linkend="sql-createsubscription"/> for details.
+ </para>
+
<sect2 id="logical-replication-subscription-slot">
<title>Replication Slot Management</title>
diff --git a/doc/src/sgml/ref/alter_subscription.sgml b/doc/src/sgml/ref/alter_subscription.sgml
index 964fcbb8ff..8b7eb28e54 100644
--- a/doc/src/sgml/ref/alter_subscription.sgml
+++ b/doc/src/sgml/ref/alter_subscription.sgml
@@ -213,8 +213,9 @@ ALTER SUBSCRIPTION <replaceable class="parameter">name</replaceable> RENAME TO <
are <literal>slot_name</literal>,
<literal>synchronous_commit</literal>,
<literal>binary</literal>, <literal>streaming</literal>,
- <literal>disable_on_error</literal>, and
- <literal>origin</literal>.
+ <literal>disable_on_error</literal>,
+ <literal>origin</literal>, and
+ <literal>min_apply_delay</literal>.
</para>
</listitem>
</varlistentry>
diff --git a/doc/src/sgml/ref/create_subscription.sgml b/doc/src/sgml/ref/create_subscription.sgml
index 51c45f17c7..1b4b8390af 100644
--- a/doc/src/sgml/ref/create_subscription.sgml
+++ b/doc/src/sgml/ref/create_subscription.sgml
@@ -349,7 +349,49 @@ CREATE SUBSCRIPTION <replaceable class="parameter">subscription_name</replaceabl
</para>
</listitem>
</varlistentry>
- </variablelist></para>
+
+ <varlistentry>
+ <term><literal>min_apply_delay</literal> (<type>integer</type>)</term>
+ <listitem>
+ <para>
+ By default, the subscriber applies changes as soon as possible. This
+ parameter allows the user to delay the application of changes by a
+ given time period. If the value is specified without units, it is
+ taken as milliseconds. The default is zero (no delay). See
+ <xref linkend="config-setting-names-values"/> for details on the
+ available valid time units.
+ </para>
+ <para>
+ Any delay becomes effective only after all initial table
+ synchronization has finished and occurs before each transaction starts
+ to get applied on the subscriber. The delay is calculated as the
+ difference between the WAL timestamp as written on the publisher and
+ the current time on the subscriber. Any overhead of time spent in
+ logical decoding and in transferring the transaction may reduce the
+ actual wait time. It is also possible that the overhead already
+ exceeds the requested <literal>min_apply_delay</literal> value, in
+ which case no delay is applied. If the system clocks on publisher and
+ subscriber are not synchronized, this may lead to apply changes
+ earlier than expected, but this is not a major issue because this
+ parameter is typically much larger than the time deviations between
+ servers. Note that if this parameter is set to a long delay, the
+ replication will stop if the replication slot falls behind the current
+ LSN by more than
+ <link linkend="guc-max-slot-wal-keep-size"><literal>max_slot_wal_keep_size</literal></link>.
+ </para>
+ <warning>
+ <para>
+ Delaying the replication means there is a much longer time between
+ making a change on the publisher, and that change being committed
+ on the subscriber. This can impact the performance of synchronous
+ replication. See <xref linkend="guc-synchronous-commit"/>
+ parameter.
+ </para>
+ </warning>
+ </listitem>
+ </varlistentry>
+ </variablelist>
+ </para>
</listitem>
</varlistentry>
@@ -420,6 +462,11 @@ CREATE SUBSCRIPTION <replaceable class="parameter">subscription_name</replaceabl
published with different column lists are not supported.
</para>
+ <para>
+ A non-zero <literal>min_apply_delay</literal> parameter is not allowed when
+ streaming in parallel mode.
+ </para>
+
<para>
We allow non-existent publications to be specified so that users can add
those later. This means
diff --git a/src/backend/catalog/pg_subscription.c b/src/backend/catalog/pg_subscription.c
index a56ae311c3..e19e5cbca2 100644
--- a/src/backend/catalog/pg_subscription.c
+++ b/src/backend/catalog/pg_subscription.c
@@ -66,6 +66,7 @@ GetSubscription(Oid subid, bool missing_ok)
sub->skiplsn = subform->subskiplsn;
sub->name = pstrdup(NameStr(subform->subname));
sub->owner = subform->subowner;
+ sub->minapplydelay = subform->subminapplydelay;
sub->enabled = subform->subenabled;
sub->binary = subform->subbinary;
sub->stream = subform->substream;
diff --git a/src/backend/catalog/system_views.sql b/src/backend/catalog/system_views.sql
index 8608e3fa5b..317c2010cb 100644
--- a/src/backend/catalog/system_views.sql
+++ b/src/backend/catalog/system_views.sql
@@ -1299,9 +1299,10 @@ REVOKE ALL ON pg_replication_origin_status FROM public;
-- All columns of pg_subscription except subconninfo are publicly readable.
REVOKE ALL ON pg_subscription FROM public;
-GRANT SELECT (oid, subdbid, subskiplsn, subname, subowner, subenabled,
- subbinary, substream, subtwophasestate, subdisableonerr,
- subslotname, subsynccommit, subpublications, suborigin)
+GRANT SELECT (oid, subdbid, subskiplsn, subminapplydelay, subname, subowner,
+ subenabled, subbinary, substream, subtwophasestate,
+ subdisableonerr, subslotname, subsynccommit, subpublications,
+ suborigin)
ON pg_subscription TO public;
CREATE VIEW pg_stat_subscription_stats AS
diff --git a/src/backend/commands/subscriptioncmds.c b/src/backend/commands/subscriptioncmds.c
index 464db6d247..82e16fd0f9 100644
--- a/src/backend/commands/subscriptioncmds.c
+++ b/src/backend/commands/subscriptioncmds.c
@@ -66,6 +66,7 @@
#define SUBOPT_DISABLE_ON_ERR 0x00000400
#define SUBOPT_LSN 0x00000800
#define SUBOPT_ORIGIN 0x00001000
+#define SUBOPT_MIN_APPLY_DELAY 0x00002000
/* check if the 'val' has 'bits' set */
#define IsSet(val, bits) (((val) & (bits)) == (bits))
@@ -90,6 +91,7 @@ typedef struct SubOpts
bool disableonerr;
char *origin;
XLogRecPtr lsn;
+ int32 min_apply_delay;
} SubOpts;
static List *fetch_table_list(WalReceiverConn *wrconn, List *publications);
@@ -100,7 +102,7 @@ static void check_publications_origin(WalReceiverConn *wrconn,
static void check_duplicates_in_publist(List *publist, Datum *datums);
static List *merge_publications(List *oldpublist, List *newpublist, bool addpub, const char *subname);
static void ReportSlotConnectionError(List *rstates, Oid subid, char *slotname, char *err);
-
+static int32 defGetMinApplyDelay(DefElem *def);
/*
* Common option parsing function for CREATE and ALTER SUBSCRIPTION commands.
@@ -146,6 +148,8 @@ parse_subscription_options(ParseState *pstate, List *stmt_options,
opts->disableonerr = false;
if (IsSet(supported_opts, SUBOPT_ORIGIN))
opts->origin = pstrdup(LOGICALREP_ORIGIN_ANY);
+ if (IsSet(supported_opts, SUBOPT_MIN_APPLY_DELAY))
+ opts->min_apply_delay = 0;
/* Parse options */
foreach(lc, stmt_options)
@@ -324,6 +328,15 @@ parse_subscription_options(ParseState *pstate, List *stmt_options,
opts->specified_opts |= SUBOPT_LSN;
opts->lsn = lsn;
}
+ else if (IsSet(supported_opts, SUBOPT_MIN_APPLY_DELAY) &&
+ strcmp(defel->defname, "min_apply_delay") == 0)
+ {
+ if (IsSet(opts->specified_opts, SUBOPT_MIN_APPLY_DELAY))
+ errorConflictingDefElem(defel, pstate);
+
+ opts->specified_opts |= SUBOPT_MIN_APPLY_DELAY;
+ opts->min_apply_delay = defGetMinApplyDelay(defel);
+ }
else
ereport(ERROR,
(errcode(ERRCODE_SYNTAX_ERROR),
@@ -404,6 +417,32 @@ parse_subscription_options(ParseState *pstate, List *stmt_options,
"slot_name = NONE", "create_slot = false")));
}
}
+
+ /*
+ * The combination of parallel streaming mode and min_apply_delay is not
+ * allowed. This is because in parallel streaming mode, we start applying
+ * the transaction stream as soon as the first change arrives without
+ * knowing the transaction's prepare/commit time. This means we cannot
+ * calculate the underlying network/decoding lag between publisher and
+ * subscriber, and so always waiting for the full 'min_apply_delay' period
+ * might include unnecessary delay.
+ *
+ * The other possibility was to apply the delay at the end of the parallel
+ * apply transaction but that would cause issues related to resource bloat
+ * and locks being held for a long time.
+ */
+ if (IsSet(supported_opts, SUBOPT_MIN_APPLY_DELAY) &&
+ opts->min_apply_delay > 0 &&
+ opts->streaming == LOGICALREP_STREAM_PARALLEL)
+ ereport(ERROR,
+ errcode(ERRCODE_SYNTAX_ERROR),
+
+ /*
+ * translator: the first %s is a string of the form "parameter > 0"
+ * and the second one is "option = value".
+ */
+ errmsg("%s and %s are mutually exclusive options",
+ "min_apply_delay > 0", "streaming = parallel"));
}
/*
@@ -560,7 +599,8 @@ CreateSubscription(ParseState *pstate, CreateSubscriptionStmt *stmt,
SUBOPT_SLOT_NAME | SUBOPT_COPY_DATA |
SUBOPT_SYNCHRONOUS_COMMIT | SUBOPT_BINARY |
SUBOPT_STREAMING | SUBOPT_TWOPHASE_COMMIT |
- SUBOPT_DISABLE_ON_ERR | SUBOPT_ORIGIN);
+ SUBOPT_DISABLE_ON_ERR | SUBOPT_ORIGIN |
+ SUBOPT_MIN_APPLY_DELAY);
parse_subscription_options(pstate, stmt->options, supported_opts, &opts);
/*
@@ -625,6 +665,7 @@ CreateSubscription(ParseState *pstate, CreateSubscriptionStmt *stmt,
values[Anum_pg_subscription_oid - 1] = ObjectIdGetDatum(subid);
values[Anum_pg_subscription_subdbid - 1] = ObjectIdGetDatum(MyDatabaseId);
values[Anum_pg_subscription_subskiplsn - 1] = LSNGetDatum(InvalidXLogRecPtr);
+ values[Anum_pg_subscription_subminapplydelay - 1] = Int32GetDatum(opts.min_apply_delay);
values[Anum_pg_subscription_subname - 1] =
DirectFunctionCall1(namein, CStringGetDatum(stmt->subname));
values[Anum_pg_subscription_subowner - 1] = ObjectIdGetDatum(owner);
@@ -1054,7 +1095,7 @@ AlterSubscription(ParseState *pstate, AlterSubscriptionStmt *stmt,
supported_opts = (SUBOPT_SLOT_NAME |
SUBOPT_SYNCHRONOUS_COMMIT | SUBOPT_BINARY |
SUBOPT_STREAMING | SUBOPT_DISABLE_ON_ERR |
- SUBOPT_ORIGIN);
+ SUBOPT_ORIGIN | SUBOPT_MIN_APPLY_DELAY);
parse_subscription_options(pstate, stmt->options,
supported_opts, &opts);
@@ -1098,6 +1139,19 @@ AlterSubscription(ParseState *pstate, AlterSubscriptionStmt *stmt,
if (IsSet(opts.specified_opts, SUBOPT_STREAMING))
{
+ /*
+ * The combination of parallel streaming mode and
+ * min_apply_delay is not allowed. See
+ * parse_subscription_options.
+ */
+ if (opts.streaming == LOGICALREP_STREAM_PARALLEL &&
+ !IsSet(opts.specified_opts, SUBOPT_MIN_APPLY_DELAY)
+ && sub->minapplydelay > 0)
+ ereport(ERROR,
+ errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
+ errmsg("cannot set parallel streaming mode for subscription with %s",
+ "min_apply_delay"));
+
values[Anum_pg_subscription_substream - 1] =
CharGetDatum(opts.streaming);
replaces[Anum_pg_subscription_substream - 1] = true;
@@ -1111,6 +1165,26 @@ AlterSubscription(ParseState *pstate, AlterSubscriptionStmt *stmt,
= true;
}
+ if (IsSet(opts.specified_opts, SUBOPT_MIN_APPLY_DELAY))
+ {
+ /*
+ * The combination of parallel streaming mode and
+ * min_apply_delay is not allowed. See
+ * parse_subscription_options.
+ */
+ if (opts.min_apply_delay > 0 &&
+ !IsSet(opts.specified_opts, SUBOPT_STREAMING)
+ && sub->stream == LOGICALREP_STREAM_PARALLEL)
+ ereport(ERROR,
+ errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
+ errmsg("cannot set %s for subscription in parallel streaming mode",
+ "min_apply_delay"));
+
+ values[Anum_pg_subscription_subminapplydelay - 1] =
+ Int32GetDatum(opts.min_apply_delay);
+ replaces[Anum_pg_subscription_subminapplydelay - 1] = true;
+ }
+
if (IsSet(opts.specified_opts, SUBOPT_ORIGIN))
{
values[Anum_pg_subscription_suborigin - 1] =
@@ -2195,3 +2269,45 @@ defGetStreamingMode(DefElem *def)
def->defname)));
return LOGICALREP_STREAM_OFF; /* keep compiler quiet */
}
+
+/*
+ * Extract the min_apply_delay value from a DefElem. This is very similar to
+ * parse_and_validate_value() for integer values, because min_apply_delay
+ * accepts the same parameter format as recovery_min_apply_delay.
+ */
+static int32
+defGetMinApplyDelay(DefElem *def)
+{
+ char *input_string;
+ int result;
+ const char *hintmsg;
+
+ input_string = defGetString(def);
+
+ /*
+ * Parse given string as parameter which has millisecond unit
+ */
+ if (!parse_int(input_string, &result, GUC_UNIT_MS, &hintmsg))
+ ereport(ERROR,
+ (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+ errmsg("invalid value for parameter \"%s\": \"%s\"",
+ "min_apply_delay", input_string),
+ hintmsg ? errhint("%s", _(hintmsg)) : 0));
+
+ /*
+ * Check both the lower boundary for the valid min_apply_delay range and
+ * the upper boundary as the safeguard for some platforms where INT_MAX is
+ * wider than int32 respectively. Although parse_int() has confirmed that
+ * the result is less than or equal to INT_MAX, the value will be stored
+ * in a catalog column of int32.
+ */
+ if (result < 0 || result > PG_INT32_MAX)
+ ereport(ERROR,
+ (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+ errmsg("%d ms is outside the valid range for parameter \"%s\" (%d .. %d)",
+ result,
+ "min_apply_delay",
+ 0, PG_INT32_MAX)));
+
+ return result;
+}
diff --git a/src/backend/replication/logical/applyparallelworker.c b/src/backend/replication/logical/applyparallelworker.c
index da437e0bc3..32db20fd98 100644
--- a/src/backend/replication/logical/applyparallelworker.c
+++ b/src/backend/replication/logical/applyparallelworker.c
@@ -704,7 +704,8 @@ pa_process_spooled_messages_if_required(void)
{
apply_spooled_messages(&MyParallelShared->fileset,
MyParallelShared->xid,
- InvalidXLogRecPtr);
+ InvalidXLogRecPtr,
+ 0);
pa_set_fileset_state(MyParallelShared, FS_EMPTY);
}
diff --git a/src/backend/replication/logical/worker.c b/src/backend/replication/logical/worker.c
index cfb2ab6248..c574531040 100644
--- a/src/backend/replication/logical/worker.c
+++ b/src/backend/replication/logical/worker.c
@@ -319,6 +319,17 @@ static List *on_commit_wakeup_workers_subids = NIL;
bool in_remote_transaction = false;
static XLogRecPtr remote_final_lsn = InvalidXLogRecPtr;
+/*
+ * In order to avoid walsender timeout for time-delayed logical replication the
+ * apply worker keeps sending feedback messages during the delay period.
+ * Meanwhile, the feature delays the apply before the start of the
+ * transaction and thus we don't write WAL records for the suspended changes
+ * during the wait. When the apply worker sends a feedback message during the
+ * delay, we should not overwrite positions of the flushed and apply LSN by the
+ * last received latest LSN. See send_feedback() for details.
+ */
+static XLogRecPtr last_received = InvalidXLogRecPtr;
+
/* fields valid only when processing streamed transaction */
static bool in_streamed_transaction = false;
@@ -389,7 +400,8 @@ static void stream_write_change(char action, StringInfo s);
static void stream_open_and_write_change(TransactionId xid, char action, StringInfo s);
static void stream_close_file(void);
-static void send_feedback(XLogRecPtr recvpos, bool force, bool requestReply);
+static void send_feedback(XLogRecPtr recvpos, bool force, bool requestReply,
+ bool has_unprocessed_change);
static void DisableSubscriptionAndExit(void);
@@ -999,6 +1011,109 @@ slot_modify_data(TupleTableSlot *slot, TupleTableSlot *srcslot,
ExecStoreVirtualTuple(slot);
}
+/*
+ * When min_apply_delay parameter is set on the subscriber, we wait long enough
+ * to make sure a transaction is applied at least that period behind the
+ * publisher.
+ *
+ * While the physical replication applies the delay at commit time, this
+ * feature applies the delay for the next transaction but before starting the
+ * transaction. This is mainly because keeping a transaction that conducted
+ * write operations open for a long time results in some issues such as bloat
+ * and locks.
+ *
+ * The min_apply_delay parameter will take effect only after all tables are in
+ * READY state.
+ *
+ * xid is the transaction id where we apply the delay.
+ *
+ * finish_ts is the commit/prepare time of both regular (non-streamed) and
+ * streamed transactions. Unlike the regular (non-streamed) cases, the delay
+ * is applied in a STREAM COMMIT/STREAM PREPARE message for streamed
+ * transactions. The STREAM START message does not contain a commit/prepare
+ * time (it will be available when the in-progress transaction finishes).
+ * Hence, it's not appropriate to apply a delay at the STREAM START time.
+ */
+static void
+maybe_apply_delay(TransactionId xid, TimestampTz finish_ts)
+{
+ Assert(finish_ts > 0);
+
+ /* Nothing to do if no delay set */
+ if (!MySubscription->minapplydelay)
+ return;
+
+ /*
+ * The min_apply_delay parameter is ignored until all tablesync workers
+ * have reached READY state. This is because if we allowed the delay
+ * during the catchup phase, then once we reached the limit of tablesync
+ * workers it would impose a delay for each subsequent worker. That would
+ * cause initial table synchronization completion to take a long time.
+ */
+ if (!AllTablesyncsReady())
+ return;
+
+ /* Apply the delay by the latch mechanism */
+ while (true)
+ {
+ TimestampTz delayUntil;
+ long diffms;
+
+ ResetLatch(MyLatch);
+
+ CHECK_FOR_INTERRUPTS();
+
+ /* This might change wal_receiver_status_interval */
+ if (ConfigReloadPending)
+ {
+ ConfigReloadPending = false;
+ ProcessConfigFile(PGC_SIGHUP);
+ }
+
+ /*
+ * Before calculating the time duration, reload the catalog if needed.
+ */
+ if (!in_remote_transaction && !in_streamed_transaction)
+ {
+ AcceptInvalidationMessages();
+ maybe_reread_subscription();
+ }
+
+ delayUntil = TimestampTzPlusMilliseconds(finish_ts, MySubscription->minapplydelay);
+ diffms = TimestampDifferenceMilliseconds(GetCurrentTimestamp(), delayUntil);
+
+ /*
+ * Exit without arming the latch if it's already past time to apply
+ * this transaction.
+ */
+ if (diffms <= 0)
+ break;
+
+ elog(DEBUG2, "time-delayed replication for txid %u, min_apply_delay = %d ms, remaining wait time: %ld ms",
+ xid, MySubscription->minapplydelay, diffms);
+
+ /*
+ * Call send_feedback() to prevent the publisher from exiting by
+ * timeout during the delay, when wal_receiver_status_interval is
+ * available.
+ */
+ if (wal_receiver_status_interval > 0 &&
+ diffms > wal_receiver_status_interval * 1000L)
+ {
+ WaitLatch(MyLatch,
+ WL_LATCH_SET | WL_TIMEOUT | WL_EXIT_ON_PM_DEATH,
+ wal_receiver_status_interval * 1000L,
+ WAIT_EVENT_RECOVERY_APPLY_DELAY);
+ send_feedback(last_received, true, false, true);
+ }
+ else
+ WaitLatch(MyLatch,
+ WL_LATCH_SET | WL_TIMEOUT | WL_EXIT_ON_PM_DEATH,
+ diffms,
+ WAIT_EVENT_RECOVERY_APPLY_DELAY);
+ }
+}
+
/*
* Handle BEGIN message.
*/
@@ -1013,6 +1128,9 @@ apply_handle_begin(StringInfo s)
logicalrep_read_begin(s, &begin_data);
set_apply_error_context_xact(begin_data.xid, begin_data.final_lsn);
+ /* Should we delay the current transaction? */
+ maybe_apply_delay(begin_data.xid, begin_data.committime);
+
remote_final_lsn = begin_data.final_lsn;
maybe_start_skipping_changes(begin_data.final_lsn);
@@ -1070,6 +1188,9 @@ apply_handle_begin_prepare(StringInfo s)
logicalrep_read_begin_prepare(s, &begin_data);
set_apply_error_context_xact(begin_data.xid, begin_data.prepare_lsn);
+ /* Should we delay the current prepared transaction? */
+ maybe_apply_delay(begin_data.xid, begin_data.prepare_time);
+
remote_final_lsn = begin_data.prepare_lsn;
maybe_start_skipping_changes(begin_data.prepare_lsn);
@@ -1317,7 +1438,8 @@ apply_handle_stream_prepare(StringInfo s)
* spooled operations.
*/
apply_spooled_messages(MyLogicalRepWorker->stream_fileset,
- prepare_data.xid, prepare_data.prepare_lsn);
+ prepare_data.xid, prepare_data.prepare_lsn,
+ prepare_data.prepare_time);
/* Mark the transaction as prepared. */
apply_handle_prepare_internal(&prepare_data);
@@ -2011,10 +2133,13 @@ ensure_last_message(FileSet *stream_fileset, TransactionId xid, int fileno,
/*
* Common spoolfile processing.
+ *
+ * The commit/prepare time (finish_ts) is required for time-delayed logical
+ * replication.
*/
void
apply_spooled_messages(FileSet *stream_fileset, TransactionId xid,
- XLogRecPtr lsn)
+ XLogRecPtr lsn, TimestampTz finish_ts)
{
StringInfoData s2;
int nchanges;
@@ -2025,6 +2150,10 @@ apply_spooled_messages(FileSet *stream_fileset, TransactionId xid,
int fileno;
off_t offset;
+ /* Should we delay the current transaction? */
+ if (finish_ts)
+ maybe_apply_delay(xid, finish_ts);
+
if (!am_parallel_apply_worker())
maybe_start_skipping_changes(lsn);
@@ -2174,7 +2303,7 @@ apply_handle_stream_commit(StringInfo s)
* spooled operations.
*/
apply_spooled_messages(MyLogicalRepWorker->stream_fileset, xid,
- commit_data.commit_lsn);
+ commit_data.commit_lsn, commit_data.committime);
apply_handle_commit_internal(&commit_data);
@@ -3447,7 +3576,7 @@ UpdateWorkerStats(XLogRecPtr last_lsn, TimestampTz send_time, bool reply)
* Apply main loop.
*/
static void
-LogicalRepApplyLoop(XLogRecPtr last_received)
+LogicalRepApplyLoop(void)
{
TimestampTz last_recv_timestamp = GetCurrentTimestamp();
bool ping_sent = false;
@@ -3568,7 +3697,7 @@ LogicalRepApplyLoop(XLogRecPtr last_received)
if (last_received < end_lsn)
last_received = end_lsn;
- send_feedback(last_received, reply_requested, false);
+ send_feedback(last_received, reply_requested, false, false);
UpdateWorkerStats(last_received, timestamp, true);
}
/* other message types are purposefully ignored */
@@ -3581,7 +3710,7 @@ LogicalRepApplyLoop(XLogRecPtr last_received)
}
/* confirm all writes so far */
- send_feedback(last_received, false, false);
+ send_feedback(last_received, false, false, false);
if (!in_remote_transaction && !in_streamed_transaction)
{
@@ -3678,7 +3807,7 @@ LogicalRepApplyLoop(XLogRecPtr last_received)
}
}
- send_feedback(last_received, requestReply, requestReply);
+ send_feedback(last_received, requestReply, requestReply, false);
/*
* Force reporting to ensure long idle periods don't lead to
@@ -3708,7 +3837,7 @@ LogicalRepApplyLoop(XLogRecPtr last_received)
* to send a response to avoid timeouts.
*/
static void
-send_feedback(XLogRecPtr recvpos, bool force, bool requestReply)
+send_feedback(XLogRecPtr recvpos, bool force, bool requestReply, bool has_unprocessed_change)
{
static StringInfo reply_message = NULL;
static TimestampTz send_time = 0;
@@ -3738,8 +3867,14 @@ send_feedback(XLogRecPtr recvpos, bool force, bool requestReply)
/*
* No outstanding transactions to flush, we can report the latest received
* position. This is important for synchronous replication.
+ *
+ * If the logical replication subscription has unprocessed changes then do
+ * not inform the publisher that the received latest LSN is already
+ * applied and flushed, otherwise, the publisher will make a wrong
+ * assumption about the logical replication progress. Instead, just send a
+ * feedback message to avoid a replication timeout during the delay.
*/
- if (!have_pending_txes)
+ if (!have_pending_txes && !has_unprocessed_change)
flushpos = writepos = recvpos;
if (writepos < last_writepos)
@@ -3776,8 +3911,9 @@ send_feedback(XLogRecPtr recvpos, bool force, bool requestReply)
pq_sendint64(reply_message, now); /* sendTime */
pq_sendbyte(reply_message, requestReply); /* replyRequested */
- elog(DEBUG2, "sending feedback (force %d) to recv %X/%X, write %X/%X, flush %X/%X",
+ elog(DEBUG2, "sending feedback (force %d, has_unprocessed_change %d) to recv %X/%X, write %X/%X, flush %X/%X",
force,
+ has_unprocessed_change,
LSN_FORMAT_ARGS(recvpos),
LSN_FORMAT_ARGS(writepos),
LSN_FORMAT_ARGS(flushpos));
@@ -4367,11 +4503,11 @@ start_table_sync(XLogRecPtr *origin_startpos, char **myslotname)
* of system resource error and are not repeatable.
*/
static void
-start_apply(XLogRecPtr origin_startpos)
+start_apply(void)
{
PG_TRY();
{
- LogicalRepApplyLoop(origin_startpos);
+ LogicalRepApplyLoop();
}
PG_CATCH();
{
@@ -4661,7 +4797,8 @@ ApplyWorkerMain(Datum main_arg)
}
/* Run the main loop. */
- start_apply(origin_startpos);
+ last_received = origin_startpos;
+ start_apply();
proc_exit(0);
}
diff --git a/src/bin/pg_dump/pg_dump.c b/src/bin/pg_dump/pg_dump.c
index 527c7651ab..1e87f0124e 100644
--- a/src/bin/pg_dump/pg_dump.c
+++ b/src/bin/pg_dump/pg_dump.c
@@ -4494,6 +4494,7 @@ getSubscriptions(Archive *fout)
int i_subsynccommit;
int i_subpublications;
int i_subbinary;
+ int i_subminapplydelay;
int i,
ntups;
@@ -4546,9 +4547,13 @@ getSubscriptions(Archive *fout)
LOGICALREP_TWOPHASE_STATE_DISABLED);
if (fout->remoteVersion >= 160000)
- appendPQExpBufferStr(query, " s.suborigin\n");
+ appendPQExpBufferStr(query,
+ " s.suborigin,\n"
+ " s.subminapplydelay\n");
else
- appendPQExpBuffer(query, " '%s' AS suborigin\n", LOGICALREP_ORIGIN_ANY);
+ appendPQExpBuffer(query, " '%s' AS suborigin,\n"
+ " 0 AS subminapplydelay\n",
+ LOGICALREP_ORIGIN_ANY);
appendPQExpBufferStr(query,
"FROM pg_subscription s\n"
@@ -4576,6 +4581,7 @@ getSubscriptions(Archive *fout)
i_subtwophasestate = PQfnumber(res, "subtwophasestate");
i_subdisableonerr = PQfnumber(res, "subdisableonerr");
i_suborigin = PQfnumber(res, "suborigin");
+ i_subminapplydelay = PQfnumber(res, "subminapplydelay");
subinfo = pg_malloc(ntups * sizeof(SubscriptionInfo));
@@ -4606,6 +4612,8 @@ getSubscriptions(Archive *fout)
subinfo[i].subdisableonerr =
pg_strdup(PQgetvalue(res, i, i_subdisableonerr));
subinfo[i].suborigin = pg_strdup(PQgetvalue(res, i, i_suborigin));
+ subinfo[i].subminapplydelay =
+ atoi(PQgetvalue(res, i, i_subminapplydelay));
/* Decide whether we want to dump it */
selectDumpableObject(&(subinfo[i].dobj), fout);
@@ -4687,6 +4695,9 @@ dumpSubscription(Archive *fout, const SubscriptionInfo *subinfo)
if (strcmp(subinfo->subsynccommit, "off") != 0)
appendPQExpBuffer(query, ", synchronous_commit = %s", fmtId(subinfo->subsynccommit));
+ if (subinfo->subminapplydelay > 0)
+ appendPQExpBuffer(query, ", min_apply_delay = '%d ms'", subinfo->subminapplydelay);
+
appendPQExpBufferStr(query, ");\n");
if (subinfo->dobj.dump & DUMP_COMPONENT_DEFINITION)
diff --git a/src/bin/pg_dump/pg_dump.h b/src/bin/pg_dump/pg_dump.h
index e7cbd8d7ed..b8831c3ed3 100644
--- a/src/bin/pg_dump/pg_dump.h
+++ b/src/bin/pg_dump/pg_dump.h
@@ -661,6 +661,7 @@ typedef struct _SubscriptionInfo
char *subdisableonerr;
char *suborigin;
char *subsynccommit;
+ int subminapplydelay;
char *subpublications;
} SubscriptionInfo;
diff --git a/src/bin/psql/describe.c b/src/bin/psql/describe.c
index c8a0bb7b3a..81d4607a1c 100644
--- a/src/bin/psql/describe.c
+++ b/src/bin/psql/describe.c
@@ -6472,7 +6472,7 @@ describeSubscriptions(const char *pattern, bool verbose)
PGresult *res;
printQueryOpt myopt = pset.popt;
static const bool translate_columns[] = {false, false, false, false,
- false, false, false, false, false, false, false, false};
+ false, false, false, false, false, false, false, false, false};
if (pset.sversion < 100000)
{
@@ -6527,10 +6527,13 @@ describeSubscriptions(const char *pattern, bool verbose)
gettext_noop("Two-phase commit"),
gettext_noop("Disable on error"));
+ /* Origin and min_apply_delay are only supported in v16 and higher */
if (pset.sversion >= 160000)
appendPQExpBuffer(&buf,
- ", suborigin AS \"%s\"\n",
- gettext_noop("Origin"));
+ ", suborigin AS \"%s\"\n"
+ ", subminapplydelay AS \"%s\"\n",
+ gettext_noop("Origin"),
+ gettext_noop("Min apply delay"));
appendPQExpBuffer(&buf,
", subsynccommit AS \"%s\"\n"
diff --git a/src/bin/psql/tab-complete.c b/src/bin/psql/tab-complete.c
index 5e1882eaea..e8b9a43a47 100644
--- a/src/bin/psql/tab-complete.c
+++ b/src/bin/psql/tab-complete.c
@@ -1925,7 +1925,7 @@ psql_completion(const char *text, int start, int end)
COMPLETE_WITH("(", "PUBLICATION");
/* ALTER SUBSCRIPTION <name> SET ( */
else if (HeadMatches("ALTER", "SUBSCRIPTION", MatchAny) && TailMatches("SET", "("))
- COMPLETE_WITH("binary", "disable_on_error", "origin", "slot_name",
+ COMPLETE_WITH("binary", "disable_on_error", "min_apply_delay", "origin", "slot_name",
"streaming", "synchronous_commit");
/* ALTER SUBSCRIPTION <name> SKIP ( */
else if (HeadMatches("ALTER", "SUBSCRIPTION", MatchAny) && TailMatches("SKIP", "("))
@@ -3268,7 +3268,7 @@ psql_completion(const char *text, int start, int end)
/* Complete "CREATE SUBSCRIPTION <name> ... WITH ( <opt>" */
else if (HeadMatches("CREATE", "SUBSCRIPTION") && TailMatches("WITH", "("))
COMPLETE_WITH("binary", "connect", "copy_data", "create_slot",
- "disable_on_error", "enabled", "origin", "slot_name",
+ "disable_on_error", "enabled", "min_apply_delay", "origin", "slot_name",
"streaming", "synchronous_commit", "two_phase");
/* CREATE TRIGGER --- is allowed inside CREATE SCHEMA, so use TailMatches */
diff --git a/src/include/catalog/pg_subscription.h b/src/include/catalog/pg_subscription.h
index b0f2a1705d..d1cfefc6d6 100644
--- a/src/include/catalog/pg_subscription.h
+++ b/src/include/catalog/pg_subscription.h
@@ -74,6 +74,8 @@ CATALOG(pg_subscription,6100,SubscriptionRelationId) BKI_SHARED_RELATION BKI_ROW
Oid subowner BKI_LOOKUP(pg_authid); /* Owner of the subscription */
+ int32 subminapplydelay; /* Replication apply delay (ms) */
+
bool subenabled; /* True if the subscription is enabled (the
* worker should be running) */
@@ -122,6 +124,7 @@ typedef struct Subscription
* skipped */
char *name; /* Name of the subscription */
Oid owner; /* Oid of the subscription owner */
+ int32 minapplydelay; /* Replication apply delay (ms) */
bool enabled; /* Indicates if the subscription is enabled */
bool binary; /* Indicates if the subscription wants data in
* binary format */
diff --git a/src/include/replication/worker_internal.h b/src/include/replication/worker_internal.h
index dc87a4edd1..3dc09d1a4c 100644
--- a/src/include/replication/worker_internal.h
+++ b/src/include/replication/worker_internal.h
@@ -255,7 +255,7 @@ extern void stream_stop_internal(TransactionId xid);
/* Common streaming function to apply all the spooled messages */
extern void apply_spooled_messages(FileSet *stream_fileset, TransactionId xid,
- XLogRecPtr lsn);
+ XLogRecPtr lsn, TimestampTz finish_ts);
extern void apply_dispatch(StringInfo s);
diff --git a/src/test/regress/expected/subscription.out b/src/test/regress/expected/subscription.out
index 3f99b14394..cf8e727ee9 100644
--- a/src/test/regress/expected/subscription.out
+++ b/src/test/regress/expected/subscription.out
@@ -114,18 +114,18 @@ CREATE SUBSCRIPTION regress_testsub4 CONNECTION 'dbname=regress_doesnotexist' PU
WARNING: subscription was created, but is not connected
HINT: To initiate replication, you must manually create the replication slot, enable the subscription, and refresh the subscription.
\dRs+ regress_testsub4
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
-------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub4 | regress_subscription_user | f | {testpub} | f | off | d | f | none | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay | Synchronous commit | Conninfo | Skip LSN
+------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+-----------------+--------------------+-----------------------------+----------
+ regress_testsub4 | regress_subscription_user | f | {testpub} | f | off | d | f | none | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
ALTER SUBSCRIPTION regress_testsub4 SET (origin = any);
\dRs+ regress_testsub4
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
-------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub4 | regress_subscription_user | f | {testpub} | f | off | d | f | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay | Synchronous commit | Conninfo | Skip LSN
+------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+-----------------+--------------------+-----------------------------+----------
+ regress_testsub4 | regress_subscription_user | f | {testpub} | f | off | d | f | any | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
DROP SUBSCRIPTION regress_testsub3;
@@ -143,10 +143,10 @@ ALTER SUBSCRIPTION regress_testsub CONNECTION 'foobar';
ERROR: invalid connection string syntax: missing "=" after "foobar" in connection info string
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub} | f | off | d | f | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+-----------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | off | d | f | any | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
ALTER SUBSCRIPTION regress_testsub SET PUBLICATION testpub2, testpub3 WITH (refresh = false);
@@ -163,10 +163,10 @@ ERROR: unrecognized subscription parameter: "create_slot"
-- ok
ALTER SUBSCRIPTION regress_testsub SKIP (lsn = '0/12345');
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+---------------------+--------+-----------+------------------+------------------+--------+--------------------+------------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub2,testpub3} | f | off | d | f | any | off | dbname=regress_doesnotexist2 | 0/12345
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+---------------------+--------+-----------+------------------+------------------+--------+-----------------+--------------------+------------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub2,testpub3} | f | off | d | f | any | 0 | off | dbname=regress_doesnotexist2 | 0/12345
(1 row)
-- ok - with lsn = NONE
@@ -175,10 +175,10 @@ ALTER SUBSCRIPTION regress_testsub SKIP (lsn = NONE);
ALTER SUBSCRIPTION regress_testsub SKIP (lsn = '0/0');
ERROR: invalid WAL location (LSN): 0/0
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+---------------------+--------+-----------+------------------+------------------+--------+--------------------+------------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub2,testpub3} | f | off | d | f | any | off | dbname=regress_doesnotexist2 | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+---------------------+--------+-----------+------------------+------------------+--------+-----------------+--------------------+------------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub2,testpub3} | f | off | d | f | any | 0 | off | dbname=regress_doesnotexist2 | 0/0
(1 row)
BEGIN;
@@ -210,10 +210,10 @@ ALTER SUBSCRIPTION regress_testsub_foo SET (synchronous_commit = foobar);
ERROR: invalid value for parameter "synchronous_commit": "foobar"
HINT: Available values: local, remote_write, remote_apply, on, off.
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
----------------------+---------------------------+---------+---------------------+--------+-----------+------------------+------------------+--------+--------------------+------------------------------+----------
- regress_testsub_foo | regress_subscription_user | f | {testpub2,testpub3} | f | off | d | f | any | local | dbname=regress_doesnotexist2 | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay | Synchronous commit | Conninfo | Skip LSN
+---------------------+---------------------------+---------+---------------------+--------+-----------+------------------+------------------+--------+-----------------+--------------------+------------------------------+----------
+ regress_testsub_foo | regress_subscription_user | f | {testpub2,testpub3} | f | off | d | f | any | 0 | local | dbname=regress_doesnotexist2 | 0/0
(1 row)
-- rename back to keep the rest simple
@@ -247,19 +247,19 @@ CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUB
WARNING: subscription was created, but is not connected
HINT: To initiate replication, you must manually create the replication slot, enable the subscription, and refresh the subscription.
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub} | t | off | d | f | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+-----------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | t | off | d | f | any | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
ALTER SUBSCRIPTION regress_testsub SET (binary = false);
ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE);
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub} | f | off | d | f | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+-----------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | off | d | f | any | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
DROP SUBSCRIPTION regress_testsub;
@@ -271,27 +271,27 @@ CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUB
WARNING: subscription was created, but is not connected
HINT: To initiate replication, you must manually create the replication slot, enable the subscription, and refresh the subscription.
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub} | f | on | d | f | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+-----------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | on | d | f | any | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
ALTER SUBSCRIPTION regress_testsub SET (streaming = parallel);
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub} | f | parallel | d | f | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+-----------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | parallel | d | f | any | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
ALTER SUBSCRIPTION regress_testsub SET (streaming = false);
ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE);
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub} | f | off | d | f | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+-----------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | off | d | f | any | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
-- fail - publication already exists
@@ -306,10 +306,10 @@ ALTER SUBSCRIPTION regress_testsub ADD PUBLICATION testpub1, testpub2 WITH (refr
ALTER SUBSCRIPTION regress_testsub ADD PUBLICATION testpub1, testpub2 WITH (refresh = false);
ERROR: publication "testpub1" is already in subscription "regress_testsub"
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-----------------------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub,testpub1,testpub2} | f | off | d | f | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-----------------------------+--------+-----------+------------------+------------------+--------+-----------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub,testpub1,testpub2} | f | off | d | f | any | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
-- fail - publication used more than once
@@ -324,10 +324,10 @@ ERROR: publication "testpub3" is not in subscription "regress_testsub"
-- ok - delete publications
ALTER SUBSCRIPTION regress_testsub DROP PUBLICATION testpub1, testpub2 WITH (refresh = false);
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub} | f | off | d | f | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+-----------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | off | d | f | any | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
DROP SUBSCRIPTION regress_testsub;
@@ -363,10 +363,10 @@ CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUB
WARNING: subscription was created, but is not connected
HINT: To initiate replication, you must manually create the replication slot, enable the subscription, and refresh the subscription.
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub} | f | off | p | f | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+-----------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | off | p | f | any | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
--fail - alter of two_phase option not supported.
@@ -375,10 +375,10 @@ ERROR: unrecognized subscription parameter: "two_phase"
-- but can alter streaming when two_phase enabled
ALTER SUBSCRIPTION regress_testsub SET (streaming = true);
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub} | f | on | p | f | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+-----------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | on | p | f | any | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE);
@@ -388,10 +388,10 @@ CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUB
WARNING: subscription was created, but is not connected
HINT: To initiate replication, you must manually create the replication slot, enable the subscription, and refresh the subscription.
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub} | f | on | p | f | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+-----------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | on | p | f | any | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE);
@@ -404,20 +404,57 @@ CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUB
WARNING: subscription was created, but is not connected
HINT: To initiate replication, you must manually create the replication slot, enable the subscription, and refresh the subscription.
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub} | f | off | d | f | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+-----------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | off | d | f | any | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
ALTER SUBSCRIPTION regress_testsub SET (disable_on_error = true);
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub} | f | off | d | t | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+-----------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | off | d | t | any | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
+ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE);
+DROP SUBSCRIPTION regress_testsub;
+-- fail -- min_apply_delay must be a non-negative integer
+CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUBLICATION testpub WITH (connect = false, min_apply_delay = foo);
+ERROR: invalid value for parameter "min_apply_delay": "foo"
+CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUBLICATION testpub WITH (connect = false, min_apply_delay = -1);
+ERROR: -1 ms is outside the valid range for parameter "min_apply_delay" (0 .. 2147483647)
+-- fail - utilizing streaming = parallel with time-delayed replication is not supported
+CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUBLICATION testpub WITH (connect = false, streaming = parallel, min_apply_delay = 123);
+ERROR: min_apply_delay > 0 and streaming = parallel are mutually exclusive options
+-- success -- min_apply_delay value without unit is taken as milliseconds
+CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUBLICATION testpub WITH (connect = false, min_apply_delay = 123);
+WARNING: subscription was created, but is not connected
+HINT: To initiate replication, you must manually create the replication slot, enable the subscription, and refresh the subscription.
+\dRs+
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+-----------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | off | d | f | any | 123 | off | dbname=regress_doesnotexist | 0/0
+(1 row)
+
+-- success -- min_apply_delay value with unit is converted into ms and stored as an integer
+ALTER SUBSCRIPTION regress_testsub SET (min_apply_delay = '1 d');
+\dRs+
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+-----------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | off | d | f | any | 86400000 | off | dbname=regress_doesnotexist | 0/0
+(1 row)
+
+-- fail - alter subscription with streaming = parallel should fail when time-delayed replication is set
+ALTER SUBSCRIPTION regress_testsub SET (streaming = parallel);
+ERROR: cannot set parallel streaming mode for subscription with min_apply_delay
+-- fail - alter subscription with min_apply_delay should fail when streaming = parallel is set
+ALTER SUBSCRIPTION regress_testsub SET (min_apply_delay = 0, streaming = parallel);
+ALTER SUBSCRIPTION regress_testsub SET (min_apply_delay = 123);
+ERROR: cannot set min_apply_delay for subscription in parallel streaming mode
ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE);
DROP SUBSCRIPTION regress_testsub;
RESET SESSION AUTHORIZATION;
diff --git a/src/test/regress/sql/subscription.sql b/src/test/regress/sql/subscription.sql
index 7281f5fee2..7317b140f5 100644
--- a/src/test/regress/sql/subscription.sql
+++ b/src/test/regress/sql/subscription.sql
@@ -286,6 +286,30 @@ ALTER SUBSCRIPTION regress_testsub SET (disable_on_error = true);
ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE);
DROP SUBSCRIPTION regress_testsub;
+-- fail -- min_apply_delay must be a non-negative integer
+CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUBLICATION testpub WITH (connect = false, min_apply_delay = foo);
+CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUBLICATION testpub WITH (connect = false, min_apply_delay = -1);
+
+-- fail - utilizing streaming = parallel with time-delayed replication is not supported
+CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUBLICATION testpub WITH (connect = false, streaming = parallel, min_apply_delay = 123);
+
+-- success -- min_apply_delay value without unit is taken as milliseconds
+CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUBLICATION testpub WITH (connect = false, min_apply_delay = 123);
+\dRs+
+
+-- success -- min_apply_delay value with unit is converted into ms and stored as an integer
+ALTER SUBSCRIPTION regress_testsub SET (min_apply_delay = '1 d');
+\dRs+
+
+-- fail - alter subscription with streaming = parallel should fail when time-delayed replication is set
+ALTER SUBSCRIPTION regress_testsub SET (streaming = parallel);
+
+-- fail - alter subscription with min_apply_delay should fail when streaming = parallel is set
+ALTER SUBSCRIPTION regress_testsub SET (min_apply_delay = 0, streaming = parallel);
+ALTER SUBSCRIPTION regress_testsub SET (min_apply_delay = 123);
+ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE);
+DROP SUBSCRIPTION regress_testsub;
+
RESET SESSION AUTHORIZATION;
DROP ROLE regress_subscription_user;
DROP ROLE regress_subscription_user2;
diff --git a/src/test/subscription/t/001_rep_changes.pl b/src/test/subscription/t/001_rep_changes.pl
index 91aa068c95..f94819672b 100644
--- a/src/test/subscription/t/001_rep_changes.pl
+++ b/src/test/subscription/t/001_rep_changes.pl
@@ -515,6 +515,36 @@ $node_publisher->poll_query_until('postgres',
or die
"Timed out while waiting for apply to restart after renaming SUBSCRIPTION";
+# Test time-delayed logical replication
+#
+# If the subscription sets min_apply_delay parameter, the logical replication
+# worker will delay the transaction apply for min_apply_delay milliseconds. We
+# look the time duration between tuples are inserted on publisher and then
+# changes are replicated on subscriber.
+my $delay = 3;
+
+# Set min_apply_delay parameter to 3 seconds
+$node_subscriber->safe_psql('postgres',
+ "ALTER SUBSCRIPTION tap_sub_renamed SET (min_apply_delay = '${delay}s')");
+
+# Make new content on publisher and check its presence in subscriber depending
+# on the delay applied above. Before doing the insertion, get the
+# current timestamp that will be used as a comparison base. Even on slow
+# machines, this allows to have a predictable behavior when comparing the
+# delay between data insertion moment on publisher and replay time on subscriber.
+my $publisher_insert_time = time();
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_ins VALUES (generate_series(1101, 1120))");
+
+# The publisher waits for the replication to complete
+$node_publisher->wait_for_catchup('tap_sub_renamed');
+
+# This test is successful if and only if the LSN has been applied with at least
+# the configured apply delay.
+ok( time() - $publisher_insert_time >= $delay,
+ "subscriber applies WAL only after replication delay for non-streaming transaction"
+);
+
# check all the cleanup
$node_subscriber->safe_psql('postgres', "DROP SUBSCRIPTION tap_sub_renamed");
--
2.30.0
Hi,
On Tuesday, February 7, 2023 2:26 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
On Tue, Feb 7, 2023 at 10:13 AM Kyotaro Horiguchi <horikyota.ntt@gmail.com>
wrote:At Mon, 6 Feb 2023 13:10:01 +0000, "Takamichi Osumi (Fujitsu)"
<osumi.takamichi@fujitsu.com> wrote inThe attached patch v29 has included your changes.
catalogs.sgml
+ <para> + The minimum delay (ms) for applying changes. + </para></entry>I think we don't use unit symbols that way. Namely I think we would
write it as "The minimum delay for applying changes in milliseconds"Okay, if we prefer to use milliseconds, then how about: "The minimum delay, in
milliseconds, for applying changes"?
This looks good to me. Adopted.
alter_subscription.sgml
are <literal>slot_name</literal>, <literal>synchronous_commit</literal>, <literal>binary</literal>, <literal>streaming</literal>, - <literal>disable_on_error</literal>, and - <literal>origin</literal>. + <literal>disable_on_error</literal>, + <literal>origin</literal>, and + <literal>min_apply_delay</literal>. </para>By the way, is there any rule for the order among the words?
Currently, it is in the order in which the corresponding features are added.
Yes. So, I keep it as it is.
They
don't seem in alphabetical order nor in the same order to the
create_sbuscription page.In create_subscription page also, it appears to be in the order in which those
are added with a difference that they are divided into two categories
(parameters that control what happens during subscription creation and
parameters that control the subscription's replication behavior after it has been
created)
Same as here. The current order should be fine.
(I seems like in the order of SUBOPT_* symbols, but I'm not sure it's
a good idea..)subscriptioncmds.c
+ if (opts.streaming ==
LOGICALREP_STREAM_PARALLEL &&
+ + !IsSet(opts.specified_opts, SUBOPT_MIN_APPLY_DELAY) && + sub->minapplydelay > 0) .. + if (opts.min_apply_delay > 0 && + + !IsSet(opts.specified_opts, SUBOPT_STREAMING) && sub->stream == + LOGICALREP_STREAM_PARALLEL)Don't we wrap the lines?
worker.c
+ if (wal_receiver_status_interval > 0 && + diffms > wal_receiver_status_interval * 1000L) + { + WaitLatch(MyLatch, + WL_LATCH_SET |WL_TIMEOUT | WL_EXIT_ON_PM_DEATH,
+ wal_receiver_status_interval *
1000L,
+
WAIT_EVENT_RECOVERY_APPLY_DELAY);
+ send_feedback(last_received, true, false, true); + } + else + WaitLatch(MyLatch, + WL_LATCH_SET |WL_TIMEOUT | WL_EXIT_ON_PM_DEATH,
+ diffms, + + WAIT_EVENT_RECOVERY_APPLY_DELAY);send_feedback always handles the case where
wal_receiver_status_interval == 0.It only handles when force is false but here we are using that as true. So, not
sure, if what you said would be an improvement.
Agreed. So, I keep it as it is.
thus we can simply wait for
min(wal_receiver_status_interval, diffms) then call send_feedback()
unconditionally.-start_apply(XLogRecPtr origin_startpos) +start_apply(void)-LogicalRepApplyLoop(XLogRecPtr last_received) +LogicalRepApplyLoop(void)Does this patch requires this change?
I think this is because the scope of last_received has been changed so that it
can be used to pass in send_feedback() during the delay.
Yes, that's our intention.
Kindly have a look at the latest patch v31 shared in [1]/messages/by-id/TYCPR01MB8373BA483A6D2C924C600968EDDB9@TYCPR01MB8373.jpnprd01.prod.outlook.com.
[1]: /messages/by-id/TYCPR01MB8373BA483A6D2C924C600968EDDB9@TYCPR01MB8373.jpnprd01.prod.outlook.com
Best Regards,
Takamichi Osumi
Hi, Horiguchi-san
Thanks for your review !
On Tuesday, February 7, 2023 1:43 PM From: Kyotaro Horiguchi <horikyota.ntt@gmail.com> wrote:
At Mon, 6 Feb 2023 13:10:01 +0000, "Takamichi Osumi (Fujitsu)"
<osumi.takamichi@fujitsu.com> wrote in
subscriptioncmds.c+ if (opts.streaming == LOGICALREP_STREAM_PARALLEL && + !IsSet(opts.specified_opts, SUBOPT_MIN_APPLY_DELAY) && +sub->minapplydelay > 0) .. + if (opts.min_apply_delay > 0 && + !IsSet(opts.specified_opts, SUBOPT_STREAMING) && sub->stream == +LOGICALREP_STREAM_PARALLEL)Don't we wrap the lines?
Yes, those lines should have looked nicer.
Updated. Kindly have a look at the latest patch v31 in [1]/messages/by-id/TYCPR01MB8373BA483A6D2C924C600968EDDB9@TYCPR01MB8373.jpnprd01.prod.outlook.com.
There are also other some changes in the patch.
[1]: /messages/by-id/TYCPR01MB8373BA483A6D2C924C600968EDDB9@TYCPR01MB8373.jpnprd01.prod.outlook.com
Best Regards,
Takamichi Osumi
Here are my review comments for v31-0001
======
doc/src/sgml/glossary.sgml
1.
+ <para>
+ Replication setup that applies time-delayed copy of the data.
+ </para>
That sentence seemed a bit strange to me.
SUGGESTION
Replication setup that delays the application of changes by a
specified minimum time-delay period.
======
src/backend/replication/logical/worker.c
2. maybe_apply_delay
+ if (wal_receiver_status_interval > 0 &&
+ diffms > wal_receiver_status_interval * 1000L)
+ {
+ WaitLatch(MyLatch,
+ WL_LATCH_SET | WL_TIMEOUT | WL_EXIT_ON_PM_DEATH,
+ wal_receiver_status_interval * 1000L,
+ WAIT_EVENT_RECOVERY_APPLY_DELAY);
+ send_feedback(last_received, true, false, true);
+ }
I felt that introducing another variable like:
long statusinterval_ms = wal_receiver_status_interval * 1000L;
would help here by doing 2 things:
1) The condition would be easier to read because the ms units would be the same
2) Won't need * 1000L repeated in two places.
Only, do take care to assign this variable in the right place in this
loop in case the configuration is changed.
======
src/test/subscription/t/001_rep_changes.pl
3.
+# Test time-delayed logical replication
+#
+# If the subscription sets min_apply_delay parameter, the logical replication
+# worker will delay the transaction apply for min_apply_delay milliseconds. We
+# look the time duration between tuples are inserted on publisher and then
+# changes are replicated on subscriber.
This comment and the other one appearing later in this test are both
explaining the same test strategy. I think both comments should be
combined into one big one up-front, like this:
SUGGESTION
If the subscription sets min_apply_delay parameter, the logical
replication worker will delay the transaction apply for
min_apply_delay milliseconds. We verify this by looking at the time
difference between a) when tuples are inserted on the publisher, and
b) when those changes are replicated on the subscriber. Even on slow
machines, this strategy will give predictable behavior.
~~
4.
+my $delay = 3;
+
+# Set min_apply_delay parameter to 3 seconds
+$node_subscriber->safe_psql('postgres',
+ "ALTER SUBSCRIPTION tap_sub_renamed SET (min_apply_delay = '${delay}s')");
IMO that "my $delay = 3;" assignment should be *after* the comment:
e.g.
+
+# Set min_apply_delay parameter to 3 seconds
+my $delay = 3;
+$node_subscriber->safe_psql('postgres',
+ "ALTER SUBSCRIPTION tap_sub_renamed SET (min_apply_delay = '${delay}s')");
~~~
5.
+# Make new content on publisher and check its presence in subscriber depending
+# on the delay applied above. Before doing the insertion, get the
+# current timestamp that will be used as a comparison base. Even on slow
+# machines, this allows to have a predictable behavior when comparing the
+# delay between data insertion moment on publisher and replay time on
subscriber.
Most of this comment is now redundant because this was already
explained in the big comment up-front (see #3). Only one useful
sentence is left.
SUGGESTION
Before doing the insertion, get the current timestamp that will be
used as a comparison base.
------
Kind Regards,
Peter Smith.
Fujitsu Australia.
Dear Peter,
Thank you for reviewing! PSA new version.
======
doc/src/sgml/glossary.sgml1. + <para> + Replication setup that applies time-delayed copy of the data. + </para>That sentence seemed a bit strange to me.
SUGGESTION
Replication setup that delays the application of changes by a
specified minimum time-delay period.
Fixed.
======
src/backend/replication/logical/worker.c
2. maybe_apply_delay
+ if (wal_receiver_status_interval > 0 && + diffms > wal_receiver_status_interval * 1000L) + { + WaitLatch(MyLatch, + WL_LATCH_SET | WL_TIMEOUT | WL_EXIT_ON_PM_DEATH, + wal_receiver_status_interval * 1000L, + WAIT_EVENT_RECOVERY_APPLY_DELAY); + send_feedback(last_received, true, false, true); + }I felt that introducing another variable like:
long statusinterval_ms = wal_receiver_status_interval * 1000L;
would help here by doing 2 things:
1) The condition would be easier to read because the ms units would be the same
2) Won't need * 1000L repeated in two places.Only, do take care to assign this variable in the right place in this
loop in case the configuration is changed.
Fixed. Calculations are done on two lines - first one is the entrance of the loop,
and second one is the after SIGHUP is detected.
======
src/test/subscription/t/001_rep_changes.pl3. +# Test time-delayed logical replication +# +# If the subscription sets min_apply_delay parameter, the logical replication +# worker will delay the transaction apply for min_apply_delay milliseconds. We +# look the time duration between tuples are inserted on publisher and then +# changes are replicated on subscriber.This comment and the other one appearing later in this test are both
explaining the same test strategy. I think both comments should be
combined into one big one up-front, like this:SUGGESTION
If the subscription sets min_apply_delay parameter, the logical
replication worker will delay the transaction apply for
min_apply_delay milliseconds. We verify this by looking at the time
difference between a) when tuples are inserted on the publisher, and
b) when those changes are replicated on the subscriber. Even on slow
machines, this strategy will give predictable behavior.
Changed.
4. +my $delay = 3; + +# Set min_apply_delay parameter to 3 seconds +$node_subscriber->safe_psql('postgres', + "ALTER SUBSCRIPTION tap_sub_renamed SET (min_apply_delay = '${delay}s')");IMO that "my $delay = 3;" assignment should be *after* the comment:
e.g. + +# Set min_apply_delay parameter to 3 seconds +my $delay = 3; +$node_subscriber->safe_psql('postgres', + "ALTER SUBSCRIPTION tap_sub_renamed SET (min_apply_delay = '${delay}s')");
Right, changed.
5. +# Make new content on publisher and check its presence in subscriber depending +# on the delay applied above. Before doing the insertion, get the +# current timestamp that will be used as a comparison base. Even on slow +# machines, this allows to have a predictable behavior when comparing the +# delay between data insertion moment on publisher and replay time on subscriber.Most of this comment is now redundant because this was already
explained in the big comment up-front (see #3). Only one useful
sentence is left.SUGGESTION
Before doing the insertion, get the current timestamp that will be
used as a comparison base.
Removed.
Best Regards,
Hayato Kuroda
FUJITSU LIMITED
Attachments:
v32-0001-Time-delayed-logical-replication-subscriber.patchapplication/octet-stream; name=v32-0001-Time-delayed-logical-replication-subscriber.patchDownload
From 414d8f4bcb329e6a26d65b1b097acefe35e3a766 Mon Sep 17 00:00:00 2001
From: Takamichi Osumi <osumi.takamichi@fujitsu.com>
Date: Tue, 7 Feb 2023 13:05:34 +0000
Subject: [PATCH v32] Time-delayed logical replication subscriber
Similar to physical replication, a time-delayed copy of the data for
logical replication is useful for some scenarios (particularly to fix
errors that might cause data loss).
This patch implements a new subscription parameter called 'min_apply_delay'.
If the subscription sets min_apply_delay parameter, the logical
replication worker will delay the transaction apply for min_apply_delay
milliseconds.
The delay is calculated between the WAL time stamp and the current time
on the subscriber.
The delay occurs before we start to apply the transaction on the
subscriber. The main reason is to avoid keeping a transaction open for
a long time. Regular and prepared transactions are covered. Streamed
transactions are also covered.
The combination of parallel streaming mode and min_apply_delay is not
allowed. This is because in parallel streaming mode, we start applying
the transaction stream as soon as the first change arrives without
knowing the transaction's prepare/commit time. This means we cannot
calculate the underlying network/decoding lag between publisher and
subscriber, and so always waiting for the full 'min_apply_delay' period
might include unnecessary delay.
The other possibility was to apply the delay at the end of the parallel
apply transaction but that would cause issues related to resource
bloat and locks being held for a long time.
Note that this feature doesn't interact with skip transaction feature.
The skip transaction feature applies to one transaction with a specific LSN.
So, even if the skipped transaction and non-skipped transaction come
consecutively in a very short time, regardless of the order of which comes
first, the time-delayed feature gets balanced by delayed application
for other transactions before and after the skipped transaction.
Author: Euler Taveira, Takamichi Osumi, Kuroda Hayato
Reviewed-by: Amit Kapila, Peter Smith, Vignesh C, Shveta Malik,
Kyotaro Horiguchi, Shi Yu, Wang Wei, Dilip Kumar, Melih Mutlu
Discussion: https://postgr.es/m/CAB-JLwYOYwL=XTyAXKiH5CtM_Vm8KjKh7aaitCKvmCh4rzr5pQ@mail.gmail.com
---
doc/src/sgml/catalogs.sgml | 9 +
doc/src/sgml/config.sgml | 12 ++
doc/src/sgml/glossary.sgml | 15 ++
doc/src/sgml/logical-replication.sgml | 6 +
doc/src/sgml/ref/alter_subscription.sgml | 5 +-
doc/src/sgml/ref/create_subscription.sgml | 49 ++++-
src/backend/catalog/pg_subscription.c | 1 +
src/backend/catalog/system_views.sql | 7 +-
src/backend/commands/subscriptioncmds.c | 122 +++++++++++-
.../replication/logical/applyparallelworker.c | 3 +-
src/backend/replication/logical/worker.c | 171 +++++++++++++++--
src/bin/pg_dump/pg_dump.c | 15 +-
src/bin/pg_dump/pg_dump.h | 1 +
src/bin/psql/describe.c | 9 +-
src/bin/psql/tab-complete.c | 4 +-
src/include/catalog/pg_subscription.h | 3 +
src/include/replication/worker_internal.h | 2 +-
src/test/regress/expected/subscription.out | 181 +++++++++++-------
src/test/regress/sql/subscription.sql | 24 +++
src/test/subscription/t/001_rep_changes.pl | 28 +++
20 files changed, 563 insertions(+), 104 deletions(-)
diff --git a/doc/src/sgml/catalogs.sgml b/doc/src/sgml/catalogs.sgml
index c1e4048054..5dc5ca1133 100644
--- a/doc/src/sgml/catalogs.sgml
+++ b/doc/src/sgml/catalogs.sgml
@@ -7873,6 +7873,15 @@ SCRAM-SHA-256$<replaceable><iteration count></replaceable>:<replaceable>&l
</para></entry>
</row>
+ <row>
+ <entry role="catalog_table_entry"><para role="column_definition">
+ <structfield>subminapplydelay</structfield> <type>int4</type>
+ </para>
+ <para>
+ The minimum delay, in milliseconds, for applying changes
+ </para></entry>
+ </row>
+
<row>
<entry role="catalog_table_entry"><para role="column_definition">
<structfield>subname</structfield> <type>name</type>
diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml
index d190be1925..626a8b5bd0 100644
--- a/doc/src/sgml/config.sgml
+++ b/doc/src/sgml/config.sgml
@@ -4787,6 +4787,18 @@ ANY <replaceable class="parameter">num_sync</replaceable> ( <replaceable class="
the <filename>postgresql.conf</filename> file or on the server
command line.
</para>
+ <para>
+ For time-delayed logical replication, the apply worker sends a feedback
+ message to the publisher every
+ <varname>wal_receiver_status_interval</varname> milliseconds. Make sure
+ to set <varname>wal_receiver_status_interval</varname> less than the
+ <varname>wal_sender_timeout</varname> on the publisher, otherwise, the
+ <literal>walsender</literal> will repeatedly terminate due to timeout
+ errors. Note that if <varname>wal_receiver_status_interval</varname> is
+ set to zero, the apply worker sends no feedback messages during the
+ <literal>min_apply_delay</literal> period. Refer to
+ <xref linkend="sql-createsubscription"/> for more information.
+ </para>
</listitem>
</varlistentry>
diff --git a/doc/src/sgml/glossary.sgml b/doc/src/sgml/glossary.sgml
index 7c01a541fe..9ede9d05f6 100644
--- a/doc/src/sgml/glossary.sgml
+++ b/doc/src/sgml/glossary.sgml
@@ -1729,6 +1729,21 @@
</glossdef>
</glossentry>
+ <glossentry id="glossary-time-delayed-replication">
+ <glossterm>Time-delayed replication</glossterm>
+ <glossdef>
+ <para>
+ Replication setup that delays the application of changes by a specified
+ minimum time-delay period.
+ </para>
+ <para>
+ For more information, see
+ <xref linkend="guc-recovery-min-apply-delay"/> for physical replication
+ and <xref linkend="sql-createsubscription"/> for logical replication.
+ </para>
+ </glossdef>
+ </glossentry>
+
<glossentry id="glossary-toast">
<glossterm>TOAST</glossterm>
<glossdef>
diff --git a/doc/src/sgml/logical-replication.sgml b/doc/src/sgml/logical-replication.sgml
index 1bd5660c87..6bd5f61e2b 100644
--- a/doc/src/sgml/logical-replication.sgml
+++ b/doc/src/sgml/logical-replication.sgml
@@ -247,6 +247,12 @@
target table.
</para>
+ <para>
+ A subscription can delay the application of changes by specifying the
+ <literal>min_apply_delay</literal> subscription parameter. See
+ <xref linkend="sql-createsubscription"/> for details.
+ </para>
+
<sect2 id="logical-replication-subscription-slot">
<title>Replication Slot Management</title>
diff --git a/doc/src/sgml/ref/alter_subscription.sgml b/doc/src/sgml/ref/alter_subscription.sgml
index 964fcbb8ff..8b7eb28e54 100644
--- a/doc/src/sgml/ref/alter_subscription.sgml
+++ b/doc/src/sgml/ref/alter_subscription.sgml
@@ -213,8 +213,9 @@ ALTER SUBSCRIPTION <replaceable class="parameter">name</replaceable> RENAME TO <
are <literal>slot_name</literal>,
<literal>synchronous_commit</literal>,
<literal>binary</literal>, <literal>streaming</literal>,
- <literal>disable_on_error</literal>, and
- <literal>origin</literal>.
+ <literal>disable_on_error</literal>,
+ <literal>origin</literal>, and
+ <literal>min_apply_delay</literal>.
</para>
</listitem>
</varlistentry>
diff --git a/doc/src/sgml/ref/create_subscription.sgml b/doc/src/sgml/ref/create_subscription.sgml
index 51c45f17c7..1b4b8390af 100644
--- a/doc/src/sgml/ref/create_subscription.sgml
+++ b/doc/src/sgml/ref/create_subscription.sgml
@@ -349,7 +349,49 @@ CREATE SUBSCRIPTION <replaceable class="parameter">subscription_name</replaceabl
</para>
</listitem>
</varlistentry>
- </variablelist></para>
+
+ <varlistentry>
+ <term><literal>min_apply_delay</literal> (<type>integer</type>)</term>
+ <listitem>
+ <para>
+ By default, the subscriber applies changes as soon as possible. This
+ parameter allows the user to delay the application of changes by a
+ given time period. If the value is specified without units, it is
+ taken as milliseconds. The default is zero (no delay). See
+ <xref linkend="config-setting-names-values"/> for details on the
+ available valid time units.
+ </para>
+ <para>
+ Any delay becomes effective only after all initial table
+ synchronization has finished and occurs before each transaction starts
+ to get applied on the subscriber. The delay is calculated as the
+ difference between the WAL timestamp as written on the publisher and
+ the current time on the subscriber. Any overhead of time spent in
+ logical decoding and in transferring the transaction may reduce the
+ actual wait time. It is also possible that the overhead already
+ exceeds the requested <literal>min_apply_delay</literal> value, in
+ which case no delay is applied. If the system clocks on publisher and
+ subscriber are not synchronized, this may lead to apply changes
+ earlier than expected, but this is not a major issue because this
+ parameter is typically much larger than the time deviations between
+ servers. Note that if this parameter is set to a long delay, the
+ replication will stop if the replication slot falls behind the current
+ LSN by more than
+ <link linkend="guc-max-slot-wal-keep-size"><literal>max_slot_wal_keep_size</literal></link>.
+ </para>
+ <warning>
+ <para>
+ Delaying the replication means there is a much longer time between
+ making a change on the publisher, and that change being committed
+ on the subscriber. This can impact the performance of synchronous
+ replication. See <xref linkend="guc-synchronous-commit"/>
+ parameter.
+ </para>
+ </warning>
+ </listitem>
+ </varlistentry>
+ </variablelist>
+ </para>
</listitem>
</varlistentry>
@@ -420,6 +462,11 @@ CREATE SUBSCRIPTION <replaceable class="parameter">subscription_name</replaceabl
published with different column lists are not supported.
</para>
+ <para>
+ A non-zero <literal>min_apply_delay</literal> parameter is not allowed when
+ streaming in parallel mode.
+ </para>
+
<para>
We allow non-existent publications to be specified so that users can add
those later. This means
diff --git a/src/backend/catalog/pg_subscription.c b/src/backend/catalog/pg_subscription.c
index a56ae311c3..e19e5cbca2 100644
--- a/src/backend/catalog/pg_subscription.c
+++ b/src/backend/catalog/pg_subscription.c
@@ -66,6 +66,7 @@ GetSubscription(Oid subid, bool missing_ok)
sub->skiplsn = subform->subskiplsn;
sub->name = pstrdup(NameStr(subform->subname));
sub->owner = subform->subowner;
+ sub->minapplydelay = subform->subminapplydelay;
sub->enabled = subform->subenabled;
sub->binary = subform->subbinary;
sub->stream = subform->substream;
diff --git a/src/backend/catalog/system_views.sql b/src/backend/catalog/system_views.sql
index 8608e3fa5b..317c2010cb 100644
--- a/src/backend/catalog/system_views.sql
+++ b/src/backend/catalog/system_views.sql
@@ -1299,9 +1299,10 @@ REVOKE ALL ON pg_replication_origin_status FROM public;
-- All columns of pg_subscription except subconninfo are publicly readable.
REVOKE ALL ON pg_subscription FROM public;
-GRANT SELECT (oid, subdbid, subskiplsn, subname, subowner, subenabled,
- subbinary, substream, subtwophasestate, subdisableonerr,
- subslotname, subsynccommit, subpublications, suborigin)
+GRANT SELECT (oid, subdbid, subskiplsn, subminapplydelay, subname, subowner,
+ subenabled, subbinary, substream, subtwophasestate,
+ subdisableonerr, subslotname, subsynccommit, subpublications,
+ suborigin)
ON pg_subscription TO public;
CREATE VIEW pg_stat_subscription_stats AS
diff --git a/src/backend/commands/subscriptioncmds.c b/src/backend/commands/subscriptioncmds.c
index 464db6d247..82e16fd0f9 100644
--- a/src/backend/commands/subscriptioncmds.c
+++ b/src/backend/commands/subscriptioncmds.c
@@ -66,6 +66,7 @@
#define SUBOPT_DISABLE_ON_ERR 0x00000400
#define SUBOPT_LSN 0x00000800
#define SUBOPT_ORIGIN 0x00001000
+#define SUBOPT_MIN_APPLY_DELAY 0x00002000
/* check if the 'val' has 'bits' set */
#define IsSet(val, bits) (((val) & (bits)) == (bits))
@@ -90,6 +91,7 @@ typedef struct SubOpts
bool disableonerr;
char *origin;
XLogRecPtr lsn;
+ int32 min_apply_delay;
} SubOpts;
static List *fetch_table_list(WalReceiverConn *wrconn, List *publications);
@@ -100,7 +102,7 @@ static void check_publications_origin(WalReceiverConn *wrconn,
static void check_duplicates_in_publist(List *publist, Datum *datums);
static List *merge_publications(List *oldpublist, List *newpublist, bool addpub, const char *subname);
static void ReportSlotConnectionError(List *rstates, Oid subid, char *slotname, char *err);
-
+static int32 defGetMinApplyDelay(DefElem *def);
/*
* Common option parsing function for CREATE and ALTER SUBSCRIPTION commands.
@@ -146,6 +148,8 @@ parse_subscription_options(ParseState *pstate, List *stmt_options,
opts->disableonerr = false;
if (IsSet(supported_opts, SUBOPT_ORIGIN))
opts->origin = pstrdup(LOGICALREP_ORIGIN_ANY);
+ if (IsSet(supported_opts, SUBOPT_MIN_APPLY_DELAY))
+ opts->min_apply_delay = 0;
/* Parse options */
foreach(lc, stmt_options)
@@ -324,6 +328,15 @@ parse_subscription_options(ParseState *pstate, List *stmt_options,
opts->specified_opts |= SUBOPT_LSN;
opts->lsn = lsn;
}
+ else if (IsSet(supported_opts, SUBOPT_MIN_APPLY_DELAY) &&
+ strcmp(defel->defname, "min_apply_delay") == 0)
+ {
+ if (IsSet(opts->specified_opts, SUBOPT_MIN_APPLY_DELAY))
+ errorConflictingDefElem(defel, pstate);
+
+ opts->specified_opts |= SUBOPT_MIN_APPLY_DELAY;
+ opts->min_apply_delay = defGetMinApplyDelay(defel);
+ }
else
ereport(ERROR,
(errcode(ERRCODE_SYNTAX_ERROR),
@@ -404,6 +417,32 @@ parse_subscription_options(ParseState *pstate, List *stmt_options,
"slot_name = NONE", "create_slot = false")));
}
}
+
+ /*
+ * The combination of parallel streaming mode and min_apply_delay is not
+ * allowed. This is because in parallel streaming mode, we start applying
+ * the transaction stream as soon as the first change arrives without
+ * knowing the transaction's prepare/commit time. This means we cannot
+ * calculate the underlying network/decoding lag between publisher and
+ * subscriber, and so always waiting for the full 'min_apply_delay' period
+ * might include unnecessary delay.
+ *
+ * The other possibility was to apply the delay at the end of the parallel
+ * apply transaction but that would cause issues related to resource bloat
+ * and locks being held for a long time.
+ */
+ if (IsSet(supported_opts, SUBOPT_MIN_APPLY_DELAY) &&
+ opts->min_apply_delay > 0 &&
+ opts->streaming == LOGICALREP_STREAM_PARALLEL)
+ ereport(ERROR,
+ errcode(ERRCODE_SYNTAX_ERROR),
+
+ /*
+ * translator: the first %s is a string of the form "parameter > 0"
+ * and the second one is "option = value".
+ */
+ errmsg("%s and %s are mutually exclusive options",
+ "min_apply_delay > 0", "streaming = parallel"));
}
/*
@@ -560,7 +599,8 @@ CreateSubscription(ParseState *pstate, CreateSubscriptionStmt *stmt,
SUBOPT_SLOT_NAME | SUBOPT_COPY_DATA |
SUBOPT_SYNCHRONOUS_COMMIT | SUBOPT_BINARY |
SUBOPT_STREAMING | SUBOPT_TWOPHASE_COMMIT |
- SUBOPT_DISABLE_ON_ERR | SUBOPT_ORIGIN);
+ SUBOPT_DISABLE_ON_ERR | SUBOPT_ORIGIN |
+ SUBOPT_MIN_APPLY_DELAY);
parse_subscription_options(pstate, stmt->options, supported_opts, &opts);
/*
@@ -625,6 +665,7 @@ CreateSubscription(ParseState *pstate, CreateSubscriptionStmt *stmt,
values[Anum_pg_subscription_oid - 1] = ObjectIdGetDatum(subid);
values[Anum_pg_subscription_subdbid - 1] = ObjectIdGetDatum(MyDatabaseId);
values[Anum_pg_subscription_subskiplsn - 1] = LSNGetDatum(InvalidXLogRecPtr);
+ values[Anum_pg_subscription_subminapplydelay - 1] = Int32GetDatum(opts.min_apply_delay);
values[Anum_pg_subscription_subname - 1] =
DirectFunctionCall1(namein, CStringGetDatum(stmt->subname));
values[Anum_pg_subscription_subowner - 1] = ObjectIdGetDatum(owner);
@@ -1054,7 +1095,7 @@ AlterSubscription(ParseState *pstate, AlterSubscriptionStmt *stmt,
supported_opts = (SUBOPT_SLOT_NAME |
SUBOPT_SYNCHRONOUS_COMMIT | SUBOPT_BINARY |
SUBOPT_STREAMING | SUBOPT_DISABLE_ON_ERR |
- SUBOPT_ORIGIN);
+ SUBOPT_ORIGIN | SUBOPT_MIN_APPLY_DELAY);
parse_subscription_options(pstate, stmt->options,
supported_opts, &opts);
@@ -1098,6 +1139,19 @@ AlterSubscription(ParseState *pstate, AlterSubscriptionStmt *stmt,
if (IsSet(opts.specified_opts, SUBOPT_STREAMING))
{
+ /*
+ * The combination of parallel streaming mode and
+ * min_apply_delay is not allowed. See
+ * parse_subscription_options.
+ */
+ if (opts.streaming == LOGICALREP_STREAM_PARALLEL &&
+ !IsSet(opts.specified_opts, SUBOPT_MIN_APPLY_DELAY)
+ && sub->minapplydelay > 0)
+ ereport(ERROR,
+ errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
+ errmsg("cannot set parallel streaming mode for subscription with %s",
+ "min_apply_delay"));
+
values[Anum_pg_subscription_substream - 1] =
CharGetDatum(opts.streaming);
replaces[Anum_pg_subscription_substream - 1] = true;
@@ -1111,6 +1165,26 @@ AlterSubscription(ParseState *pstate, AlterSubscriptionStmt *stmt,
= true;
}
+ if (IsSet(opts.specified_opts, SUBOPT_MIN_APPLY_DELAY))
+ {
+ /*
+ * The combination of parallel streaming mode and
+ * min_apply_delay is not allowed. See
+ * parse_subscription_options.
+ */
+ if (opts.min_apply_delay > 0 &&
+ !IsSet(opts.specified_opts, SUBOPT_STREAMING)
+ && sub->stream == LOGICALREP_STREAM_PARALLEL)
+ ereport(ERROR,
+ errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
+ errmsg("cannot set %s for subscription in parallel streaming mode",
+ "min_apply_delay"));
+
+ values[Anum_pg_subscription_subminapplydelay - 1] =
+ Int32GetDatum(opts.min_apply_delay);
+ replaces[Anum_pg_subscription_subminapplydelay - 1] = true;
+ }
+
if (IsSet(opts.specified_opts, SUBOPT_ORIGIN))
{
values[Anum_pg_subscription_suborigin - 1] =
@@ -2195,3 +2269,45 @@ defGetStreamingMode(DefElem *def)
def->defname)));
return LOGICALREP_STREAM_OFF; /* keep compiler quiet */
}
+
+/*
+ * Extract the min_apply_delay value from a DefElem. This is very similar to
+ * parse_and_validate_value() for integer values, because min_apply_delay
+ * accepts the same parameter format as recovery_min_apply_delay.
+ */
+static int32
+defGetMinApplyDelay(DefElem *def)
+{
+ char *input_string;
+ int result;
+ const char *hintmsg;
+
+ input_string = defGetString(def);
+
+ /*
+ * Parse given string as parameter which has millisecond unit
+ */
+ if (!parse_int(input_string, &result, GUC_UNIT_MS, &hintmsg))
+ ereport(ERROR,
+ (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+ errmsg("invalid value for parameter \"%s\": \"%s\"",
+ "min_apply_delay", input_string),
+ hintmsg ? errhint("%s", _(hintmsg)) : 0));
+
+ /*
+ * Check both the lower boundary for the valid min_apply_delay range and
+ * the upper boundary as the safeguard for some platforms where INT_MAX is
+ * wider than int32 respectively. Although parse_int() has confirmed that
+ * the result is less than or equal to INT_MAX, the value will be stored
+ * in a catalog column of int32.
+ */
+ if (result < 0 || result > PG_INT32_MAX)
+ ereport(ERROR,
+ (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+ errmsg("%d ms is outside the valid range for parameter \"%s\" (%d .. %d)",
+ result,
+ "min_apply_delay",
+ 0, PG_INT32_MAX)));
+
+ return result;
+}
diff --git a/src/backend/replication/logical/applyparallelworker.c b/src/backend/replication/logical/applyparallelworker.c
index da437e0bc3..32db20fd98 100644
--- a/src/backend/replication/logical/applyparallelworker.c
+++ b/src/backend/replication/logical/applyparallelworker.c
@@ -704,7 +704,8 @@ pa_process_spooled_messages_if_required(void)
{
apply_spooled_messages(&MyParallelShared->fileset,
MyParallelShared->xid,
- InvalidXLogRecPtr);
+ InvalidXLogRecPtr,
+ 0);
pa_set_fileset_state(MyParallelShared, FS_EMPTY);
}
diff --git a/src/backend/replication/logical/worker.c b/src/backend/replication/logical/worker.c
index cfb2ab6248..e52143b588 100644
--- a/src/backend/replication/logical/worker.c
+++ b/src/backend/replication/logical/worker.c
@@ -319,6 +319,17 @@ static List *on_commit_wakeup_workers_subids = NIL;
bool in_remote_transaction = false;
static XLogRecPtr remote_final_lsn = InvalidXLogRecPtr;
+/*
+ * In order to avoid walsender timeout for time-delayed logical replication the
+ * apply worker keeps sending feedback messages during the delay period.
+ * Meanwhile, the feature delays the apply before the start of the
+ * transaction and thus we don't write WAL records for the suspended changes
+ * during the wait. When the apply worker sends a feedback message during the
+ * delay, we should not overwrite positions of the flushed and apply LSN by the
+ * last received latest LSN. See send_feedback() for details.
+ */
+static XLogRecPtr last_received = InvalidXLogRecPtr;
+
/* fields valid only when processing streamed transaction */
static bool in_streamed_transaction = false;
@@ -389,7 +400,8 @@ static void stream_write_change(char action, StringInfo s);
static void stream_open_and_write_change(TransactionId xid, char action, StringInfo s);
static void stream_close_file(void);
-static void send_feedback(XLogRecPtr recvpos, bool force, bool requestReply);
+static void send_feedback(XLogRecPtr recvpos, bool force, bool requestReply,
+ bool has_unprocessed_change);
static void DisableSubscriptionAndExit(void);
@@ -999,6 +1011,115 @@ slot_modify_data(TupleTableSlot *slot, TupleTableSlot *srcslot,
ExecStoreVirtualTuple(slot);
}
+/*
+ * When min_apply_delay parameter is set on the subscriber, we wait long enough
+ * to make sure a transaction is applied at least that period behind the
+ * publisher.
+ *
+ * While the physical replication applies the delay at commit time, this
+ * feature applies the delay for the next transaction but before starting the
+ * transaction. This is mainly because keeping a transaction that conducted
+ * write operations open for a long time results in some issues such as bloat
+ * and locks.
+ *
+ * The min_apply_delay parameter will take effect only after all tables are in
+ * READY state.
+ *
+ * xid is the transaction id where we apply the delay.
+ *
+ * finish_ts is the commit/prepare time of both regular (non-streamed) and
+ * streamed transactions. Unlike the regular (non-streamed) cases, the delay
+ * is applied in a STREAM COMMIT/STREAM PREPARE message for streamed
+ * transactions. The STREAM START message does not contain a commit/prepare
+ * time (it will be available when the in-progress transaction finishes).
+ * Hence, it's not appropriate to apply a delay at the STREAM START time.
+ */
+static void
+maybe_apply_delay(TransactionId xid, TimestampTz finish_ts)
+{
+ long statusinterval_ms;
+
+ Assert(finish_ts > 0);
+
+ /* Nothing to do if no delay set */
+ if (!MySubscription->minapplydelay)
+ return;
+
+ /*
+ * The min_apply_delay parameter is ignored until all tablesync workers
+ * have reached READY state. This is because if we allowed the delay
+ * during the catchup phase, then once we reached the limit of tablesync
+ * workers it would impose a delay for each subsequent worker. That would
+ * cause initial table synchronization completion to take a long time.
+ */
+ if (!AllTablesyncsReady())
+ return;
+
+ /* Calculate the time interval between status reports */
+ statusinterval_ms = wal_receiver_status_interval * 1000L;
+
+ /* Apply the delay by the latch mechanism */
+ while (true)
+ {
+ TimestampTz delayUntil;
+ long diffms;
+
+ ResetLatch(MyLatch);
+
+ CHECK_FOR_INTERRUPTS();
+
+ /* This might change wal_receiver_status_interval */
+ if (ConfigReloadPending)
+ {
+ ConfigReloadPending = false;
+ ProcessConfigFile(PGC_SIGHUP);
+ /* Re-calculate the time interval between status reports */
+ statusinterval_ms = wal_receiver_status_interval * 1000L;
+ }
+
+ /*
+ * Before calculating the time duration, reload the catalog if needed.
+ */
+ if (!in_remote_transaction && !in_streamed_transaction)
+ {
+ AcceptInvalidationMessages();
+ maybe_reread_subscription();
+ }
+
+ delayUntil = TimestampTzPlusMilliseconds(finish_ts, MySubscription->minapplydelay);
+ diffms = TimestampDifferenceMilliseconds(GetCurrentTimestamp(), delayUntil);
+
+ /*
+ * Exit without arming the latch if it's already past time to apply
+ * this transaction.
+ */
+ if (diffms <= 0)
+ break;
+
+ elog(DEBUG2, "time-delayed replication for txid %u, min_apply_delay = %d ms, remaining wait time: %ld ms",
+ xid, MySubscription->minapplydelay, diffms);
+
+ /*
+ * Call send_feedback() to prevent the publisher from exiting by
+ * timeout during the delay, when statusinterval_ms is greater than
+ * zero.
+ */
+ if (statusinterval_ms > 0 && diffms > statusinterval_ms)
+ {
+ WaitLatch(MyLatch,
+ WL_LATCH_SET | WL_TIMEOUT | WL_EXIT_ON_PM_DEATH,
+ statusinterval_ms,
+ WAIT_EVENT_RECOVERY_APPLY_DELAY);
+ send_feedback(last_received, true, false, true);
+ }
+ else
+ WaitLatch(MyLatch,
+ WL_LATCH_SET | WL_TIMEOUT | WL_EXIT_ON_PM_DEATH,
+ diffms,
+ WAIT_EVENT_RECOVERY_APPLY_DELAY);
+ }
+}
+
/*
* Handle BEGIN message.
*/
@@ -1013,6 +1134,9 @@ apply_handle_begin(StringInfo s)
logicalrep_read_begin(s, &begin_data);
set_apply_error_context_xact(begin_data.xid, begin_data.final_lsn);
+ /* Should we delay the current transaction? */
+ maybe_apply_delay(begin_data.xid, begin_data.committime);
+
remote_final_lsn = begin_data.final_lsn;
maybe_start_skipping_changes(begin_data.final_lsn);
@@ -1070,6 +1194,9 @@ apply_handle_begin_prepare(StringInfo s)
logicalrep_read_begin_prepare(s, &begin_data);
set_apply_error_context_xact(begin_data.xid, begin_data.prepare_lsn);
+ /* Should we delay the current prepared transaction? */
+ maybe_apply_delay(begin_data.xid, begin_data.prepare_time);
+
remote_final_lsn = begin_data.prepare_lsn;
maybe_start_skipping_changes(begin_data.prepare_lsn);
@@ -1317,7 +1444,8 @@ apply_handle_stream_prepare(StringInfo s)
* spooled operations.
*/
apply_spooled_messages(MyLogicalRepWorker->stream_fileset,
- prepare_data.xid, prepare_data.prepare_lsn);
+ prepare_data.xid, prepare_data.prepare_lsn,
+ prepare_data.prepare_time);
/* Mark the transaction as prepared. */
apply_handle_prepare_internal(&prepare_data);
@@ -2011,10 +2139,13 @@ ensure_last_message(FileSet *stream_fileset, TransactionId xid, int fileno,
/*
* Common spoolfile processing.
+ *
+ * The commit/prepare time (finish_ts) is required for time-delayed logical
+ * replication.
*/
void
apply_spooled_messages(FileSet *stream_fileset, TransactionId xid,
- XLogRecPtr lsn)
+ XLogRecPtr lsn, TimestampTz finish_ts)
{
StringInfoData s2;
int nchanges;
@@ -2025,6 +2156,10 @@ apply_spooled_messages(FileSet *stream_fileset, TransactionId xid,
int fileno;
off_t offset;
+ /* Should we delay the current transaction? */
+ if (finish_ts)
+ maybe_apply_delay(xid, finish_ts);
+
if (!am_parallel_apply_worker())
maybe_start_skipping_changes(lsn);
@@ -2174,7 +2309,7 @@ apply_handle_stream_commit(StringInfo s)
* spooled operations.
*/
apply_spooled_messages(MyLogicalRepWorker->stream_fileset, xid,
- commit_data.commit_lsn);
+ commit_data.commit_lsn, commit_data.committime);
apply_handle_commit_internal(&commit_data);
@@ -3447,7 +3582,7 @@ UpdateWorkerStats(XLogRecPtr last_lsn, TimestampTz send_time, bool reply)
* Apply main loop.
*/
static void
-LogicalRepApplyLoop(XLogRecPtr last_received)
+LogicalRepApplyLoop(void)
{
TimestampTz last_recv_timestamp = GetCurrentTimestamp();
bool ping_sent = false;
@@ -3568,7 +3703,7 @@ LogicalRepApplyLoop(XLogRecPtr last_received)
if (last_received < end_lsn)
last_received = end_lsn;
- send_feedback(last_received, reply_requested, false);
+ send_feedback(last_received, reply_requested, false, false);
UpdateWorkerStats(last_received, timestamp, true);
}
/* other message types are purposefully ignored */
@@ -3581,7 +3716,7 @@ LogicalRepApplyLoop(XLogRecPtr last_received)
}
/* confirm all writes so far */
- send_feedback(last_received, false, false);
+ send_feedback(last_received, false, false, false);
if (!in_remote_transaction && !in_streamed_transaction)
{
@@ -3678,7 +3813,7 @@ LogicalRepApplyLoop(XLogRecPtr last_received)
}
}
- send_feedback(last_received, requestReply, requestReply);
+ send_feedback(last_received, requestReply, requestReply, false);
/*
* Force reporting to ensure long idle periods don't lead to
@@ -3708,7 +3843,7 @@ LogicalRepApplyLoop(XLogRecPtr last_received)
* to send a response to avoid timeouts.
*/
static void
-send_feedback(XLogRecPtr recvpos, bool force, bool requestReply)
+send_feedback(XLogRecPtr recvpos, bool force, bool requestReply, bool has_unprocessed_change)
{
static StringInfo reply_message = NULL;
static TimestampTz send_time = 0;
@@ -3738,8 +3873,14 @@ send_feedback(XLogRecPtr recvpos, bool force, bool requestReply)
/*
* No outstanding transactions to flush, we can report the latest received
* position. This is important for synchronous replication.
+ *
+ * If the logical replication subscription has unprocessed changes then do
+ * not inform the publisher that the received latest LSN is already
+ * applied and flushed, otherwise, the publisher will make a wrong
+ * assumption about the logical replication progress. Instead, just send a
+ * feedback message to avoid a replication timeout during the delay.
*/
- if (!have_pending_txes)
+ if (!have_pending_txes && !has_unprocessed_change)
flushpos = writepos = recvpos;
if (writepos < last_writepos)
@@ -3776,8 +3917,9 @@ send_feedback(XLogRecPtr recvpos, bool force, bool requestReply)
pq_sendint64(reply_message, now); /* sendTime */
pq_sendbyte(reply_message, requestReply); /* replyRequested */
- elog(DEBUG2, "sending feedback (force %d) to recv %X/%X, write %X/%X, flush %X/%X",
+ elog(DEBUG2, "sending feedback (force %d, has_unprocessed_change %d) to recv %X/%X, write %X/%X, flush %X/%X",
force,
+ has_unprocessed_change,
LSN_FORMAT_ARGS(recvpos),
LSN_FORMAT_ARGS(writepos),
LSN_FORMAT_ARGS(flushpos));
@@ -4367,11 +4509,11 @@ start_table_sync(XLogRecPtr *origin_startpos, char **myslotname)
* of system resource error and are not repeatable.
*/
static void
-start_apply(XLogRecPtr origin_startpos)
+start_apply(void)
{
PG_TRY();
{
- LogicalRepApplyLoop(origin_startpos);
+ LogicalRepApplyLoop();
}
PG_CATCH();
{
@@ -4661,7 +4803,8 @@ ApplyWorkerMain(Datum main_arg)
}
/* Run the main loop. */
- start_apply(origin_startpos);
+ last_received = origin_startpos;
+ start_apply();
proc_exit(0);
}
diff --git a/src/bin/pg_dump/pg_dump.c b/src/bin/pg_dump/pg_dump.c
index 527c7651ab..1e87f0124e 100644
--- a/src/bin/pg_dump/pg_dump.c
+++ b/src/bin/pg_dump/pg_dump.c
@@ -4494,6 +4494,7 @@ getSubscriptions(Archive *fout)
int i_subsynccommit;
int i_subpublications;
int i_subbinary;
+ int i_subminapplydelay;
int i,
ntups;
@@ -4546,9 +4547,13 @@ getSubscriptions(Archive *fout)
LOGICALREP_TWOPHASE_STATE_DISABLED);
if (fout->remoteVersion >= 160000)
- appendPQExpBufferStr(query, " s.suborigin\n");
+ appendPQExpBufferStr(query,
+ " s.suborigin,\n"
+ " s.subminapplydelay\n");
else
- appendPQExpBuffer(query, " '%s' AS suborigin\n", LOGICALREP_ORIGIN_ANY);
+ appendPQExpBuffer(query, " '%s' AS suborigin,\n"
+ " 0 AS subminapplydelay\n",
+ LOGICALREP_ORIGIN_ANY);
appendPQExpBufferStr(query,
"FROM pg_subscription s\n"
@@ -4576,6 +4581,7 @@ getSubscriptions(Archive *fout)
i_subtwophasestate = PQfnumber(res, "subtwophasestate");
i_subdisableonerr = PQfnumber(res, "subdisableonerr");
i_suborigin = PQfnumber(res, "suborigin");
+ i_subminapplydelay = PQfnumber(res, "subminapplydelay");
subinfo = pg_malloc(ntups * sizeof(SubscriptionInfo));
@@ -4606,6 +4612,8 @@ getSubscriptions(Archive *fout)
subinfo[i].subdisableonerr =
pg_strdup(PQgetvalue(res, i, i_subdisableonerr));
subinfo[i].suborigin = pg_strdup(PQgetvalue(res, i, i_suborigin));
+ subinfo[i].subminapplydelay =
+ atoi(PQgetvalue(res, i, i_subminapplydelay));
/* Decide whether we want to dump it */
selectDumpableObject(&(subinfo[i].dobj), fout);
@@ -4687,6 +4695,9 @@ dumpSubscription(Archive *fout, const SubscriptionInfo *subinfo)
if (strcmp(subinfo->subsynccommit, "off") != 0)
appendPQExpBuffer(query, ", synchronous_commit = %s", fmtId(subinfo->subsynccommit));
+ if (subinfo->subminapplydelay > 0)
+ appendPQExpBuffer(query, ", min_apply_delay = '%d ms'", subinfo->subminapplydelay);
+
appendPQExpBufferStr(query, ");\n");
if (subinfo->dobj.dump & DUMP_COMPONENT_DEFINITION)
diff --git a/src/bin/pg_dump/pg_dump.h b/src/bin/pg_dump/pg_dump.h
index e7cbd8d7ed..b8831c3ed3 100644
--- a/src/bin/pg_dump/pg_dump.h
+++ b/src/bin/pg_dump/pg_dump.h
@@ -661,6 +661,7 @@ typedef struct _SubscriptionInfo
char *subdisableonerr;
char *suborigin;
char *subsynccommit;
+ int subminapplydelay;
char *subpublications;
} SubscriptionInfo;
diff --git a/src/bin/psql/describe.c b/src/bin/psql/describe.c
index c8a0bb7b3a..81d4607a1c 100644
--- a/src/bin/psql/describe.c
+++ b/src/bin/psql/describe.c
@@ -6472,7 +6472,7 @@ describeSubscriptions(const char *pattern, bool verbose)
PGresult *res;
printQueryOpt myopt = pset.popt;
static const bool translate_columns[] = {false, false, false, false,
- false, false, false, false, false, false, false, false};
+ false, false, false, false, false, false, false, false, false};
if (pset.sversion < 100000)
{
@@ -6527,10 +6527,13 @@ describeSubscriptions(const char *pattern, bool verbose)
gettext_noop("Two-phase commit"),
gettext_noop("Disable on error"));
+ /* Origin and min_apply_delay are only supported in v16 and higher */
if (pset.sversion >= 160000)
appendPQExpBuffer(&buf,
- ", suborigin AS \"%s\"\n",
- gettext_noop("Origin"));
+ ", suborigin AS \"%s\"\n"
+ ", subminapplydelay AS \"%s\"\n",
+ gettext_noop("Origin"),
+ gettext_noop("Min apply delay"));
appendPQExpBuffer(&buf,
", subsynccommit AS \"%s\"\n"
diff --git a/src/bin/psql/tab-complete.c b/src/bin/psql/tab-complete.c
index 5e1882eaea..e8b9a43a47 100644
--- a/src/bin/psql/tab-complete.c
+++ b/src/bin/psql/tab-complete.c
@@ -1925,7 +1925,7 @@ psql_completion(const char *text, int start, int end)
COMPLETE_WITH("(", "PUBLICATION");
/* ALTER SUBSCRIPTION <name> SET ( */
else if (HeadMatches("ALTER", "SUBSCRIPTION", MatchAny) && TailMatches("SET", "("))
- COMPLETE_WITH("binary", "disable_on_error", "origin", "slot_name",
+ COMPLETE_WITH("binary", "disable_on_error", "min_apply_delay", "origin", "slot_name",
"streaming", "synchronous_commit");
/* ALTER SUBSCRIPTION <name> SKIP ( */
else if (HeadMatches("ALTER", "SUBSCRIPTION", MatchAny) && TailMatches("SKIP", "("))
@@ -3268,7 +3268,7 @@ psql_completion(const char *text, int start, int end)
/* Complete "CREATE SUBSCRIPTION <name> ... WITH ( <opt>" */
else if (HeadMatches("CREATE", "SUBSCRIPTION") && TailMatches("WITH", "("))
COMPLETE_WITH("binary", "connect", "copy_data", "create_slot",
- "disable_on_error", "enabled", "origin", "slot_name",
+ "disable_on_error", "enabled", "min_apply_delay", "origin", "slot_name",
"streaming", "synchronous_commit", "two_phase");
/* CREATE TRIGGER --- is allowed inside CREATE SCHEMA, so use TailMatches */
diff --git a/src/include/catalog/pg_subscription.h b/src/include/catalog/pg_subscription.h
index b0f2a1705d..d1cfefc6d6 100644
--- a/src/include/catalog/pg_subscription.h
+++ b/src/include/catalog/pg_subscription.h
@@ -74,6 +74,8 @@ CATALOG(pg_subscription,6100,SubscriptionRelationId) BKI_SHARED_RELATION BKI_ROW
Oid subowner BKI_LOOKUP(pg_authid); /* Owner of the subscription */
+ int32 subminapplydelay; /* Replication apply delay (ms) */
+
bool subenabled; /* True if the subscription is enabled (the
* worker should be running) */
@@ -122,6 +124,7 @@ typedef struct Subscription
* skipped */
char *name; /* Name of the subscription */
Oid owner; /* Oid of the subscription owner */
+ int32 minapplydelay; /* Replication apply delay (ms) */
bool enabled; /* Indicates if the subscription is enabled */
bool binary; /* Indicates if the subscription wants data in
* binary format */
diff --git a/src/include/replication/worker_internal.h b/src/include/replication/worker_internal.h
index dc87a4edd1..3dc09d1a4c 100644
--- a/src/include/replication/worker_internal.h
+++ b/src/include/replication/worker_internal.h
@@ -255,7 +255,7 @@ extern void stream_stop_internal(TransactionId xid);
/* Common streaming function to apply all the spooled messages */
extern void apply_spooled_messages(FileSet *stream_fileset, TransactionId xid,
- XLogRecPtr lsn);
+ XLogRecPtr lsn, TimestampTz finish_ts);
extern void apply_dispatch(StringInfo s);
diff --git a/src/test/regress/expected/subscription.out b/src/test/regress/expected/subscription.out
index 3f99b14394..cf8e727ee9 100644
--- a/src/test/regress/expected/subscription.out
+++ b/src/test/regress/expected/subscription.out
@@ -114,18 +114,18 @@ CREATE SUBSCRIPTION regress_testsub4 CONNECTION 'dbname=regress_doesnotexist' PU
WARNING: subscription was created, but is not connected
HINT: To initiate replication, you must manually create the replication slot, enable the subscription, and refresh the subscription.
\dRs+ regress_testsub4
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
-------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub4 | regress_subscription_user | f | {testpub} | f | off | d | f | none | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay | Synchronous commit | Conninfo | Skip LSN
+------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+-----------------+--------------------+-----------------------------+----------
+ regress_testsub4 | regress_subscription_user | f | {testpub} | f | off | d | f | none | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
ALTER SUBSCRIPTION regress_testsub4 SET (origin = any);
\dRs+ regress_testsub4
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
-------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub4 | regress_subscription_user | f | {testpub} | f | off | d | f | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay | Synchronous commit | Conninfo | Skip LSN
+------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+-----------------+--------------------+-----------------------------+----------
+ regress_testsub4 | regress_subscription_user | f | {testpub} | f | off | d | f | any | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
DROP SUBSCRIPTION regress_testsub3;
@@ -143,10 +143,10 @@ ALTER SUBSCRIPTION regress_testsub CONNECTION 'foobar';
ERROR: invalid connection string syntax: missing "=" after "foobar" in connection info string
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub} | f | off | d | f | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+-----------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | off | d | f | any | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
ALTER SUBSCRIPTION regress_testsub SET PUBLICATION testpub2, testpub3 WITH (refresh = false);
@@ -163,10 +163,10 @@ ERROR: unrecognized subscription parameter: "create_slot"
-- ok
ALTER SUBSCRIPTION regress_testsub SKIP (lsn = '0/12345');
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+---------------------+--------+-----------+------------------+------------------+--------+--------------------+------------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub2,testpub3} | f | off | d | f | any | off | dbname=regress_doesnotexist2 | 0/12345
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+---------------------+--------+-----------+------------------+------------------+--------+-----------------+--------------------+------------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub2,testpub3} | f | off | d | f | any | 0 | off | dbname=regress_doesnotexist2 | 0/12345
(1 row)
-- ok - with lsn = NONE
@@ -175,10 +175,10 @@ ALTER SUBSCRIPTION regress_testsub SKIP (lsn = NONE);
ALTER SUBSCRIPTION regress_testsub SKIP (lsn = '0/0');
ERROR: invalid WAL location (LSN): 0/0
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+---------------------+--------+-----------+------------------+------------------+--------+--------------------+------------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub2,testpub3} | f | off | d | f | any | off | dbname=regress_doesnotexist2 | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+---------------------+--------+-----------+------------------+------------------+--------+-----------------+--------------------+------------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub2,testpub3} | f | off | d | f | any | 0 | off | dbname=regress_doesnotexist2 | 0/0
(1 row)
BEGIN;
@@ -210,10 +210,10 @@ ALTER SUBSCRIPTION regress_testsub_foo SET (synchronous_commit = foobar);
ERROR: invalid value for parameter "synchronous_commit": "foobar"
HINT: Available values: local, remote_write, remote_apply, on, off.
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
----------------------+---------------------------+---------+---------------------+--------+-----------+------------------+------------------+--------+--------------------+------------------------------+----------
- regress_testsub_foo | regress_subscription_user | f | {testpub2,testpub3} | f | off | d | f | any | local | dbname=regress_doesnotexist2 | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay | Synchronous commit | Conninfo | Skip LSN
+---------------------+---------------------------+---------+---------------------+--------+-----------+------------------+------------------+--------+-----------------+--------------------+------------------------------+----------
+ regress_testsub_foo | regress_subscription_user | f | {testpub2,testpub3} | f | off | d | f | any | 0 | local | dbname=regress_doesnotexist2 | 0/0
(1 row)
-- rename back to keep the rest simple
@@ -247,19 +247,19 @@ CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUB
WARNING: subscription was created, but is not connected
HINT: To initiate replication, you must manually create the replication slot, enable the subscription, and refresh the subscription.
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub} | t | off | d | f | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+-----------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | t | off | d | f | any | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
ALTER SUBSCRIPTION regress_testsub SET (binary = false);
ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE);
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub} | f | off | d | f | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+-----------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | off | d | f | any | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
DROP SUBSCRIPTION regress_testsub;
@@ -271,27 +271,27 @@ CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUB
WARNING: subscription was created, but is not connected
HINT: To initiate replication, you must manually create the replication slot, enable the subscription, and refresh the subscription.
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub} | f | on | d | f | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+-----------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | on | d | f | any | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
ALTER SUBSCRIPTION regress_testsub SET (streaming = parallel);
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub} | f | parallel | d | f | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+-----------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | parallel | d | f | any | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
ALTER SUBSCRIPTION regress_testsub SET (streaming = false);
ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE);
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub} | f | off | d | f | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+-----------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | off | d | f | any | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
-- fail - publication already exists
@@ -306,10 +306,10 @@ ALTER SUBSCRIPTION regress_testsub ADD PUBLICATION testpub1, testpub2 WITH (refr
ALTER SUBSCRIPTION regress_testsub ADD PUBLICATION testpub1, testpub2 WITH (refresh = false);
ERROR: publication "testpub1" is already in subscription "regress_testsub"
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-----------------------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub,testpub1,testpub2} | f | off | d | f | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-----------------------------+--------+-----------+------------------+------------------+--------+-----------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub,testpub1,testpub2} | f | off | d | f | any | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
-- fail - publication used more than once
@@ -324,10 +324,10 @@ ERROR: publication "testpub3" is not in subscription "regress_testsub"
-- ok - delete publications
ALTER SUBSCRIPTION regress_testsub DROP PUBLICATION testpub1, testpub2 WITH (refresh = false);
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub} | f | off | d | f | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+-----------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | off | d | f | any | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
DROP SUBSCRIPTION regress_testsub;
@@ -363,10 +363,10 @@ CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUB
WARNING: subscription was created, but is not connected
HINT: To initiate replication, you must manually create the replication slot, enable the subscription, and refresh the subscription.
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub} | f | off | p | f | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+-----------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | off | p | f | any | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
--fail - alter of two_phase option not supported.
@@ -375,10 +375,10 @@ ERROR: unrecognized subscription parameter: "two_phase"
-- but can alter streaming when two_phase enabled
ALTER SUBSCRIPTION regress_testsub SET (streaming = true);
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub} | f | on | p | f | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+-----------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | on | p | f | any | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE);
@@ -388,10 +388,10 @@ CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUB
WARNING: subscription was created, but is not connected
HINT: To initiate replication, you must manually create the replication slot, enable the subscription, and refresh the subscription.
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub} | f | on | p | f | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+-----------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | on | p | f | any | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE);
@@ -404,20 +404,57 @@ CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUB
WARNING: subscription was created, but is not connected
HINT: To initiate replication, you must manually create the replication slot, enable the subscription, and refresh the subscription.
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub} | f | off | d | f | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+-----------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | off | d | f | any | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
ALTER SUBSCRIPTION regress_testsub SET (disable_on_error = true);
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub} | f | off | d | t | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+-----------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | off | d | t | any | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
+ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE);
+DROP SUBSCRIPTION regress_testsub;
+-- fail -- min_apply_delay must be a non-negative integer
+CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUBLICATION testpub WITH (connect = false, min_apply_delay = foo);
+ERROR: invalid value for parameter "min_apply_delay": "foo"
+CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUBLICATION testpub WITH (connect = false, min_apply_delay = -1);
+ERROR: -1 ms is outside the valid range for parameter "min_apply_delay" (0 .. 2147483647)
+-- fail - utilizing streaming = parallel with time-delayed replication is not supported
+CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUBLICATION testpub WITH (connect = false, streaming = parallel, min_apply_delay = 123);
+ERROR: min_apply_delay > 0 and streaming = parallel are mutually exclusive options
+-- success -- min_apply_delay value without unit is taken as milliseconds
+CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUBLICATION testpub WITH (connect = false, min_apply_delay = 123);
+WARNING: subscription was created, but is not connected
+HINT: To initiate replication, you must manually create the replication slot, enable the subscription, and refresh the subscription.
+\dRs+
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+-----------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | off | d | f | any | 123 | off | dbname=regress_doesnotexist | 0/0
+(1 row)
+
+-- success -- min_apply_delay value with unit is converted into ms and stored as an integer
+ALTER SUBSCRIPTION regress_testsub SET (min_apply_delay = '1 d');
+\dRs+
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+-----------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | off | d | f | any | 86400000 | off | dbname=regress_doesnotexist | 0/0
+(1 row)
+
+-- fail - alter subscription with streaming = parallel should fail when time-delayed replication is set
+ALTER SUBSCRIPTION regress_testsub SET (streaming = parallel);
+ERROR: cannot set parallel streaming mode for subscription with min_apply_delay
+-- fail - alter subscription with min_apply_delay should fail when streaming = parallel is set
+ALTER SUBSCRIPTION regress_testsub SET (min_apply_delay = 0, streaming = parallel);
+ALTER SUBSCRIPTION regress_testsub SET (min_apply_delay = 123);
+ERROR: cannot set min_apply_delay for subscription in parallel streaming mode
ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE);
DROP SUBSCRIPTION regress_testsub;
RESET SESSION AUTHORIZATION;
diff --git a/src/test/regress/sql/subscription.sql b/src/test/regress/sql/subscription.sql
index 7281f5fee2..7317b140f5 100644
--- a/src/test/regress/sql/subscription.sql
+++ b/src/test/regress/sql/subscription.sql
@@ -286,6 +286,30 @@ ALTER SUBSCRIPTION regress_testsub SET (disable_on_error = true);
ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE);
DROP SUBSCRIPTION regress_testsub;
+-- fail -- min_apply_delay must be a non-negative integer
+CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUBLICATION testpub WITH (connect = false, min_apply_delay = foo);
+CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUBLICATION testpub WITH (connect = false, min_apply_delay = -1);
+
+-- fail - utilizing streaming = parallel with time-delayed replication is not supported
+CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUBLICATION testpub WITH (connect = false, streaming = parallel, min_apply_delay = 123);
+
+-- success -- min_apply_delay value without unit is taken as milliseconds
+CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUBLICATION testpub WITH (connect = false, min_apply_delay = 123);
+\dRs+
+
+-- success -- min_apply_delay value with unit is converted into ms and stored as an integer
+ALTER SUBSCRIPTION regress_testsub SET (min_apply_delay = '1 d');
+\dRs+
+
+-- fail - alter subscription with streaming = parallel should fail when time-delayed replication is set
+ALTER SUBSCRIPTION regress_testsub SET (streaming = parallel);
+
+-- fail - alter subscription with min_apply_delay should fail when streaming = parallel is set
+ALTER SUBSCRIPTION regress_testsub SET (min_apply_delay = 0, streaming = parallel);
+ALTER SUBSCRIPTION regress_testsub SET (min_apply_delay = 123);
+ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE);
+DROP SUBSCRIPTION regress_testsub;
+
RESET SESSION AUTHORIZATION;
DROP ROLE regress_subscription_user;
DROP ROLE regress_subscription_user2;
diff --git a/src/test/subscription/t/001_rep_changes.pl b/src/test/subscription/t/001_rep_changes.pl
index 91aa068c95..75fd77b891 100644
--- a/src/test/subscription/t/001_rep_changes.pl
+++ b/src/test/subscription/t/001_rep_changes.pl
@@ -515,6 +515,34 @@ $node_publisher->poll_query_until('postgres',
or die
"Timed out while waiting for apply to restart after renaming SUBSCRIPTION";
+# Test time-delayed logical replication
+#
+# If the subscription sets min_apply_delay parameter, the logical replication
+# worker will delay the transaction apply for min_apply_delay milliseconds. We
+# verify this by looking at the time difference between a) when tuples are
+# inserted on the publisher, and b) when those changes are replicated on the
+# subscriber. Even on slow machines, this strategy will give predictable behavior.
+
+# Set min_apply_delay parameter to 3 seconds
+my $delay = 3;
+$node_subscriber->safe_psql('postgres',
+ "ALTER SUBSCRIPTION tap_sub_renamed SET (min_apply_delay = '${delay}s')");
+
+# Before doing the insertion, get the current timestamp that will be
+# used as a comparison base.
+my $publisher_insert_time = time();
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_ins VALUES (generate_series(1101, 1120))");
+
+# The publisher waits for the replication to complete
+$node_publisher->wait_for_catchup('tap_sub_renamed');
+
+# This test is successful if and only if the LSN has been applied with at least
+# the configured apply delay.
+ok( time() - $publisher_insert_time >= $delay,
+ "subscriber applies WAL only after replication delay for non-streaming transaction"
+);
+
# check all the cleanup
$node_subscriber->safe_psql('postgres', "DROP SUBSCRIPTION tap_sub_renamed");
--
2.27.0
On Wed, Feb 8, 2023 at 8:03 PM Hayato Kuroda (Fujitsu)
<kuroda.hayato@fujitsu.com> wrote:
...
======
src/backend/replication/logical/worker.c
2. maybe_apply_delay
+ if (wal_receiver_status_interval > 0 && + diffms > wal_receiver_status_interval * 1000L) + { + WaitLatch(MyLatch, + WL_LATCH_SET | WL_TIMEOUT | WL_EXIT_ON_PM_DEATH, + wal_receiver_status_interval * 1000L, + WAIT_EVENT_RECOVERY_APPLY_DELAY); + send_feedback(last_received, true, false, true); + }I felt that introducing another variable like:
long statusinterval_ms = wal_receiver_status_interval * 1000L;
would help here by doing 2 things:
1) The condition would be easier to read because the ms units would be the same
2) Won't need * 1000L repeated in two places.Only, do take care to assign this variable in the right place in this
loop in case the configuration is changed.Fixed. Calculations are done on two lines - first one is the entrance of the loop,
and second one is the after SIGHUP is detected.
TBH, I expected you would write this as just a *single* variable
assignment before the condition like below:
SUGGESTION (tweaked comment and put single assignment before condition)
/*
* Call send_feedback() to prevent the publisher from exiting by
* timeout during the delay, when the status interval is greater than
* zero.
*/
status_interval_ms = wal_receiver_status_interval * 1000L;
if (status_interval_ms > 0 && diffms > status_interval_ms)
{
...
~
I understand in theory, your code is more efficient, but in practice,
I think the overhead of a single variable assignment every loop
iteration (which is doing WaitLatch anyway) is of insignificant
concern, whereas having one assignment is simpler than having two IMO.
But, if you want to keep it the way you have then that is OK.
Otherwise, this patch v32 LGTM.
------
Kind Regards,
Peter Smith.
Fujitsu Australia.
At Wed, 8 Feb 2023 09:03:03 +0000, "Hayato Kuroda (Fujitsu)" <kuroda.hayato@fujitsu.com> wrote in
Thank you for reviewing! PSA new version.
+ if (statusinterval_ms > 0 && diffms > statusinterval_ms)
The next expected feedback time is measured from the last status
report. Thus, it seems to me this may suppress feedbacks from being
sent for an unexpectedly long time especially when min_apply_delay is
shorter than wal_r_s_interval.
regards.
--
Kyotaro Horiguchi
NTT Open Source Software Center
On Thu, Feb 9, 2023 at 12:17 AM Peter Smith <smithpb2250@gmail.com> wrote:
On Wed, Feb 8, 2023 at 8:03 PM Hayato Kuroda (Fujitsu)
<kuroda.hayato@fujitsu.com> wrote:...
======
src/backend/replication/logical/worker.c
2. maybe_apply_delay
+ if (wal_receiver_status_interval > 0 && + diffms > wal_receiver_status_interval * 1000L) + { + WaitLatch(MyLatch, + WL_LATCH_SET | WL_TIMEOUT | WL_EXIT_ON_PM_DEATH, + wal_receiver_status_interval * 1000L, + WAIT_EVENT_RECOVERY_APPLY_DELAY); + send_feedback(last_received, true, false, true); + }I felt that introducing another variable like:
long statusinterval_ms = wal_receiver_status_interval * 1000L;
would help here by doing 2 things:
1) The condition would be easier to read because the ms units would be the same
2) Won't need * 1000L repeated in two places.Only, do take care to assign this variable in the right place in this
loop in case the configuration is changed.Fixed. Calculations are done on two lines - first one is the entrance of the loop,
and second one is the after SIGHUP is detected.TBH, I expected you would write this as just a *single* variable
assignment before the condition like below:SUGGESTION (tweaked comment and put single assignment before condition)
/*
* Call send_feedback() to prevent the publisher from exiting by
* timeout during the delay, when the status interval is greater than
* zero.
*/
status_interval_ms = wal_receiver_status_interval * 1000L;
if (status_interval_ms > 0 && diffms > status_interval_ms)
{
...~
I understand in theory, your code is more efficient, but in practice,
I think the overhead of a single variable assignment every loop
iteration (which is doing WaitLatch anyway) is of insignificant
concern, whereas having one assignment is simpler than having two IMO.
Yeah, that sounds better to me as well.
--
With Regards,
Amit Kapila.
On Thu, Feb 9, 2023 at 10:45 AM Kyotaro Horiguchi
<horikyota.ntt@gmail.com> wrote:
At Wed, 8 Feb 2023 09:03:03 +0000, "Hayato Kuroda (Fujitsu)" <kuroda.hayato@fujitsu.com> wrote in
Thank you for reviewing! PSA new version.
+ if (statusinterval_ms > 0 && diffms > statusinterval_ms)
The next expected feedback time is measured from the last status
report. Thus, it seems to me this may suppress feedbacks from being
sent for an unexpectedly long time especially when min_apply_delay is
shorter than wal_r_s_interval.
I think the minimum time before we send any feedback during the wait
is wal_r_s_interval. Now, I think if there is no transaction for a
long time before we get a new transaction, there should be keep-alive
messages in between which would allow us to send feedback at regular
intervals (wal_receiver_status_interval). So, I think we should be
able to send feedback in less than 2 * wal_receiver_status_interval
unless wal_sender/receiver timeout is very large and there is a very
low volume of transactions. Now, we can try to send the feedback
before we start waiting or maybe after every
wal_receiver_status_interval / 2 but I think that will lead to more
spurious feedback messages than we get the benefit from them.
--
With Regards,
Amit Kapila.
Hi,
On Thursday, February 9, 2023 4:56 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
On Thu, Feb 9, 2023 at 12:17 AM Peter Smith <smithpb2250@gmail.com>
wrote:On Wed, Feb 8, 2023 at 8:03 PM Hayato Kuroda (Fujitsu)
<kuroda.hayato@fujitsu.com> wrote:...
======
src/backend/replication/logical/worker.c
2. maybe_apply_delay
+ if (wal_receiver_status_interval > 0 && diffms > + wal_receiver_status_interval * 1000L) { WaitLatch(MyLatch, + WL_LATCH_SET | WL_TIMEOUT | WL_EXIT_ON_PM_DEATH, + wal_receiver_status_interval * 1000L, + WAIT_EVENT_RECOVERY_APPLY_DELAY);send_feedback(last_received,
+ true, false, true); }
I felt that introducing another variable like:
long statusinterval_ms = wal_receiver_status_interval * 1000L;
would help here by doing 2 things:
1) The condition would be easier to read because the ms units
would be the same
2) Won't need * 1000L repeated in two places.Only, do take care to assign this variable in the right place in
this loop in case the configuration is changed.Fixed. Calculations are done on two lines - first one is the
entrance of the loop, and second one is the after SIGHUP is detected.TBH, I expected you would write this as just a *single* variable
assignment before the condition like below:SUGGESTION (tweaked comment and put single assignment before
condition)
/*
* Call send_feedback() to prevent the publisher from exiting by
* timeout during the delay, when the status interval is greater than
* zero.
*/
status_interval_ms = wal_receiver_status_interval * 1000L; if
(status_interval_ms > 0 && diffms > status_interval_ms) { ...~
I understand in theory, your code is more efficient, but in practice,
I think the overhead of a single variable assignment every loop
iteration (which is doing WaitLatch anyway) is of insignificant
concern, whereas having one assignment is simpler than having two IMO.Yeah, that sounds better to me as well.
OK, fixed.
The comment adjustment suggested by Peter-san above
was also included in this v33.
Please have a look at the attached patch.
Best Regards,
Takamichi Osumi
Attachments:
v33-0001-Time-delayed-logical-replication-subscriber.patchapplication/octet-stream; name=v33-0001-Time-delayed-logical-replication-subscriber.patchDownload
From cbb290d9c9cbb03eb95d28aa5449b7bae513328b Mon Sep 17 00:00:00 2001
From: Takamichi Osumi <osumi.takamichi@fujitsu.com>
Date: Thu, 9 Feb 2023 09:05:23 +0000
Subject: [PATCH v33] Time-delayed logical replication subscriber
Similar to physical replication, a time-delayed copy of the data for
logical replication is useful for some scenarios (particularly to fix
errors that might cause data loss).
This patch implements a new subscription parameter called 'min_apply_delay'.
If the subscription sets min_apply_delay parameter, the logical
replication worker will delay the transaction apply for min_apply_delay
milliseconds.
The delay is calculated between the WAL time stamp and the current time
on the subscriber.
The delay occurs before we start to apply the transaction on the
subscriber. The main reason is to avoid keeping a transaction open for
a long time. Regular and prepared transactions are covered. Streamed
transactions are also covered.
The combination of parallel streaming mode and min_apply_delay is not
allowed. This is because in parallel streaming mode, we start applying
the transaction stream as soon as the first change arrives without
knowing the transaction's prepare/commit time. This means we cannot
calculate the underlying network/decoding lag between publisher and
subscriber, and so always waiting for the full 'min_apply_delay' period
might include unnecessary delay.
The other possibility was to apply the delay at the end of the parallel
apply transaction but that would cause issues related to resource
bloat and locks being held for a long time.
Note that this feature doesn't interact with skip transaction feature.
The skip transaction feature applies to one transaction with a specific LSN.
So, even if the skipped transaction and non-skipped transaction come
consecutively in a very short time, regardless of the order of which comes
first, the time-delayed feature gets balanced by delayed application
for other transactions before and after the skipped transaction.
Author: Euler Taveira, Takamichi Osumi, Kuroda Hayato
Reviewed-by: Amit Kapila, Peter Smith, Vignesh C, Shveta Malik,
Kyotaro Horiguchi, Shi Yu, Wang Wei, Dilip Kumar, Melih Mutlu
Discussion: https://postgr.es/m/CAB-JLwYOYwL=XTyAXKiH5CtM_Vm8KjKh7aaitCKvmCh4rzr5pQ@mail.gmail.com
---
doc/src/sgml/catalogs.sgml | 9 +
doc/src/sgml/config.sgml | 12 ++
doc/src/sgml/glossary.sgml | 15 ++
doc/src/sgml/logical-replication.sgml | 6 +
doc/src/sgml/ref/alter_subscription.sgml | 5 +-
doc/src/sgml/ref/create_subscription.sgml | 49 ++++-
src/backend/catalog/pg_subscription.c | 1 +
src/backend/catalog/system_views.sql | 7 +-
src/backend/commands/subscriptioncmds.c | 122 +++++++++++-
.../replication/logical/applyparallelworker.c | 3 +-
src/backend/replication/logical/worker.c | 166 ++++++++++++++--
src/bin/pg_dump/pg_dump.c | 15 +-
src/bin/pg_dump/pg_dump.h | 1 +
src/bin/psql/describe.c | 9 +-
src/bin/psql/tab-complete.c | 4 +-
src/include/catalog/pg_subscription.h | 3 +
src/include/replication/worker_internal.h | 2 +-
src/test/regress/expected/subscription.out | 181 +++++++++++-------
src/test/regress/sql/subscription.sql | 24 +++
src/test/subscription/t/001_rep_changes.pl | 28 +++
20 files changed, 558 insertions(+), 104 deletions(-)
diff --git a/doc/src/sgml/catalogs.sgml b/doc/src/sgml/catalogs.sgml
index c1e4048054..5dc5ca1133 100644
--- a/doc/src/sgml/catalogs.sgml
+++ b/doc/src/sgml/catalogs.sgml
@@ -7873,6 +7873,15 @@ SCRAM-SHA-256$<replaceable><iteration count></replaceable>:<replaceable>&l
</para></entry>
</row>
+ <row>
+ <entry role="catalog_table_entry"><para role="column_definition">
+ <structfield>subminapplydelay</structfield> <type>int4</type>
+ </para>
+ <para>
+ The minimum delay, in milliseconds, for applying changes
+ </para></entry>
+ </row>
+
<row>
<entry role="catalog_table_entry"><para role="column_definition">
<structfield>subname</structfield> <type>name</type>
diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml
index d190be1925..626a8b5bd0 100644
--- a/doc/src/sgml/config.sgml
+++ b/doc/src/sgml/config.sgml
@@ -4787,6 +4787,18 @@ ANY <replaceable class="parameter">num_sync</replaceable> ( <replaceable class="
the <filename>postgresql.conf</filename> file or on the server
command line.
</para>
+ <para>
+ For time-delayed logical replication, the apply worker sends a feedback
+ message to the publisher every
+ <varname>wal_receiver_status_interval</varname> milliseconds. Make sure
+ to set <varname>wal_receiver_status_interval</varname> less than the
+ <varname>wal_sender_timeout</varname> on the publisher, otherwise, the
+ <literal>walsender</literal> will repeatedly terminate due to timeout
+ errors. Note that if <varname>wal_receiver_status_interval</varname> is
+ set to zero, the apply worker sends no feedback messages during the
+ <literal>min_apply_delay</literal> period. Refer to
+ <xref linkend="sql-createsubscription"/> for more information.
+ </para>
</listitem>
</varlistentry>
diff --git a/doc/src/sgml/glossary.sgml b/doc/src/sgml/glossary.sgml
index 7c01a541fe..9ede9d05f6 100644
--- a/doc/src/sgml/glossary.sgml
+++ b/doc/src/sgml/glossary.sgml
@@ -1729,6 +1729,21 @@
</glossdef>
</glossentry>
+ <glossentry id="glossary-time-delayed-replication">
+ <glossterm>Time-delayed replication</glossterm>
+ <glossdef>
+ <para>
+ Replication setup that delays the application of changes by a specified
+ minimum time-delay period.
+ </para>
+ <para>
+ For more information, see
+ <xref linkend="guc-recovery-min-apply-delay"/> for physical replication
+ and <xref linkend="sql-createsubscription"/> for logical replication.
+ </para>
+ </glossdef>
+ </glossentry>
+
<glossentry id="glossary-toast">
<glossterm>TOAST</glossterm>
<glossdef>
diff --git a/doc/src/sgml/logical-replication.sgml b/doc/src/sgml/logical-replication.sgml
index 1bd5660c87..6bd5f61e2b 100644
--- a/doc/src/sgml/logical-replication.sgml
+++ b/doc/src/sgml/logical-replication.sgml
@@ -247,6 +247,12 @@
target table.
</para>
+ <para>
+ A subscription can delay the application of changes by specifying the
+ <literal>min_apply_delay</literal> subscription parameter. See
+ <xref linkend="sql-createsubscription"/> for details.
+ </para>
+
<sect2 id="logical-replication-subscription-slot">
<title>Replication Slot Management</title>
diff --git a/doc/src/sgml/ref/alter_subscription.sgml b/doc/src/sgml/ref/alter_subscription.sgml
index 964fcbb8ff..8b7eb28e54 100644
--- a/doc/src/sgml/ref/alter_subscription.sgml
+++ b/doc/src/sgml/ref/alter_subscription.sgml
@@ -213,8 +213,9 @@ ALTER SUBSCRIPTION <replaceable class="parameter">name</replaceable> RENAME TO <
are <literal>slot_name</literal>,
<literal>synchronous_commit</literal>,
<literal>binary</literal>, <literal>streaming</literal>,
- <literal>disable_on_error</literal>, and
- <literal>origin</literal>.
+ <literal>disable_on_error</literal>,
+ <literal>origin</literal>, and
+ <literal>min_apply_delay</literal>.
</para>
</listitem>
</varlistentry>
diff --git a/doc/src/sgml/ref/create_subscription.sgml b/doc/src/sgml/ref/create_subscription.sgml
index 51c45f17c7..1b4b8390af 100644
--- a/doc/src/sgml/ref/create_subscription.sgml
+++ b/doc/src/sgml/ref/create_subscription.sgml
@@ -349,7 +349,49 @@ CREATE SUBSCRIPTION <replaceable class="parameter">subscription_name</replaceabl
</para>
</listitem>
</varlistentry>
- </variablelist></para>
+
+ <varlistentry>
+ <term><literal>min_apply_delay</literal> (<type>integer</type>)</term>
+ <listitem>
+ <para>
+ By default, the subscriber applies changes as soon as possible. This
+ parameter allows the user to delay the application of changes by a
+ given time period. If the value is specified without units, it is
+ taken as milliseconds. The default is zero (no delay). See
+ <xref linkend="config-setting-names-values"/> for details on the
+ available valid time units.
+ </para>
+ <para>
+ Any delay becomes effective only after all initial table
+ synchronization has finished and occurs before each transaction starts
+ to get applied on the subscriber. The delay is calculated as the
+ difference between the WAL timestamp as written on the publisher and
+ the current time on the subscriber. Any overhead of time spent in
+ logical decoding and in transferring the transaction may reduce the
+ actual wait time. It is also possible that the overhead already
+ exceeds the requested <literal>min_apply_delay</literal> value, in
+ which case no delay is applied. If the system clocks on publisher and
+ subscriber are not synchronized, this may lead to apply changes
+ earlier than expected, but this is not a major issue because this
+ parameter is typically much larger than the time deviations between
+ servers. Note that if this parameter is set to a long delay, the
+ replication will stop if the replication slot falls behind the current
+ LSN by more than
+ <link linkend="guc-max-slot-wal-keep-size"><literal>max_slot_wal_keep_size</literal></link>.
+ </para>
+ <warning>
+ <para>
+ Delaying the replication means there is a much longer time between
+ making a change on the publisher, and that change being committed
+ on the subscriber. This can impact the performance of synchronous
+ replication. See <xref linkend="guc-synchronous-commit"/>
+ parameter.
+ </para>
+ </warning>
+ </listitem>
+ </varlistentry>
+ </variablelist>
+ </para>
</listitem>
</varlistentry>
@@ -420,6 +462,11 @@ CREATE SUBSCRIPTION <replaceable class="parameter">subscription_name</replaceabl
published with different column lists are not supported.
</para>
+ <para>
+ A non-zero <literal>min_apply_delay</literal> parameter is not allowed when
+ streaming in parallel mode.
+ </para>
+
<para>
We allow non-existent publications to be specified so that users can add
those later. This means
diff --git a/src/backend/catalog/pg_subscription.c b/src/backend/catalog/pg_subscription.c
index a56ae311c3..e19e5cbca2 100644
--- a/src/backend/catalog/pg_subscription.c
+++ b/src/backend/catalog/pg_subscription.c
@@ -66,6 +66,7 @@ GetSubscription(Oid subid, bool missing_ok)
sub->skiplsn = subform->subskiplsn;
sub->name = pstrdup(NameStr(subform->subname));
sub->owner = subform->subowner;
+ sub->minapplydelay = subform->subminapplydelay;
sub->enabled = subform->subenabled;
sub->binary = subform->subbinary;
sub->stream = subform->substream;
diff --git a/src/backend/catalog/system_views.sql b/src/backend/catalog/system_views.sql
index 8608e3fa5b..317c2010cb 100644
--- a/src/backend/catalog/system_views.sql
+++ b/src/backend/catalog/system_views.sql
@@ -1299,9 +1299,10 @@ REVOKE ALL ON pg_replication_origin_status FROM public;
-- All columns of pg_subscription except subconninfo are publicly readable.
REVOKE ALL ON pg_subscription FROM public;
-GRANT SELECT (oid, subdbid, subskiplsn, subname, subowner, subenabled,
- subbinary, substream, subtwophasestate, subdisableonerr,
- subslotname, subsynccommit, subpublications, suborigin)
+GRANT SELECT (oid, subdbid, subskiplsn, subminapplydelay, subname, subowner,
+ subenabled, subbinary, substream, subtwophasestate,
+ subdisableonerr, subslotname, subsynccommit, subpublications,
+ suborigin)
ON pg_subscription TO public;
CREATE VIEW pg_stat_subscription_stats AS
diff --git a/src/backend/commands/subscriptioncmds.c b/src/backend/commands/subscriptioncmds.c
index 464db6d247..82e16fd0f9 100644
--- a/src/backend/commands/subscriptioncmds.c
+++ b/src/backend/commands/subscriptioncmds.c
@@ -66,6 +66,7 @@
#define SUBOPT_DISABLE_ON_ERR 0x00000400
#define SUBOPT_LSN 0x00000800
#define SUBOPT_ORIGIN 0x00001000
+#define SUBOPT_MIN_APPLY_DELAY 0x00002000
/* check if the 'val' has 'bits' set */
#define IsSet(val, bits) (((val) & (bits)) == (bits))
@@ -90,6 +91,7 @@ typedef struct SubOpts
bool disableonerr;
char *origin;
XLogRecPtr lsn;
+ int32 min_apply_delay;
} SubOpts;
static List *fetch_table_list(WalReceiverConn *wrconn, List *publications);
@@ -100,7 +102,7 @@ static void check_publications_origin(WalReceiverConn *wrconn,
static void check_duplicates_in_publist(List *publist, Datum *datums);
static List *merge_publications(List *oldpublist, List *newpublist, bool addpub, const char *subname);
static void ReportSlotConnectionError(List *rstates, Oid subid, char *slotname, char *err);
-
+static int32 defGetMinApplyDelay(DefElem *def);
/*
* Common option parsing function for CREATE and ALTER SUBSCRIPTION commands.
@@ -146,6 +148,8 @@ parse_subscription_options(ParseState *pstate, List *stmt_options,
opts->disableonerr = false;
if (IsSet(supported_opts, SUBOPT_ORIGIN))
opts->origin = pstrdup(LOGICALREP_ORIGIN_ANY);
+ if (IsSet(supported_opts, SUBOPT_MIN_APPLY_DELAY))
+ opts->min_apply_delay = 0;
/* Parse options */
foreach(lc, stmt_options)
@@ -324,6 +328,15 @@ parse_subscription_options(ParseState *pstate, List *stmt_options,
opts->specified_opts |= SUBOPT_LSN;
opts->lsn = lsn;
}
+ else if (IsSet(supported_opts, SUBOPT_MIN_APPLY_DELAY) &&
+ strcmp(defel->defname, "min_apply_delay") == 0)
+ {
+ if (IsSet(opts->specified_opts, SUBOPT_MIN_APPLY_DELAY))
+ errorConflictingDefElem(defel, pstate);
+
+ opts->specified_opts |= SUBOPT_MIN_APPLY_DELAY;
+ opts->min_apply_delay = defGetMinApplyDelay(defel);
+ }
else
ereport(ERROR,
(errcode(ERRCODE_SYNTAX_ERROR),
@@ -404,6 +417,32 @@ parse_subscription_options(ParseState *pstate, List *stmt_options,
"slot_name = NONE", "create_slot = false")));
}
}
+
+ /*
+ * The combination of parallel streaming mode and min_apply_delay is not
+ * allowed. This is because in parallel streaming mode, we start applying
+ * the transaction stream as soon as the first change arrives without
+ * knowing the transaction's prepare/commit time. This means we cannot
+ * calculate the underlying network/decoding lag between publisher and
+ * subscriber, and so always waiting for the full 'min_apply_delay' period
+ * might include unnecessary delay.
+ *
+ * The other possibility was to apply the delay at the end of the parallel
+ * apply transaction but that would cause issues related to resource bloat
+ * and locks being held for a long time.
+ */
+ if (IsSet(supported_opts, SUBOPT_MIN_APPLY_DELAY) &&
+ opts->min_apply_delay > 0 &&
+ opts->streaming == LOGICALREP_STREAM_PARALLEL)
+ ereport(ERROR,
+ errcode(ERRCODE_SYNTAX_ERROR),
+
+ /*
+ * translator: the first %s is a string of the form "parameter > 0"
+ * and the second one is "option = value".
+ */
+ errmsg("%s and %s are mutually exclusive options",
+ "min_apply_delay > 0", "streaming = parallel"));
}
/*
@@ -560,7 +599,8 @@ CreateSubscription(ParseState *pstate, CreateSubscriptionStmt *stmt,
SUBOPT_SLOT_NAME | SUBOPT_COPY_DATA |
SUBOPT_SYNCHRONOUS_COMMIT | SUBOPT_BINARY |
SUBOPT_STREAMING | SUBOPT_TWOPHASE_COMMIT |
- SUBOPT_DISABLE_ON_ERR | SUBOPT_ORIGIN);
+ SUBOPT_DISABLE_ON_ERR | SUBOPT_ORIGIN |
+ SUBOPT_MIN_APPLY_DELAY);
parse_subscription_options(pstate, stmt->options, supported_opts, &opts);
/*
@@ -625,6 +665,7 @@ CreateSubscription(ParseState *pstate, CreateSubscriptionStmt *stmt,
values[Anum_pg_subscription_oid - 1] = ObjectIdGetDatum(subid);
values[Anum_pg_subscription_subdbid - 1] = ObjectIdGetDatum(MyDatabaseId);
values[Anum_pg_subscription_subskiplsn - 1] = LSNGetDatum(InvalidXLogRecPtr);
+ values[Anum_pg_subscription_subminapplydelay - 1] = Int32GetDatum(opts.min_apply_delay);
values[Anum_pg_subscription_subname - 1] =
DirectFunctionCall1(namein, CStringGetDatum(stmt->subname));
values[Anum_pg_subscription_subowner - 1] = ObjectIdGetDatum(owner);
@@ -1054,7 +1095,7 @@ AlterSubscription(ParseState *pstate, AlterSubscriptionStmt *stmt,
supported_opts = (SUBOPT_SLOT_NAME |
SUBOPT_SYNCHRONOUS_COMMIT | SUBOPT_BINARY |
SUBOPT_STREAMING | SUBOPT_DISABLE_ON_ERR |
- SUBOPT_ORIGIN);
+ SUBOPT_ORIGIN | SUBOPT_MIN_APPLY_DELAY);
parse_subscription_options(pstate, stmt->options,
supported_opts, &opts);
@@ -1098,6 +1139,19 @@ AlterSubscription(ParseState *pstate, AlterSubscriptionStmt *stmt,
if (IsSet(opts.specified_opts, SUBOPT_STREAMING))
{
+ /*
+ * The combination of parallel streaming mode and
+ * min_apply_delay is not allowed. See
+ * parse_subscription_options.
+ */
+ if (opts.streaming == LOGICALREP_STREAM_PARALLEL &&
+ !IsSet(opts.specified_opts, SUBOPT_MIN_APPLY_DELAY)
+ && sub->minapplydelay > 0)
+ ereport(ERROR,
+ errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
+ errmsg("cannot set parallel streaming mode for subscription with %s",
+ "min_apply_delay"));
+
values[Anum_pg_subscription_substream - 1] =
CharGetDatum(opts.streaming);
replaces[Anum_pg_subscription_substream - 1] = true;
@@ -1111,6 +1165,26 @@ AlterSubscription(ParseState *pstate, AlterSubscriptionStmt *stmt,
= true;
}
+ if (IsSet(opts.specified_opts, SUBOPT_MIN_APPLY_DELAY))
+ {
+ /*
+ * The combination of parallel streaming mode and
+ * min_apply_delay is not allowed. See
+ * parse_subscription_options.
+ */
+ if (opts.min_apply_delay > 0 &&
+ !IsSet(opts.specified_opts, SUBOPT_STREAMING)
+ && sub->stream == LOGICALREP_STREAM_PARALLEL)
+ ereport(ERROR,
+ errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
+ errmsg("cannot set %s for subscription in parallel streaming mode",
+ "min_apply_delay"));
+
+ values[Anum_pg_subscription_subminapplydelay - 1] =
+ Int32GetDatum(opts.min_apply_delay);
+ replaces[Anum_pg_subscription_subminapplydelay - 1] = true;
+ }
+
if (IsSet(opts.specified_opts, SUBOPT_ORIGIN))
{
values[Anum_pg_subscription_suborigin - 1] =
@@ -2195,3 +2269,45 @@ defGetStreamingMode(DefElem *def)
def->defname)));
return LOGICALREP_STREAM_OFF; /* keep compiler quiet */
}
+
+/*
+ * Extract the min_apply_delay value from a DefElem. This is very similar to
+ * parse_and_validate_value() for integer values, because min_apply_delay
+ * accepts the same parameter format as recovery_min_apply_delay.
+ */
+static int32
+defGetMinApplyDelay(DefElem *def)
+{
+ char *input_string;
+ int result;
+ const char *hintmsg;
+
+ input_string = defGetString(def);
+
+ /*
+ * Parse given string as parameter which has millisecond unit
+ */
+ if (!parse_int(input_string, &result, GUC_UNIT_MS, &hintmsg))
+ ereport(ERROR,
+ (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+ errmsg("invalid value for parameter \"%s\": \"%s\"",
+ "min_apply_delay", input_string),
+ hintmsg ? errhint("%s", _(hintmsg)) : 0));
+
+ /*
+ * Check both the lower boundary for the valid min_apply_delay range and
+ * the upper boundary as the safeguard for some platforms where INT_MAX is
+ * wider than int32 respectively. Although parse_int() has confirmed that
+ * the result is less than or equal to INT_MAX, the value will be stored
+ * in a catalog column of int32.
+ */
+ if (result < 0 || result > PG_INT32_MAX)
+ ereport(ERROR,
+ (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+ errmsg("%d ms is outside the valid range for parameter \"%s\" (%d .. %d)",
+ result,
+ "min_apply_delay",
+ 0, PG_INT32_MAX)));
+
+ return result;
+}
diff --git a/src/backend/replication/logical/applyparallelworker.c b/src/backend/replication/logical/applyparallelworker.c
index da437e0bc3..32db20fd98 100644
--- a/src/backend/replication/logical/applyparallelworker.c
+++ b/src/backend/replication/logical/applyparallelworker.c
@@ -704,7 +704,8 @@ pa_process_spooled_messages_if_required(void)
{
apply_spooled_messages(&MyParallelShared->fileset,
MyParallelShared->xid,
- InvalidXLogRecPtr);
+ InvalidXLogRecPtr,
+ 0);
pa_set_fileset_state(MyParallelShared, FS_EMPTY);
}
diff --git a/src/backend/replication/logical/worker.c b/src/backend/replication/logical/worker.c
index cfb2ab6248..19b0574ad0 100644
--- a/src/backend/replication/logical/worker.c
+++ b/src/backend/replication/logical/worker.c
@@ -319,6 +319,17 @@ static List *on_commit_wakeup_workers_subids = NIL;
bool in_remote_transaction = false;
static XLogRecPtr remote_final_lsn = InvalidXLogRecPtr;
+/*
+ * In order to avoid walsender timeout for time-delayed logical replication the
+ * apply worker keeps sending feedback messages during the delay period.
+ * Meanwhile, the feature delays the apply before the start of the
+ * transaction and thus we don't write WAL records for the suspended changes
+ * during the wait. When the apply worker sends a feedback message during the
+ * delay, we should not overwrite positions of the flushed and apply LSN by the
+ * last received latest LSN. See send_feedback() for details.
+ */
+static XLogRecPtr last_received = InvalidXLogRecPtr;
+
/* fields valid only when processing streamed transaction */
static bool in_streamed_transaction = false;
@@ -389,7 +400,8 @@ static void stream_write_change(char action, StringInfo s);
static void stream_open_and_write_change(TransactionId xid, char action, StringInfo s);
static void stream_close_file(void);
-static void send_feedback(XLogRecPtr recvpos, bool force, bool requestReply);
+static void send_feedback(XLogRecPtr recvpos, bool force, bool requestReply,
+ bool has_unprocessed_change);
static void DisableSubscriptionAndExit(void);
@@ -999,6 +1011,110 @@ slot_modify_data(TupleTableSlot *slot, TupleTableSlot *srcslot,
ExecStoreVirtualTuple(slot);
}
+/*
+ * When min_apply_delay parameter is set on the subscriber, we wait long enough
+ * to make sure a transaction is applied at least that period behind the
+ * publisher.
+ *
+ * While the physical replication applies the delay at commit time, this
+ * feature applies the delay for the next transaction but before starting the
+ * transaction. This is mainly because keeping a transaction that conducted
+ * write operations open for a long time results in some issues such as bloat
+ * and locks.
+ *
+ * The min_apply_delay parameter will take effect only after all tables are in
+ * READY state.
+ *
+ * xid is the transaction id where we apply the delay.
+ *
+ * finish_ts is the commit/prepare time of both regular (non-streamed) and
+ * streamed transactions. Unlike the regular (non-streamed) cases, the delay
+ * is applied in a STREAM COMMIT/STREAM PREPARE message for streamed
+ * transactions. The STREAM START message does not contain a commit/prepare
+ * time (it will be available when the in-progress transaction finishes).
+ * Hence, it's not appropriate to apply a delay at the STREAM START time.
+ */
+static void
+maybe_apply_delay(TransactionId xid, TimestampTz finish_ts)
+{
+ Assert(finish_ts > 0);
+
+ /* Nothing to do if no delay set */
+ if (!MySubscription->minapplydelay)
+ return;
+
+ /*
+ * The min_apply_delay parameter is ignored until all tablesync workers
+ * have reached READY state. This is because if we allowed the delay
+ * during the catchup phase, then once we reached the limit of tablesync
+ * workers it would impose a delay for each subsequent worker. That would
+ * cause initial table synchronization completion to take a long time.
+ */
+ if (!AllTablesyncsReady())
+ return;
+
+ /* Apply the delay by the latch mechanism */
+ while (true)
+ {
+ TimestampTz delayUntil;
+ long diffms;
+ long status_interval_ms;
+
+ ResetLatch(MyLatch);
+
+ CHECK_FOR_INTERRUPTS();
+
+ /* This might change wal_receiver_status_interval */
+ if (ConfigReloadPending)
+ {
+ ConfigReloadPending = false;
+ ProcessConfigFile(PGC_SIGHUP);
+ }
+
+ /*
+ * Before calculating the time duration, reload the catalog if needed.
+ */
+ if (!in_remote_transaction && !in_streamed_transaction)
+ {
+ AcceptInvalidationMessages();
+ maybe_reread_subscription();
+ }
+
+ delayUntil = TimestampTzPlusMilliseconds(finish_ts, MySubscription->minapplydelay);
+ diffms = TimestampDifferenceMilliseconds(GetCurrentTimestamp(), delayUntil);
+
+ /*
+ * Exit without arming the latch if it's already past time to apply
+ * this transaction.
+ */
+ if (diffms <= 0)
+ break;
+
+ elog(DEBUG2, "time-delayed replication for txid %u, min_apply_delay = %d ms, remaining wait time: %ld ms",
+ xid, MySubscription->minapplydelay, diffms);
+
+ /*
+ * Call send_feedback() to prevent the publisher from exiting by
+ * timeout during the delay, when the status interval is greater than
+ * zero.
+ */
+ status_interval_ms = wal_receiver_status_interval * 1000L;
+ if (status_interval_ms > 0 && diffms > status_interval_ms)
+ {
+ WaitLatch(MyLatch,
+ WL_LATCH_SET | WL_TIMEOUT | WL_EXIT_ON_PM_DEATH,
+ status_interval_ms,
+ WAIT_EVENT_RECOVERY_APPLY_DELAY);
+ send_feedback(last_received, true, false, true);
+ }
+ else
+ WaitLatch(MyLatch,
+ WL_LATCH_SET | WL_TIMEOUT | WL_EXIT_ON_PM_DEATH,
+ diffms,
+ WAIT_EVENT_RECOVERY_APPLY_DELAY);
+ }
+}
+
/*
* Handle BEGIN message.
*/
@@ -1013,6 +1129,9 @@ apply_handle_begin(StringInfo s)
logicalrep_read_begin(s, &begin_data);
set_apply_error_context_xact(begin_data.xid, begin_data.final_lsn);
+ /* Should we delay the current transaction? */
+ maybe_apply_delay(begin_data.xid, begin_data.committime);
+
remote_final_lsn = begin_data.final_lsn;
maybe_start_skipping_changes(begin_data.final_lsn);
@@ -1070,6 +1189,9 @@ apply_handle_begin_prepare(StringInfo s)
logicalrep_read_begin_prepare(s, &begin_data);
set_apply_error_context_xact(begin_data.xid, begin_data.prepare_lsn);
+ /* Should we delay the current prepared transaction? */
+ maybe_apply_delay(begin_data.xid, begin_data.prepare_time);
+
remote_final_lsn = begin_data.prepare_lsn;
maybe_start_skipping_changes(begin_data.prepare_lsn);
@@ -1317,7 +1439,8 @@ apply_handle_stream_prepare(StringInfo s)
* spooled operations.
*/
apply_spooled_messages(MyLogicalRepWorker->stream_fileset,
- prepare_data.xid, prepare_data.prepare_lsn);
+ prepare_data.xid, prepare_data.prepare_lsn,
+ prepare_data.prepare_time);
/* Mark the transaction as prepared. */
apply_handle_prepare_internal(&prepare_data);
@@ -2011,10 +2134,13 @@ ensure_last_message(FileSet *stream_fileset, TransactionId xid, int fileno,
/*
* Common spoolfile processing.
+ *
+ * The commit/prepare time (finish_ts) is required for time-delayed logical
+ * replication.
*/
void
apply_spooled_messages(FileSet *stream_fileset, TransactionId xid,
- XLogRecPtr lsn)
+ XLogRecPtr lsn, TimestampTz finish_ts)
{
StringInfoData s2;
int nchanges;
@@ -2025,6 +2151,10 @@ apply_spooled_messages(FileSet *stream_fileset, TransactionId xid,
int fileno;
off_t offset;
+ /* Should we delay the current transaction? */
+ if (finish_ts)
+ maybe_apply_delay(xid, finish_ts);
+
if (!am_parallel_apply_worker())
maybe_start_skipping_changes(lsn);
@@ -2174,7 +2304,7 @@ apply_handle_stream_commit(StringInfo s)
* spooled operations.
*/
apply_spooled_messages(MyLogicalRepWorker->stream_fileset, xid,
- commit_data.commit_lsn);
+ commit_data.commit_lsn, commit_data.committime);
apply_handle_commit_internal(&commit_data);
@@ -3447,7 +3577,7 @@ UpdateWorkerStats(XLogRecPtr last_lsn, TimestampTz send_time, bool reply)
* Apply main loop.
*/
static void
-LogicalRepApplyLoop(XLogRecPtr last_received)
+LogicalRepApplyLoop(void)
{
TimestampTz last_recv_timestamp = GetCurrentTimestamp();
bool ping_sent = false;
@@ -3568,7 +3698,7 @@ LogicalRepApplyLoop(XLogRecPtr last_received)
if (last_received < end_lsn)
last_received = end_lsn;
- send_feedback(last_received, reply_requested, false);
+ send_feedback(last_received, reply_requested, false, false);
UpdateWorkerStats(last_received, timestamp, true);
}
/* other message types are purposefully ignored */
@@ -3581,7 +3711,7 @@ LogicalRepApplyLoop(XLogRecPtr last_received)
}
/* confirm all writes so far */
- send_feedback(last_received, false, false);
+ send_feedback(last_received, false, false, false);
if (!in_remote_transaction && !in_streamed_transaction)
{
@@ -3678,7 +3808,7 @@ LogicalRepApplyLoop(XLogRecPtr last_received)
}
}
- send_feedback(last_received, requestReply, requestReply);
+ send_feedback(last_received, requestReply, requestReply, false);
/*
* Force reporting to ensure long idle periods don't lead to
@@ -3708,7 +3838,7 @@ LogicalRepApplyLoop(XLogRecPtr last_received)
* to send a response to avoid timeouts.
*/
static void
-send_feedback(XLogRecPtr recvpos, bool force, bool requestReply)
+send_feedback(XLogRecPtr recvpos, bool force, bool requestReply, bool has_unprocessed_change)
{
static StringInfo reply_message = NULL;
static TimestampTz send_time = 0;
@@ -3738,8 +3868,14 @@ send_feedback(XLogRecPtr recvpos, bool force, bool requestReply)
/*
* No outstanding transactions to flush, we can report the latest received
* position. This is important for synchronous replication.
+ *
+ * If the logical replication subscription has unprocessed changes then do
+ * not inform the publisher that the received latest LSN is already
+ * applied and flushed, otherwise, the publisher will make a wrong
+ * assumption about the logical replication progress. Instead, just send a
+ * feedback message to avoid a replication timeout during the delay.
*/
- if (!have_pending_txes)
+ if (!have_pending_txes && !has_unprocessed_change)
flushpos = writepos = recvpos;
if (writepos < last_writepos)
@@ -3776,8 +3912,9 @@ send_feedback(XLogRecPtr recvpos, bool force, bool requestReply)
pq_sendint64(reply_message, now); /* sendTime */
pq_sendbyte(reply_message, requestReply); /* replyRequested */
- elog(DEBUG2, "sending feedback (force %d) to recv %X/%X, write %X/%X, flush %X/%X",
+ elog(DEBUG2, "sending feedback (force %d, has_unprocessed_change %d) to recv %X/%X, write %X/%X, flush %X/%X",
force,
+ has_unprocessed_change,
LSN_FORMAT_ARGS(recvpos),
LSN_FORMAT_ARGS(writepos),
LSN_FORMAT_ARGS(flushpos));
@@ -4367,11 +4504,11 @@ start_table_sync(XLogRecPtr *origin_startpos, char **myslotname)
* of system resource error and are not repeatable.
*/
static void
-start_apply(XLogRecPtr origin_startpos)
+start_apply(void)
{
PG_TRY();
{
- LogicalRepApplyLoop(origin_startpos);
+ LogicalRepApplyLoop();
}
PG_CATCH();
{
@@ -4661,7 +4798,8 @@ ApplyWorkerMain(Datum main_arg)
}
/* Run the main loop. */
- start_apply(origin_startpos);
+ last_received = origin_startpos;
+ start_apply();
proc_exit(0);
}
diff --git a/src/bin/pg_dump/pg_dump.c b/src/bin/pg_dump/pg_dump.c
index 527c7651ab..1e87f0124e 100644
--- a/src/bin/pg_dump/pg_dump.c
+++ b/src/bin/pg_dump/pg_dump.c
@@ -4494,6 +4494,7 @@ getSubscriptions(Archive *fout)
int i_subsynccommit;
int i_subpublications;
int i_subbinary;
+ int i_subminapplydelay;
int i,
ntups;
@@ -4546,9 +4547,13 @@ getSubscriptions(Archive *fout)
LOGICALREP_TWOPHASE_STATE_DISABLED);
if (fout->remoteVersion >= 160000)
- appendPQExpBufferStr(query, " s.suborigin\n");
+ appendPQExpBufferStr(query,
+ " s.suborigin,\n"
+ " s.subminapplydelay\n");
else
- appendPQExpBuffer(query, " '%s' AS suborigin\n", LOGICALREP_ORIGIN_ANY);
+ appendPQExpBuffer(query, " '%s' AS suborigin,\n"
+ " 0 AS subminapplydelay\n",
+ LOGICALREP_ORIGIN_ANY);
appendPQExpBufferStr(query,
"FROM pg_subscription s\n"
@@ -4576,6 +4581,7 @@ getSubscriptions(Archive *fout)
i_subtwophasestate = PQfnumber(res, "subtwophasestate");
i_subdisableonerr = PQfnumber(res, "subdisableonerr");
i_suborigin = PQfnumber(res, "suborigin");
+ i_subminapplydelay = PQfnumber(res, "subminapplydelay");
subinfo = pg_malloc(ntups * sizeof(SubscriptionInfo));
@@ -4606,6 +4612,8 @@ getSubscriptions(Archive *fout)
subinfo[i].subdisableonerr =
pg_strdup(PQgetvalue(res, i, i_subdisableonerr));
subinfo[i].suborigin = pg_strdup(PQgetvalue(res, i, i_suborigin));
+ subinfo[i].subminapplydelay =
+ atoi(PQgetvalue(res, i, i_subminapplydelay));
/* Decide whether we want to dump it */
selectDumpableObject(&(subinfo[i].dobj), fout);
@@ -4687,6 +4695,9 @@ dumpSubscription(Archive *fout, const SubscriptionInfo *subinfo)
if (strcmp(subinfo->subsynccommit, "off") != 0)
appendPQExpBuffer(query, ", synchronous_commit = %s", fmtId(subinfo->subsynccommit));
+ if (subinfo->subminapplydelay > 0)
+ appendPQExpBuffer(query, ", min_apply_delay = '%d ms'", subinfo->subminapplydelay);
+
appendPQExpBufferStr(query, ");\n");
if (subinfo->dobj.dump & DUMP_COMPONENT_DEFINITION)
diff --git a/src/bin/pg_dump/pg_dump.h b/src/bin/pg_dump/pg_dump.h
index e7cbd8d7ed..b8831c3ed3 100644
--- a/src/bin/pg_dump/pg_dump.h
+++ b/src/bin/pg_dump/pg_dump.h
@@ -661,6 +661,7 @@ typedef struct _SubscriptionInfo
char *subdisableonerr;
char *suborigin;
char *subsynccommit;
+ int subminapplydelay;
char *subpublications;
} SubscriptionInfo;
diff --git a/src/bin/psql/describe.c b/src/bin/psql/describe.c
index c8a0bb7b3a..81d4607a1c 100644
--- a/src/bin/psql/describe.c
+++ b/src/bin/psql/describe.c
@@ -6472,7 +6472,7 @@ describeSubscriptions(const char *pattern, bool verbose)
PGresult *res;
printQueryOpt myopt = pset.popt;
static const bool translate_columns[] = {false, false, false, false,
- false, false, false, false, false, false, false, false};
+ false, false, false, false, false, false, false, false, false};
if (pset.sversion < 100000)
{
@@ -6527,10 +6527,13 @@ describeSubscriptions(const char *pattern, bool verbose)
gettext_noop("Two-phase commit"),
gettext_noop("Disable on error"));
+ /* Origin and min_apply_delay are only supported in v16 and higher */
if (pset.sversion >= 160000)
appendPQExpBuffer(&buf,
- ", suborigin AS \"%s\"\n",
- gettext_noop("Origin"));
+ ", suborigin AS \"%s\"\n"
+ ", subminapplydelay AS \"%s\"\n",
+ gettext_noop("Origin"),
+ gettext_noop("Min apply delay"));
appendPQExpBuffer(&buf,
", subsynccommit AS \"%s\"\n"
diff --git a/src/bin/psql/tab-complete.c b/src/bin/psql/tab-complete.c
index 5e1882eaea..e8b9a43a47 100644
--- a/src/bin/psql/tab-complete.c
+++ b/src/bin/psql/tab-complete.c
@@ -1925,7 +1925,7 @@ psql_completion(const char *text, int start, int end)
COMPLETE_WITH("(", "PUBLICATION");
/* ALTER SUBSCRIPTION <name> SET ( */
else if (HeadMatches("ALTER", "SUBSCRIPTION", MatchAny) && TailMatches("SET", "("))
- COMPLETE_WITH("binary", "disable_on_error", "origin", "slot_name",
+ COMPLETE_WITH("binary", "disable_on_error", "min_apply_delay", "origin", "slot_name",
"streaming", "synchronous_commit");
/* ALTER SUBSCRIPTION <name> SKIP ( */
else if (HeadMatches("ALTER", "SUBSCRIPTION", MatchAny) && TailMatches("SKIP", "("))
@@ -3268,7 +3268,7 @@ psql_completion(const char *text, int start, int end)
/* Complete "CREATE SUBSCRIPTION <name> ... WITH ( <opt>" */
else if (HeadMatches("CREATE", "SUBSCRIPTION") && TailMatches("WITH", "("))
COMPLETE_WITH("binary", "connect", "copy_data", "create_slot",
- "disable_on_error", "enabled", "origin", "slot_name",
+ "disable_on_error", "enabled", "min_apply_delay", "origin", "slot_name",
"streaming", "synchronous_commit", "two_phase");
/* CREATE TRIGGER --- is allowed inside CREATE SCHEMA, so use TailMatches */
diff --git a/src/include/catalog/pg_subscription.h b/src/include/catalog/pg_subscription.h
index b0f2a1705d..d1cfefc6d6 100644
--- a/src/include/catalog/pg_subscription.h
+++ b/src/include/catalog/pg_subscription.h
@@ -74,6 +74,8 @@ CATALOG(pg_subscription,6100,SubscriptionRelationId) BKI_SHARED_RELATION BKI_ROW
Oid subowner BKI_LOOKUP(pg_authid); /* Owner of the subscription */
+ int32 subminapplydelay; /* Replication apply delay (ms) */
+
bool subenabled; /* True if the subscription is enabled (the
* worker should be running) */
@@ -122,6 +124,7 @@ typedef struct Subscription
* skipped */
char *name; /* Name of the subscription */
Oid owner; /* Oid of the subscription owner */
+ int32 minapplydelay; /* Replication apply delay (ms) */
bool enabled; /* Indicates if the subscription is enabled */
bool binary; /* Indicates if the subscription wants data in
* binary format */
diff --git a/src/include/replication/worker_internal.h b/src/include/replication/worker_internal.h
index dc87a4edd1..3dc09d1a4c 100644
--- a/src/include/replication/worker_internal.h
+++ b/src/include/replication/worker_internal.h
@@ -255,7 +255,7 @@ extern void stream_stop_internal(TransactionId xid);
/* Common streaming function to apply all the spooled messages */
extern void apply_spooled_messages(FileSet *stream_fileset, TransactionId xid,
- XLogRecPtr lsn);
+ XLogRecPtr lsn, TimestampTz finish_ts);
extern void apply_dispatch(StringInfo s);
diff --git a/src/test/regress/expected/subscription.out b/src/test/regress/expected/subscription.out
index 3f99b14394..cf8e727ee9 100644
--- a/src/test/regress/expected/subscription.out
+++ b/src/test/regress/expected/subscription.out
@@ -114,18 +114,18 @@ CREATE SUBSCRIPTION regress_testsub4 CONNECTION 'dbname=regress_doesnotexist' PU
WARNING: subscription was created, but is not connected
HINT: To initiate replication, you must manually create the replication slot, enable the subscription, and refresh the subscription.
\dRs+ regress_testsub4
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
-------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub4 | regress_subscription_user | f | {testpub} | f | off | d | f | none | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay | Synchronous commit | Conninfo | Skip LSN
+------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+-----------------+--------------------+-----------------------------+----------
+ regress_testsub4 | regress_subscription_user | f | {testpub} | f | off | d | f | none | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
ALTER SUBSCRIPTION regress_testsub4 SET (origin = any);
\dRs+ regress_testsub4
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
-------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub4 | regress_subscription_user | f | {testpub} | f | off | d | f | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay | Synchronous commit | Conninfo | Skip LSN
+------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+-----------------+--------------------+-----------------------------+----------
+ regress_testsub4 | regress_subscription_user | f | {testpub} | f | off | d | f | any | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
DROP SUBSCRIPTION regress_testsub3;
@@ -143,10 +143,10 @@ ALTER SUBSCRIPTION regress_testsub CONNECTION 'foobar';
ERROR: invalid connection string syntax: missing "=" after "foobar" in connection info string
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub} | f | off | d | f | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+-----------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | off | d | f | any | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
ALTER SUBSCRIPTION regress_testsub SET PUBLICATION testpub2, testpub3 WITH (refresh = false);
@@ -163,10 +163,10 @@ ERROR: unrecognized subscription parameter: "create_slot"
-- ok
ALTER SUBSCRIPTION regress_testsub SKIP (lsn = '0/12345');
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+---------------------+--------+-----------+------------------+------------------+--------+--------------------+------------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub2,testpub3} | f | off | d | f | any | off | dbname=regress_doesnotexist2 | 0/12345
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+---------------------+--------+-----------+------------------+------------------+--------+-----------------+--------------------+------------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub2,testpub3} | f | off | d | f | any | 0 | off | dbname=regress_doesnotexist2 | 0/12345
(1 row)
-- ok - with lsn = NONE
@@ -175,10 +175,10 @@ ALTER SUBSCRIPTION regress_testsub SKIP (lsn = NONE);
ALTER SUBSCRIPTION regress_testsub SKIP (lsn = '0/0');
ERROR: invalid WAL location (LSN): 0/0
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+---------------------+--------+-----------+------------------+------------------+--------+--------------------+------------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub2,testpub3} | f | off | d | f | any | off | dbname=regress_doesnotexist2 | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+---------------------+--------+-----------+------------------+------------------+--------+-----------------+--------------------+------------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub2,testpub3} | f | off | d | f | any | 0 | off | dbname=regress_doesnotexist2 | 0/0
(1 row)
BEGIN;
@@ -210,10 +210,10 @@ ALTER SUBSCRIPTION regress_testsub_foo SET (synchronous_commit = foobar);
ERROR: invalid value for parameter "synchronous_commit": "foobar"
HINT: Available values: local, remote_write, remote_apply, on, off.
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
----------------------+---------------------------+---------+---------------------+--------+-----------+------------------+------------------+--------+--------------------+------------------------------+----------
- regress_testsub_foo | regress_subscription_user | f | {testpub2,testpub3} | f | off | d | f | any | local | dbname=regress_doesnotexist2 | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay | Synchronous commit | Conninfo | Skip LSN
+---------------------+---------------------------+---------+---------------------+--------+-----------+------------------+------------------+--------+-----------------+--------------------+------------------------------+----------
+ regress_testsub_foo | regress_subscription_user | f | {testpub2,testpub3} | f | off | d | f | any | 0 | local | dbname=regress_doesnotexist2 | 0/0
(1 row)
-- rename back to keep the rest simple
@@ -247,19 +247,19 @@ CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUB
WARNING: subscription was created, but is not connected
HINT: To initiate replication, you must manually create the replication slot, enable the subscription, and refresh the subscription.
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub} | t | off | d | f | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+-----------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | t | off | d | f | any | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
ALTER SUBSCRIPTION regress_testsub SET (binary = false);
ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE);
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub} | f | off | d | f | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+-----------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | off | d | f | any | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
DROP SUBSCRIPTION regress_testsub;
@@ -271,27 +271,27 @@ CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUB
WARNING: subscription was created, but is not connected
HINT: To initiate replication, you must manually create the replication slot, enable the subscription, and refresh the subscription.
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub} | f | on | d | f | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+-----------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | on | d | f | any | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
ALTER SUBSCRIPTION regress_testsub SET (streaming = parallel);
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub} | f | parallel | d | f | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+-----------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | parallel | d | f | any | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
ALTER SUBSCRIPTION regress_testsub SET (streaming = false);
ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE);
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub} | f | off | d | f | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+-----------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | off | d | f | any | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
-- fail - publication already exists
@@ -306,10 +306,10 @@ ALTER SUBSCRIPTION regress_testsub ADD PUBLICATION testpub1, testpub2 WITH (refr
ALTER SUBSCRIPTION regress_testsub ADD PUBLICATION testpub1, testpub2 WITH (refresh = false);
ERROR: publication "testpub1" is already in subscription "regress_testsub"
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-----------------------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub,testpub1,testpub2} | f | off | d | f | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-----------------------------+--------+-----------+------------------+------------------+--------+-----------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub,testpub1,testpub2} | f | off | d | f | any | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
-- fail - publication used more than once
@@ -324,10 +324,10 @@ ERROR: publication "testpub3" is not in subscription "regress_testsub"
-- ok - delete publications
ALTER SUBSCRIPTION regress_testsub DROP PUBLICATION testpub1, testpub2 WITH (refresh = false);
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub} | f | off | d | f | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+-----------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | off | d | f | any | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
DROP SUBSCRIPTION regress_testsub;
@@ -363,10 +363,10 @@ CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUB
WARNING: subscription was created, but is not connected
HINT: To initiate replication, you must manually create the replication slot, enable the subscription, and refresh the subscription.
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub} | f | off | p | f | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+-----------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | off | p | f | any | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
--fail - alter of two_phase option not supported.
@@ -375,10 +375,10 @@ ERROR: unrecognized subscription parameter: "two_phase"
-- but can alter streaming when two_phase enabled
ALTER SUBSCRIPTION regress_testsub SET (streaming = true);
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub} | f | on | p | f | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+-----------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | on | p | f | any | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE);
@@ -388,10 +388,10 @@ CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUB
WARNING: subscription was created, but is not connected
HINT: To initiate replication, you must manually create the replication slot, enable the subscription, and refresh the subscription.
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub} | f | on | p | f | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+-----------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | on | p | f | any | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE);
@@ -404,20 +404,57 @@ CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUB
WARNING: subscription was created, but is not connected
HINT: To initiate replication, you must manually create the replication slot, enable the subscription, and refresh the subscription.
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub} | f | off | d | f | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+-----------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | off | d | f | any | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
ALTER SUBSCRIPTION regress_testsub SET (disable_on_error = true);
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub} | f | off | d | t | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+-----------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | off | d | t | any | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
+ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE);
+DROP SUBSCRIPTION regress_testsub;
+-- fail -- min_apply_delay must be a non-negative integer
+CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUBLICATION testpub WITH (connect = false, min_apply_delay = foo);
+ERROR: invalid value for parameter "min_apply_delay": "foo"
+CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUBLICATION testpub WITH (connect = false, min_apply_delay = -1);
+ERROR: -1 ms is outside the valid range for parameter "min_apply_delay" (0 .. 2147483647)
+-- fail - utilizing streaming = parallel with time-delayed replication is not supported
+CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUBLICATION testpub WITH (connect = false, streaming = parallel, min_apply_delay = 123);
+ERROR: min_apply_delay > 0 and streaming = parallel are mutually exclusive options
+-- success -- min_apply_delay value without unit is taken as milliseconds
+CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUBLICATION testpub WITH (connect = false, min_apply_delay = 123);
+WARNING: subscription was created, but is not connected
+HINT: To initiate replication, you must manually create the replication slot, enable the subscription, and refresh the subscription.
+\dRs+
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+-----------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | off | d | f | any | 123 | off | dbname=regress_doesnotexist | 0/0
+(1 row)
+
+-- success -- min_apply_delay value with unit is converted into ms and stored as an integer
+ALTER SUBSCRIPTION regress_testsub SET (min_apply_delay = '1 d');
+\dRs+
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+-----------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | off | d | f | any | 86400000 | off | dbname=regress_doesnotexist | 0/0
+(1 row)
+
+-- fail - alter subscription with streaming = parallel should fail when time-delayed replication is set
+ALTER SUBSCRIPTION regress_testsub SET (streaming = parallel);
+ERROR: cannot set parallel streaming mode for subscription with min_apply_delay
+-- fail - alter subscription with min_apply_delay should fail when streaming = parallel is set
+ALTER SUBSCRIPTION regress_testsub SET (min_apply_delay = 0, streaming = parallel);
+ALTER SUBSCRIPTION regress_testsub SET (min_apply_delay = 123);
+ERROR: cannot set min_apply_delay for subscription in parallel streaming mode
ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE);
DROP SUBSCRIPTION regress_testsub;
RESET SESSION AUTHORIZATION;
diff --git a/src/test/regress/sql/subscription.sql b/src/test/regress/sql/subscription.sql
index 7281f5fee2..7317b140f5 100644
--- a/src/test/regress/sql/subscription.sql
+++ b/src/test/regress/sql/subscription.sql
@@ -286,6 +286,30 @@ ALTER SUBSCRIPTION regress_testsub SET (disable_on_error = true);
ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE);
DROP SUBSCRIPTION regress_testsub;
+-- fail -- min_apply_delay must be a non-negative integer
+CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUBLICATION testpub WITH (connect = false, min_apply_delay = foo);
+CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUBLICATION testpub WITH (connect = false, min_apply_delay = -1);
+
+-- fail - utilizing streaming = parallel with time-delayed replication is not supported
+CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUBLICATION testpub WITH (connect = false, streaming = parallel, min_apply_delay = 123);
+
+-- success -- min_apply_delay value without unit is taken as milliseconds
+CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUBLICATION testpub WITH (connect = false, min_apply_delay = 123);
+\dRs+
+
+-- success -- min_apply_delay value with unit is converted into ms and stored as an integer
+ALTER SUBSCRIPTION regress_testsub SET (min_apply_delay = '1 d');
+\dRs+
+
+-- fail - alter subscription with streaming = parallel should fail when time-delayed replication is set
+ALTER SUBSCRIPTION regress_testsub SET (streaming = parallel);
+
+-- fail - alter subscription with min_apply_delay should fail when streaming = parallel is set
+ALTER SUBSCRIPTION regress_testsub SET (min_apply_delay = 0, streaming = parallel);
+ALTER SUBSCRIPTION regress_testsub SET (min_apply_delay = 123);
+ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE);
+DROP SUBSCRIPTION regress_testsub;
+
RESET SESSION AUTHORIZATION;
DROP ROLE regress_subscription_user;
DROP ROLE regress_subscription_user2;
diff --git a/src/test/subscription/t/001_rep_changes.pl b/src/test/subscription/t/001_rep_changes.pl
index 91aa068c95..75fd77b891 100644
--- a/src/test/subscription/t/001_rep_changes.pl
+++ b/src/test/subscription/t/001_rep_changes.pl
@@ -515,6 +515,34 @@ $node_publisher->poll_query_until('postgres',
or die
"Timed out while waiting for apply to restart after renaming SUBSCRIPTION";
+# Test time-delayed logical replication
+#
+# If the subscription sets min_apply_delay parameter, the logical replication
+# worker will delay the transaction apply for min_apply_delay milliseconds. We
+# verify this by looking at the time difference between a) when tuples are
+# inserted on the publisher, and b) when those changes are replicated on the
+# subscriber. Even on slow machines, this strategy will give predictable behavior.
+
+# Set min_apply_delay parameter to 3 seconds
+my $delay = 3;
+$node_subscriber->safe_psql('postgres',
+ "ALTER SUBSCRIPTION tap_sub_renamed SET (min_apply_delay = '${delay}s')");
+
+# Before doing the insertion, get the current timestamp that will be
+# used as a comparison base.
+my $publisher_insert_time = time();
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_ins VALUES (generate_series(1101, 1120))");
+
+# The publisher waits for the replication to complete
+$node_publisher->wait_for_catchup('tap_sub_renamed');
+
+# This test is successful if and only if the LSN has been applied with at least
+# the configured apply delay.
+ok( time() - $publisher_insert_time >= $delay,
+ "subscriber applies WAL only after replication delay for non-streaming transaction"
+);
+
# check all the cleanup
$node_subscriber->safe_psql('postgres', "DROP SUBSCRIPTION tap_sub_renamed");
--
2.30.0
The comment adjustment suggested by Peter-san above
was also included in this v33.
Please have a look at the attached patch.
Patch v33 LGTM.
------
Kind Regards,
Peter Smith.
Fujitsu Australia
At Thu, 9 Feb 2023 13:26:19 +0530, Amit Kapila <amit.kapila16@gmail.com> wrote in
amit.kapila16> On Thu, Feb 9, 2023 at 12:17 AM Peter Smith <smithpb2250@gmail.com> wrote:
I understand in theory, your code is more efficient, but in practice,
I think the overhead of a single variable assignment every loop
iteration (which is doing WaitLatch anyway) is of insignificant
concern, whereas having one assignment is simpler than having two IMO.Yeah, that sounds better to me as well.
FWIW, I'm on board with this.
regards.
--
Kyotaro Horiguchi
NTT Open Source Software Center
At Thu, 9 Feb 2023 13:48:52 +0530, Amit Kapila <amit.kapila16@gmail.com> wrote in
On Thu, Feb 9, 2023 at 10:45 AM Kyotaro Horiguchi
<horikyota.ntt@gmail.com> wrote:At Wed, 8 Feb 2023 09:03:03 +0000, "Hayato Kuroda (Fujitsu)" <kuroda.hayato@fujitsu.com> wrote in
Thank you for reviewing! PSA new version.
+ if (statusinterval_ms > 0 && diffms > statusinterval_ms)
The next expected feedback time is measured from the last status
report. Thus, it seems to me this may suppress feedbacks from being
sent for an unexpectedly long time especially when min_apply_delay is
shorter than wal_r_s_interval.I think the minimum time before we send any feedback during the wait
is wal_r_s_interval. Now, I think if there is no transaction for a
long time before we get a new transaction, there should be keep-alive
messages in between which would allow us to send feedback at regular
intervals (wal_receiver_status_interval). So, I think we should be
Right.
able to send feedback in less than 2 * wal_receiver_status_interval
unless wal_sender/receiver timeout is very large and there is a very
low volume of transactions. Now, we can try to send the feedback
We have suffered this kind of feedback silence many times. Thus I
don't want to rely on luck here. I had in mind of exposing last_send
itself or providing interval-calclation function to the logic.
before we start waiting or maybe after every
wal_receiver_status_interval / 2 but I think that will lead to more
spurious feedback messages than we get the benefit from them.
regards.
--
Kyotaro Horiguchi
NTT Open Source Software Center
Mmm. A part of the previous mail have gone anywhere for a uncertain
reason and placed by a mysterious blank lines...
At Fri, 10 Feb 2023 09:57:22 +0900 (JST), Kyotaro Horiguchi <horikyota.ntt@gmail.com> wrote in
At Thu, 9 Feb 2023 13:48:52 +0530, Amit Kapila <amit.kapila16@gmail.com> wrote in
On Thu, Feb 9, 2023 at 10:45 AM Kyotaro Horiguchi
<horikyota.ntt@gmail.com> wrote:At Wed, 8 Feb 2023 09:03:03 +0000, "Hayato Kuroda (Fujitsu)" <kuroda.hayato@fujitsu.com> wrote in
Thank you for reviewing! PSA new version.
+ if (statusinterval_ms > 0 && diffms > statusinterval_ms)
The next expected feedback time is measured from the last status
report. Thus, it seems to me this may suppress feedbacks from being
sent for an unexpectedly long time especially when min_apply_delay is
shorter than wal_r_s_interval.I think the minimum time before we send any feedback during the wait
is wal_r_s_interval. Now, I think if there is no transaction for a
long time before we get a new transaction, there should be keep-alive
messages in between which would allow us to send feedback at regular
intervals (wal_receiver_status_interval). So, I think we should beRight.
able to send feedback in less than 2 * wal_receiver_status_interval
unless wal_sender/receiver timeout is very large and there is a very
low volume of transactions. Now, we can try to send the feedbackWe have suffered this kind of feedback silence many times. Thus I
don't want to rely on luck here. I had in mind of exposing last_send
itself or providing interval-calclation function to the logic.before we start waiting or maybe after every
wal_receiver_status_interval / 2 but I think that will lead to more
spurious feedback messages than we get the benefit from them.
Agreed. I think we dont want to do that.
regards.
--
Kyotaro Horiguchi
NTT Open Source Software Center
On Fri, Feb 10, 2023 at 6:27 AM Kyotaro Horiguchi
<horikyota.ntt@gmail.com> wrote:
At Thu, 9 Feb 2023 13:48:52 +0530, Amit Kapila <amit.kapila16@gmail.com> wrote in
On Thu, Feb 9, 2023 at 10:45 AM Kyotaro Horiguchi
<horikyota.ntt@gmail.com> wrote:At Wed, 8 Feb 2023 09:03:03 +0000, "Hayato Kuroda (Fujitsu)" <kuroda.hayato@fujitsu.com> wrote in
Thank you for reviewing! PSA new version.
+ if (statusinterval_ms > 0 && diffms > statusinterval_ms)
The next expected feedback time is measured from the last status
report. Thus, it seems to me this may suppress feedbacks from being
sent for an unexpectedly long time especially when min_apply_delay is
shorter than wal_r_s_interval.I think the minimum time before we send any feedback during the wait
is wal_r_s_interval. Now, I think if there is no transaction for a
long time before we get a new transaction, there should be keep-alive
messages in between which would allow us to send feedback at regular
intervals (wal_receiver_status_interval). So, I think we should beRight.
able to send feedback in less than 2 * wal_receiver_status_interval
unless wal_sender/receiver timeout is very large and there is a very
low volume of transactions. Now, we can try to send the feedbackWe have suffered this kind of feedback silence many times. Thus I
don't want to rely on luck here. I had in mind of exposing last_send
itself or providing interval-calclation function to the logic.
I think we have last_send time in send_feedback(), so we can expose it
if we want but how would that solve the problem you are worried about?
The one simple idea as I shared in my last email was to send feedback
every wal_receiver_status_interval / 2. I think this should avoid any
timeout problem because we already recommend setting it to lesser than
wal_sender_timeout.
--
With Regards,
Amit Kapila.
On Fri, Feb 10, 2023 at 10:11 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
On Fri, Feb 10, 2023 at 6:27 AM Kyotaro Horiguchi
<horikyota.ntt@gmail.com> wrote:At Thu, 9 Feb 2023 13:48:52 +0530, Amit Kapila <amit.kapila16@gmail.com> wrote in
On Thu, Feb 9, 2023 at 10:45 AM Kyotaro Horiguchi
<horikyota.ntt@gmail.com> wrote:At Wed, 8 Feb 2023 09:03:03 +0000, "Hayato Kuroda (Fujitsu)" <kuroda.hayato@fujitsu.com> wrote in
Thank you for reviewing! PSA new version.
+ if (statusinterval_ms > 0 && diffms > statusinterval_ms)
The next expected feedback time is measured from the last status
report. Thus, it seems to me this may suppress feedbacks from being
sent for an unexpectedly long time especially when min_apply_delay is
shorter than wal_r_s_interval.I think the minimum time before we send any feedback during the wait
is wal_r_s_interval. Now, I think if there is no transaction for a
long time before we get a new transaction, there should be keep-alive
messages in between which would allow us to send feedback at regular
intervals (wal_receiver_status_interval). So, I think we should beRight.
able to send feedback in less than 2 * wal_receiver_status_interval
unless wal_sender/receiver timeout is very large and there is a very
low volume of transactions. Now, we can try to send the feedbackWe have suffered this kind of feedback silence many times. Thus I
don't want to rely on luck here. I had in mind of exposing last_send
itself or providing interval-calclation function to the logic.I think we have last_send time in send_feedback(), so we can expose it
if we want but how would that solve the problem you are worried about?
I have an idea to use last_send time to avoid walsenders being
timeout. Instead of directly using wal_receiver_status_interval as a
minimum interval to send the feedback, we should check if it is
greater than last_send time then we should send the feedback after
(wal_receiver_status_interval - last_send). I think they can probably
be different only on the very first time. Any better ideas?
--
With Regards,
Amit Kapila.
Hi,
On Friday, February 10, 2023 2:05 PM Friday, February 10, 2023 2:05 PM wrote:
On Fri, Feb 10, 2023 at 10:11 AM Amit Kapila <amit.kapila16@gmail.com>
wrote:On Fri, Feb 10, 2023 at 6:27 AM Kyotaro Horiguchi
<horikyota.ntt@gmail.com> wrote:At Thu, 9 Feb 2023 13:48:52 +0530, Amit Kapila
<amit.kapila16@gmail.com> wrote inOn Thu, Feb 9, 2023 at 10:45 AM Kyotaro Horiguchi
<horikyota.ntt@gmail.com> wrote:At Wed, 8 Feb 2023 09:03:03 +0000, "Hayato Kuroda (Fujitsu)"
<kuroda.hayato@fujitsu.com> wrote inThank you for reviewing! PSA new version.
+ if (statusinterval_ms > 0 && diffms > + statusinterval_ms)The next expected feedback time is measured from the last status
report. Thus, it seems to me this may suppress feedbacks from
being sent for an unexpectedly long time especially when
min_apply_delay is shorter than wal_r_s_interval.I think the minimum time before we send any feedback during the
wait is wal_r_s_interval. Now, I think if there is no transaction
for a long time before we get a new transaction, there should be
keep-alive messages in between which would allow us to send
feedback at regular intervals (wal_receiver_status_interval). So,
I think we should beRight.
able to send feedback in less than 2 *
wal_receiver_status_interval unless wal_sender/receiver timeout is
very large and there is a very low volume of transactions. Now, we
can try to send the feedbackWe have suffered this kind of feedback silence many times. Thus I
don't want to rely on luck here. I had in mind of exposing last_send
itself or providing interval-calclation function to the logic.I think we have last_send time in send_feedback(), so we can expose it
if we want but how would that solve the problem you are worried about?I have an idea to use last_send time to avoid walsenders being timeout.
Instead of directly using wal_receiver_status_interval as a minimum interval
to send the feedback, we should check if it is greater than last_send time
then we should send the feedback after (wal_receiver_status_interval -
last_send). I think they can probably be different only on the very first time.
Any better ideas?
This idea sounds good to me and
implemented this idea in an attached patch v34.
In the previous patch, we couldn't solve the
timeout of the publisher, when we conduct a scenario suggested by Horiguchi-san
and reproduced in the scenario attached test file 'test.sh'.
But now we handle it by adjusting the timing of the first wait time.
FYI, we thought to implement the new variable 'send_time'
in the LogicalRepWorker structure at first. But, this structure
is used when launcher controls workers or reports statistics
and it stores TimestampTz recorded in the received WAL,
so not sure if the struct is the right place to implement the variable.
Moreover, there are other similar variables such as last_recv_time
or reply_time. So, those will be confusing when we decide to have
new variable together. Then, it's declared separately.
The new patch also includes some changes for wait event.
Kindly have a look at the v34 patch.
Best Regards,
Takamichi Osumi
Attachments:
v34-0001-Time-delayed-logical-replication-subscriber.patchapplication/octet-stream; name=v34-0001-Time-delayed-logical-replication-subscriber.patchDownload
From 7f71a549010129326ed8233882a36099bc94728b Mon Sep 17 00:00:00 2001
From: Takamichi Osumi <osumi.takamichi@fujitsu.com>
Date: Fri, 10 Feb 2023 10:43:26 +0000
Subject: [PATCH v34] Time-delayed logical replication subscriber
Similar to physical replication, a time-delayed copy of the data for
logical replication is useful for some scenarios (particularly to fix
errors that might cause data loss).
This patch implements a new subscription parameter called 'min_apply_delay'.
If the subscription sets min_apply_delay parameter, the logical
replication worker will delay the transaction apply for min_apply_delay
milliseconds.
The delay is calculated between the WAL time stamp and the current time
on the subscriber.
The delay occurs before we start to apply the transaction on the
subscriber. The main reason is to avoid keeping a transaction open for
a long time. Regular and prepared transactions are covered. Streamed
transactions are also covered.
The combination of parallel streaming mode and min_apply_delay is not
allowed. This is because in parallel streaming mode, we start applying
the transaction stream as soon as the first change arrives without
knowing the transaction's prepare/commit time. This means we cannot
calculate the underlying network/decoding lag between publisher and
subscriber, and so always waiting for the full 'min_apply_delay' period
might include unnecessary delay.
The other possibility was to apply the delay at the end of the parallel
apply transaction but that would cause issues related to resource
bloat and locks being held for a long time.
Note that this feature doesn't interact with skip transaction feature.
The skip transaction feature applies to one transaction with a specific LSN.
So, even if the skipped transaction and non-skipped transaction come
consecutively in a very short time, regardless of the order of which comes
first, the time-delayed feature gets balanced by delayed application
for other transactions before and after the skipped transaction.
Author: Euler Taveira, Takamichi Osumi, Kuroda Hayato
Reviewed-by: Amit Kapila, Peter Smith, Vignesh C, Shveta Malik,
Kyotaro Horiguchi, Shi Yu, Wang Wei, Dilip Kumar, Melih Mutlu
Discussion: https://postgr.es/m/CAB-JLwYOYwL=XTyAXKiH5CtM_Vm8KjKh7aaitCKvmCh4rzr5pQ@mail.gmail.com
---
doc/src/sgml/catalogs.sgml | 9 +
doc/src/sgml/config.sgml | 12 ++
doc/src/sgml/glossary.sgml | 15 ++
doc/src/sgml/logical-replication.sgml | 6 +
doc/src/sgml/ref/alter_subscription.sgml | 5 +-
doc/src/sgml/ref/create_subscription.sgml | 49 ++++-
src/backend/catalog/pg_subscription.c | 1 +
src/backend/catalog/system_views.sql | 7 +-
src/backend/commands/subscriptioncmds.c | 122 +++++++++++-
.../replication/logical/applyparallelworker.c | 3 +-
src/backend/replication/logical/worker.c | 188 ++++++++++++++++--
src/backend/utils/activity/wait_event.c | 3 +
src/bin/pg_dump/pg_dump.c | 15 +-
src/bin/pg_dump/pg_dump.h | 1 +
src/bin/psql/describe.c | 9 +-
src/bin/psql/tab-complete.c | 4 +-
src/include/catalog/pg_subscription.h | 3 +
src/include/replication/worker_internal.h | 2 +-
src/include/utils/wait_event.h | 3 +-
src/test/regress/expected/subscription.out | 181 ++++++++++-------
src/test/regress/sql/subscription.sql | 24 +++
src/test/subscription/t/001_rep_changes.pl | 28 +++
22 files changed, 584 insertions(+), 106 deletions(-)
diff --git a/doc/src/sgml/catalogs.sgml b/doc/src/sgml/catalogs.sgml
index c1e4048054..5dc5ca1133 100644
--- a/doc/src/sgml/catalogs.sgml
+++ b/doc/src/sgml/catalogs.sgml
@@ -7873,6 +7873,15 @@ SCRAM-SHA-256$<replaceable><iteration count></replaceable>:<replaceable>&l
</para></entry>
</row>
+ <row>
+ <entry role="catalog_table_entry"><para role="column_definition">
+ <structfield>subminapplydelay</structfield> <type>int4</type>
+ </para>
+ <para>
+ The minimum delay, in milliseconds, for applying changes
+ </para></entry>
+ </row>
+
<row>
<entry role="catalog_table_entry"><para role="column_definition">
<structfield>subname</structfield> <type>name</type>
diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml
index 8c56b134a8..21b45c68e2 100644
--- a/doc/src/sgml/config.sgml
+++ b/doc/src/sgml/config.sgml
@@ -4787,6 +4787,18 @@ ANY <replaceable class="parameter">num_sync</replaceable> ( <replaceable class="
the <filename>postgresql.conf</filename> file or on the server
command line.
</para>
+ <para>
+ For time-delayed logical replication, the apply worker sends a feedback
+ message to the publisher every
+ <varname>wal_receiver_status_interval</varname> milliseconds. Make sure
+ to set <varname>wal_receiver_status_interval</varname> less than the
+ <varname>wal_sender_timeout</varname> on the publisher, otherwise, the
+ <literal>walsender</literal> will repeatedly terminate due to timeout
+ errors. Note that if <varname>wal_receiver_status_interval</varname> is
+ set to zero, the apply worker sends no feedback messages during the
+ <literal>min_apply_delay</literal> period. Refer to
+ <xref linkend="sql-createsubscription"/> for more information.
+ </para>
</listitem>
</varlistentry>
diff --git a/doc/src/sgml/glossary.sgml b/doc/src/sgml/glossary.sgml
index 7c01a541fe..9ede9d05f6 100644
--- a/doc/src/sgml/glossary.sgml
+++ b/doc/src/sgml/glossary.sgml
@@ -1729,6 +1729,21 @@
</glossdef>
</glossentry>
+ <glossentry id="glossary-time-delayed-replication">
+ <glossterm>Time-delayed replication</glossterm>
+ <glossdef>
+ <para>
+ Replication setup that delays the application of changes by a specified
+ minimum time-delay period.
+ </para>
+ <para>
+ For more information, see
+ <xref linkend="guc-recovery-min-apply-delay"/> for physical replication
+ and <xref linkend="sql-createsubscription"/> for logical replication.
+ </para>
+ </glossdef>
+ </glossentry>
+
<glossentry id="glossary-toast">
<glossterm>TOAST</glossterm>
<glossdef>
diff --git a/doc/src/sgml/logical-replication.sgml b/doc/src/sgml/logical-replication.sgml
index 1bd5660c87..6bd5f61e2b 100644
--- a/doc/src/sgml/logical-replication.sgml
+++ b/doc/src/sgml/logical-replication.sgml
@@ -247,6 +247,12 @@
target table.
</para>
+ <para>
+ A subscription can delay the application of changes by specifying the
+ <literal>min_apply_delay</literal> subscription parameter. See
+ <xref linkend="sql-createsubscription"/> for details.
+ </para>
+
<sect2 id="logical-replication-subscription-slot">
<title>Replication Slot Management</title>
diff --git a/doc/src/sgml/ref/alter_subscription.sgml b/doc/src/sgml/ref/alter_subscription.sgml
index 964fcbb8ff..8b7eb28e54 100644
--- a/doc/src/sgml/ref/alter_subscription.sgml
+++ b/doc/src/sgml/ref/alter_subscription.sgml
@@ -213,8 +213,9 @@ ALTER SUBSCRIPTION <replaceable class="parameter">name</replaceable> RENAME TO <
are <literal>slot_name</literal>,
<literal>synchronous_commit</literal>,
<literal>binary</literal>, <literal>streaming</literal>,
- <literal>disable_on_error</literal>, and
- <literal>origin</literal>.
+ <literal>disable_on_error</literal>,
+ <literal>origin</literal>, and
+ <literal>min_apply_delay</literal>.
</para>
</listitem>
</varlistentry>
diff --git a/doc/src/sgml/ref/create_subscription.sgml b/doc/src/sgml/ref/create_subscription.sgml
index 51c45f17c7..1b4b8390af 100644
--- a/doc/src/sgml/ref/create_subscription.sgml
+++ b/doc/src/sgml/ref/create_subscription.sgml
@@ -349,7 +349,49 @@ CREATE SUBSCRIPTION <replaceable class="parameter">subscription_name</replaceabl
</para>
</listitem>
</varlistentry>
- </variablelist></para>
+
+ <varlistentry>
+ <term><literal>min_apply_delay</literal> (<type>integer</type>)</term>
+ <listitem>
+ <para>
+ By default, the subscriber applies changes as soon as possible. This
+ parameter allows the user to delay the application of changes by a
+ given time period. If the value is specified without units, it is
+ taken as milliseconds. The default is zero (no delay). See
+ <xref linkend="config-setting-names-values"/> for details on the
+ available valid time units.
+ </para>
+ <para>
+ Any delay becomes effective only after all initial table
+ synchronization has finished and occurs before each transaction starts
+ to get applied on the subscriber. The delay is calculated as the
+ difference between the WAL timestamp as written on the publisher and
+ the current time on the subscriber. Any overhead of time spent in
+ logical decoding and in transferring the transaction may reduce the
+ actual wait time. It is also possible that the overhead already
+ exceeds the requested <literal>min_apply_delay</literal> value, in
+ which case no delay is applied. If the system clocks on publisher and
+ subscriber are not synchronized, this may lead to apply changes
+ earlier than expected, but this is not a major issue because this
+ parameter is typically much larger than the time deviations between
+ servers. Note that if this parameter is set to a long delay, the
+ replication will stop if the replication slot falls behind the current
+ LSN by more than
+ <link linkend="guc-max-slot-wal-keep-size"><literal>max_slot_wal_keep_size</literal></link>.
+ </para>
+ <warning>
+ <para>
+ Delaying the replication means there is a much longer time between
+ making a change on the publisher, and that change being committed
+ on the subscriber. This can impact the performance of synchronous
+ replication. See <xref linkend="guc-synchronous-commit"/>
+ parameter.
+ </para>
+ </warning>
+ </listitem>
+ </varlistentry>
+ </variablelist>
+ </para>
</listitem>
</varlistentry>
@@ -420,6 +462,11 @@ CREATE SUBSCRIPTION <replaceable class="parameter">subscription_name</replaceabl
published with different column lists are not supported.
</para>
+ <para>
+ A non-zero <literal>min_apply_delay</literal> parameter is not allowed when
+ streaming in parallel mode.
+ </para>
+
<para>
We allow non-existent publications to be specified so that users can add
those later. This means
diff --git a/src/backend/catalog/pg_subscription.c b/src/backend/catalog/pg_subscription.c
index a56ae311c3..e19e5cbca2 100644
--- a/src/backend/catalog/pg_subscription.c
+++ b/src/backend/catalog/pg_subscription.c
@@ -66,6 +66,7 @@ GetSubscription(Oid subid, bool missing_ok)
sub->skiplsn = subform->subskiplsn;
sub->name = pstrdup(NameStr(subform->subname));
sub->owner = subform->subowner;
+ sub->minapplydelay = subform->subminapplydelay;
sub->enabled = subform->subenabled;
sub->binary = subform->subbinary;
sub->stream = subform->substream;
diff --git a/src/backend/catalog/system_views.sql b/src/backend/catalog/system_views.sql
index 8608e3fa5b..317c2010cb 100644
--- a/src/backend/catalog/system_views.sql
+++ b/src/backend/catalog/system_views.sql
@@ -1299,9 +1299,10 @@ REVOKE ALL ON pg_replication_origin_status FROM public;
-- All columns of pg_subscription except subconninfo are publicly readable.
REVOKE ALL ON pg_subscription FROM public;
-GRANT SELECT (oid, subdbid, subskiplsn, subname, subowner, subenabled,
- subbinary, substream, subtwophasestate, subdisableonerr,
- subslotname, subsynccommit, subpublications, suborigin)
+GRANT SELECT (oid, subdbid, subskiplsn, subminapplydelay, subname, subowner,
+ subenabled, subbinary, substream, subtwophasestate,
+ subdisableonerr, subslotname, subsynccommit, subpublications,
+ suborigin)
ON pg_subscription TO public;
CREATE VIEW pg_stat_subscription_stats AS
diff --git a/src/backend/commands/subscriptioncmds.c b/src/backend/commands/subscriptioncmds.c
index 464db6d247..82e16fd0f9 100644
--- a/src/backend/commands/subscriptioncmds.c
+++ b/src/backend/commands/subscriptioncmds.c
@@ -66,6 +66,7 @@
#define SUBOPT_DISABLE_ON_ERR 0x00000400
#define SUBOPT_LSN 0x00000800
#define SUBOPT_ORIGIN 0x00001000
+#define SUBOPT_MIN_APPLY_DELAY 0x00002000
/* check if the 'val' has 'bits' set */
#define IsSet(val, bits) (((val) & (bits)) == (bits))
@@ -90,6 +91,7 @@ typedef struct SubOpts
bool disableonerr;
char *origin;
XLogRecPtr lsn;
+ int32 min_apply_delay;
} SubOpts;
static List *fetch_table_list(WalReceiverConn *wrconn, List *publications);
@@ -100,7 +102,7 @@ static void check_publications_origin(WalReceiverConn *wrconn,
static void check_duplicates_in_publist(List *publist, Datum *datums);
static List *merge_publications(List *oldpublist, List *newpublist, bool addpub, const char *subname);
static void ReportSlotConnectionError(List *rstates, Oid subid, char *slotname, char *err);
-
+static int32 defGetMinApplyDelay(DefElem *def);
/*
* Common option parsing function for CREATE and ALTER SUBSCRIPTION commands.
@@ -146,6 +148,8 @@ parse_subscription_options(ParseState *pstate, List *stmt_options,
opts->disableonerr = false;
if (IsSet(supported_opts, SUBOPT_ORIGIN))
opts->origin = pstrdup(LOGICALREP_ORIGIN_ANY);
+ if (IsSet(supported_opts, SUBOPT_MIN_APPLY_DELAY))
+ opts->min_apply_delay = 0;
/* Parse options */
foreach(lc, stmt_options)
@@ -324,6 +328,15 @@ parse_subscription_options(ParseState *pstate, List *stmt_options,
opts->specified_opts |= SUBOPT_LSN;
opts->lsn = lsn;
}
+ else if (IsSet(supported_opts, SUBOPT_MIN_APPLY_DELAY) &&
+ strcmp(defel->defname, "min_apply_delay") == 0)
+ {
+ if (IsSet(opts->specified_opts, SUBOPT_MIN_APPLY_DELAY))
+ errorConflictingDefElem(defel, pstate);
+
+ opts->specified_opts |= SUBOPT_MIN_APPLY_DELAY;
+ opts->min_apply_delay = defGetMinApplyDelay(defel);
+ }
else
ereport(ERROR,
(errcode(ERRCODE_SYNTAX_ERROR),
@@ -404,6 +417,32 @@ parse_subscription_options(ParseState *pstate, List *stmt_options,
"slot_name = NONE", "create_slot = false")));
}
}
+
+ /*
+ * The combination of parallel streaming mode and min_apply_delay is not
+ * allowed. This is because in parallel streaming mode, we start applying
+ * the transaction stream as soon as the first change arrives without
+ * knowing the transaction's prepare/commit time. This means we cannot
+ * calculate the underlying network/decoding lag between publisher and
+ * subscriber, and so always waiting for the full 'min_apply_delay' period
+ * might include unnecessary delay.
+ *
+ * The other possibility was to apply the delay at the end of the parallel
+ * apply transaction but that would cause issues related to resource bloat
+ * and locks being held for a long time.
+ */
+ if (IsSet(supported_opts, SUBOPT_MIN_APPLY_DELAY) &&
+ opts->min_apply_delay > 0 &&
+ opts->streaming == LOGICALREP_STREAM_PARALLEL)
+ ereport(ERROR,
+ errcode(ERRCODE_SYNTAX_ERROR),
+
+ /*
+ * translator: the first %s is a string of the form "parameter > 0"
+ * and the second one is "option = value".
+ */
+ errmsg("%s and %s are mutually exclusive options",
+ "min_apply_delay > 0", "streaming = parallel"));
}
/*
@@ -560,7 +599,8 @@ CreateSubscription(ParseState *pstate, CreateSubscriptionStmt *stmt,
SUBOPT_SLOT_NAME | SUBOPT_COPY_DATA |
SUBOPT_SYNCHRONOUS_COMMIT | SUBOPT_BINARY |
SUBOPT_STREAMING | SUBOPT_TWOPHASE_COMMIT |
- SUBOPT_DISABLE_ON_ERR | SUBOPT_ORIGIN);
+ SUBOPT_DISABLE_ON_ERR | SUBOPT_ORIGIN |
+ SUBOPT_MIN_APPLY_DELAY);
parse_subscription_options(pstate, stmt->options, supported_opts, &opts);
/*
@@ -625,6 +665,7 @@ CreateSubscription(ParseState *pstate, CreateSubscriptionStmt *stmt,
values[Anum_pg_subscription_oid - 1] = ObjectIdGetDatum(subid);
values[Anum_pg_subscription_subdbid - 1] = ObjectIdGetDatum(MyDatabaseId);
values[Anum_pg_subscription_subskiplsn - 1] = LSNGetDatum(InvalidXLogRecPtr);
+ values[Anum_pg_subscription_subminapplydelay - 1] = Int32GetDatum(opts.min_apply_delay);
values[Anum_pg_subscription_subname - 1] =
DirectFunctionCall1(namein, CStringGetDatum(stmt->subname));
values[Anum_pg_subscription_subowner - 1] = ObjectIdGetDatum(owner);
@@ -1054,7 +1095,7 @@ AlterSubscription(ParseState *pstate, AlterSubscriptionStmt *stmt,
supported_opts = (SUBOPT_SLOT_NAME |
SUBOPT_SYNCHRONOUS_COMMIT | SUBOPT_BINARY |
SUBOPT_STREAMING | SUBOPT_DISABLE_ON_ERR |
- SUBOPT_ORIGIN);
+ SUBOPT_ORIGIN | SUBOPT_MIN_APPLY_DELAY);
parse_subscription_options(pstate, stmt->options,
supported_opts, &opts);
@@ -1098,6 +1139,19 @@ AlterSubscription(ParseState *pstate, AlterSubscriptionStmt *stmt,
if (IsSet(opts.specified_opts, SUBOPT_STREAMING))
{
+ /*
+ * The combination of parallel streaming mode and
+ * min_apply_delay is not allowed. See
+ * parse_subscription_options.
+ */
+ if (opts.streaming == LOGICALREP_STREAM_PARALLEL &&
+ !IsSet(opts.specified_opts, SUBOPT_MIN_APPLY_DELAY)
+ && sub->minapplydelay > 0)
+ ereport(ERROR,
+ errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
+ errmsg("cannot set parallel streaming mode for subscription with %s",
+ "min_apply_delay"));
+
values[Anum_pg_subscription_substream - 1] =
CharGetDatum(opts.streaming);
replaces[Anum_pg_subscription_substream - 1] = true;
@@ -1111,6 +1165,26 @@ AlterSubscription(ParseState *pstate, AlterSubscriptionStmt *stmt,
= true;
}
+ if (IsSet(opts.specified_opts, SUBOPT_MIN_APPLY_DELAY))
+ {
+ /*
+ * The combination of parallel streaming mode and
+ * min_apply_delay is not allowed. See
+ * parse_subscription_options.
+ */
+ if (opts.min_apply_delay > 0 &&
+ !IsSet(opts.specified_opts, SUBOPT_STREAMING)
+ && sub->stream == LOGICALREP_STREAM_PARALLEL)
+ ereport(ERROR,
+ errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
+ errmsg("cannot set %s for subscription in parallel streaming mode",
+ "min_apply_delay"));
+
+ values[Anum_pg_subscription_subminapplydelay - 1] =
+ Int32GetDatum(opts.min_apply_delay);
+ replaces[Anum_pg_subscription_subminapplydelay - 1] = true;
+ }
+
if (IsSet(opts.specified_opts, SUBOPT_ORIGIN))
{
values[Anum_pg_subscription_suborigin - 1] =
@@ -2195,3 +2269,45 @@ defGetStreamingMode(DefElem *def)
def->defname)));
return LOGICALREP_STREAM_OFF; /* keep compiler quiet */
}
+
+/*
+ * Extract the min_apply_delay value from a DefElem. This is very similar to
+ * parse_and_validate_value() for integer values, because min_apply_delay
+ * accepts the same parameter format as recovery_min_apply_delay.
+ */
+static int32
+defGetMinApplyDelay(DefElem *def)
+{
+ char *input_string;
+ int result;
+ const char *hintmsg;
+
+ input_string = defGetString(def);
+
+ /*
+ * Parse given string as parameter which has millisecond unit
+ */
+ if (!parse_int(input_string, &result, GUC_UNIT_MS, &hintmsg))
+ ereport(ERROR,
+ (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+ errmsg("invalid value for parameter \"%s\": \"%s\"",
+ "min_apply_delay", input_string),
+ hintmsg ? errhint("%s", _(hintmsg)) : 0));
+
+ /*
+ * Check both the lower boundary for the valid min_apply_delay range and
+ * the upper boundary as the safeguard for some platforms where INT_MAX is
+ * wider than int32 respectively. Although parse_int() has confirmed that
+ * the result is less than or equal to INT_MAX, the value will be stored
+ * in a catalog column of int32.
+ */
+ if (result < 0 || result > PG_INT32_MAX)
+ ereport(ERROR,
+ (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+ errmsg("%d ms is outside the valid range for parameter \"%s\" (%d .. %d)",
+ result,
+ "min_apply_delay",
+ 0, PG_INT32_MAX)));
+
+ return result;
+}
diff --git a/src/backend/replication/logical/applyparallelworker.c b/src/backend/replication/logical/applyparallelworker.c
index da437e0bc3..32db20fd98 100644
--- a/src/backend/replication/logical/applyparallelworker.c
+++ b/src/backend/replication/logical/applyparallelworker.c
@@ -704,7 +704,8 @@ pa_process_spooled_messages_if_required(void)
{
apply_spooled_messages(&MyParallelShared->fileset,
MyParallelShared->xid,
- InvalidXLogRecPtr);
+ InvalidXLogRecPtr,
+ 0);
pa_set_fileset_state(MyParallelShared, FS_EMPTY);
}
diff --git a/src/backend/replication/logical/worker.c b/src/backend/replication/logical/worker.c
index cfb2ab6248..6b86723b60 100644
--- a/src/backend/replication/logical/worker.c
+++ b/src/backend/replication/logical/worker.c
@@ -319,6 +319,20 @@ static List *on_commit_wakeup_workers_subids = NIL;
bool in_remote_transaction = false;
static XLogRecPtr remote_final_lsn = InvalidXLogRecPtr;
+/*
+ * In order to avoid walsender timeout for time-delayed logical replication the
+ * apply worker keeps sending feedback messages during the delay period.
+ * Meanwhile, the feature delays the apply before the start of the
+ * transaction and thus we don't write WAL records for the suspended changes
+ * during the wait. When the apply worker sends a feedback message during the
+ * delay, we should not overwrite positions of the flushed and apply LSN by the
+ * last received latest LSN. See send_feedback() for details.
+ */
+static XLogRecPtr last_received = InvalidXLogRecPtr;
+
+/* The last time we send a feedback message */
+static TimestampTz send_time = 0;
+
/* fields valid only when processing streamed transaction */
static bool in_streamed_transaction = false;
@@ -389,7 +403,8 @@ static void stream_write_change(char action, StringInfo s);
static void stream_open_and_write_change(TransactionId xid, char action, StringInfo s);
static void stream_close_file(void);
-static void send_feedback(XLogRecPtr recvpos, bool force, bool requestReply);
+static void send_feedback(XLogRecPtr recvpos, bool force, bool requestReply,
+ bool has_unprocessed_change);
static void DisableSubscriptionAndExit(void);
@@ -999,6 +1014,128 @@ slot_modify_data(TupleTableSlot *slot, TupleTableSlot *srcslot,
ExecStoreVirtualTuple(slot);
}
+/*
+ * When min_apply_delay parameter is set on the subscriber, we wait long enough
+ * to make sure a transaction is applied at least that period behind the
+ * publisher.
+ *
+ * While the physical replication applies the delay at commit time, this
+ * feature applies the delay for the next transaction but before starting the
+ * transaction. This is mainly because keeping a transaction that conducted
+ * write operations open for a long time results in some issues such as bloat
+ * and locks.
+ *
+ * The min_apply_delay parameter will take effect only after all tables are in
+ * READY state.
+ *
+ * xid is the transaction id where we apply the delay.
+ *
+ * finish_ts is the commit/prepare time of both regular (non-streamed) and
+ * streamed transactions. Unlike the regular (non-streamed) cases, the delay
+ * is applied in a STREAM COMMIT/STREAM PREPARE message for streamed
+ * transactions. The STREAM START message does not contain a commit/prepare
+ * time (it will be available when the in-progress transaction finishes).
+ * Hence, it's not appropriate to apply a delay at the STREAM START time.
+ */
+static void
+maybe_apply_delay(TransactionId xid, TimestampTz finish_ts)
+{
+ long status_interval_ms = 0;
+
+ Assert(finish_ts > 0);
+
+ /* Nothing to do if no delay set */
+ if (!MySubscription->minapplydelay)
+ return;
+
+ /*
+ * The min_apply_delay parameter is ignored until all tablesync workers
+ * have reached READY state. This is because if we allowed the delay
+ * during the catchup phase, then once we reached the limit of tablesync
+ * workers it would impose a delay for each subsequent worker. That would
+ * cause initial table synchronization completion to take a long time.
+ */
+ if (!AllTablesyncsReady())
+ return;
+
+ /* Apply the delay by the latch mechanism */
+ do
+ {
+ TimestampTz delayUntil;
+ long diffms;
+
+ ResetLatch(MyLatch);
+
+ CHECK_FOR_INTERRUPTS();
+
+ /* This might change wal_receiver_status_interval */
+ if (ConfigReloadPending)
+ {
+ ConfigReloadPending = false;
+ ProcessConfigFile(PGC_SIGHUP);
+ }
+
+ /*
+ * Before calculating the time duration, reload the catalog if needed.
+ */
+ if (!in_remote_transaction && !in_streamed_transaction)
+ {
+ AcceptInvalidationMessages();
+ maybe_reread_subscription();
+ }
+
+ delayUntil = TimestampTzPlusMilliseconds(finish_ts, MySubscription->minapplydelay);
+ diffms = TimestampDifferenceMilliseconds(GetCurrentTimestamp(), delayUntil);
+
+ /*
+ * Exit without arming the latch if it's already past time to apply
+ * this transaction.
+ */
+ if (diffms <= 0)
+ break;
+
+ elog(DEBUG2, "time-delayed replication for txid %u, min_apply_delay = %d ms, remaining wait time: %ld ms",
+ xid, MySubscription->minapplydelay, diffms);
+
+ /*
+ * Call send_feedback() to prevent the publisher from exiting by
+ * timeout during the delay, when the status interval is greater than
+ * zero.
+ */
+ if (!status_interval_ms)
+ {
+ TimestampTz nextFeedback;
+
+ /*
+ * Based on the last time when we send a feedback message, adjust
+ * the first delay time for this transaction. This ensures that
+ * the first feedback message follows wal_receiver_status_interval
+ * interval.
+ */
+ nextFeedback = TimestampTzPlusMilliseconds(send_time,
+ wal_receiver_status_interval * 1000L);
+ status_interval_ms = TimestampDifferenceMilliseconds(GetCurrentTimestamp(), nextFeedback);
+ }
+ else
+ status_interval_ms = wal_receiver_status_interval * 1000L;
+
+ if (status_interval_ms > 0 && diffms > status_interval_ms)
+ {
+ WaitLatch(MyLatch,
+ WL_LATCH_SET | WL_TIMEOUT | WL_EXIT_ON_PM_DEATH,
+ status_interval_ms,
+ WAIT_EVENT_LOGICAL_APPLY_DELAY);
+ send_feedback(last_received, true, false, true);
+ }
+ else
+ WaitLatch(MyLatch,
+ WL_LATCH_SET | WL_TIMEOUT | WL_EXIT_ON_PM_DEATH,
+ diffms,
+ WAIT_EVENT_LOGICAL_APPLY_DELAY);
+
+ } while (true);
+}
+
/*
* Handle BEGIN message.
*/
@@ -1013,6 +1150,9 @@ apply_handle_begin(StringInfo s)
logicalrep_read_begin(s, &begin_data);
set_apply_error_context_xact(begin_data.xid, begin_data.final_lsn);
+ /* Should we delay the current transaction? */
+ maybe_apply_delay(begin_data.xid, begin_data.committime);
+
remote_final_lsn = begin_data.final_lsn;
maybe_start_skipping_changes(begin_data.final_lsn);
@@ -1070,6 +1210,9 @@ apply_handle_begin_prepare(StringInfo s)
logicalrep_read_begin_prepare(s, &begin_data);
set_apply_error_context_xact(begin_data.xid, begin_data.prepare_lsn);
+ /* Should we delay the current prepared transaction? */
+ maybe_apply_delay(begin_data.xid, begin_data.prepare_time);
+
remote_final_lsn = begin_data.prepare_lsn;
maybe_start_skipping_changes(begin_data.prepare_lsn);
@@ -1317,7 +1460,8 @@ apply_handle_stream_prepare(StringInfo s)
* spooled operations.
*/
apply_spooled_messages(MyLogicalRepWorker->stream_fileset,
- prepare_data.xid, prepare_data.prepare_lsn);
+ prepare_data.xid, prepare_data.prepare_lsn,
+ prepare_data.prepare_time);
/* Mark the transaction as prepared. */
apply_handle_prepare_internal(&prepare_data);
@@ -2011,10 +2155,13 @@ ensure_last_message(FileSet *stream_fileset, TransactionId xid, int fileno,
/*
* Common spoolfile processing.
+ *
+ * The commit/prepare time (finish_ts) is required for time-delayed logical
+ * replication.
*/
void
apply_spooled_messages(FileSet *stream_fileset, TransactionId xid,
- XLogRecPtr lsn)
+ XLogRecPtr lsn, TimestampTz finish_ts)
{
StringInfoData s2;
int nchanges;
@@ -2025,6 +2172,10 @@ apply_spooled_messages(FileSet *stream_fileset, TransactionId xid,
int fileno;
off_t offset;
+ /* Should we delay the current transaction? */
+ if (finish_ts)
+ maybe_apply_delay(xid, finish_ts);
+
if (!am_parallel_apply_worker())
maybe_start_skipping_changes(lsn);
@@ -2174,7 +2325,7 @@ apply_handle_stream_commit(StringInfo s)
* spooled operations.
*/
apply_spooled_messages(MyLogicalRepWorker->stream_fileset, xid,
- commit_data.commit_lsn);
+ commit_data.commit_lsn, commit_data.committime);
apply_handle_commit_internal(&commit_data);
@@ -3447,7 +3598,7 @@ UpdateWorkerStats(XLogRecPtr last_lsn, TimestampTz send_time, bool reply)
* Apply main loop.
*/
static void
-LogicalRepApplyLoop(XLogRecPtr last_received)
+LogicalRepApplyLoop(void)
{
TimestampTz last_recv_timestamp = GetCurrentTimestamp();
bool ping_sent = false;
@@ -3568,7 +3719,7 @@ LogicalRepApplyLoop(XLogRecPtr last_received)
if (last_received < end_lsn)
last_received = end_lsn;
- send_feedback(last_received, reply_requested, false);
+ send_feedback(last_received, reply_requested, false, false);
UpdateWorkerStats(last_received, timestamp, true);
}
/* other message types are purposefully ignored */
@@ -3581,7 +3732,7 @@ LogicalRepApplyLoop(XLogRecPtr last_received)
}
/* confirm all writes so far */
- send_feedback(last_received, false, false);
+ send_feedback(last_received, false, false, false);
if (!in_remote_transaction && !in_streamed_transaction)
{
@@ -3678,7 +3829,7 @@ LogicalRepApplyLoop(XLogRecPtr last_received)
}
}
- send_feedback(last_received, requestReply, requestReply);
+ send_feedback(last_received, requestReply, requestReply, false);
/*
* Force reporting to ensure long idle periods don't lead to
@@ -3708,10 +3859,9 @@ LogicalRepApplyLoop(XLogRecPtr last_received)
* to send a response to avoid timeouts.
*/
static void
-send_feedback(XLogRecPtr recvpos, bool force, bool requestReply)
+send_feedback(XLogRecPtr recvpos, bool force, bool requestReply, bool has_unprocessed_change)
{
static StringInfo reply_message = NULL;
- static TimestampTz send_time = 0;
static XLogRecPtr last_recvpos = InvalidXLogRecPtr;
static XLogRecPtr last_writepos = InvalidXLogRecPtr;
@@ -3738,8 +3888,14 @@ send_feedback(XLogRecPtr recvpos, bool force, bool requestReply)
/*
* No outstanding transactions to flush, we can report the latest received
* position. This is important for synchronous replication.
+ *
+ * If the logical replication subscription has unprocessed changes then do
+ * not inform the publisher that the received latest LSN is already
+ * applied and flushed, otherwise, the publisher will make a wrong
+ * assumption about the logical replication progress. Instead, just send a
+ * feedback message to avoid a replication timeout during the delay.
*/
- if (!have_pending_txes)
+ if (!have_pending_txes && !has_unprocessed_change)
flushpos = writepos = recvpos;
if (writepos < last_writepos)
@@ -3776,8 +3932,9 @@ send_feedback(XLogRecPtr recvpos, bool force, bool requestReply)
pq_sendint64(reply_message, now); /* sendTime */
pq_sendbyte(reply_message, requestReply); /* replyRequested */
- elog(DEBUG2, "sending feedback (force %d) to recv %X/%X, write %X/%X, flush %X/%X",
+ elog(DEBUG2, "sending feedback (force %d, has_unprocessed_change %d) to recv %X/%X, write %X/%X, flush %X/%X",
force,
+ has_unprocessed_change,
LSN_FORMAT_ARGS(recvpos),
LSN_FORMAT_ARGS(writepos),
LSN_FORMAT_ARGS(flushpos));
@@ -4367,11 +4524,11 @@ start_table_sync(XLogRecPtr *origin_startpos, char **myslotname)
* of system resource error and are not repeatable.
*/
static void
-start_apply(XLogRecPtr origin_startpos)
+start_apply(void)
{
PG_TRY();
{
- LogicalRepApplyLoop(origin_startpos);
+ LogicalRepApplyLoop();
}
PG_CATCH();
{
@@ -4661,7 +4818,8 @@ ApplyWorkerMain(Datum main_arg)
}
/* Run the main loop. */
- start_apply(origin_startpos);
+ last_received = origin_startpos;
+ start_apply();
proc_exit(0);
}
diff --git a/src/backend/utils/activity/wait_event.c b/src/backend/utils/activity/wait_event.c
index 6e4599278c..dd06927328 100644
--- a/src/backend/utils/activity/wait_event.c
+++ b/src/backend/utils/activity/wait_event.c
@@ -512,6 +512,9 @@ pgstat_get_wait_timeout(WaitEventTimeout w)
case WAIT_EVENT_VACUUM_TRUNCATE:
event_name = "VacuumTruncate";
break;
+ case WAIT_EVENT_LOGICAL_APPLY_DELAY:
+ event_name = "LogicalApplyDelay";
+ break;
/* no default case, so that compiler will warn */
}
diff --git a/src/bin/pg_dump/pg_dump.c b/src/bin/pg_dump/pg_dump.c
index 527c7651ab..1e87f0124e 100644
--- a/src/bin/pg_dump/pg_dump.c
+++ b/src/bin/pg_dump/pg_dump.c
@@ -4494,6 +4494,7 @@ getSubscriptions(Archive *fout)
int i_subsynccommit;
int i_subpublications;
int i_subbinary;
+ int i_subminapplydelay;
int i,
ntups;
@@ -4546,9 +4547,13 @@ getSubscriptions(Archive *fout)
LOGICALREP_TWOPHASE_STATE_DISABLED);
if (fout->remoteVersion >= 160000)
- appendPQExpBufferStr(query, " s.suborigin\n");
+ appendPQExpBufferStr(query,
+ " s.suborigin,\n"
+ " s.subminapplydelay\n");
else
- appendPQExpBuffer(query, " '%s' AS suborigin\n", LOGICALREP_ORIGIN_ANY);
+ appendPQExpBuffer(query, " '%s' AS suborigin,\n"
+ " 0 AS subminapplydelay\n",
+ LOGICALREP_ORIGIN_ANY);
appendPQExpBufferStr(query,
"FROM pg_subscription s\n"
@@ -4576,6 +4581,7 @@ getSubscriptions(Archive *fout)
i_subtwophasestate = PQfnumber(res, "subtwophasestate");
i_subdisableonerr = PQfnumber(res, "subdisableonerr");
i_suborigin = PQfnumber(res, "suborigin");
+ i_subminapplydelay = PQfnumber(res, "subminapplydelay");
subinfo = pg_malloc(ntups * sizeof(SubscriptionInfo));
@@ -4606,6 +4612,8 @@ getSubscriptions(Archive *fout)
subinfo[i].subdisableonerr =
pg_strdup(PQgetvalue(res, i, i_subdisableonerr));
subinfo[i].suborigin = pg_strdup(PQgetvalue(res, i, i_suborigin));
+ subinfo[i].subminapplydelay =
+ atoi(PQgetvalue(res, i, i_subminapplydelay));
/* Decide whether we want to dump it */
selectDumpableObject(&(subinfo[i].dobj), fout);
@@ -4687,6 +4695,9 @@ dumpSubscription(Archive *fout, const SubscriptionInfo *subinfo)
if (strcmp(subinfo->subsynccommit, "off") != 0)
appendPQExpBuffer(query, ", synchronous_commit = %s", fmtId(subinfo->subsynccommit));
+ if (subinfo->subminapplydelay > 0)
+ appendPQExpBuffer(query, ", min_apply_delay = '%d ms'", subinfo->subminapplydelay);
+
appendPQExpBufferStr(query, ");\n");
if (subinfo->dobj.dump & DUMP_COMPONENT_DEFINITION)
diff --git a/src/bin/pg_dump/pg_dump.h b/src/bin/pg_dump/pg_dump.h
index e7cbd8d7ed..b8831c3ed3 100644
--- a/src/bin/pg_dump/pg_dump.h
+++ b/src/bin/pg_dump/pg_dump.h
@@ -661,6 +661,7 @@ typedef struct _SubscriptionInfo
char *subdisableonerr;
char *suborigin;
char *subsynccommit;
+ int subminapplydelay;
char *subpublications;
} SubscriptionInfo;
diff --git a/src/bin/psql/describe.c b/src/bin/psql/describe.c
index c8a0bb7b3a..81d4607a1c 100644
--- a/src/bin/psql/describe.c
+++ b/src/bin/psql/describe.c
@@ -6472,7 +6472,7 @@ describeSubscriptions(const char *pattern, bool verbose)
PGresult *res;
printQueryOpt myopt = pset.popt;
static const bool translate_columns[] = {false, false, false, false,
- false, false, false, false, false, false, false, false};
+ false, false, false, false, false, false, false, false, false};
if (pset.sversion < 100000)
{
@@ -6527,10 +6527,13 @@ describeSubscriptions(const char *pattern, bool verbose)
gettext_noop("Two-phase commit"),
gettext_noop("Disable on error"));
+ /* Origin and min_apply_delay are only supported in v16 and higher */
if (pset.sversion >= 160000)
appendPQExpBuffer(&buf,
- ", suborigin AS \"%s\"\n",
- gettext_noop("Origin"));
+ ", suborigin AS \"%s\"\n"
+ ", subminapplydelay AS \"%s\"\n",
+ gettext_noop("Origin"),
+ gettext_noop("Min apply delay"));
appendPQExpBuffer(&buf,
", subsynccommit AS \"%s\"\n"
diff --git a/src/bin/psql/tab-complete.c b/src/bin/psql/tab-complete.c
index 5e1882eaea..e8b9a43a47 100644
--- a/src/bin/psql/tab-complete.c
+++ b/src/bin/psql/tab-complete.c
@@ -1925,7 +1925,7 @@ psql_completion(const char *text, int start, int end)
COMPLETE_WITH("(", "PUBLICATION");
/* ALTER SUBSCRIPTION <name> SET ( */
else if (HeadMatches("ALTER", "SUBSCRIPTION", MatchAny) && TailMatches("SET", "("))
- COMPLETE_WITH("binary", "disable_on_error", "origin", "slot_name",
+ COMPLETE_WITH("binary", "disable_on_error", "min_apply_delay", "origin", "slot_name",
"streaming", "synchronous_commit");
/* ALTER SUBSCRIPTION <name> SKIP ( */
else if (HeadMatches("ALTER", "SUBSCRIPTION", MatchAny) && TailMatches("SKIP", "("))
@@ -3268,7 +3268,7 @@ psql_completion(const char *text, int start, int end)
/* Complete "CREATE SUBSCRIPTION <name> ... WITH ( <opt>" */
else if (HeadMatches("CREATE", "SUBSCRIPTION") && TailMatches("WITH", "("))
COMPLETE_WITH("binary", "connect", "copy_data", "create_slot",
- "disable_on_error", "enabled", "origin", "slot_name",
+ "disable_on_error", "enabled", "min_apply_delay", "origin", "slot_name",
"streaming", "synchronous_commit", "two_phase");
/* CREATE TRIGGER --- is allowed inside CREATE SCHEMA, so use TailMatches */
diff --git a/src/include/catalog/pg_subscription.h b/src/include/catalog/pg_subscription.h
index b0f2a1705d..d1cfefc6d6 100644
--- a/src/include/catalog/pg_subscription.h
+++ b/src/include/catalog/pg_subscription.h
@@ -74,6 +74,8 @@ CATALOG(pg_subscription,6100,SubscriptionRelationId) BKI_SHARED_RELATION BKI_ROW
Oid subowner BKI_LOOKUP(pg_authid); /* Owner of the subscription */
+ int32 subminapplydelay; /* Replication apply delay (ms) */
+
bool subenabled; /* True if the subscription is enabled (the
* worker should be running) */
@@ -122,6 +124,7 @@ typedef struct Subscription
* skipped */
char *name; /* Name of the subscription */
Oid owner; /* Oid of the subscription owner */
+ int32 minapplydelay; /* Replication apply delay (ms) */
bool enabled; /* Indicates if the subscription is enabled */
bool binary; /* Indicates if the subscription wants data in
* binary format */
diff --git a/src/include/replication/worker_internal.h b/src/include/replication/worker_internal.h
index dc87a4edd1..3dc09d1a4c 100644
--- a/src/include/replication/worker_internal.h
+++ b/src/include/replication/worker_internal.h
@@ -255,7 +255,7 @@ extern void stream_stop_internal(TransactionId xid);
/* Common streaming function to apply all the spooled messages */
extern void apply_spooled_messages(FileSet *stream_fileset, TransactionId xid,
- XLogRecPtr lsn);
+ XLogRecPtr lsn, TimestampTz finish_ts);
extern void apply_dispatch(StringInfo s);
diff --git a/src/include/utils/wait_event.h b/src/include/utils/wait_event.h
index 6cacd6edaf..f95c5fee8c 100644
--- a/src/include/utils/wait_event.h
+++ b/src/include/utils/wait_event.h
@@ -149,7 +149,8 @@ typedef enum
WAIT_EVENT_REGISTER_SYNC_REQUEST,
WAIT_EVENT_SPIN_DELAY,
WAIT_EVENT_VACUUM_DELAY,
- WAIT_EVENT_VACUUM_TRUNCATE
+ WAIT_EVENT_VACUUM_TRUNCATE,
+ WAIT_EVENT_LOGICAL_APPLY_DELAY
} WaitEventTimeout;
/* ----------
diff --git a/src/test/regress/expected/subscription.out b/src/test/regress/expected/subscription.out
index 3f99b14394..cf8e727ee9 100644
--- a/src/test/regress/expected/subscription.out
+++ b/src/test/regress/expected/subscription.out
@@ -114,18 +114,18 @@ CREATE SUBSCRIPTION regress_testsub4 CONNECTION 'dbname=regress_doesnotexist' PU
WARNING: subscription was created, but is not connected
HINT: To initiate replication, you must manually create the replication slot, enable the subscription, and refresh the subscription.
\dRs+ regress_testsub4
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
-------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub4 | regress_subscription_user | f | {testpub} | f | off | d | f | none | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay | Synchronous commit | Conninfo | Skip LSN
+------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+-----------------+--------------------+-----------------------------+----------
+ regress_testsub4 | regress_subscription_user | f | {testpub} | f | off | d | f | none | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
ALTER SUBSCRIPTION regress_testsub4 SET (origin = any);
\dRs+ regress_testsub4
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
-------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub4 | regress_subscription_user | f | {testpub} | f | off | d | f | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay | Synchronous commit | Conninfo | Skip LSN
+------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+-----------------+--------------------+-----------------------------+----------
+ regress_testsub4 | regress_subscription_user | f | {testpub} | f | off | d | f | any | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
DROP SUBSCRIPTION regress_testsub3;
@@ -143,10 +143,10 @@ ALTER SUBSCRIPTION regress_testsub CONNECTION 'foobar';
ERROR: invalid connection string syntax: missing "=" after "foobar" in connection info string
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub} | f | off | d | f | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+-----------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | off | d | f | any | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
ALTER SUBSCRIPTION regress_testsub SET PUBLICATION testpub2, testpub3 WITH (refresh = false);
@@ -163,10 +163,10 @@ ERROR: unrecognized subscription parameter: "create_slot"
-- ok
ALTER SUBSCRIPTION regress_testsub SKIP (lsn = '0/12345');
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+---------------------+--------+-----------+------------------+------------------+--------+--------------------+------------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub2,testpub3} | f | off | d | f | any | off | dbname=regress_doesnotexist2 | 0/12345
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+---------------------+--------+-----------+------------------+------------------+--------+-----------------+--------------------+------------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub2,testpub3} | f | off | d | f | any | 0 | off | dbname=regress_doesnotexist2 | 0/12345
(1 row)
-- ok - with lsn = NONE
@@ -175,10 +175,10 @@ ALTER SUBSCRIPTION regress_testsub SKIP (lsn = NONE);
ALTER SUBSCRIPTION regress_testsub SKIP (lsn = '0/0');
ERROR: invalid WAL location (LSN): 0/0
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+---------------------+--------+-----------+------------------+------------------+--------+--------------------+------------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub2,testpub3} | f | off | d | f | any | off | dbname=regress_doesnotexist2 | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+---------------------+--------+-----------+------------------+------------------+--------+-----------------+--------------------+------------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub2,testpub3} | f | off | d | f | any | 0 | off | dbname=regress_doesnotexist2 | 0/0
(1 row)
BEGIN;
@@ -210,10 +210,10 @@ ALTER SUBSCRIPTION regress_testsub_foo SET (synchronous_commit = foobar);
ERROR: invalid value for parameter "synchronous_commit": "foobar"
HINT: Available values: local, remote_write, remote_apply, on, off.
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
----------------------+---------------------------+---------+---------------------+--------+-----------+------------------+------------------+--------+--------------------+------------------------------+----------
- regress_testsub_foo | regress_subscription_user | f | {testpub2,testpub3} | f | off | d | f | any | local | dbname=regress_doesnotexist2 | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay | Synchronous commit | Conninfo | Skip LSN
+---------------------+---------------------------+---------+---------------------+--------+-----------+------------------+------------------+--------+-----------------+--------------------+------------------------------+----------
+ regress_testsub_foo | regress_subscription_user | f | {testpub2,testpub3} | f | off | d | f | any | 0 | local | dbname=regress_doesnotexist2 | 0/0
(1 row)
-- rename back to keep the rest simple
@@ -247,19 +247,19 @@ CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUB
WARNING: subscription was created, but is not connected
HINT: To initiate replication, you must manually create the replication slot, enable the subscription, and refresh the subscription.
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub} | t | off | d | f | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+-----------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | t | off | d | f | any | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
ALTER SUBSCRIPTION regress_testsub SET (binary = false);
ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE);
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub} | f | off | d | f | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+-----------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | off | d | f | any | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
DROP SUBSCRIPTION regress_testsub;
@@ -271,27 +271,27 @@ CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUB
WARNING: subscription was created, but is not connected
HINT: To initiate replication, you must manually create the replication slot, enable the subscription, and refresh the subscription.
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub} | f | on | d | f | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+-----------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | on | d | f | any | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
ALTER SUBSCRIPTION regress_testsub SET (streaming = parallel);
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub} | f | parallel | d | f | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+-----------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | parallel | d | f | any | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
ALTER SUBSCRIPTION regress_testsub SET (streaming = false);
ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE);
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub} | f | off | d | f | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+-----------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | off | d | f | any | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
-- fail - publication already exists
@@ -306,10 +306,10 @@ ALTER SUBSCRIPTION regress_testsub ADD PUBLICATION testpub1, testpub2 WITH (refr
ALTER SUBSCRIPTION regress_testsub ADD PUBLICATION testpub1, testpub2 WITH (refresh = false);
ERROR: publication "testpub1" is already in subscription "regress_testsub"
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-----------------------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub,testpub1,testpub2} | f | off | d | f | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-----------------------------+--------+-----------+------------------+------------------+--------+-----------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub,testpub1,testpub2} | f | off | d | f | any | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
-- fail - publication used more than once
@@ -324,10 +324,10 @@ ERROR: publication "testpub3" is not in subscription "regress_testsub"
-- ok - delete publications
ALTER SUBSCRIPTION regress_testsub DROP PUBLICATION testpub1, testpub2 WITH (refresh = false);
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub} | f | off | d | f | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+-----------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | off | d | f | any | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
DROP SUBSCRIPTION regress_testsub;
@@ -363,10 +363,10 @@ CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUB
WARNING: subscription was created, but is not connected
HINT: To initiate replication, you must manually create the replication slot, enable the subscription, and refresh the subscription.
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub} | f | off | p | f | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+-----------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | off | p | f | any | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
--fail - alter of two_phase option not supported.
@@ -375,10 +375,10 @@ ERROR: unrecognized subscription parameter: "two_phase"
-- but can alter streaming when two_phase enabled
ALTER SUBSCRIPTION regress_testsub SET (streaming = true);
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub} | f | on | p | f | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+-----------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | on | p | f | any | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE);
@@ -388,10 +388,10 @@ CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUB
WARNING: subscription was created, but is not connected
HINT: To initiate replication, you must manually create the replication slot, enable the subscription, and refresh the subscription.
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub} | f | on | p | f | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+-----------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | on | p | f | any | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE);
@@ -404,20 +404,57 @@ CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUB
WARNING: subscription was created, but is not connected
HINT: To initiate replication, you must manually create the replication slot, enable the subscription, and refresh the subscription.
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub} | f | off | d | f | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+-----------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | off | d | f | any | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
ALTER SUBSCRIPTION regress_testsub SET (disable_on_error = true);
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub} | f | off | d | t | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+-----------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | off | d | t | any | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
+ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE);
+DROP SUBSCRIPTION regress_testsub;
+-- fail -- min_apply_delay must be a non-negative integer
+CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUBLICATION testpub WITH (connect = false, min_apply_delay = foo);
+ERROR: invalid value for parameter "min_apply_delay": "foo"
+CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUBLICATION testpub WITH (connect = false, min_apply_delay = -1);
+ERROR: -1 ms is outside the valid range for parameter "min_apply_delay" (0 .. 2147483647)
+-- fail - utilizing streaming = parallel with time-delayed replication is not supported
+CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUBLICATION testpub WITH (connect = false, streaming = parallel, min_apply_delay = 123);
+ERROR: min_apply_delay > 0 and streaming = parallel are mutually exclusive options
+-- success -- min_apply_delay value without unit is taken as milliseconds
+CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUBLICATION testpub WITH (connect = false, min_apply_delay = 123);
+WARNING: subscription was created, but is not connected
+HINT: To initiate replication, you must manually create the replication slot, enable the subscription, and refresh the subscription.
+\dRs+
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+-----------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | off | d | f | any | 123 | off | dbname=regress_doesnotexist | 0/0
+(1 row)
+
+-- success -- min_apply_delay value with unit is converted into ms and stored as an integer
+ALTER SUBSCRIPTION regress_testsub SET (min_apply_delay = '1 d');
+\dRs+
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+-----------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | off | d | f | any | 86400000 | off | dbname=regress_doesnotexist | 0/0
+(1 row)
+
+-- fail - alter subscription with streaming = parallel should fail when time-delayed replication is set
+ALTER SUBSCRIPTION regress_testsub SET (streaming = parallel);
+ERROR: cannot set parallel streaming mode for subscription with min_apply_delay
+-- fail - alter subscription with min_apply_delay should fail when streaming = parallel is set
+ALTER SUBSCRIPTION regress_testsub SET (min_apply_delay = 0, streaming = parallel);
+ALTER SUBSCRIPTION regress_testsub SET (min_apply_delay = 123);
+ERROR: cannot set min_apply_delay for subscription in parallel streaming mode
ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE);
DROP SUBSCRIPTION regress_testsub;
RESET SESSION AUTHORIZATION;
diff --git a/src/test/regress/sql/subscription.sql b/src/test/regress/sql/subscription.sql
index 7281f5fee2..7317b140f5 100644
--- a/src/test/regress/sql/subscription.sql
+++ b/src/test/regress/sql/subscription.sql
@@ -286,6 +286,30 @@ ALTER SUBSCRIPTION regress_testsub SET (disable_on_error = true);
ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE);
DROP SUBSCRIPTION regress_testsub;
+-- fail -- min_apply_delay must be a non-negative integer
+CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUBLICATION testpub WITH (connect = false, min_apply_delay = foo);
+CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUBLICATION testpub WITH (connect = false, min_apply_delay = -1);
+
+-- fail - utilizing streaming = parallel with time-delayed replication is not supported
+CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUBLICATION testpub WITH (connect = false, streaming = parallel, min_apply_delay = 123);
+
+-- success -- min_apply_delay value without unit is taken as milliseconds
+CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUBLICATION testpub WITH (connect = false, min_apply_delay = 123);
+\dRs+
+
+-- success -- min_apply_delay value with unit is converted into ms and stored as an integer
+ALTER SUBSCRIPTION regress_testsub SET (min_apply_delay = '1 d');
+\dRs+
+
+-- fail - alter subscription with streaming = parallel should fail when time-delayed replication is set
+ALTER SUBSCRIPTION regress_testsub SET (streaming = parallel);
+
+-- fail - alter subscription with min_apply_delay should fail when streaming = parallel is set
+ALTER SUBSCRIPTION regress_testsub SET (min_apply_delay = 0, streaming = parallel);
+ALTER SUBSCRIPTION regress_testsub SET (min_apply_delay = 123);
+ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE);
+DROP SUBSCRIPTION regress_testsub;
+
RESET SESSION AUTHORIZATION;
DROP ROLE regress_subscription_user;
DROP ROLE regress_subscription_user2;
diff --git a/src/test/subscription/t/001_rep_changes.pl b/src/test/subscription/t/001_rep_changes.pl
index 91aa068c95..75fd77b891 100644
--- a/src/test/subscription/t/001_rep_changes.pl
+++ b/src/test/subscription/t/001_rep_changes.pl
@@ -515,6 +515,34 @@ $node_publisher->poll_query_until('postgres',
or die
"Timed out while waiting for apply to restart after renaming SUBSCRIPTION";
+# Test time-delayed logical replication
+#
+# If the subscription sets min_apply_delay parameter, the logical replication
+# worker will delay the transaction apply for min_apply_delay milliseconds. We
+# verify this by looking at the time difference between a) when tuples are
+# inserted on the publisher, and b) when those changes are replicated on the
+# subscriber. Even on slow machines, this strategy will give predictable behavior.
+
+# Set min_apply_delay parameter to 3 seconds
+my $delay = 3;
+$node_subscriber->safe_psql('postgres',
+ "ALTER SUBSCRIPTION tap_sub_renamed SET (min_apply_delay = '${delay}s')");
+
+# Before doing the insertion, get the current timestamp that will be
+# used as a comparison base.
+my $publisher_insert_time = time();
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_ins VALUES (generate_series(1101, 1120))");
+
+# The publisher waits for the replication to complete
+$node_publisher->wait_for_catchup('tap_sub_renamed');
+
+# This test is successful if and only if the LSN has been applied with at least
+# the configured apply delay.
+ok( time() - $publisher_insert_time >= $delay,
+ "subscriber applies WAL only after replication delay for non-streaming transaction"
+);
+
# check all the cleanup
$node_subscriber->safe_psql('postgres', "DROP SUBSCRIPTION tap_sub_renamed");
--
2.30.0
Hi,
On 2023-02-10 11:26:21 +0000, Takamichi Osumi (Fujitsu) wrote:
Subject: [PATCH v34] Time-delayed logical replication subscriber
Similar to physical replication, a time-delayed copy of the data for
logical replication is useful for some scenarios (particularly to fix
errors that might cause data loss).This patch implements a new subscription parameter called 'min_apply_delay'.
Sorry for not reading through the thread, but it's very long.
Has there been any discussion about whether this is actually best implemented
on the client side? You could alternatively implement it on the sender.
That'd have quite a few advantages, I think - you e.g. wouldn't remove the
ability to *receive* and send feedback messages. We'd not end up filling up
the network buffer with data that we'll not process anytime soon.
Greetings,
Andres Freund
Hi
On Saturday, February 11, 2023 11:10 AM Andres Freund <andres@anarazel.de> wrote:
On 2023-02-10 11:26:21 +0000, Takamichi Osumi (Fujitsu) wrote:
Subject: [PATCH v34] Time-delayed logical replication subscriber
Similar to physical replication, a time-delayed copy of the data for
logical replication is useful for some scenarios (particularly to fix
errors that might cause data loss).This patch implements a new subscription parameter called
'min_apply_delay'.
Has there been any discussion about whether this is actually best
implemented on the client side? You could alternatively implement it on the
sender.That'd have quite a few advantages, I think - you e.g. wouldn't remove the
ability to *receive* and send feedback messages. We'd not end up filling up
the network buffer with data that we'll not process anytime soon.
Thanks for your comments !
We have discussed about the publisher side idea around here [1]/messages/by-id/20221215.105200.268327207020006785.horikyota.ntt@gmail.com
but, we chose the current direction. Kindly have a look at the discussion.
If we apply the delay on the publisher, then
it can lead to extra delay where we don't need to apply.
The current proposed approach can take other loads or factors
(network, busyness of the publisher, etc) into account
because it calculates the required delay on the subscriber.
[1]: /messages/by-id/20221215.105200.268327207020006785.horikyota.ntt@gmail.com
Best Regards,
Takamichi Osumi
At Fri, 10 Feb 2023 10:34:49 +0530, Amit Kapila <amit.kapila16@gmail.com> wrote in
On Fri, Feb 10, 2023 at 10:11 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
On Fri, Feb 10, 2023 at 6:27 AM Kyotaro Horiguchi
<horikyota.ntt@gmail.com> wrote:We have suffered this kind of feedback silence many times. Thus I
don't want to rely on luck here. I had in mind of exposing last_send
itself or providing interval-calclation function to the logic.I think we have last_send time in send_feedback(), so we can expose it
if we want but how would that solve the problem you are worried about?
Wal receiver can avoid a too-long sleep by knowing when to wake up for
the next feedback.
I have an idea to use last_send time to avoid walsenders being
timeout. Instead of directly using wal_receiver_status_interval as a
minimum interval to send the feedback, we should check if it is
greater than last_send time then we should send the feedback after
(wal_receiver_status_interval - last_send). I think they can probably
be different only on the very first time. Any better ideas?
If it means MyLogicalRepWorker->last_send_time, it is not the last
time when walreceiver sent a feedback but the last time when
wal*sender* sent a data. So I'm not sure that works.
We could use the variable that way, but AFAIU in turn when so many
changes have been spooled that the control doesn't return to
LogicalRepApplyLoop longer than wal_r_s_interval, maybe_apply_delay()
starts calling send_feedback() at every call after the first feedback
timing. Even in that case, send_feedback() won't send one actually
until the next feedback timing, I don't think that behavior is great.
The only packets walreceiver sends back is the feedback packets and
currently only send_feedback knows the last feedback time.
regards.
--
Kyotaro Horiguchi
NTT Open Source Software Center
Hi, Horiguchi-san
On Monday, February 13, 2023 10:26 AM Kyotaro Horiguchi <horikyota.ntt@gmail.com> wrote:
At Fri, 10 Feb 2023 10:34:49 +0530, Amit Kapila <amit.kapila16@gmail.com>
wrote inOn Fri, Feb 10, 2023 at 10:11 AM Amit Kapila <amit.kapila16@gmail.com>
wrote:
On Fri, Feb 10, 2023 at 6:27 AM Kyotaro Horiguchi
<horikyota.ntt@gmail.com> wrote:We have suffered this kind of feedback silence many times. Thus I
don't want to rely on luck here. I had in mind of exposing
last_send itself or providing interval-calclation function to the logic.I think we have last_send time in send_feedback(), so we can expose
it if we want but how would that solve the problem you are worriedabout?
Wal receiver can avoid a too-long sleep by knowing when to wake up for the
next feedback.I have an idea to use last_send time to avoid walsenders being
timeout. Instead of directly using wal_receiver_status_interval as a
minimum interval to send the feedback, we should check if it is
greater than last_send time then we should send the feedback after
(wal_receiver_status_interval - last_send). I think they can probably
be different only on the very first time. Any better ideas?If it means MyLogicalRepWorker->last_send_time, it is not the last time when
walreceiver sent a feedback but the last time when
wal*sender* sent a data. So I'm not sure that works.We could use the variable that way, but AFAIU in turn when so many changes
have been spooled that the control doesn't return to LogicalRepApplyLoop
longer than wal_r_s_interval, maybe_apply_delay() starts calling
send_feedback() at every call after the first feedback timing. Even in that
case, send_feedback() won't send one actually until the next feedback timing,
I don't think that behavior is great.The only packets walreceiver sends back is the feedback packets and
currently only send_feedback knows the last feedback time.
Thanks for your comments !
As described in your last sentence, in the latest patch v34 [1]/messages/by-id/TYCPR01MB83736C50C98CB2153728A7A8EDDE9@TYCPR01MB8373.jpnprd01.prod.outlook.com,
we use the last time set in send_feedback() and
based on it, we calculate and adjust the first timing of feedback message
in maybe_apply_delay() so that we can send the feedback message following
the interval of wal_receiver_status_interval. I wasn't sure if
the above concern is still valid for this implementation.
Could you please have a look at the latest patch and share your opinion ?
[1]: /messages/by-id/TYCPR01MB83736C50C98CB2153728A7A8EDDE9@TYCPR01MB8373.jpnprd01.prod.outlook.com
Best Regards,
Takamichi Osumi
Here are my review comments for the v34 patch.
======
src/backend/replication/logical/worker.c
+/* The last time we send a feedback message */
+static TimestampTz send_time = 0;
+
IMO this is a bad variable name. When this variable was changed to be
global it ought to have been renamed.
The name "send_time" is almost meaningless without any contextual information.
But also it's bad because this global name is "shadowed" by several
other parameters and other local variables using that same name (e.g.
see UpdateWorkerStats, LogicalRepApplyLoop, etc). It is too confusing.
How about using a unique/meaningful name with a comment to match to
improve readability and remove unwanted shadowing?
SUGGESTION
/* Timestamp of when the last feedback message was sent. */
static TimestampTz last_sent_feedback_ts = 0;
~~~
2. maybe_apply_delay
+ /* Apply the delay by the latch mechanism */
+ do
+ {
+ TimestampTz delayUntil;
+ long diffms;
+
+ ResetLatch(MyLatch);
+
+ CHECK_FOR_INTERRUPTS();
+
+ /* This might change wal_receiver_status_interval */
+ if (ConfigReloadPending)
+ {
+ ConfigReloadPending = false;
+ ProcessConfigFile(PGC_SIGHUP);
+ }
+
+ /*
+ * Before calculating the time duration, reload the catalog if needed.
+ */
+ if (!in_remote_transaction && !in_streamed_transaction)
+ {
+ AcceptInvalidationMessages();
+ maybe_reread_subscription();
+ }
+
+ delayUntil = TimestampTzPlusMilliseconds(finish_ts,
MySubscription->minapplydelay);
+ diffms = TimestampDifferenceMilliseconds(GetCurrentTimestamp(), delayUntil);
+
+ /*
+ * Exit without arming the latch if it's already past time to apply
+ * this transaction.
+ */
+ if (diffms <= 0)
+ break;
+
+ elog(DEBUG2, "time-delayed replication for txid %u, min_apply_delay
= %d ms, remaining wait time: %ld ms",
+ xid, MySubscription->minapplydelay, diffms);
+
+ /*
+ * Call send_feedback() to prevent the publisher from exiting by
+ * timeout during the delay, when the status interval is greater than
+ * zero.
+ */
+ if (!status_interval_ms)
+ {
+ TimestampTz nextFeedback;
+
+ /*
+ * Based on the last time when we send a feedback message, adjust
+ * the first delay time for this transaction. This ensures that
+ * the first feedback message follows wal_receiver_status_interval
+ * interval.
+ */
+ nextFeedback = TimestampTzPlusMilliseconds(send_time,
+ wal_receiver_status_interval * 1000L);
+ status_interval_ms =
TimestampDifferenceMilliseconds(GetCurrentTimestamp(), nextFeedback);
+ }
+ else
+ status_interval_ms = wal_receiver_status_interval * 1000L;
+
+ if (status_interval_ms > 0 && diffms > status_interval_ms)
+ {
+ WaitLatch(MyLatch,
+ WL_LATCH_SET | WL_TIMEOUT | WL_EXIT_ON_PM_DEATH,
+ status_interval_ms,
+ WAIT_EVENT_LOGICAL_APPLY_DELAY);
+ send_feedback(last_received, true, false, true);
+ }
+ else
+ WaitLatch(MyLatch,
+ WL_LATCH_SET | WL_TIMEOUT | WL_EXIT_ON_PM_DEATH,
+ diffms,
+ WAIT_EVENT_LOGICAL_APPLY_DELAY);
+
+ } while (true);
~
IMO this logic has been tweaked too many times without revisiting the
variable names and logic from scratch, so it has become over-complex
- some variable names are assuming multiple meanings
- multiple * 1000L have crept back in again
- the 'diffms' is too generic now with so many vars so it has lost its meaning
- GetCurrentTimestamp call in multiple places
SUGGESTIONS
- rename some variables and simplify the logic.
- reduce all the if/else
- don't be sneaky with the meaning of status_interval_ms
- 'diffms' --> 'remaining_delay_ms'
- 'DelayUntil' --> 'delay_until_ts'
- introduce 'now' variable
- simplify the check of (next_feedback_due_ms < remaining_delay_ms)
SUGGESTION (WFM)
/* Apply the delay by the latch mechanism */
while (true)
{
TimestampTz now;
TimestampTz delay_until_ts;
long remaining_delay_ms;
long status_interval_ms;
ResetLatch(MyLatch);
CHECK_FOR_INTERRUPTS();
/* This might change wal_receiver_status_interval */
if (ConfigReloadPending)
{
ConfigReloadPending = false;
ProcessConfigFile(PGC_SIGHUP);
}
/*
* Before calculating the time duration, reload the catalog if needed.
*/
if (!in_remote_transaction && !in_streamed_transaction)
{
AcceptInvalidationMessages();
maybe_reread_subscription();
}
now = GetCurrentTimestamp();
delay_until_ts = TimestampTzPlusMilliseconds(finish_ts,
MySubscription->minapplydelay);
remaining_delay_ms = TimestampDifferenceMilliseconds(now, delay_until_ts);
/*
* Exit without arming the latch if it's already past time to apply
* this transaction.
*/
if (remaining_delay_ms <= 0)
break;
elog(DEBUG2, "time-delayed replication for txid %u, min_apply_delay =
%d ms, remaining wait time: %ld ms",
xid, MySubscription->minapplydelay, remaining_delay_ms);
/*
* If a status interval is defined then we may need to call send_feedback()
* early to prevent the publisher from exiting during a long apply delay.
*/
status_interval_ms = wal_receiver_status_interval * 1000L;
if (status_interval_ms > 0)
{
TimestampTz next_feedback_due_ts;
long next_feedback_due_ms;
/*
* Find if the next feedback is due earlier than the remaining delay ms.
*/
next_feedback_due_ts = TimestampTzPlusMilliseconds(send_time,
status_interval_ms);
next_feedback_due_ms = TimestampDifferenceMilliseconds(now,
next_feedback_due_ts);
if (next_feedback_due_ms < remaining_delay_ms)
{
/* delay before feedback */
WaitLatch(MyLatch,
WL_LATCH_SET | WL_TIMEOUT | WL_EXIT_ON_PM_DEATH,
next_feedback_due_ms,
WAIT_EVENT_LOGICAL_APPLY_DELAY);
send_feedback(last_received, true, false, true);
continue;
}
}
/* delay before apply */
WaitLatch(MyLatch,
WL_LATCH_SET | WL_TIMEOUT | WL_EXIT_ON_PM_DEATH,
remaining_delay_ms,
WAIT_EVENT_LOGICAL_APPLY_DELAY);
}
======
src/include/utils/wait_event.h
3.
@@ -149,7 +149,8 @@ typedef enum
WAIT_EVENT_REGISTER_SYNC_REQUEST,
WAIT_EVENT_SPIN_DELAY,
WAIT_EVENT_VACUUM_DELAY,
- WAIT_EVENT_VACUUM_TRUNCATE
+ WAIT_EVENT_VACUUM_TRUNCATE,
+ WAIT_EVENT_LOGICAL_APPLY_DELAY
} WaitEventTimeout;
FYI - The PGDOCS has a section with "Table 28.13. Wait Events of Type
Timeout" so if you a going to add a new Timeout Event then you also
need to document it (alphabetically) in that table.
------
Kind Regards,
Peter Smith.
Fujitsu Australia
On Fri, Feb 10, 2023 at 4:56 PM Takamichi Osumi (Fujitsu)
<osumi.takamichi@fujitsu.com> wrote:
On Friday, February 10, 2023 2:05 PM Friday, February 10, 2023 2:05 PM wrote:
On Fri, Feb 10, 2023 at 10:11 AM Amit Kapila <amit.kapila16@gmail.com>
wrote:In the previous patch, we couldn't solve the
timeout of the publisher, when we conduct a scenario suggested by Horiguchi-san
and reproduced in the scenario attached test file 'test.sh'.
But now we handle it by adjusting the timing of the first wait time.FYI, we thought to implement the new variable 'send_time'
in the LogicalRepWorker structure at first. But, this structure
is used when launcher controls workers or reports statistics
and it stores TimestampTz recorded in the received WAL,
so not sure if the struct is the right place to implement the variable.
Moreover, there are other similar variables such as last_recv_time
or reply_time. So, those will be confusing when we decide to have
new variable together. Then, it's declared separately.
I think we can introduce a new variable as last_feedback_time in the
LogicalRepWorker structure and probably for the last_received, we can
last_lsn in MyLogicalRepWorker as that seems to be updated correctly.
I think it would be good to avoid global variables.
--
With Regards,
Amit Kapila.
Hi,
On 2023-02-11 05:44:47 +0000, Takamichi Osumi (Fujitsu) wrote:
On Saturday, February 11, 2023 11:10 AM Andres Freund <andres@anarazel.de> wrote:
Has there been any discussion about whether this is actually best
implemented on the client side? You could alternatively implement it on the
sender.That'd have quite a few advantages, I think - you e.g. wouldn't remove the
ability to *receive* and send feedback messages. We'd not end up filling up
the network buffer with data that we'll not process anytime soon.Thanks for your comments !
We have discussed about the publisher side idea around here [1]
but, we chose the current direction. Kindly have a look at the discussion.If we apply the delay on the publisher, then
it can lead to extra delay where we don't need to apply.
The current proposed approach can take other loads or factors
(network, busyness of the publisher, etc) into account
because it calculates the required delay on the subscriber.
I don't think it's OK to just loose the ability to read / reply to keepalive
messages.
I think as-is we seriously consider to just reject the feature, adding too
much complexity, without corresponding gain.
Greetings,
Andres Freund
At Mon, 13 Feb 2023 15:51:25 +0530, Amit Kapila <amit.kapila16@gmail.com> wrote in
I think we can introduce a new variable as last_feedback_time in the
LogicalRepWorker structure and probably for the last_received, we can
last_lsn in MyLogicalRepWorker as that seems to be updated correctly.
I think it would be good to avoid global variables.
MyLogicalRepWorker is a global variable:p, but it is far better than a
bear one.
By the way, we are trying to send the status messages regularly, but
as Andres pointed out, worker does not read nor reply to keepalive
messages from publisher while delaying. It is not possible as far as
we choke the stream at the subscriber end. It doesn't seem to be a
practical problem, but IMHO I think he's right in terms of adherence
to the wire protocol, which was also one of my own initial concern.
regards.
--
Kyotaro Horiguchi
NTT Open Source Software Center
Hi, Andres-san
On Tuesday, February 14, 2023 1:47 AM Andres Freund <andres@anarazel.de> wrote:
On 2023-02-11 05:44:47 +0000, Takamichi Osumi (Fujitsu) wrote:
On Saturday, February 11, 2023 11:10 AM Andres Freund
<andres@anarazel.de> wrote:
Has there been any discussion about whether this is actually best
implemented on the client side? You could alternatively implement it
on the sender.That'd have quite a few advantages, I think - you e.g. wouldn't
remove the ability to *receive* and send feedback messages. We'd
not end up filling up the network buffer with data that we'll not processanytime soon.
Thanks for your comments !
We have discussed about the publisher side idea around here [1] but,
we chose the current direction. Kindly have a look at the discussion.If we apply the delay on the publisher, then it can lead to extra
delay where we don't need to apply.
The current proposed approach can take other loads or factors
(network, busyness of the publisher, etc) into account because it
calculates the required delay on the subscriber.I don't think it's OK to just loose the ability to read / reply to keepalive
messages.I think as-is we seriously consider to just reject the feature, adding too much
complexity, without corresponding gain.
Thanks for your comments !
Could you please tell us about your concern a bit more?
The keepalive/reply messages are currently used for two purposes,
(a) send the updated wrte/flush/apply locations; (b) avoid timeouts incase of idle times.
Both the cases shouldn't be impacted with this time-delayed LR patch because during the delay there won't
be any progress and to avoid timeouts, we allow to send the alive message during the delay.
This is just we would like to clarify the issue you have in mind.
OTOH, if we want to implement the functionality on publisher-side,
I think we need to first consider the interface.
We can think of two options (a) Have it as a subscription parameter as the patch has now and
then pass it as an option to the publisher which it will use to delay;
(b) Have it defined on publisher-side, say via GUC or some other way.
The basic idea could be that while processing commit record (in DecodeCommit),
we can somehow check the value of delay and then use it there to delay sending the xact.
Also, during delay, we need to somehow send the keepalive and process replies,
probably via a new callback or by some existing callback.
We also need to handle in-progress and 2PC xacts in a similar way.
For the former, probably we would need to apply the delay before sending the first stream.
Could you please share what you feel on this direction as well ?
Best Regards,
Takamichi Osumi
Dear Andres and other hackers,
OTOH, if we want to implement the functionality on publisher-side,
I think we need to first consider the interface.
We can think of two options (a) Have it as a subscription parameter as the patch
has now and
then pass it as an option to the publisher which it will use to delay;
(b) Have it defined on publisher-side, say via GUC or some other way.
The basic idea could be that while processing commit record (in DecodeCommit),
we can somehow check the value of delay and then use it there to delay sending
the xact.
Also, during delay, we need to somehow send the keepalive and process replies,
probably via a new callback or by some existing callback.
We also need to handle in-progress and 2PC xacts in a similar way.
For the former, probably we would need to apply the delay before sending the first
stream.
Could you please share what you feel on this direction as well ?
I implemented a patch that the delaying is done on the publisher side. In this patch,
approach (a) was chosen, in which min_apply_delay is specified as a subscription
parameter, and then apply worker passes it to the publisher as an output plugin option.
During the delay, the walsender periodically checks and processes replies from the
apply worker and sends keepalive messages if needed. Therefore, the ability to handle
keepalives is not loosed.
To delay the transaction in the output plugin layer, the new LogicalOutputPlugin
API was added. For now, I choose the output plugin layer but can consider to do
it from the core if there is a better way.
Could you please share your opinion?
Note: thanks for Osumi-san to help implementing.
Best Regards,
Hayato Kuroda
FUJITSU LIMITED
Attachments:
0001-Time-delayed-logical-replication-on-publisher-side.patchapplication/octet-stream; name=0001-Time-delayed-logical-replication-on-publisher-side.patchDownload
From 21accecb2fd1a7b2636e4f5dc68b6a618550b207 Mon Sep 17 00:00:00 2001
From: Takamichi Osumi <osumi.takamichi@fujitsu.com>
Date: Wed, 15 Feb 2023 04:08:12 +0000
Subject: [PATCH] Time-delayed logical replication on publisher side
Similar to physical replication, a time-delayed copy of the data for
logical replication is useful for some scenarios (particularly to fix
errors that might cause data loss).
This patch implements a new subscription parameter called 'min_apply_delay'.
If the subscription sets min_apply_delay parameter, an apply worker passes the
value to the publisher as an output plugin option. And then, the walsender will
delay the transaction sending for given milliseconds.
The delay does not take into account the overhead of time spent in transferring
the transaction, which means that the arrival time at the subscriber may be
delayed more than the given time.
The delay occurs before we start to send the transaction on the publisher.
Regular and prepared transactions are covered. Streamed transactions are also
covered.
Author: Euler Taveira, Takamichi Osumi, Kuroda Hayato
Reviewed-by: Amit Kapila, Peter Smith, Vignesh C, Shveta Malik,
Kyotaro Horiguchi, Shi Yu, Wang Wei, Dilip Kumar, Melih Mutlu
---
doc/src/sgml/catalogs.sgml | 9 +
doc/src/sgml/glossary.sgml | 15 ++
doc/src/sgml/logical-replication.sgml | 6 +
doc/src/sgml/ref/alter_subscription.sgml | 5 +-
doc/src/sgml/ref/create_subscription.sgml | 34 +++-
src/backend/catalog/pg_subscription.c | 1 +
src/backend/catalog/system_views.sql | 7 +-
src/backend/commands/subscriptioncmds.c | 70 ++++++-
.../libpqwalreceiver/libpqwalreceiver.c | 5 +
src/backend/replication/logical/logical.c | 27 ++-
.../replication/logical/logicalfuncs.c | 2 +-
src/backend/replication/logical/worker.c | 12 +-
src/backend/replication/pgoutput/pgoutput.c | 41 ++++-
src/backend/replication/slotfuncs.c | 4 +-
src/backend/replication/walsender.c | 79 +++++++-
src/bin/pg_dump/pg_dump.c | 15 +-
src/bin/pg_dump/pg_dump.h | 1 +
src/bin/psql/describe.c | 9 +-
src/bin/psql/tab-complete.c | 4 +-
src/include/catalog/pg_subscription.h | 3 +
src/include/replication/logical.h | 11 +-
src/include/replication/output_plugin.h | 1 +
src/include/replication/pgoutput.h | 1 +
src/include/replication/walreceiver.h | 1 +
src/test/regress/expected/subscription.out | 171 ++++++++++--------
src/test/regress/sql/subscription.sql | 15 ++
src/test/subscription/t/001_rep_changes.pl | 28 +++
27 files changed, 473 insertions(+), 104 deletions(-)
diff --git a/doc/src/sgml/catalogs.sgml b/doc/src/sgml/catalogs.sgml
index c1e4048054..5dc5ca1133 100644
--- a/doc/src/sgml/catalogs.sgml
+++ b/doc/src/sgml/catalogs.sgml
@@ -7873,6 +7873,15 @@ SCRAM-SHA-256$<replaceable><iteration count></replaceable>:<replaceable>&l
</para></entry>
</row>
+ <row>
+ <entry role="catalog_table_entry"><para role="column_definition">
+ <structfield>subminapplydelay</structfield> <type>int4</type>
+ </para>
+ <para>
+ The minimum delay, in milliseconds, for applying changes
+ </para></entry>
+ </row>
+
<row>
<entry role="catalog_table_entry"><para role="column_definition">
<structfield>subname</structfield> <type>name</type>
diff --git a/doc/src/sgml/glossary.sgml b/doc/src/sgml/glossary.sgml
index 7c01a541fe..9ede9d05f6 100644
--- a/doc/src/sgml/glossary.sgml
+++ b/doc/src/sgml/glossary.sgml
@@ -1729,6 +1729,21 @@
</glossdef>
</glossentry>
+ <glossentry id="glossary-time-delayed-replication">
+ <glossterm>Time-delayed replication</glossterm>
+ <glossdef>
+ <para>
+ Replication setup that delays the application of changes by a specified
+ minimum time-delay period.
+ </para>
+ <para>
+ For more information, see
+ <xref linkend="guc-recovery-min-apply-delay"/> for physical replication
+ and <xref linkend="sql-createsubscription"/> for logical replication.
+ </para>
+ </glossdef>
+ </glossentry>
+
<glossentry id="glossary-toast">
<glossterm>TOAST</glossterm>
<glossdef>
diff --git a/doc/src/sgml/logical-replication.sgml b/doc/src/sgml/logical-replication.sgml
index 1bd5660c87..6bd5f61e2b 100644
--- a/doc/src/sgml/logical-replication.sgml
+++ b/doc/src/sgml/logical-replication.sgml
@@ -247,6 +247,12 @@
target table.
</para>
+ <para>
+ A subscription can delay the application of changes by specifying the
+ <literal>min_apply_delay</literal> subscription parameter. See
+ <xref linkend="sql-createsubscription"/> for details.
+ </para>
+
<sect2 id="logical-replication-subscription-slot">
<title>Replication Slot Management</title>
diff --git a/doc/src/sgml/ref/alter_subscription.sgml b/doc/src/sgml/ref/alter_subscription.sgml
index 964fcbb8ff..8b7eb28e54 100644
--- a/doc/src/sgml/ref/alter_subscription.sgml
+++ b/doc/src/sgml/ref/alter_subscription.sgml
@@ -213,8 +213,9 @@ ALTER SUBSCRIPTION <replaceable class="parameter">name</replaceable> RENAME TO <
are <literal>slot_name</literal>,
<literal>synchronous_commit</literal>,
<literal>binary</literal>, <literal>streaming</literal>,
- <literal>disable_on_error</literal>, and
- <literal>origin</literal>.
+ <literal>disable_on_error</literal>,
+ <literal>origin</literal>, and
+ <literal>min_apply_delay</literal>.
</para>
</listitem>
</varlistentry>
diff --git a/doc/src/sgml/ref/create_subscription.sgml b/doc/src/sgml/ref/create_subscription.sgml
index 51c45f17c7..cda8a91aba 100644
--- a/doc/src/sgml/ref/create_subscription.sgml
+++ b/doc/src/sgml/ref/create_subscription.sgml
@@ -349,7 +349,39 @@ CREATE SUBSCRIPTION <replaceable class="parameter">subscription_name</replaceabl
</para>
</listitem>
</varlistentry>
- </variablelist></para>
+
+ <varlistentry>
+ <term><literal>min_apply_delay</literal> (<type>integer</type>)</term>
+ <listitem>
+ <para>
+ By default, the publisher sends changes as soon as possible. This
+ parameter allows the user to delay the publisher to send changes by
+ given time period. If the value is specified without units, it is
+ taken as milliseconds. The default is zero (no delay). See
+ <xref linkend="config-setting-names-values"/> for details on the
+ available valid time units.
+ </para>
+ <para>
+ Any delay becomes effective only after all initial table
+ synchronization has finished and occurs before each transaction
+ starts to get applied on the subscriber. The delay does not take into
+ account the overhead of time spent in transferring the transaction,
+ which means that the arrival time at the subscriber may be delayed
+ more than the given time.
+ </para>
+ <warning>
+ <para>
+ Delaying the replication means there is a much longer time between
+ making a change on the publisher, and that change being committed
+ on the subscriber. This can impact the performance of synchronous
+ replication. See <xref linkend="guc-synchronous-commit"/>
+ parameter.
+ </para>
+ </warning>
+ </listitem>
+ </varlistentry>
+ </variablelist>
+ </para>
</listitem>
</varlistentry>
diff --git a/src/backend/catalog/pg_subscription.c b/src/backend/catalog/pg_subscription.c
index a56ae311c3..e19e5cbca2 100644
--- a/src/backend/catalog/pg_subscription.c
+++ b/src/backend/catalog/pg_subscription.c
@@ -66,6 +66,7 @@ GetSubscription(Oid subid, bool missing_ok)
sub->skiplsn = subform->subskiplsn;
sub->name = pstrdup(NameStr(subform->subname));
sub->owner = subform->subowner;
+ sub->minapplydelay = subform->subminapplydelay;
sub->enabled = subform->subenabled;
sub->binary = subform->subbinary;
sub->stream = subform->substream;
diff --git a/src/backend/catalog/system_views.sql b/src/backend/catalog/system_views.sql
index 34ca0e739f..7578b80c07 100644
--- a/src/backend/catalog/system_views.sql
+++ b/src/backend/catalog/system_views.sql
@@ -1314,9 +1314,10 @@ REVOKE ALL ON pg_replication_origin_status FROM public;
-- All columns of pg_subscription except subconninfo are publicly readable.
REVOKE ALL ON pg_subscription FROM public;
-GRANT SELECT (oid, subdbid, subskiplsn, subname, subowner, subenabled,
- subbinary, substream, subtwophasestate, subdisableonerr,
- subslotname, subsynccommit, subpublications, suborigin)
+GRANT SELECT (oid, subdbid, subskiplsn, subminapplydelay, subname, subowner,
+ subenabled, subbinary, substream, subtwophasestate,
+ subdisableonerr, subslotname, subsynccommit, subpublications,
+ suborigin)
ON pg_subscription TO public;
CREATE VIEW pg_stat_subscription_stats AS
diff --git a/src/backend/commands/subscriptioncmds.c b/src/backend/commands/subscriptioncmds.c
index 464db6d247..eef595afcf 100644
--- a/src/backend/commands/subscriptioncmds.c
+++ b/src/backend/commands/subscriptioncmds.c
@@ -66,6 +66,7 @@
#define SUBOPT_DISABLE_ON_ERR 0x00000400
#define SUBOPT_LSN 0x00000800
#define SUBOPT_ORIGIN 0x00001000
+#define SUBOPT_MIN_APPLY_DELAY 0x00002000
/* check if the 'val' has 'bits' set */
#define IsSet(val, bits) (((val) & (bits)) == (bits))
@@ -90,6 +91,7 @@ typedef struct SubOpts
bool disableonerr;
char *origin;
XLogRecPtr lsn;
+ int32 min_apply_delay;
} SubOpts;
static List *fetch_table_list(WalReceiverConn *wrconn, List *publications);
@@ -100,7 +102,7 @@ static void check_publications_origin(WalReceiverConn *wrconn,
static void check_duplicates_in_publist(List *publist, Datum *datums);
static List *merge_publications(List *oldpublist, List *newpublist, bool addpub, const char *subname);
static void ReportSlotConnectionError(List *rstates, Oid subid, char *slotname, char *err);
-
+static int32 defGetMinApplyDelay(DefElem *def);
/*
* Common option parsing function for CREATE and ALTER SUBSCRIPTION commands.
@@ -146,6 +148,8 @@ parse_subscription_options(ParseState *pstate, List *stmt_options,
opts->disableonerr = false;
if (IsSet(supported_opts, SUBOPT_ORIGIN))
opts->origin = pstrdup(LOGICALREP_ORIGIN_ANY);
+ if (IsSet(supported_opts, SUBOPT_MIN_APPLY_DELAY))
+ opts->min_apply_delay = 0;
/* Parse options */
foreach(lc, stmt_options)
@@ -324,6 +328,15 @@ parse_subscription_options(ParseState *pstate, List *stmt_options,
opts->specified_opts |= SUBOPT_LSN;
opts->lsn = lsn;
}
+ else if (IsSet(supported_opts, SUBOPT_MIN_APPLY_DELAY) &&
+ strcmp(defel->defname, "min_apply_delay") == 0)
+ {
+ if (IsSet(opts->specified_opts, SUBOPT_MIN_APPLY_DELAY))
+ errorConflictingDefElem(defel, pstate);
+
+ opts->specified_opts |= SUBOPT_MIN_APPLY_DELAY;
+ opts->min_apply_delay = defGetMinApplyDelay(defel);
+ }
else
ereport(ERROR,
(errcode(ERRCODE_SYNTAX_ERROR),
@@ -560,7 +573,8 @@ CreateSubscription(ParseState *pstate, CreateSubscriptionStmt *stmt,
SUBOPT_SLOT_NAME | SUBOPT_COPY_DATA |
SUBOPT_SYNCHRONOUS_COMMIT | SUBOPT_BINARY |
SUBOPT_STREAMING | SUBOPT_TWOPHASE_COMMIT |
- SUBOPT_DISABLE_ON_ERR | SUBOPT_ORIGIN);
+ SUBOPT_DISABLE_ON_ERR | SUBOPT_ORIGIN |
+ SUBOPT_MIN_APPLY_DELAY);
parse_subscription_options(pstate, stmt->options, supported_opts, &opts);
/*
@@ -625,6 +639,7 @@ CreateSubscription(ParseState *pstate, CreateSubscriptionStmt *stmt,
values[Anum_pg_subscription_oid - 1] = ObjectIdGetDatum(subid);
values[Anum_pg_subscription_subdbid - 1] = ObjectIdGetDatum(MyDatabaseId);
values[Anum_pg_subscription_subskiplsn - 1] = LSNGetDatum(InvalidXLogRecPtr);
+ values[Anum_pg_subscription_subminapplydelay - 1] = Int32GetDatum(opts.min_apply_delay);
values[Anum_pg_subscription_subname - 1] =
DirectFunctionCall1(namein, CStringGetDatum(stmt->subname));
values[Anum_pg_subscription_subowner - 1] = ObjectIdGetDatum(owner);
@@ -1054,7 +1069,7 @@ AlterSubscription(ParseState *pstate, AlterSubscriptionStmt *stmt,
supported_opts = (SUBOPT_SLOT_NAME |
SUBOPT_SYNCHRONOUS_COMMIT | SUBOPT_BINARY |
SUBOPT_STREAMING | SUBOPT_DISABLE_ON_ERR |
- SUBOPT_ORIGIN);
+ SUBOPT_ORIGIN | SUBOPT_MIN_APPLY_DELAY);
parse_subscription_options(pstate, stmt->options,
supported_opts, &opts);
@@ -1111,6 +1126,13 @@ AlterSubscription(ParseState *pstate, AlterSubscriptionStmt *stmt,
= true;
}
+ if (IsSet(opts.specified_opts, SUBOPT_MIN_APPLY_DELAY))
+ {
+ values[Anum_pg_subscription_subminapplydelay - 1] =
+ Int32GetDatum(opts.min_apply_delay);
+ replaces[Anum_pg_subscription_subminapplydelay - 1] = true;
+ }
+
if (IsSet(opts.specified_opts, SUBOPT_ORIGIN))
{
values[Anum_pg_subscription_suborigin - 1] =
@@ -2195,3 +2217,45 @@ defGetStreamingMode(DefElem *def)
def->defname)));
return LOGICALREP_STREAM_OFF; /* keep compiler quiet */
}
+
+/*
+ * Extract the min_apply_delay value from a DefElem. This is very similar to
+ * parse_and_validate_value() for integer values, because min_apply_delay
+ * accepts the same parameter format as recovery_min_apply_delay.
+ */
+static int32
+defGetMinApplyDelay(DefElem *def)
+{
+ char *input_string;
+ int result;
+ const char *hintmsg;
+
+ input_string = defGetString(def);
+
+ /*
+ * Parse given string as parameter which has millisecond unit
+ */
+ if (!parse_int(input_string, &result, GUC_UNIT_MS, &hintmsg))
+ ereport(ERROR,
+ (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+ errmsg("invalid value for parameter \"%s\": \"%s\"",
+ "min_apply_delay", input_string),
+ hintmsg ? errhint("%s", _(hintmsg)) : 0));
+
+ /*
+ * Check both the lower boundary for the valid min_apply_delay range and
+ * the upper boundary as the safeguard for some platforms where INT_MAX is
+ * wider than int32 respectively. Although parse_int() has confirmed that
+ * the result is less than or equal to INT_MAX, the value will be stored
+ * in a catalog column of int32.
+ */
+ if (result < 0 || result > PG_INT32_MAX)
+ ereport(ERROR,
+ (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+ errmsg("%d ms is outside the valid range for parameter \"%s\" (%d .. %d)",
+ result,
+ "min_apply_delay",
+ 0, PG_INT32_MAX)));
+
+ return result;
+}
diff --git a/src/backend/replication/libpqwalreceiver/libpqwalreceiver.c b/src/backend/replication/libpqwalreceiver/libpqwalreceiver.c
index 560ec974fa..ec0885ba73 100644
--- a/src/backend/replication/libpqwalreceiver/libpqwalreceiver.c
+++ b/src/backend/replication/libpqwalreceiver/libpqwalreceiver.c
@@ -443,6 +443,11 @@ libpqrcv_startstreaming(WalReceiverConn *conn,
PQserverVersion(conn->streamConn) >= 140000)
appendStringInfoString(&cmd, ", binary 'true'");
+ if (options->proto.logical.min_apply_delay > 0 &&
+ PQserverVersion(conn->streamConn) >= 160000)
+ appendStringInfo(&cmd, ", min_apply_delay '%d'",
+ options->proto.logical.min_apply_delay);
+
appendStringInfoChar(&cmd, ')');
}
else
diff --git a/src/backend/replication/logical/logical.c b/src/backend/replication/logical/logical.c
index c3ec97a0a6..b347940eda 100644
--- a/src/backend/replication/logical/logical.c
+++ b/src/backend/replication/logical/logical.c
@@ -156,7 +156,8 @@ StartupDecodingContext(List *output_plugin_options,
XLogReaderRoutine *xl_routine,
LogicalOutputPluginWriterPrepareWrite prepare_write,
LogicalOutputPluginWriterWrite do_write,
- LogicalOutputPluginWriterUpdateProgress update_progress)
+ LogicalOutputPluginWriterUpdateProgress update_progress,
+ LogicalOutputPluginDelay delay)
{
ReplicationSlot *slot;
MemoryContext context,
@@ -293,6 +294,7 @@ StartupDecodingContext(List *output_plugin_options,
ctx->prepare_write = prepare_write;
ctx->write = do_write;
ctx->update_progress = update_progress;
+ ctx->delay = delay;
ctx->output_plugin_options = output_plugin_options;
@@ -316,7 +318,7 @@ StartupDecodingContext(List *output_plugin_options,
* marking WAL reserved beforehand. In that scenario, it's up to the
* caller to guarantee that WAL remains available.
* xl_routine -- XLogReaderRoutine for underlying XLogReader
- * prepare_write, do_write, update_progress --
+ * prepare_write, do_write, update_progress, delay --
* callbacks that perform the use-case dependent, actual, work.
*
* Needs to be called while in a memory context that's at least as long lived
@@ -334,7 +336,8 @@ CreateInitDecodingContext(const char *plugin,
XLogReaderRoutine *xl_routine,
LogicalOutputPluginWriterPrepareWrite prepare_write,
LogicalOutputPluginWriterWrite do_write,
- LogicalOutputPluginWriterUpdateProgress update_progress)
+ LogicalOutputPluginWriterUpdateProgress update_progress,
+ LogicalOutputPluginDelay delay)
{
TransactionId xmin_horizon = InvalidTransactionId;
ReplicationSlot *slot;
@@ -435,7 +438,7 @@ CreateInitDecodingContext(const char *plugin,
ctx = StartupDecodingContext(NIL, restart_lsn, xmin_horizon,
need_full_snapshot, false,
xl_routine, prepare_write, do_write,
- update_progress);
+ update_progress, delay);
/* call output plugin initialization callback */
old_context = MemoryContextSwitchTo(ctx->context);
@@ -475,7 +478,7 @@ CreateInitDecodingContext(const char *plugin,
* xl_routine
* XLogReaderRoutine used by underlying xlogreader
*
- * prepare_write, do_write, update_progress
+ * prepare_write, do_write, update_progress, delay
* callbacks that have to be filled to perform the use-case dependent,
* actual work.
*
@@ -493,7 +496,8 @@ CreateDecodingContext(XLogRecPtr start_lsn,
XLogReaderRoutine *xl_routine,
LogicalOutputPluginWriterPrepareWrite prepare_write,
LogicalOutputPluginWriterWrite do_write,
- LogicalOutputPluginWriterUpdateProgress update_progress)
+ LogicalOutputPluginWriterUpdateProgress update_progress,
+ LogicalOutputPluginDelay delay)
{
LogicalDecodingContext *ctx;
ReplicationSlot *slot;
@@ -547,7 +551,7 @@ CreateDecodingContext(XLogRecPtr start_lsn,
ctx = StartupDecodingContext(output_plugin_options,
start_lsn, InvalidTransactionId, false,
fast_forward, xl_routine, prepare_write,
- do_write, update_progress);
+ do_write, update_progress, delay);
/* call output plugin initialization callback */
old_context = MemoryContextSwitchTo(ctx->context);
@@ -1922,3 +1926,12 @@ UpdateDecodingStats(LogicalDecodingContext *ctx)
rb->totalTxns = 0;
rb->totalBytes = 0;
}
+
+void
+OutputPluginDelay(struct LogicalDecodingContext *ctx, int32 min_apply_delay)
+{
+ if (!ctx->delay)
+ return;
+
+ ctx->delay(ctx, min_apply_delay);
+}
diff --git a/src/backend/replication/logical/logicalfuncs.c b/src/backend/replication/logical/logicalfuncs.c
index fa1b641a2b..960025197f 100644
--- a/src/backend/replication/logical/logicalfuncs.c
+++ b/src/backend/replication/logical/logicalfuncs.c
@@ -212,7 +212,7 @@ pg_logical_slot_get_changes_guts(FunctionCallInfo fcinfo, bool confirm, bool bin
.segment_open = wal_segment_open,
.segment_close = wal_segment_close),
LogicalOutputPrepareWrite,
- LogicalOutputWrite, NULL);
+ LogicalOutputWrite, NULL, NULL);
/*
* After the sanity checks in CreateDecodingContext, make sure the
diff --git a/src/backend/replication/logical/worker.c b/src/backend/replication/logical/worker.c
index cfb2ab6248..749536dd5d 100644
--- a/src/backend/replication/logical/worker.c
+++ b/src/backend/replication/logical/worker.c
@@ -3898,7 +3898,8 @@ maybe_reread_subscription(void)
newsub->stream != MySubscription->stream ||
strcmp(newsub->origin, MySubscription->origin) != 0 ||
newsub->owner != MySubscription->owner ||
- !equal(newsub->publications, MySubscription->publications))
+ !equal(newsub->publications, MySubscription->publications) ||
+ newsub->minapplydelay != MySubscription->minapplydelay)
{
if (am_parallel_apply_worker())
ereport(LOG,
@@ -4617,9 +4618,18 @@ ApplyWorkerMain(Datum main_arg)
options.proto.logical.twophase = false;
options.proto.logical.origin = pstrdup(MySubscription->origin);
+ options.proto.logical.min_apply_delay = 0;
if (!am_tablesync_worker())
{
+ /*
+ * Time-delayed logical replication does not support tablesync
+ * workers, so only the leader apply worker can request walsenders to
+ * apply delay on the publisher side.
+ */
+ if (server_version >= 160000 && MySubscription->minapplydelay > 0)
+ options.proto.logical.min_apply_delay = MySubscription->minapplydelay;
+
/*
* Even when the two_phase mode is requested by the user, it remains
* as the tri-state PENDING until all tablesyncs have reached READY
diff --git a/src/backend/replication/pgoutput/pgoutput.c b/src/backend/replication/pgoutput/pgoutput.c
index 98377c094b..300500abcd 100644
--- a/src/backend/replication/pgoutput/pgoutput.c
+++ b/src/backend/replication/pgoutput/pgoutput.c
@@ -285,6 +285,7 @@ parse_output_parameters(List *options, PGOutputData *data)
bool streaming_given = false;
bool two_phase_option_given = false;
bool origin_option_given = false;
+ bool min_apply_delay_option_given = false;
data->binary = false;
data->streaming = LOGICALREP_STREAM_OFF;
@@ -396,6 +397,32 @@ parse_output_parameters(List *options, PGOutputData *data)
errcode(ERRCODE_INVALID_PARAMETER_VALUE),
errmsg("unrecognized origin value: \"%s\"", data->origin));
}
+ else if (strcmp(defel->defname, "min_apply_delay") == 0)
+ {
+ unsigned long parsed;
+ char *endptr;
+
+ if (min_apply_delay_option_given)
+ ereport(ERROR,
+ errcode(ERRCODE_SYNTAX_ERROR),
+ errmsg("conflicting or redundant options"));
+ min_apply_delay_option_given = true;
+
+ errno = 0;
+ parsed = strtoul(strVal(defel->arg), &endptr, 10);
+ if (errno != 0 || *endptr != '\0')
+ ereport(ERROR,
+ (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+ errmsg("invalid min_apply_delay")));
+
+ if (parsed > PG_INT32_MAX)
+ ereport(ERROR,
+ (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+ errmsg("min_apply_delay \"%s\" out of range",
+ strVal(defel->arg))));
+
+ data->min_apply_delay = (int32) parsed;
+ }
else
elog(ERROR, "unrecognized pgoutput option: %s", defel->defname);
}
@@ -537,7 +564,6 @@ pgoutput_begin_txn(LogicalDecodingContext *ctx, ReorderBufferTXN *txn)
{
PGOutputTxnData *txndata = MemoryContextAllocZero(ctx->context,
sizeof(PGOutputTxnData));
-
txn->output_plugin_private = txndata;
}
@@ -551,10 +577,13 @@ pgoutput_send_begin(LogicalDecodingContext *ctx, ReorderBufferTXN *txn)
{
bool send_replication_origin = txn->origin_id != InvalidRepOriginId;
PGOutputTxnData *txndata = (PGOutputTxnData *) txn->output_plugin_private;
+ PGOutputData *data = (PGOutputData *) ctx->output_plugin_private;
Assert(txndata);
Assert(!txndata->sent_begin_txn);
+ OutputPluginDelay(ctx, data->min_apply_delay);
+
OutputPluginPrepareWrite(ctx, !send_replication_origin);
logicalrep_write_begin(ctx->out, txn);
txndata->sent_begin_txn = true;
@@ -604,7 +633,9 @@ static void
pgoutput_begin_prepare_txn(LogicalDecodingContext *ctx, ReorderBufferTXN *txn)
{
bool send_replication_origin = txn->origin_id != InvalidRepOriginId;
+ PGOutputData *data = (PGOutputData *) ctx->output_plugin_private;
+ OutputPluginDelay(ctx, data->min_apply_delay);
OutputPluginPrepareWrite(ctx, !send_replication_origin);
logicalrep_write_begin_prepare(ctx->out, txn);
@@ -1810,9 +1841,17 @@ pgoutput_stream_start(struct LogicalDecodingContext *ctx,
/*
* If we already sent the first stream for this transaction then don't
* send the origin id in the subsequent streams.
+ *
+ * Otherwise, try to delay sending streams
*/
if (rbtxn_is_streamed(txn))
send_replication_origin = false;
+ else
+ {
+ PGOutputData *data = (PGOutputData *) ctx->output_plugin_private;
+
+ OutputPluginDelay(ctx, data->min_apply_delay);
+ }
OutputPluginPrepareWrite(ctx, !send_replication_origin);
logicalrep_write_stream_start(ctx->out, txn->xid, !rbtxn_is_streamed(txn));
diff --git a/src/backend/replication/slotfuncs.c b/src/backend/replication/slotfuncs.c
index 2f3c964824..522f7600a1 100644
--- a/src/backend/replication/slotfuncs.c
+++ b/src/backend/replication/slotfuncs.c
@@ -148,7 +148,7 @@ create_logical_replication_slot(char *name, char *plugin,
XL_ROUTINE(.page_read = read_local_xlog_page,
.segment_open = wal_segment_open,
.segment_close = wal_segment_close),
- NULL, NULL, NULL);
+ NULL, NULL, NULL, NULL);
/*
* If caller needs us to determine the decoding start point, do so now.
@@ -481,7 +481,7 @@ pg_logical_replication_slot_advance(XLogRecPtr moveto)
XL_ROUTINE(.page_read = read_local_xlog_page,
.segment_open = wal_segment_open,
.segment_close = wal_segment_close),
- NULL, NULL, NULL);
+ NULL, NULL, NULL, NULL);
/*
* Start reading at the slot's restart_lsn, which we know to point to
diff --git a/src/backend/replication/walsender.c b/src/backend/replication/walsender.c
index 75e8363e24..1bad03d91a 100644
--- a/src/backend/replication/walsender.c
+++ b/src/backend/replication/walsender.c
@@ -252,6 +252,7 @@ static void WalSndPrepareWrite(LogicalDecodingContext *ctx, XLogRecPtr lsn, Tran
static void WalSndWriteData(LogicalDecodingContext *ctx, XLogRecPtr lsn, TransactionId xid, bool last_write);
static void WalSndUpdateProgress(LogicalDecodingContext *ctx, XLogRecPtr lsn, TransactionId xid,
bool skipped_xact);
+static void WalSndDelay(LogicalDecodingContext *ctx, int32 min_apply_delay);
static XLogRecPtr WalSndWaitForWal(XLogRecPtr loc);
static void LagTrackerWrite(XLogRecPtr lsn, TimestampTz local_flush_time);
static TimeOffset LagTrackerRead(int head, XLogRecPtr lsn, TimestampTz now);
@@ -1126,7 +1127,7 @@ CreateReplicationSlot(CreateReplicationSlotCmd *cmd)
.segment_open = WalSndSegmentOpen,
.segment_close = wal_segment_close),
WalSndPrepareWrite, WalSndWriteData,
- WalSndUpdateProgress);
+ WalSndUpdateProgress, WalSndDelay);
/*
* Signal that we don't need the timeout mechanism. We're just
@@ -1285,7 +1286,7 @@ StartLogicalReplication(StartReplicationCmd *cmd)
.segment_open = WalSndSegmentOpen,
.segment_close = wal_segment_close),
WalSndPrepareWrite, WalSndWriteData,
- WalSndUpdateProgress);
+ WalSndUpdateProgress, WalSndDelay);
xlogreader = logical_decoding_ctx->reader;
WalSndSetState(WALSNDSTATE_CATCHUP);
@@ -3849,3 +3850,77 @@ LagTrackerRead(int head, XLogRecPtr lsn, TimestampTz now)
Assert(time != 0);
return now - time;
}
+
+/*
+ * LogicalDecodingContext 'delay' callback.
+ *
+ * Wait long enough to make sure a transaction is applied at least that
+ * period behind the publisher.
+ */
+static void
+WalSndDelay(LogicalDecodingContext *ctx, int32 min_apply_delay)
+{
+ TimestampTz delay_start = GetCurrentTimestamp();
+
+ /* Wait till delayUntil by the latch mechanism */
+ while (true)
+ {
+ TimestampTz delayUntil;
+ long diffms;
+ long timeout_interval_ms;
+
+ ResetLatch(MyLatch);
+
+ CHECK_FOR_INTERRUPTS();
+
+ /* This might change wal_sender_timeout */
+ if (ConfigReloadPending)
+ {
+ ConfigReloadPending = false;
+ ProcessConfigFile(PGC_SIGHUP);
+ }
+
+ /* Check for input from the client */
+ ProcessRepliesIfAny();
+
+ /* die if timeout was reached */
+ WalSndCheckTimeOut();
+
+ /* Send keepalive if the time has come */
+ WalSndKeepaliveIfNecessary();
+
+ /* Try to flush pending output to the client */
+ if (pq_flush_if_writable() != 0)
+ WalSndShutdown();
+
+ /*
+ * If we've requested to shut down, exit the process.
+ *
+ * Note that WalSndDone() cannot be used here because the delaying
+ * changes will be sent in the function.
+ */
+ if (got_STOPPING)
+ WalSndShutdown();
+
+ delayUntil = TimestampTzPlusMilliseconds(delay_start, min_apply_delay);
+ diffms = TimestampDifferenceMilliseconds(GetCurrentTimestamp(), delayUntil);
+
+ /*
+ * Exit without arming the latch if it's already past time to send
+ * this transaction.
+ */
+ if (diffms <= 0)
+ break;
+
+ /* Sleep until appropriate time. */
+ timeout_interval_ms = WalSndComputeSleeptime(GetCurrentTimestamp());
+
+ elog(DEBUG2, "time-delayed replication for txid %u, delay_time = %d ms, remaining wait time: %ld ms",
+ ctx->write_xid, (int) min_apply_delay, diffms);
+
+ /* Sleep until we get reply from worker or we time out */
+ WalSndWait(WL_SOCKET_READABLE,
+ Min(timeout_interval_ms, diffms),
+ WAIT_EVENT_WAL_SENDER_WRITE_DATA);
+ }
+}
diff --git a/src/bin/pg_dump/pg_dump.c b/src/bin/pg_dump/pg_dump.c
index 527c7651ab..1e87f0124e 100644
--- a/src/bin/pg_dump/pg_dump.c
+++ b/src/bin/pg_dump/pg_dump.c
@@ -4494,6 +4494,7 @@ getSubscriptions(Archive *fout)
int i_subsynccommit;
int i_subpublications;
int i_subbinary;
+ int i_subminapplydelay;
int i,
ntups;
@@ -4546,9 +4547,13 @@ getSubscriptions(Archive *fout)
LOGICALREP_TWOPHASE_STATE_DISABLED);
if (fout->remoteVersion >= 160000)
- appendPQExpBufferStr(query, " s.suborigin\n");
+ appendPQExpBufferStr(query,
+ " s.suborigin,\n"
+ " s.subminapplydelay\n");
else
- appendPQExpBuffer(query, " '%s' AS suborigin\n", LOGICALREP_ORIGIN_ANY);
+ appendPQExpBuffer(query, " '%s' AS suborigin,\n"
+ " 0 AS subminapplydelay\n",
+ LOGICALREP_ORIGIN_ANY);
appendPQExpBufferStr(query,
"FROM pg_subscription s\n"
@@ -4576,6 +4581,7 @@ getSubscriptions(Archive *fout)
i_subtwophasestate = PQfnumber(res, "subtwophasestate");
i_subdisableonerr = PQfnumber(res, "subdisableonerr");
i_suborigin = PQfnumber(res, "suborigin");
+ i_subminapplydelay = PQfnumber(res, "subminapplydelay");
subinfo = pg_malloc(ntups * sizeof(SubscriptionInfo));
@@ -4606,6 +4612,8 @@ getSubscriptions(Archive *fout)
subinfo[i].subdisableonerr =
pg_strdup(PQgetvalue(res, i, i_subdisableonerr));
subinfo[i].suborigin = pg_strdup(PQgetvalue(res, i, i_suborigin));
+ subinfo[i].subminapplydelay =
+ atoi(PQgetvalue(res, i, i_subminapplydelay));
/* Decide whether we want to dump it */
selectDumpableObject(&(subinfo[i].dobj), fout);
@@ -4687,6 +4695,9 @@ dumpSubscription(Archive *fout, const SubscriptionInfo *subinfo)
if (strcmp(subinfo->subsynccommit, "off") != 0)
appendPQExpBuffer(query, ", synchronous_commit = %s", fmtId(subinfo->subsynccommit));
+ if (subinfo->subminapplydelay > 0)
+ appendPQExpBuffer(query, ", min_apply_delay = '%d ms'", subinfo->subminapplydelay);
+
appendPQExpBufferStr(query, ");\n");
if (subinfo->dobj.dump & DUMP_COMPONENT_DEFINITION)
diff --git a/src/bin/pg_dump/pg_dump.h b/src/bin/pg_dump/pg_dump.h
index e7cbd8d7ed..b8831c3ed3 100644
--- a/src/bin/pg_dump/pg_dump.h
+++ b/src/bin/pg_dump/pg_dump.h
@@ -661,6 +661,7 @@ typedef struct _SubscriptionInfo
char *subdisableonerr;
char *suborigin;
char *subsynccommit;
+ int subminapplydelay;
char *subpublications;
} SubscriptionInfo;
diff --git a/src/bin/psql/describe.c b/src/bin/psql/describe.c
index c8a0bb7b3a..81d4607a1c 100644
--- a/src/bin/psql/describe.c
+++ b/src/bin/psql/describe.c
@@ -6472,7 +6472,7 @@ describeSubscriptions(const char *pattern, bool verbose)
PGresult *res;
printQueryOpt myopt = pset.popt;
static const bool translate_columns[] = {false, false, false, false,
- false, false, false, false, false, false, false, false};
+ false, false, false, false, false, false, false, false, false};
if (pset.sversion < 100000)
{
@@ -6527,10 +6527,13 @@ describeSubscriptions(const char *pattern, bool verbose)
gettext_noop("Two-phase commit"),
gettext_noop("Disable on error"));
+ /* Origin and min_apply_delay are only supported in v16 and higher */
if (pset.sversion >= 160000)
appendPQExpBuffer(&buf,
- ", suborigin AS \"%s\"\n",
- gettext_noop("Origin"));
+ ", suborigin AS \"%s\"\n"
+ ", subminapplydelay AS \"%s\"\n",
+ gettext_noop("Origin"),
+ gettext_noop("Min apply delay"));
appendPQExpBuffer(&buf,
", subsynccommit AS \"%s\"\n"
diff --git a/src/bin/psql/tab-complete.c b/src/bin/psql/tab-complete.c
index 5e1882eaea..e8b9a43a47 100644
--- a/src/bin/psql/tab-complete.c
+++ b/src/bin/psql/tab-complete.c
@@ -1925,7 +1925,7 @@ psql_completion(const char *text, int start, int end)
COMPLETE_WITH("(", "PUBLICATION");
/* ALTER SUBSCRIPTION <name> SET ( */
else if (HeadMatches("ALTER", "SUBSCRIPTION", MatchAny) && TailMatches("SET", "("))
- COMPLETE_WITH("binary", "disable_on_error", "origin", "slot_name",
+ COMPLETE_WITH("binary", "disable_on_error", "min_apply_delay", "origin", "slot_name",
"streaming", "synchronous_commit");
/* ALTER SUBSCRIPTION <name> SKIP ( */
else if (HeadMatches("ALTER", "SUBSCRIPTION", MatchAny) && TailMatches("SKIP", "("))
@@ -3268,7 +3268,7 @@ psql_completion(const char *text, int start, int end)
/* Complete "CREATE SUBSCRIPTION <name> ... WITH ( <opt>" */
else if (HeadMatches("CREATE", "SUBSCRIPTION") && TailMatches("WITH", "("))
COMPLETE_WITH("binary", "connect", "copy_data", "create_slot",
- "disable_on_error", "enabled", "origin", "slot_name",
+ "disable_on_error", "enabled", "min_apply_delay", "origin", "slot_name",
"streaming", "synchronous_commit", "two_phase");
/* CREATE TRIGGER --- is allowed inside CREATE SCHEMA, so use TailMatches */
diff --git a/src/include/catalog/pg_subscription.h b/src/include/catalog/pg_subscription.h
index b0f2a1705d..d1cfefc6d6 100644
--- a/src/include/catalog/pg_subscription.h
+++ b/src/include/catalog/pg_subscription.h
@@ -74,6 +74,8 @@ CATALOG(pg_subscription,6100,SubscriptionRelationId) BKI_SHARED_RELATION BKI_ROW
Oid subowner BKI_LOOKUP(pg_authid); /* Owner of the subscription */
+ int32 subminapplydelay; /* Replication apply delay (ms) */
+
bool subenabled; /* True if the subscription is enabled (the
* worker should be running) */
@@ -122,6 +124,7 @@ typedef struct Subscription
* skipped */
char *name; /* Name of the subscription */
Oid owner; /* Oid of the subscription owner */
+ int32 minapplydelay; /* Replication apply delay (ms) */
bool enabled; /* Indicates if the subscription is enabled */
bool binary; /* Indicates if the subscription wants data in
* binary format */
diff --git a/src/include/replication/logical.h b/src/include/replication/logical.h
index 5f49554ea0..9470af1fda 100644
--- a/src/include/replication/logical.h
+++ b/src/include/replication/logical.h
@@ -30,6 +30,10 @@ typedef void (*LogicalOutputPluginWriterUpdateProgress) (struct LogicalDecodingC
bool skipped_xact
);
+typedef void (*LogicalOutputPluginDelay) (struct LogicalDecodingContext *lr,
+ int32 min_apply_delay
+);
+
typedef struct LogicalDecodingContext
{
/* memory context this is all allocated in */
@@ -64,6 +68,7 @@ typedef struct LogicalDecodingContext
LogicalOutputPluginWriterPrepareWrite prepare_write;
LogicalOutputPluginWriterWrite write;
LogicalOutputPluginWriterUpdateProgress update_progress;
+ LogicalOutputPluginDelay delay;
/*
* Output buffer.
@@ -121,14 +126,16 @@ extern LogicalDecodingContext *CreateInitDecodingContext(const char *plugin,
XLogReaderRoutine *xl_routine,
LogicalOutputPluginWriterPrepareWrite prepare_write,
LogicalOutputPluginWriterWrite do_write,
- LogicalOutputPluginWriterUpdateProgress update_progress);
+ LogicalOutputPluginWriterUpdateProgress update_progress,
+ LogicalOutputPluginDelay delay);
extern LogicalDecodingContext *CreateDecodingContext(XLogRecPtr start_lsn,
List *output_plugin_options,
bool fast_forward,
XLogReaderRoutine *xl_routine,
LogicalOutputPluginWriterPrepareWrite prepare_write,
LogicalOutputPluginWriterWrite do_write,
- LogicalOutputPluginWriterUpdateProgress update_progress);
+ LogicalOutputPluginWriterUpdateProgress update_progress,
+ LogicalOutputPluginDelay delay);
extern void DecodingContextFindStartpoint(LogicalDecodingContext *ctx);
extern bool DecodingContextReady(LogicalDecodingContext *ctx);
extern void FreeDecodingContext(LogicalDecodingContext *ctx);
diff --git a/src/include/replication/output_plugin.h b/src/include/replication/output_plugin.h
index 2d89d26586..a65ff2d241 100644
--- a/src/include/replication/output_plugin.h
+++ b/src/include/replication/output_plugin.h
@@ -246,5 +246,6 @@ typedef struct OutputPluginCallbacks
extern void OutputPluginPrepareWrite(struct LogicalDecodingContext *ctx, bool last_write);
extern void OutputPluginWrite(struct LogicalDecodingContext *ctx, bool last_write);
extern void OutputPluginUpdateProgress(struct LogicalDecodingContext *ctx, bool skipped_xact);
+extern void OutputPluginDelay(struct LogicalDecodingContext *ctx, int32 min_apply_delay);
#endif /* OUTPUT_PLUGIN_H */
diff --git a/src/include/replication/pgoutput.h b/src/include/replication/pgoutput.h
index b4a8015403..dc9a70f95e 100644
--- a/src/include/replication/pgoutput.h
+++ b/src/include/replication/pgoutput.h
@@ -30,6 +30,7 @@ typedef struct PGOutputData
bool messages;
bool two_phase;
char *origin;
+ int32 min_apply_delay;
} PGOutputData;
#endif /* PGOUTPUT_H */
diff --git a/src/include/replication/walreceiver.h b/src/include/replication/walreceiver.h
index decffe352d..464cad0e3b 100644
--- a/src/include/replication/walreceiver.h
+++ b/src/include/replication/walreceiver.h
@@ -187,6 +187,7 @@ typedef struct
* prepare time */
char *origin; /* Only publish data originating from the
* specified origin */
+ int32 min_apply_delay; /* The minimum apply delay */
} logical;
} proto;
} WalRcvStreamOptions;
diff --git a/src/test/regress/expected/subscription.out b/src/test/regress/expected/subscription.out
index 3f99b14394..2594395c73 100644
--- a/src/test/regress/expected/subscription.out
+++ b/src/test/regress/expected/subscription.out
@@ -114,18 +114,18 @@ CREATE SUBSCRIPTION regress_testsub4 CONNECTION 'dbname=regress_doesnotexist' PU
WARNING: subscription was created, but is not connected
HINT: To initiate replication, you must manually create the replication slot, enable the subscription, and refresh the subscription.
\dRs+ regress_testsub4
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
-------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub4 | regress_subscription_user | f | {testpub} | f | off | d | f | none | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay | Synchronous commit | Conninfo | Skip LSN
+------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+-----------------+--------------------+-----------------------------+----------
+ regress_testsub4 | regress_subscription_user | f | {testpub} | f | off | d | f | none | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
ALTER SUBSCRIPTION regress_testsub4 SET (origin = any);
\dRs+ regress_testsub4
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
-------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub4 | regress_subscription_user | f | {testpub} | f | off | d | f | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay | Synchronous commit | Conninfo | Skip LSN
+------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+-----------------+--------------------+-----------------------------+----------
+ regress_testsub4 | regress_subscription_user | f | {testpub} | f | off | d | f | any | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
DROP SUBSCRIPTION regress_testsub3;
@@ -143,10 +143,10 @@ ALTER SUBSCRIPTION regress_testsub CONNECTION 'foobar';
ERROR: invalid connection string syntax: missing "=" after "foobar" in connection info string
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub} | f | off | d | f | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+-----------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | off | d | f | any | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
ALTER SUBSCRIPTION regress_testsub SET PUBLICATION testpub2, testpub3 WITH (refresh = false);
@@ -163,10 +163,10 @@ ERROR: unrecognized subscription parameter: "create_slot"
-- ok
ALTER SUBSCRIPTION regress_testsub SKIP (lsn = '0/12345');
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+---------------------+--------+-----------+------------------+------------------+--------+--------------------+------------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub2,testpub3} | f | off | d | f | any | off | dbname=regress_doesnotexist2 | 0/12345
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+---------------------+--------+-----------+------------------+------------------+--------+-----------------+--------------------+------------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub2,testpub3} | f | off | d | f | any | 0 | off | dbname=regress_doesnotexist2 | 0/12345
(1 row)
-- ok - with lsn = NONE
@@ -175,10 +175,10 @@ ALTER SUBSCRIPTION regress_testsub SKIP (lsn = NONE);
ALTER SUBSCRIPTION regress_testsub SKIP (lsn = '0/0');
ERROR: invalid WAL location (LSN): 0/0
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+---------------------+--------+-----------+------------------+------------------+--------+--------------------+------------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub2,testpub3} | f | off | d | f | any | off | dbname=regress_doesnotexist2 | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+---------------------+--------+-----------+------------------+------------------+--------+-----------------+--------------------+------------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub2,testpub3} | f | off | d | f | any | 0 | off | dbname=regress_doesnotexist2 | 0/0
(1 row)
BEGIN;
@@ -210,10 +210,10 @@ ALTER SUBSCRIPTION regress_testsub_foo SET (synchronous_commit = foobar);
ERROR: invalid value for parameter "synchronous_commit": "foobar"
HINT: Available values: local, remote_write, remote_apply, on, off.
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
----------------------+---------------------------+---------+---------------------+--------+-----------+------------------+------------------+--------+--------------------+------------------------------+----------
- regress_testsub_foo | regress_subscription_user | f | {testpub2,testpub3} | f | off | d | f | any | local | dbname=regress_doesnotexist2 | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay | Synchronous commit | Conninfo | Skip LSN
+---------------------+---------------------------+---------+---------------------+--------+-----------+------------------+------------------+--------+-----------------+--------------------+------------------------------+----------
+ regress_testsub_foo | regress_subscription_user | f | {testpub2,testpub3} | f | off | d | f | any | 0 | local | dbname=regress_doesnotexist2 | 0/0
(1 row)
-- rename back to keep the rest simple
@@ -247,19 +247,19 @@ CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUB
WARNING: subscription was created, but is not connected
HINT: To initiate replication, you must manually create the replication slot, enable the subscription, and refresh the subscription.
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub} | t | off | d | f | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+-----------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | t | off | d | f | any | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
ALTER SUBSCRIPTION regress_testsub SET (binary = false);
ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE);
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub} | f | off | d | f | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+-----------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | off | d | f | any | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
DROP SUBSCRIPTION regress_testsub;
@@ -271,27 +271,27 @@ CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUB
WARNING: subscription was created, but is not connected
HINT: To initiate replication, you must manually create the replication slot, enable the subscription, and refresh the subscription.
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub} | f | on | d | f | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+-----------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | on | d | f | any | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
ALTER SUBSCRIPTION regress_testsub SET (streaming = parallel);
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub} | f | parallel | d | f | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+-----------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | parallel | d | f | any | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
ALTER SUBSCRIPTION regress_testsub SET (streaming = false);
ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE);
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub} | f | off | d | f | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+-----------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | off | d | f | any | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
-- fail - publication already exists
@@ -306,10 +306,10 @@ ALTER SUBSCRIPTION regress_testsub ADD PUBLICATION testpub1, testpub2 WITH (refr
ALTER SUBSCRIPTION regress_testsub ADD PUBLICATION testpub1, testpub2 WITH (refresh = false);
ERROR: publication "testpub1" is already in subscription "regress_testsub"
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-----------------------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub,testpub1,testpub2} | f | off | d | f | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-----------------------------+--------+-----------+------------------+------------------+--------+-----------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub,testpub1,testpub2} | f | off | d | f | any | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
-- fail - publication used more than once
@@ -324,10 +324,10 @@ ERROR: publication "testpub3" is not in subscription "regress_testsub"
-- ok - delete publications
ALTER SUBSCRIPTION regress_testsub DROP PUBLICATION testpub1, testpub2 WITH (refresh = false);
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub} | f | off | d | f | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+-----------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | off | d | f | any | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
DROP SUBSCRIPTION regress_testsub;
@@ -363,10 +363,10 @@ CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUB
WARNING: subscription was created, but is not connected
HINT: To initiate replication, you must manually create the replication slot, enable the subscription, and refresh the subscription.
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub} | f | off | p | f | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+-----------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | off | p | f | any | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
--fail - alter of two_phase option not supported.
@@ -375,10 +375,10 @@ ERROR: unrecognized subscription parameter: "two_phase"
-- but can alter streaming when two_phase enabled
ALTER SUBSCRIPTION regress_testsub SET (streaming = true);
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub} | f | on | p | f | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+-----------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | on | p | f | any | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE);
@@ -388,10 +388,10 @@ CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUB
WARNING: subscription was created, but is not connected
HINT: To initiate replication, you must manually create the replication slot, enable the subscription, and refresh the subscription.
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub} | f | on | p | f | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+-----------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | on | p | f | any | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE);
@@ -404,18 +404,45 @@ CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUB
WARNING: subscription was created, but is not connected
HINT: To initiate replication, you must manually create the replication slot, enable the subscription, and refresh the subscription.
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub} | f | off | d | f | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+-----------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | off | d | f | any | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
ALTER SUBSCRIPTION regress_testsub SET (disable_on_error = true);
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub} | f | off | d | t | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+-----------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | off | d | t | any | 0 | off | dbname=regress_doesnotexist | 0/0
+(1 row)
+
+ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE);
+DROP SUBSCRIPTION regress_testsub;
+-- fail -- min_apply_delay must be a non-negative integer
+CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexit' PUBLICATION testpub WITH (connect = false, min_apply_delay = foo);
+ERROR: invalid value for parameter "min_apply_delay": "foo"
+CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexit' PUBLICATION testpub WITH (connect = false, min_apply_delay = -1);
+ERROR: -1 ms is outside the valid range for parameter "min_apply_delay" (0 .. 2147483647)
+-- success -- min_apply_delay value without unit is take as milliseconds
+CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexit' PUBLICATION testpub WITH (connect = false, min_apply_delay = 123);
+WARNING: subscription was created, but is not connected
+HINT: To initiate replication, you must manually create the replication slot, enable the subscription, and refresh the subscription.
+\dRs+
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+-----------------+--------------------+----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | off | d | f | any | 123 | off | dbname=regress_doesnotexit | 0/0
+(1 row)
+
+-- success -- min_apply_delay value with unit is converted into ms and stored as an integer
+ALTER SUBSCRIPTION regress_testsub SET (min_apply_delay = '1 d');
+\dRs+
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+-----------------+--------------------+----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | off | d | f | any | 86400000 | off | dbname=regress_doesnotexit | 0/0
(1 row)
ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE);
diff --git a/src/test/regress/sql/subscription.sql b/src/test/regress/sql/subscription.sql
index 7281f5fee2..2396fbeedb 100644
--- a/src/test/regress/sql/subscription.sql
+++ b/src/test/regress/sql/subscription.sql
@@ -286,6 +286,21 @@ ALTER SUBSCRIPTION regress_testsub SET (disable_on_error = true);
ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE);
DROP SUBSCRIPTION regress_testsub;
+-- fail -- min_apply_delay must be a non-negative integer
+CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexit' PUBLICATION testpub WITH (connect = false, min_apply_delay = foo);
+CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexit' PUBLICATION testpub WITH (connect = false, min_apply_delay = -1);
+
+-- success -- min_apply_delay value without unit is take as milliseconds
+CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexit' PUBLICATION testpub WITH (connect = false, min_apply_delay = 123);
+\dRs+
+
+-- success -- min_apply_delay value with unit is converted into ms and stored as an integer
+ALTER SUBSCRIPTION regress_testsub SET (min_apply_delay = '1 d');
+\dRs+
+
+ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE);
+DROP SUBSCRIPTION regress_testsub;
+
RESET SESSION AUTHORIZATION;
DROP ROLE regress_subscription_user;
DROP ROLE regress_subscription_user2;
diff --git a/src/test/subscription/t/001_rep_changes.pl b/src/test/subscription/t/001_rep_changes.pl
index 91aa068c95..75fd77b891 100644
--- a/src/test/subscription/t/001_rep_changes.pl
+++ b/src/test/subscription/t/001_rep_changes.pl
@@ -515,6 +515,34 @@ $node_publisher->poll_query_until('postgres',
or die
"Timed out while waiting for apply to restart after renaming SUBSCRIPTION";
+# Test time-delayed logical replication
+#
+# If the subscription sets min_apply_delay parameter, the logical replication
+# worker will delay the transaction apply for min_apply_delay milliseconds. We
+# verify this by looking at the time difference between a) when tuples are
+# inserted on the publisher, and b) when those changes are replicated on the
+# subscriber. Even on slow machines, this strategy will give predictable behavior.
+
+# Set min_apply_delay parameter to 3 seconds
+my $delay = 3;
+$node_subscriber->safe_psql('postgres',
+ "ALTER SUBSCRIPTION tap_sub_renamed SET (min_apply_delay = '${delay}s')");
+
+# Before doing the insertion, get the current timestamp that will be
+# used as a comparison base.
+my $publisher_insert_time = time();
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_ins VALUES (generate_series(1101, 1120))");
+
+# The publisher waits for the replication to complete
+$node_publisher->wait_for_catchup('tap_sub_renamed');
+
+# This test is successful if and only if the LSN has been applied with at least
+# the configured apply delay.
+ok( time() - $publisher_insert_time >= $delay,
+ "subscriber applies WAL only after replication delay for non-streaming transaction"
+);
+
# check all the cleanup
$node_subscriber->safe_psql('postgres', "DROP SUBSCRIPTION tap_sub_renamed");
--
2.27.0
At Wed, 15 Feb 2023 11:29:18 +0000, "Hayato Kuroda (Fujitsu)" <kuroda.hayato@fujitsu.com> wrote in
Dear Andres and other hackers,
OTOH, if we want to implement the functionality on publisher-side,
I think we need to first consider the interface.
We can think of two options (a) Have it as a subscription parameter as the patch
has now and
then pass it as an option to the publisher which it will use to delay;
(b) Have it defined on publisher-side, say via GUC or some other way.
The basic idea could be that while processing commit record (in DecodeCommit),
we can somehow check the value of delay and then use it there to delay sending
the xact.
Also, during delay, we need to somehow send the keepalive and process replies,
probably via a new callback or by some existing callback.
We also need to handle in-progress and 2PC xacts in a similar way.
For the former, probably we would need to apply the delay before sending the first
stream.
Could you please share what you feel on this direction as well ?I implemented a patch that the delaying is done on the publisher side. In this patch,
approach (a) was chosen, in which min_apply_delay is specified as a subscription
parameter, and then apply worker passes it to the publisher as an output plugin option.
As Amit-K mentioned, we may need to change the name of the option in
this version, since the delay mechanism in this version causes a delay
in sending from publisher than delaying apply on the subscriber side.
I'm not sure why output plugin is involved in the delay mechanism. It
appears to me that it would be simpler if the delay occurred in
reorder buffer or logical decoder instead. Perhaps what I understand
correctly is that we could delay right before only sending commit
records in this case. If we delay at publisher end, all changes will
be sent at once if !streaming, and otherwise, all changes in a
transaction will be spooled at subscriber end. In any case, apply
worker won't be holding an active transaction unnecessarily. Of
course we need add the mechanism to process keep-alive and status
report messages.
During the delay, the walsender periodically checks and processes replies from the
apply worker and sends keepalive messages if needed. Therefore, the ability to handle
keepalives is not loosed.
My understanding is that the keep-alives is a different mechanism with
a different objective from status reports. Even if subscriber doesn't
send a spontaneous or extra status reports at all, connection can be
checked and maintained by keep-alive packets. It is possible to setup
an asymmetric configuration where only walsender sends keep-alives,
but none are sent from the peer. Those setups work fine when no
apply-delay involved, but they won't work with the patches we're
talking about because the subscriber won't respond to the keep-alive
packets from the peer. So when I wrote "practically works" in the
last mail, this is what I meant.
Thus if someone plans to enable apply_delay for logical replication,
that person should be aware of some additional subtle restrictions that
are required compared to a non-delayed setups.
To delay the transaction in the output plugin layer, the new LogicalOutputPlugin
API was added. For now, I choose the output plugin layer but can consider to do
it from the core if there is a better way.Could you please share your opinion?
Note: thanks for Osumi-san to help implementing.
regards.
--
Kyotaro Horiguchi
NTT Open Source Software Center
Dear Horiguchi-san,
Thank you for responding! Before modifying patches, I want to confirm something
you said.
As Amit-K mentioned, we may need to change the name of the option in
this version, since the delay mechanism in this version causes a delay
in sending from publisher than delaying apply on the subscriber side.
Right, will be changed.
I'm not sure why output plugin is involved in the delay mechanism. It
appears to me that it would be simpler if the delay occurred in
reorder buffer or logical decoder instead.
I'm planning to change, but..
Perhaps what I understand
correctly is that we could delay right before only sending commit
records in this case. If we delay at publisher end, all changes will
be sent at once if !streaming, and otherwise, all changes in a
transaction will be spooled at subscriber end. In any case, apply
worker won't be holding an active transaction unnecessarily.
What about parallel case? Latest patch does not reject the combination of parallel
streaming mode and delay. If delay is done at commit and subscriber uses an parallel
apply worker, it may acquire lock for a long time.
Of
course we need add the mechanism to process keep-alive and status
report messages.
Could you share the good way to handle keep-alive and status messages if you have?
If we changed to the decoding layer, it is strange to call walsender function
directly.
Those setups work fine when no
apply-delay involved, but they won't work with the patches we're
talking about because the subscriber won't respond to the keep-alive
packets from the peer. So when I wrote "practically works" in the
last mail, this is what I meant.
I'm not sure around the part. I think in the latest patch, subscriber can respond
to the keepalive packets from the peer. Also, publisher can respond to the peer.
Could you please tell me if you know a case that publisher or subscriber cannot
respond to the opposite side? Note that if we apply the publisher-side patch, we
don't have to apply subscriber-side patch.
Best Regards,
Hayato Kuroda
FUJITSU LIMITED
At Thu, 16 Feb 2023 06:20:23 +0000, "Hayato Kuroda (Fujitsu)" <kuroda.hayato@fujitsu.com> wrote in
Dear Horiguchi-san,
Thank you for responding! Before modifying patches, I want to confirm something
you said.As Amit-K mentioned, we may need to change the name of the option in
this version, since the delay mechanism in this version causes a delay
in sending from publisher than delaying apply on the subscriber side.Right, will be changed.
I'm not sure why output plugin is involved in the delay mechanism. It
appears to me that it would be simpler if the delay occurred in
reorder buffer or logical decoder instead.I'm planning to change, but..
Yeah, I don't think we've made up our minds about which way to go yet,
so it's a bit too early to work on that.
Perhaps what I understand
correctly is that we could delay right before only sending commit
records in this case. If we delay at publisher end, all changes will
be sent at once if !streaming, and otherwise, all changes in a
transaction will be spooled at subscriber end. In any case, apply
worker won't be holding an active transaction unnecessarily.What about parallel case? Latest patch does not reject the combination of parallel
streaming mode and delay. If delay is done at commit and subscriber uses an parallel
apply worker, it may acquire lock for a long time.
I didn't looked too closely, but my guess is that transactions are
conveyed in spool files in parallel mode, with each file storing a
complete transaction.
Of
course we need add the mechanism to process keep-alive and status
report messages.Could you share the good way to handle keep-alive and status messages if you have?
If we changed to the decoding layer, it is strange to call walsender function
directly.
I'm sorry, but I don't have a concrete idea at the moment. When I read
through the last patch, I missed that WalSndDelay is actually a subset
of WalSndLoop. Although it can handle keep-alives correctly, I'm not
sure we can accept that structure..
Those setups work fine when no
apply-delay involved, but they won't work with the patches we're
talking about because the subscriber won't respond to the keep-alive
packets from the peer. So when I wrote "practically works" in the
last mail, this is what I meant.I'm not sure around the part. I think in the latest patch, subscriber can respond
to the keepalive packets from the peer. Also, publisher can respond to the peer.
Could you please tell me if you know a case that publisher or subscriber cannot
respond to the opposite side? Note that if we apply the publisher-side patch, we
don't have to apply subscriber-side patch.
Sorry about that again, I missed that part in the last patch as
mentioned earlier..
regards.
--
Kyotaro Horiguchi
NTT Open Source Software Center
On Thu, Feb 16, 2023 at 2:25 PM Kyotaro Horiguchi
<horikyota.ntt@gmail.com> wrote:
At Thu, 16 Feb 2023 06:20:23 +0000, "Hayato Kuroda (Fujitsu)" <kuroda.hayato@fujitsu.com> wrote in
Dear Horiguchi-san,
Thank you for responding! Before modifying patches, I want to confirm something
you said.As Amit-K mentioned, we may need to change the name of the option in
this version, since the delay mechanism in this version causes a delay
in sending from publisher than delaying apply on the subscriber side.Right, will be changed.
I'm not sure why output plugin is involved in the delay mechanism. It
appears to me that it would be simpler if the delay occurred in
reorder buffer or logical decoder instead.I'm planning to change, but..
Yeah, I don't think we've made up our minds about which way to go yet,
so it's a bit too early to work on that.Perhaps what I understand
correctly is that we could delay right before only sending commit
records in this case. If we delay at publisher end, all changes will
be sent at once if !streaming, and otherwise, all changes in a
transaction will be spooled at subscriber end. In any case, apply
worker won't be holding an active transaction unnecessarily.What about parallel case? Latest patch does not reject the combination of parallel
streaming mode and delay. If delay is done at commit and subscriber uses an parallel
apply worker, it may acquire lock for a long time.I didn't looked too closely, but my guess is that transactions are
conveyed in spool files in parallel mode, with each file storing a
complete transaction.
No, we don't try to collect all the data in files for parallel mode.
Having said that, it doesn't matter because we won't know the time of
the commit (which is used to compute delay) before we encounter the
commit record in WAL. So, I feel for this approach, we can follow what
you said.
Of
course we need add the mechanism to process keep-alive and status
report messages.Could you share the good way to handle keep-alive and status messages if you have?
If we changed to the decoding layer, it is strange to call walsender function
directly.I'm sorry, but I don't have a concrete idea at the moment. When I read
through the last patch, I missed that WalSndDelay is actually a subset
of WalSndLoop. Although it can handle keep-alives correctly, I'm not
sure we can accept that structure..
I think we can use update_progress_txn() for this purpose but note
that we are discussing to change the same in thread [1]/messages/by-id/20230210210423.r26ndnfmuifie4f6@awork3.anarazel.de.
[1]: /messages/by-id/20230210210423.r26ndnfmuifie4f6@awork3.anarazel.de
--
With Regards,
Amit Kapila.
Hi,
On 2023-02-16 14:21:01 +0900, Kyotaro Horiguchi wrote:
I'm not sure why output plugin is involved in the delay mechanism.
+many
The output plugin absolutely never should be involved in something like
this. It was a grave mistake that OutputPluginUpdateProgress() calls were
added to the commit callback and then proliferated.
It appears to me that it would be simpler if the delay occurred in reorder
buffer or logical decoder instead.
This is a feature specific to walsender. So the riggering logic should either
directly live in the walsender, or in a callback set in
LogicalDecodingContext. That could be called from decode.c or such.
Greetings,
Andres Freund
Dear Horiguchi-san,
Thank you for replying! This direction seems OK, so I started to revise the patch.
PSA new version.
As Amit-K mentioned, we may need to change the name of the option in
this version, since the delay mechanism in this version causes a delay
in sending from publisher than delaying apply on the subscriber side.Right, will be changed.
I'm not sure why output plugin is involved in the delay mechanism. It
appears to me that it would be simpler if the delay occurred in
reorder buffer or logical decoder instead.I'm planning to change, but..
Yeah, I don't think we've made up our minds about which way to go yet,
so it's a bit too early to work on that.
The parameter name is changed to min_send_delay.
And the delaying spot is changed to logical decoder.
Perhaps what I understand
correctly is that we could delay right before only sending commit
records in this case. If we delay at publisher end, all changes will
be sent at once if !streaming, and otherwise, all changes in a
transaction will be spooled at subscriber end. In any case, apply
worker won't be holding an active transaction unnecessarily.What about parallel case? Latest patch does not reject the combination of
parallel
streaming mode and delay. If delay is done at commit and subscriber uses an
parallel
apply worker, it may acquire lock for a long time.
I didn't looked too closely, but my guess is that transactions are
conveyed in spool files in parallel mode, with each file storing a
complete transaction.
Based on the advice, I moved the delaying to DecodeCommit().
And the combination of parallel streaming mode and min_send_delay is
rejected again.
Of
course we need add the mechanism to process keep-alive and status
report messages.Could you share the good way to handle keep-alive and status messages if you
have?
If we changed to the decoding layer, it is strange to call walsender function
directly.I'm sorry, but I don't have a concrete idea at the moment. When I read
through the last patch, I missed that WalSndDelay is actually a subset
of WalSndLoop. Although it can handle keep-alives correctly, I'm not
sure we can accept that structure..
No issues. I have kept the current implementation.
Some bugs I found are also fixed.
Best Regards,
Hayato Kuroda
FUJITSU LIMITED
Attachments:
v2-0001-Time-delayed-logical-replication-on-publisher-sid.patchapplication/octet-stream; name=v2-0001-Time-delayed-logical-replication-on-publisher-sid.patchDownload
From 2d119a5d5210ad3665e77646cff96e51e4c1e956 Mon Sep 17 00:00:00 2001
From: Hayato Kuroda <kuroda.hayato@fujitsu.com>
Date: Thu, 16 Feb 2023 07:52:23 +0000
Subject: [PATCH v2] Time-delayed logical replication on publisher side
Similar to physical replication, a time-delayed copy of the data for
logical replication is useful for some scenarios (particularly to fix
errors that might cause data loss).
This patch implements a new subscription parameter called 'min_send_delay'.
If the subscription sets min_send_delay parameter, an apply worker passes the
value to the publisher as an output plugin option. And then, the walsender will
delay the transaction sending for given milliseconds.
The delay does not take into account the overhead of time spent in transferring
the transaction, which means that the arrival time at the subscriber may be
delayed more than the given time.
The combination of parallel streaming mode and min_send_delay is not allowed.
This is because in parallel streaming mode, we start applying the transaction
stream as soon as the first change arrives without knowing the transaction's
prepare/commit time. Always waiting for the full 'min_send_delay' period might
include unnecessary delay.
The other possibility was to apply the delay at the end of the parallel apply
transaction but that would cause issues related to resource bloat and locks being
held for a long time.
The delay occurs before we start to send the transaction on the publisher.
Regular and prepared transactions are covered. Streamed transactions are also
covered.
Author: Euler Taveira, Takamichi Osumi, Kuroda Hayato
Reviewed-by: Amit Kapila, Peter Smith, Vignesh C, Shveta Malik,
Kyotaro Horiguchi, Shi Yu, Wang Wei, Dilip Kumar, Melih Mutlu,
Andres Freund
---
doc/src/sgml/catalogs.sgml | 9 +
doc/src/sgml/glossary.sgml | 15 ++
doc/src/sgml/logical-replication.sgml | 6 +
doc/src/sgml/monitoring.sgml | 5 +
doc/src/sgml/ref/alter_subscription.sgml | 5 +-
doc/src/sgml/ref/create_subscription.sgml | 39 +++-
src/backend/catalog/pg_subscription.c | 1 +
src/backend/catalog/system_views.sql | 7 +-
src/backend/commands/subscriptioncmds.c | 122 +++++++++++-
.../libpqwalreceiver/libpqwalreceiver.c | 5 +
src/backend/replication/logical/decode.c | 4 +
src/backend/replication/logical/logical.c | 18 +-
.../replication/logical/logicalfuncs.c | 2 +-
src/backend/replication/logical/worker.c | 12 +-
src/backend/replication/pgoutput/pgoutput.c | 36 ++++
src/backend/replication/slotfuncs.c | 4 +-
src/backend/replication/walsender.c | 77 +++++++-
src/backend/utils/activity/wait_event.c | 3 +
src/bin/pg_dump/pg_dump.c | 15 +-
src/bin/pg_dump/pg_dump.h | 1 +
src/bin/psql/describe.c | 9 +-
src/bin/psql/tab-complete.c | 4 +-
src/include/catalog/pg_subscription.h | 3 +
src/include/replication/logical.h | 13 +-
src/include/replication/pgoutput.h | 1 +
src/include/replication/walreceiver.h | 1 +
src/include/utils/wait_event.h | 3 +-
src/test/regress/expected/subscription.out | 174 ++++++++++--------
src/test/regress/sql/subscription.sql | 18 ++
src/test/subscription/t/001_rep_changes.pl | 28 +++
30 files changed, 536 insertions(+), 104 deletions(-)
diff --git a/doc/src/sgml/catalogs.sgml b/doc/src/sgml/catalogs.sgml
index c1e4048054..3c013f976a 100644
--- a/doc/src/sgml/catalogs.sgml
+++ b/doc/src/sgml/catalogs.sgml
@@ -7873,6 +7873,15 @@ SCRAM-SHA-256$<replaceable><iteration count></replaceable>:<replaceable>&l
</para></entry>
</row>
+ <row>
+ <entry role="catalog_table_entry"><para role="column_definition">
+ <structfield>subminsenddelay</structfield> <type>int4</type>
+ </para>
+ <para>
+ The minimum delay for publisher sends data, in milliseconds
+ </para></entry>
+ </row>
+
<row>
<entry role="catalog_table_entry"><para role="column_definition">
<structfield>subname</structfield> <type>name</type>
diff --git a/doc/src/sgml/glossary.sgml b/doc/src/sgml/glossary.sgml
index 7c01a541fe..9ede9d05f6 100644
--- a/doc/src/sgml/glossary.sgml
+++ b/doc/src/sgml/glossary.sgml
@@ -1729,6 +1729,21 @@
</glossdef>
</glossentry>
+ <glossentry id="glossary-time-delayed-replication">
+ <glossterm>Time-delayed replication</glossterm>
+ <glossdef>
+ <para>
+ Replication setup that delays the application of changes by a specified
+ minimum time-delay period.
+ </para>
+ <para>
+ For more information, see
+ <xref linkend="guc-recovery-min-apply-delay"/> for physical replication
+ and <xref linkend="sql-createsubscription"/> for logical replication.
+ </para>
+ </glossdef>
+ </glossentry>
+
<glossentry id="glossary-toast">
<glossterm>TOAST</glossterm>
<glossdef>
diff --git a/doc/src/sgml/logical-replication.sgml b/doc/src/sgml/logical-replication.sgml
index 1bd5660c87..8bca0b3800 100644
--- a/doc/src/sgml/logical-replication.sgml
+++ b/doc/src/sgml/logical-replication.sgml
@@ -247,6 +247,12 @@
target table.
</para>
+ <para>
+ A publication can delay sending changes to the subscription by specifying
+ the <literal>min_send_delay</literal> subscription parameter. See
+ <xref linkend="sql-createsubscription"/> for details.
+ </para>
+
<sect2 id="logical-replication-subscription-slot">
<title>Replication Slot Management</title>
diff --git a/doc/src/sgml/monitoring.sgml b/doc/src/sgml/monitoring.sgml
index b0b997f092..e75525c5b8 100644
--- a/doc/src/sgml/monitoring.sgml
+++ b/doc/src/sgml/monitoring.sgml
@@ -2349,6 +2349,11 @@ postgres 27093 0.0 0.0 30096 2752 ? Ss 11:34 0:00 postgres: ser
<entry>Waiting to acquire an exclusive lock to truncate off any
empty pages at the end of a table vacuumed.</entry>
</row>
+ <row>
+ <entry><literal>WalSenderSendDelay</literal></entry>
+ <entry>Waiting for sending changes to subscriber in WAL sender
+ process.</entry>
+ </row>
</tbody>
</tgroup>
</table>
diff --git a/doc/src/sgml/ref/alter_subscription.sgml b/doc/src/sgml/ref/alter_subscription.sgml
index 964fcbb8ff..3f238b958b 100644
--- a/doc/src/sgml/ref/alter_subscription.sgml
+++ b/doc/src/sgml/ref/alter_subscription.sgml
@@ -213,8 +213,9 @@ ALTER SUBSCRIPTION <replaceable class="parameter">name</replaceable> RENAME TO <
are <literal>slot_name</literal>,
<literal>synchronous_commit</literal>,
<literal>binary</literal>, <literal>streaming</literal>,
- <literal>disable_on_error</literal>, and
- <literal>origin</literal>.
+ <literal>disable_on_error</literal>,
+ <literal>origin</literal>, and
+ <literal>min_send_delay</literal>.
</para>
</listitem>
</varlistentry>
diff --git a/doc/src/sgml/ref/create_subscription.sgml b/doc/src/sgml/ref/create_subscription.sgml
index 51c45f17c7..4a665c8d07 100644
--- a/doc/src/sgml/ref/create_subscription.sgml
+++ b/doc/src/sgml/ref/create_subscription.sgml
@@ -349,7 +349,39 @@ CREATE SUBSCRIPTION <replaceable class="parameter">subscription_name</replaceabl
</para>
</listitem>
</varlistentry>
- </variablelist></para>
+
+ <varlistentry>
+ <term><literal>min_send_delay</literal> (<type>integer</type>)</term>
+ <listitem>
+ <para>
+ By default, the publisher sends changes as soon as possible. This
+ parameter allows the user to delay the publisher to send changes by
+ given time period. If the value is specified without units, it is
+ taken as milliseconds. The default is zero (no delay). See
+ <xref linkend="config-setting-names-values"/> for details on the
+ available valid time units.
+ </para>
+ <para>
+ Any delay becomes effective only after all initial table
+ synchronization has finished and occurs before each transaction
+ starts to get applied on the subscriber. The delay does not take into
+ account the overhead of time spent in transferring the transaction,
+ which means that the arrival time at the subscriber may be delayed
+ more than the given time.
+ </para>
+ <warning>
+ <para>
+ Delaying the replication means there is a much longer time between
+ making a change on the publisher, and that change being committed
+ on the subscriber. This can impact the performance of synchronous
+ replication. See <xref linkend="guc-synchronous-commit"/>
+ parameter.
+ </para>
+ </warning>
+ </listitem>
+ </varlistentry>
+ </variablelist>
+ </para>
</listitem>
</varlistentry>
@@ -420,6 +452,11 @@ CREATE SUBSCRIPTION <replaceable class="parameter">subscription_name</replaceabl
published with different column lists are not supported.
</para>
+ <para>
+ A non-zero <literal>min_send_delay</literal> parameter is not allowed when
+ streaming in parallel mode.
+ </para>
+
<para>
We allow non-existent publications to be specified so that users can add
those later. This means
diff --git a/src/backend/catalog/pg_subscription.c b/src/backend/catalog/pg_subscription.c
index a56ae311c3..63a10b06d1 100644
--- a/src/backend/catalog/pg_subscription.c
+++ b/src/backend/catalog/pg_subscription.c
@@ -66,6 +66,7 @@ GetSubscription(Oid subid, bool missing_ok)
sub->skiplsn = subform->subskiplsn;
sub->name = pstrdup(NameStr(subform->subname));
sub->owner = subform->subowner;
+ sub->minsenddelay = subform->subminsenddelay;
sub->enabled = subform->subenabled;
sub->binary = subform->subbinary;
sub->stream = subform->substream;
diff --git a/src/backend/catalog/system_views.sql b/src/backend/catalog/system_views.sql
index 34ca0e739f..6b7b741a1e 100644
--- a/src/backend/catalog/system_views.sql
+++ b/src/backend/catalog/system_views.sql
@@ -1314,9 +1314,10 @@ REVOKE ALL ON pg_replication_origin_status FROM public;
-- All columns of pg_subscription except subconninfo are publicly readable.
REVOKE ALL ON pg_subscription FROM public;
-GRANT SELECT (oid, subdbid, subskiplsn, subname, subowner, subenabled,
- subbinary, substream, subtwophasestate, subdisableonerr,
- subslotname, subsynccommit, subpublications, suborigin)
+GRANT SELECT (oid, subdbid, subskiplsn, subminsenddelay, subname, subowner,
+ subenabled, subbinary, substream, subtwophasestate,
+ subdisableonerr, subslotname, subsynccommit, subpublications,
+ suborigin)
ON pg_subscription TO public;
CREATE VIEW pg_stat_subscription_stats AS
diff --git a/src/backend/commands/subscriptioncmds.c b/src/backend/commands/subscriptioncmds.c
index 464db6d247..fca23ae7e1 100644
--- a/src/backend/commands/subscriptioncmds.c
+++ b/src/backend/commands/subscriptioncmds.c
@@ -66,6 +66,7 @@
#define SUBOPT_DISABLE_ON_ERR 0x00000400
#define SUBOPT_LSN 0x00000800
#define SUBOPT_ORIGIN 0x00001000
+#define SUBOPT_MIN_SEND_DELAY 0x00002000
/* check if the 'val' has 'bits' set */
#define IsSet(val, bits) (((val) & (bits)) == (bits))
@@ -90,6 +91,7 @@ typedef struct SubOpts
bool disableonerr;
char *origin;
XLogRecPtr lsn;
+ int32 min_send_delay;
} SubOpts;
static List *fetch_table_list(WalReceiverConn *wrconn, List *publications);
@@ -100,7 +102,7 @@ static void check_publications_origin(WalReceiverConn *wrconn,
static void check_duplicates_in_publist(List *publist, Datum *datums);
static List *merge_publications(List *oldpublist, List *newpublist, bool addpub, const char *subname);
static void ReportSlotConnectionError(List *rstates, Oid subid, char *slotname, char *err);
-
+static int32 defGetMinSendDelay(DefElem *def);
/*
* Common option parsing function for CREATE and ALTER SUBSCRIPTION commands.
@@ -146,6 +148,8 @@ parse_subscription_options(ParseState *pstate, List *stmt_options,
opts->disableonerr = false;
if (IsSet(supported_opts, SUBOPT_ORIGIN))
opts->origin = pstrdup(LOGICALREP_ORIGIN_ANY);
+ if (IsSet(supported_opts, SUBOPT_MIN_SEND_DELAY))
+ opts->min_send_delay = 0;
/* Parse options */
foreach(lc, stmt_options)
@@ -324,6 +328,15 @@ parse_subscription_options(ParseState *pstate, List *stmt_options,
opts->specified_opts |= SUBOPT_LSN;
opts->lsn = lsn;
}
+ else if (IsSet(supported_opts, SUBOPT_MIN_SEND_DELAY) &&
+ strcmp(defel->defname, "min_send_delay") == 0)
+ {
+ if (IsSet(opts->specified_opts, SUBOPT_MIN_SEND_DELAY))
+ errorConflictingDefElem(defel, pstate);
+
+ opts->specified_opts |= SUBOPT_MIN_SEND_DELAY;
+ opts->min_send_delay = defGetMinSendDelay(defel);
+ }
else
ereport(ERROR,
(errcode(ERRCODE_SYNTAX_ERROR),
@@ -404,6 +417,32 @@ parse_subscription_options(ParseState *pstate, List *stmt_options,
"slot_name = NONE", "create_slot = false")));
}
}
+
+ /*
+ * The combination of parallel streaming mode and min_send_delay is not
+ * allowed. This is because in parallel streaming mode, we start applying
+ * the transaction stream as soon as the first change arrives without
+ * knowing the transaction's prepare/commit time. Always waiting for the
+ * full 'min_send_delay' period might include unnecessary delay.
+ *
+ * The other possibility was to apply the delay at the end of the parallel
+ * apply transaction but that would cause issues related to resource bloat
+ * and locks being held for a long time.
+ */
+ if (IsSet(supported_opts, SUBOPT_MIN_SEND_DELAY) &&
+ opts->min_send_delay > 0 &&
+ opts->streaming == LOGICALREP_STREAM_PARALLEL)
+ ereport(ERROR,
+ errcode(ERRCODE_SYNTAX_ERROR),
+
+ /*
+ * translator: the first %s is a string of the form "parameter > 0"
+ * and the second one is "option = value".
+ */
+ errmsg("%s and %s are mutually exclusive options",
+ "min_send_delay > 0", "streaming = parallel"));
+
+
}
/*
@@ -560,7 +599,8 @@ CreateSubscription(ParseState *pstate, CreateSubscriptionStmt *stmt,
SUBOPT_SLOT_NAME | SUBOPT_COPY_DATA |
SUBOPT_SYNCHRONOUS_COMMIT | SUBOPT_BINARY |
SUBOPT_STREAMING | SUBOPT_TWOPHASE_COMMIT |
- SUBOPT_DISABLE_ON_ERR | SUBOPT_ORIGIN);
+ SUBOPT_DISABLE_ON_ERR | SUBOPT_ORIGIN |
+ SUBOPT_MIN_SEND_DELAY);
parse_subscription_options(pstate, stmt->options, supported_opts, &opts);
/*
@@ -625,6 +665,7 @@ CreateSubscription(ParseState *pstate, CreateSubscriptionStmt *stmt,
values[Anum_pg_subscription_oid - 1] = ObjectIdGetDatum(subid);
values[Anum_pg_subscription_subdbid - 1] = ObjectIdGetDatum(MyDatabaseId);
values[Anum_pg_subscription_subskiplsn - 1] = LSNGetDatum(InvalidXLogRecPtr);
+ values[Anum_pg_subscription_subminsenddelay - 1] = Int32GetDatum(opts.min_send_delay);
values[Anum_pg_subscription_subname - 1] =
DirectFunctionCall1(namein, CStringGetDatum(stmt->subname));
values[Anum_pg_subscription_subowner - 1] = ObjectIdGetDatum(owner);
@@ -1054,7 +1095,7 @@ AlterSubscription(ParseState *pstate, AlterSubscriptionStmt *stmt,
supported_opts = (SUBOPT_SLOT_NAME |
SUBOPT_SYNCHRONOUS_COMMIT | SUBOPT_BINARY |
SUBOPT_STREAMING | SUBOPT_DISABLE_ON_ERR |
- SUBOPT_ORIGIN);
+ SUBOPT_ORIGIN | SUBOPT_MIN_SEND_DELAY);
parse_subscription_options(pstate, stmt->options,
supported_opts, &opts);
@@ -1098,6 +1139,19 @@ AlterSubscription(ParseState *pstate, AlterSubscriptionStmt *stmt,
if (IsSet(opts.specified_opts, SUBOPT_STREAMING))
{
+ /*
+ * The combination of parallel streaming mode and
+ * min_send_delay is not allowed. See
+ * parse_subscription_options.
+ */
+ if (opts.streaming == LOGICALREP_STREAM_PARALLEL &&
+ !IsSet(opts.specified_opts, SUBOPT_MIN_SEND_DELAY) &&
+ sub->minsenddelay > 0)
+ ereport(ERROR,
+ errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
+ errmsg("cannot set parallel streaming mode for subscription with %s",
+ "min_send_delay"));
+
values[Anum_pg_subscription_substream - 1] =
CharGetDatum(opts.streaming);
replaces[Anum_pg_subscription_substream - 1] = true;
@@ -1111,6 +1165,26 @@ AlterSubscription(ParseState *pstate, AlterSubscriptionStmt *stmt,
= true;
}
+ if (IsSet(opts.specified_opts, SUBOPT_MIN_SEND_DELAY))
+ {
+ /*
+ * The combination of parallel streaming mode and
+ * min_send_delay is not allowed. See
+ * parse_subscription_options.
+ */
+ if (opts.min_send_delay > 0 &&
+ !IsSet(opts.specified_opts, SUBOPT_STREAMING) &&
+ sub->stream == LOGICALREP_STREAM_PARALLEL)
+ ereport(ERROR,
+ errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
+ errmsg("cannot set %s for subscription in parallel streaming mode",
+ "min_send_delay"));
+
+ values[Anum_pg_subscription_subminsenddelay - 1] =
+ Int32GetDatum(opts.min_send_delay);
+ replaces[Anum_pg_subscription_subminsenddelay - 1] = true;
+ }
+
if (IsSet(opts.specified_opts, SUBOPT_ORIGIN))
{
values[Anum_pg_subscription_suborigin - 1] =
@@ -2195,3 +2269,45 @@ defGetStreamingMode(DefElem *def)
def->defname)));
return LOGICALREP_STREAM_OFF; /* keep compiler quiet */
}
+
+/*
+ * Extract the min_send_delay value from a DefElem. This is very similar to
+ * parse_and_validate_value() for integer values, because min_send_delay
+ * accepts the same parameter format as recovery_min_apply_delay.
+ */
+static int32
+defGetMinSendDelay(DefElem *def)
+{
+ char *input_string;
+ int result;
+ const char *hintmsg;
+
+ input_string = defGetString(def);
+
+ /*
+ * Parse given string as parameter which has millisecond unit
+ */
+ if (!parse_int(input_string, &result, GUC_UNIT_MS, &hintmsg))
+ ereport(ERROR,
+ (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+ errmsg("invalid value for parameter \"%s\": \"%s\"",
+ "min_send_delay", input_string),
+ hintmsg ? errhint("%s", _(hintmsg)) : 0));
+
+ /*
+ * Check both the lower boundary for the valid min_send_delay range and
+ * the upper boundary as the safeguard for some platforms where INT_MAX is
+ * wider than int32 respectively. Although parse_int() has confirmed that
+ * the result is less than or equal to INT_MAX, the value will be stored
+ * in a catalog column of int32.
+ */
+ if (result < 0 || result > PG_INT32_MAX)
+ ereport(ERROR,
+ (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+ errmsg("%d ms is outside the valid range for parameter \"%s\" (%d .. %d)",
+ result,
+ "min_send_delay",
+ 0, PG_INT32_MAX)));
+
+ return result;
+}
diff --git a/src/backend/replication/libpqwalreceiver/libpqwalreceiver.c b/src/backend/replication/libpqwalreceiver/libpqwalreceiver.c
index 560ec974fa..89a72c1abe 100644
--- a/src/backend/replication/libpqwalreceiver/libpqwalreceiver.c
+++ b/src/backend/replication/libpqwalreceiver/libpqwalreceiver.c
@@ -443,6 +443,11 @@ libpqrcv_startstreaming(WalReceiverConn *conn,
PQserverVersion(conn->streamConn) >= 140000)
appendStringInfoString(&cmd, ", binary 'true'");
+ if (options->proto.logical.min_send_delay > 0 &&
+ PQserverVersion(conn->streamConn) >= 160000)
+ appendStringInfo(&cmd, ", min_send_delay '%d'",
+ options->proto.logical.min_send_delay);
+
appendStringInfoChar(&cmd, ')');
}
else
diff --git a/src/backend/replication/logical/decode.c b/src/backend/replication/logical/decode.c
index a53e23c679..80415baec4 100644
--- a/src/backend/replication/logical/decode.c
+++ b/src/backend/replication/logical/decode.c
@@ -666,6 +666,10 @@ DecodeCommit(LogicalDecodingContext *ctx, XLogRecordBuffer *buf,
buf->origptr, buf->endptr);
}
+ /* Delay given time if the context has 'delay' callback */
+ if (ctx->delay)
+ ctx->delay(ctx, commit_time);
+
/*
* Send the final commit record if the transaction data is already
* decoded, otherwise, process the entire transaction.
diff --git a/src/backend/replication/logical/logical.c b/src/backend/replication/logical/logical.c
index c3ec97a0a6..ac1f9f92f7 100644
--- a/src/backend/replication/logical/logical.c
+++ b/src/backend/replication/logical/logical.c
@@ -156,7 +156,8 @@ StartupDecodingContext(List *output_plugin_options,
XLogReaderRoutine *xl_routine,
LogicalOutputPluginWriterPrepareWrite prepare_write,
LogicalOutputPluginWriterWrite do_write,
- LogicalOutputPluginWriterUpdateProgress update_progress)
+ LogicalOutputPluginWriterUpdateProgress update_progress,
+ LogicalOutputPluginWriterDelay delay)
{
ReplicationSlot *slot;
MemoryContext context,
@@ -293,6 +294,7 @@ StartupDecodingContext(List *output_plugin_options,
ctx->prepare_write = prepare_write;
ctx->write = do_write;
ctx->update_progress = update_progress;
+ ctx->delay = delay;
ctx->output_plugin_options = output_plugin_options;
@@ -316,7 +318,7 @@ StartupDecodingContext(List *output_plugin_options,
* marking WAL reserved beforehand. In that scenario, it's up to the
* caller to guarantee that WAL remains available.
* xl_routine -- XLogReaderRoutine for underlying XLogReader
- * prepare_write, do_write, update_progress --
+ * prepare_write, do_write, update_progress, delay --
* callbacks that perform the use-case dependent, actual, work.
*
* Needs to be called while in a memory context that's at least as long lived
@@ -334,7 +336,8 @@ CreateInitDecodingContext(const char *plugin,
XLogReaderRoutine *xl_routine,
LogicalOutputPluginWriterPrepareWrite prepare_write,
LogicalOutputPluginWriterWrite do_write,
- LogicalOutputPluginWriterUpdateProgress update_progress)
+ LogicalOutputPluginWriterUpdateProgress update_progress,
+ LogicalOutputPluginWriterDelay delay)
{
TransactionId xmin_horizon = InvalidTransactionId;
ReplicationSlot *slot;
@@ -435,7 +438,7 @@ CreateInitDecodingContext(const char *plugin,
ctx = StartupDecodingContext(NIL, restart_lsn, xmin_horizon,
need_full_snapshot, false,
xl_routine, prepare_write, do_write,
- update_progress);
+ update_progress, delay);
/* call output plugin initialization callback */
old_context = MemoryContextSwitchTo(ctx->context);
@@ -475,7 +478,7 @@ CreateInitDecodingContext(const char *plugin,
* xl_routine
* XLogReaderRoutine used by underlying xlogreader
*
- * prepare_write, do_write, update_progress
+ * prepare_write, do_write, update_progress, delay
* callbacks that have to be filled to perform the use-case dependent,
* actual work.
*
@@ -493,7 +496,8 @@ CreateDecodingContext(XLogRecPtr start_lsn,
XLogReaderRoutine *xl_routine,
LogicalOutputPluginWriterPrepareWrite prepare_write,
LogicalOutputPluginWriterWrite do_write,
- LogicalOutputPluginWriterUpdateProgress update_progress)
+ LogicalOutputPluginWriterUpdateProgress update_progress,
+ LogicalOutputPluginWriterDelay delay)
{
LogicalDecodingContext *ctx;
ReplicationSlot *slot;
@@ -547,7 +551,7 @@ CreateDecodingContext(XLogRecPtr start_lsn,
ctx = StartupDecodingContext(output_plugin_options,
start_lsn, InvalidTransactionId, false,
fast_forward, xl_routine, prepare_write,
- do_write, update_progress);
+ do_write, update_progress, delay);
/* call output plugin initialization callback */
old_context = MemoryContextSwitchTo(ctx->context);
diff --git a/src/backend/replication/logical/logicalfuncs.c b/src/backend/replication/logical/logicalfuncs.c
index fa1b641a2b..960025197f 100644
--- a/src/backend/replication/logical/logicalfuncs.c
+++ b/src/backend/replication/logical/logicalfuncs.c
@@ -212,7 +212,7 @@ pg_logical_slot_get_changes_guts(FunctionCallInfo fcinfo, bool confirm, bool bin
.segment_open = wal_segment_open,
.segment_close = wal_segment_close),
LogicalOutputPrepareWrite,
- LogicalOutputWrite, NULL);
+ LogicalOutputWrite, NULL, NULL);
/*
* After the sanity checks in CreateDecodingContext, make sure the
diff --git a/src/backend/replication/logical/worker.c b/src/backend/replication/logical/worker.c
index cfb2ab6248..afbac3d80e 100644
--- a/src/backend/replication/logical/worker.c
+++ b/src/backend/replication/logical/worker.c
@@ -3898,7 +3898,8 @@ maybe_reread_subscription(void)
newsub->stream != MySubscription->stream ||
strcmp(newsub->origin, MySubscription->origin) != 0 ||
newsub->owner != MySubscription->owner ||
- !equal(newsub->publications, MySubscription->publications))
+ !equal(newsub->publications, MySubscription->publications) ||
+ newsub->minsenddelay != MySubscription->minsenddelay)
{
if (am_parallel_apply_worker())
ereport(LOG,
@@ -4617,9 +4618,18 @@ ApplyWorkerMain(Datum main_arg)
options.proto.logical.twophase = false;
options.proto.logical.origin = pstrdup(MySubscription->origin);
+ options.proto.logical.min_send_delay = 0;
if (!am_tablesync_worker())
{
+ /*
+ * Time-delayed logical replication does not support tablesync
+ * workers, so only the leader apply worker can request walsenders to
+ * apply delay on the publisher side.
+ */
+ if (server_version >= 160000 && MySubscription->minsenddelay > 0)
+ options.proto.logical.min_send_delay = MySubscription->minsenddelay;
+
/*
* Even when the two_phase mode is requested by the user, it remains
* as the tri-state PENDING until all tablesyncs have reached READY
diff --git a/src/backend/replication/pgoutput/pgoutput.c b/src/backend/replication/pgoutput/pgoutput.c
index 98377c094b..be0095cf52 100644
--- a/src/backend/replication/pgoutput/pgoutput.c
+++ b/src/backend/replication/pgoutput/pgoutput.c
@@ -285,6 +285,7 @@ parse_output_parameters(List *options, PGOutputData *data)
bool streaming_given = false;
bool two_phase_option_given = false;
bool origin_option_given = false;
+ bool min_send_delay_option_given = false;
data->binary = false;
data->streaming = LOGICALREP_STREAM_OFF;
@@ -396,6 +397,32 @@ parse_output_parameters(List *options, PGOutputData *data)
errcode(ERRCODE_INVALID_PARAMETER_VALUE),
errmsg("unrecognized origin value: \"%s\"", data->origin));
}
+ else if (strcmp(defel->defname, "min_send_delay") == 0)
+ {
+ long parsed;
+ char *endptr;
+
+ if (min_send_delay_option_given)
+ ereport(ERROR,
+ errcode(ERRCODE_SYNTAX_ERROR),
+ errmsg("conflicting or redundant options"));
+ min_send_delay_option_given = true;
+
+ errno = 0;
+ parsed = strtoul(strVal(defel->arg), &endptr, 10);
+ if (errno != 0 || *endptr != '\0')
+ ereport(ERROR,
+ (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+ errmsg("invalid min_send_delay")));
+
+ if (parsed > PG_INT32_MAX)
+ ereport(ERROR,
+ (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+ errmsg("min_send_delay \"%s\" out of range",
+ strVal(defel->arg))));
+
+ data->min_send_delay = (int32) parsed;
+ }
else
elog(ERROR, "unrecognized pgoutput option: %s", defel->defname);
}
@@ -501,6 +528,15 @@ pgoutput_startup(LogicalDecodingContext *ctx, OutputPluginOptions *opt,
else
ctx->twophase_opt_given = true;
+ if (data->min_send_delay &&
+ data->protocol_version < LOGICALREP_PROTO_STREAM_PARALLEL_VERSION_NUM)
+ ereport(ERROR,
+ (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+ errmsg("requested proto_version=%d does not support delay sending data, need %d or higher",
+ data->protocol_version, LOGICALREP_PROTO_STREAM_PARALLEL_VERSION_NUM)));
+ else
+ ctx->min_send_delay = data->min_send_delay;
+
/* Init publication state. */
data->publications = NIL;
publications_valid = false;
diff --git a/src/backend/replication/slotfuncs.c b/src/backend/replication/slotfuncs.c
index 2f3c964824..522f7600a1 100644
--- a/src/backend/replication/slotfuncs.c
+++ b/src/backend/replication/slotfuncs.c
@@ -148,7 +148,7 @@ create_logical_replication_slot(char *name, char *plugin,
XL_ROUTINE(.page_read = read_local_xlog_page,
.segment_open = wal_segment_open,
.segment_close = wal_segment_close),
- NULL, NULL, NULL);
+ NULL, NULL, NULL, NULL);
/*
* If caller needs us to determine the decoding start point, do so now.
@@ -481,7 +481,7 @@ pg_logical_replication_slot_advance(XLogRecPtr moveto)
XL_ROUTINE(.page_read = read_local_xlog_page,
.segment_open = wal_segment_open,
.segment_close = wal_segment_close),
- NULL, NULL, NULL);
+ NULL, NULL, NULL, NULL);
/*
* Start reading at the slot's restart_lsn, which we know to point to
diff --git a/src/backend/replication/walsender.c b/src/backend/replication/walsender.c
index 75e8363e24..a4f03ddba1 100644
--- a/src/backend/replication/walsender.c
+++ b/src/backend/replication/walsender.c
@@ -252,6 +252,7 @@ static void WalSndPrepareWrite(LogicalDecodingContext *ctx, XLogRecPtr lsn, Tran
static void WalSndWriteData(LogicalDecodingContext *ctx, XLogRecPtr lsn, TransactionId xid, bool last_write);
static void WalSndUpdateProgress(LogicalDecodingContext *ctx, XLogRecPtr lsn, TransactionId xid,
bool skipped_xact);
+static void WalSndDelay(LogicalDecodingContext *ctx, TimestampTz delay_start);
static XLogRecPtr WalSndWaitForWal(XLogRecPtr loc);
static void LagTrackerWrite(XLogRecPtr lsn, TimestampTz local_flush_time);
static TimeOffset LagTrackerRead(int head, XLogRecPtr lsn, TimestampTz now);
@@ -1126,7 +1127,7 @@ CreateReplicationSlot(CreateReplicationSlotCmd *cmd)
.segment_open = WalSndSegmentOpen,
.segment_close = wal_segment_close),
WalSndPrepareWrite, WalSndWriteData,
- WalSndUpdateProgress);
+ WalSndUpdateProgress, WalSndDelay);
/*
* Signal that we don't need the timeout mechanism. We're just
@@ -1285,7 +1286,7 @@ StartLogicalReplication(StartReplicationCmd *cmd)
.segment_open = WalSndSegmentOpen,
.segment_close = wal_segment_close),
WalSndPrepareWrite, WalSndWriteData,
- WalSndUpdateProgress);
+ WalSndUpdateProgress, WalSndDelay);
xlogreader = logical_decoding_ctx->reader;
WalSndSetState(WALSNDSTATE_CATCHUP);
@@ -3849,3 +3850,75 @@ LagTrackerRead(int head, XLogRecPtr lsn, TimestampTz now)
Assert(time != 0);
return now - time;
}
+
+/*
+ * LogicalDecodingContext 'delay' callback.
+ *
+ * Wait long enough to make sure a transaction is applied at least that
+ * period behind the publisher.
+ */
+static void
+WalSndDelay(LogicalDecodingContext *ctx, TimestampTz delay_start)
+{
+ /* Wait till delayUntil by the latch mechanism */
+ while (true)
+ {
+ TimestampTz delayUntil;
+ long diffms;
+ long timeout_interval_ms;
+
+ ResetLatch(MyLatch);
+
+ CHECK_FOR_INTERRUPTS();
+
+ /* This might change wal_sender_timeout */
+ if (ConfigReloadPending)
+ {
+ ConfigReloadPending = false;
+ ProcessConfigFile(PGC_SIGHUP);
+ }
+
+ /* Check for input from the client */
+ ProcessRepliesIfAny();
+
+ /* die if timeout was reached */
+ WalSndCheckTimeOut();
+
+ /* Send keepalive if the time has come */
+ WalSndKeepaliveIfNecessary();
+
+ /* Try to flush pending output to the client */
+ if (pq_flush_if_writable() != 0)
+ WalSndShutdown();
+
+ /*
+ * If we've requested to shut down, exit the process.
+ *
+ * Note that WalSndDone() cannot be used here because the delaying
+ * changes will be sent in the function.
+ */
+ if (got_STOPPING)
+ WalSndShutdown();
+
+ delayUntil = TimestampTzPlusMilliseconds(delay_start, ctx->min_send_delay);
+ diffms = TimestampDifferenceMilliseconds(GetCurrentTimestamp(), delayUntil);
+
+ /*
+ * Exit without arming the latch if it's already past time to send
+ * this transaction.
+ */
+ if (diffms <= 0)
+ break;
+
+ /* Sleep until appropriate time. */
+ timeout_interval_ms = WalSndComputeSleeptime(GetCurrentTimestamp());
+
+ elog(DEBUG2, "time-delayed replication for txid %u, delay_time = %d ms, remaining wait time: %ld ms",
+ ctx->write_xid, (int) ctx->min_send_delay, diffms);
+
+ /* Sleep until we get reply from worker or we time out */
+ WalSndWait(WL_SOCKET_READABLE,
+ Min(timeout_interval_ms, diffms),
+ WAIT_EVENT_WALSENDER_SEND_DELAY);
+ }
+}
diff --git a/src/backend/utils/activity/wait_event.c b/src/backend/utils/activity/wait_event.c
index cb99cc6339..76c19fe11d 100644
--- a/src/backend/utils/activity/wait_event.c
+++ b/src/backend/utils/activity/wait_event.c
@@ -515,6 +515,9 @@ pgstat_get_wait_timeout(WaitEventTimeout w)
case WAIT_EVENT_VACUUM_TRUNCATE:
event_name = "VacuumTruncate";
break;
+ case WAIT_EVENT_WALSENDER_SEND_DELAY:
+ event_name = "WalSenderSendDelay";
+ break;
/* no default case, so that compiler will warn */
}
diff --git a/src/bin/pg_dump/pg_dump.c b/src/bin/pg_dump/pg_dump.c
index 527c7651ab..bd95747840 100644
--- a/src/bin/pg_dump/pg_dump.c
+++ b/src/bin/pg_dump/pg_dump.c
@@ -4494,6 +4494,7 @@ getSubscriptions(Archive *fout)
int i_subsynccommit;
int i_subpublications;
int i_subbinary;
+ int i_subminsenddelay;
int i,
ntups;
@@ -4546,9 +4547,13 @@ getSubscriptions(Archive *fout)
LOGICALREP_TWOPHASE_STATE_DISABLED);
if (fout->remoteVersion >= 160000)
- appendPQExpBufferStr(query, " s.suborigin\n");
+ appendPQExpBufferStr(query,
+ " s.suborigin,\n"
+ " s.subminsenddelay\n");
else
- appendPQExpBuffer(query, " '%s' AS suborigin\n", LOGICALREP_ORIGIN_ANY);
+ appendPQExpBuffer(query, " '%s' AS suborigin,\n"
+ " 0 AS subminsenddelay\n",
+ LOGICALREP_ORIGIN_ANY);
appendPQExpBufferStr(query,
"FROM pg_subscription s\n"
@@ -4576,6 +4581,7 @@ getSubscriptions(Archive *fout)
i_subtwophasestate = PQfnumber(res, "subtwophasestate");
i_subdisableonerr = PQfnumber(res, "subdisableonerr");
i_suborigin = PQfnumber(res, "suborigin");
+ i_subminsenddelay = PQfnumber(res, "subminsenddelay");
subinfo = pg_malloc(ntups * sizeof(SubscriptionInfo));
@@ -4606,6 +4612,8 @@ getSubscriptions(Archive *fout)
subinfo[i].subdisableonerr =
pg_strdup(PQgetvalue(res, i, i_subdisableonerr));
subinfo[i].suborigin = pg_strdup(PQgetvalue(res, i, i_suborigin));
+ subinfo[i].subminsenddelay =
+ atoi(PQgetvalue(res, i, i_subminsenddelay));
/* Decide whether we want to dump it */
selectDumpableObject(&(subinfo[i].dobj), fout);
@@ -4687,6 +4695,9 @@ dumpSubscription(Archive *fout, const SubscriptionInfo *subinfo)
if (strcmp(subinfo->subsynccommit, "off") != 0)
appendPQExpBuffer(query, ", synchronous_commit = %s", fmtId(subinfo->subsynccommit));
+ if (subinfo->subminsenddelay > 0)
+ appendPQExpBuffer(query, ", min_send_delay = '%d ms'", subinfo->subminsenddelay);
+
appendPQExpBufferStr(query, ");\n");
if (subinfo->dobj.dump & DUMP_COMPONENT_DEFINITION)
diff --git a/src/bin/pg_dump/pg_dump.h b/src/bin/pg_dump/pg_dump.h
index e7cbd8d7ed..24e0f6737f 100644
--- a/src/bin/pg_dump/pg_dump.h
+++ b/src/bin/pg_dump/pg_dump.h
@@ -661,6 +661,7 @@ typedef struct _SubscriptionInfo
char *subdisableonerr;
char *suborigin;
char *subsynccommit;
+ int subminsenddelay;
char *subpublications;
} SubscriptionInfo;
diff --git a/src/bin/psql/describe.c b/src/bin/psql/describe.c
index c8a0bb7b3a..c7d303a168 100644
--- a/src/bin/psql/describe.c
+++ b/src/bin/psql/describe.c
@@ -6472,7 +6472,7 @@ describeSubscriptions(const char *pattern, bool verbose)
PGresult *res;
printQueryOpt myopt = pset.popt;
static const bool translate_columns[] = {false, false, false, false,
- false, false, false, false, false, false, false, false};
+ false, false, false, false, false, false, false, false, false};
if (pset.sversion < 100000)
{
@@ -6527,10 +6527,13 @@ describeSubscriptions(const char *pattern, bool verbose)
gettext_noop("Two-phase commit"),
gettext_noop("Disable on error"));
+ /* Origin and min_send_delay are only supported in v16 and higher */
if (pset.sversion >= 160000)
appendPQExpBuffer(&buf,
- ", suborigin AS \"%s\"\n",
- gettext_noop("Origin"));
+ ", suborigin AS \"%s\"\n"
+ ", subminsenddelay AS \"%s\"\n",
+ gettext_noop("Origin"),
+ gettext_noop("Min send delay"));
appendPQExpBuffer(&buf,
", subsynccommit AS \"%s\"\n"
diff --git a/src/bin/psql/tab-complete.c b/src/bin/psql/tab-complete.c
index 5e1882eaea..6643db6f55 100644
--- a/src/bin/psql/tab-complete.c
+++ b/src/bin/psql/tab-complete.c
@@ -1925,7 +1925,7 @@ psql_completion(const char *text, int start, int end)
COMPLETE_WITH("(", "PUBLICATION");
/* ALTER SUBSCRIPTION <name> SET ( */
else if (HeadMatches("ALTER", "SUBSCRIPTION", MatchAny) && TailMatches("SET", "("))
- COMPLETE_WITH("binary", "disable_on_error", "origin", "slot_name",
+ COMPLETE_WITH("binary", "disable_on_error", "min_send_delay", "origin", "slot_name",
"streaming", "synchronous_commit");
/* ALTER SUBSCRIPTION <name> SKIP ( */
else if (HeadMatches("ALTER", "SUBSCRIPTION", MatchAny) && TailMatches("SKIP", "("))
@@ -3268,7 +3268,7 @@ psql_completion(const char *text, int start, int end)
/* Complete "CREATE SUBSCRIPTION <name> ... WITH ( <opt>" */
else if (HeadMatches("CREATE", "SUBSCRIPTION") && TailMatches("WITH", "("))
COMPLETE_WITH("binary", "connect", "copy_data", "create_slot",
- "disable_on_error", "enabled", "origin", "slot_name",
+ "disable_on_error", "enabled", "min_send_delay", "origin", "slot_name",
"streaming", "synchronous_commit", "two_phase");
/* CREATE TRIGGER --- is allowed inside CREATE SCHEMA, so use TailMatches */
diff --git a/src/include/catalog/pg_subscription.h b/src/include/catalog/pg_subscription.h
index b0f2a1705d..69ae4314b4 100644
--- a/src/include/catalog/pg_subscription.h
+++ b/src/include/catalog/pg_subscription.h
@@ -74,6 +74,8 @@ CATALOG(pg_subscription,6100,SubscriptionRelationId) BKI_SHARED_RELATION BKI_ROW
Oid subowner BKI_LOOKUP(pg_authid); /* Owner of the subscription */
+ int32 subminsenddelay; /* Replication send delay (ms) */
+
bool subenabled; /* True if the subscription is enabled (the
* worker should be running) */
@@ -122,6 +124,7 @@ typedef struct Subscription
* skipped */
char *name; /* Name of the subscription */
Oid owner; /* Oid of the subscription owner */
+ int32 minsenddelay; /* Replication send delay (ms) */
bool enabled; /* Indicates if the subscription is enabled */
bool binary; /* Indicates if the subscription wants data in
* binary format */
diff --git a/src/include/replication/logical.h b/src/include/replication/logical.h
index 5f49554ea0..603c37b6ce 100644
--- a/src/include/replication/logical.h
+++ b/src/include/replication/logical.h
@@ -30,6 +30,10 @@ typedef void (*LogicalOutputPluginWriterUpdateProgress) (struct LogicalDecodingC
bool skipped_xact
);
+typedef void (*LogicalOutputPluginWriterDelay) (struct LogicalDecodingContext *lr,
+ TimestampTz start_time
+);
+
typedef struct LogicalDecodingContext
{
/* memory context this is all allocated in */
@@ -64,6 +68,7 @@ typedef struct LogicalDecodingContext
LogicalOutputPluginWriterPrepareWrite prepare_write;
LogicalOutputPluginWriterWrite write;
LogicalOutputPluginWriterUpdateProgress update_progress;
+ LogicalOutputPluginWriterDelay delay;
/*
* Output buffer.
@@ -100,6 +105,8 @@ typedef struct LogicalDecodingContext
*/
bool twophase_opt_given;
+ int32 min_send_delay;
+
/*
* State for writing output.
*/
@@ -121,14 +128,16 @@ extern LogicalDecodingContext *CreateInitDecodingContext(const char *plugin,
XLogReaderRoutine *xl_routine,
LogicalOutputPluginWriterPrepareWrite prepare_write,
LogicalOutputPluginWriterWrite do_write,
- LogicalOutputPluginWriterUpdateProgress update_progress);
+ LogicalOutputPluginWriterUpdateProgress update_progress,
+ LogicalOutputPluginWriterDelay delay);
extern LogicalDecodingContext *CreateDecodingContext(XLogRecPtr start_lsn,
List *output_plugin_options,
bool fast_forward,
XLogReaderRoutine *xl_routine,
LogicalOutputPluginWriterPrepareWrite prepare_write,
LogicalOutputPluginWriterWrite do_write,
- LogicalOutputPluginWriterUpdateProgress update_progress);
+ LogicalOutputPluginWriterUpdateProgress update_progress,
+ LogicalOutputPluginWriterDelay delay);
extern void DecodingContextFindStartpoint(LogicalDecodingContext *ctx);
extern bool DecodingContextReady(LogicalDecodingContext *ctx);
extern void FreeDecodingContext(LogicalDecodingContext *ctx);
diff --git a/src/include/replication/pgoutput.h b/src/include/replication/pgoutput.h
index b4a8015403..d2fde09e00 100644
--- a/src/include/replication/pgoutput.h
+++ b/src/include/replication/pgoutput.h
@@ -30,6 +30,7 @@ typedef struct PGOutputData
bool messages;
bool two_phase;
char *origin;
+ int32 min_send_delay;
} PGOutputData;
#endif /* PGOUTPUT_H */
diff --git a/src/include/replication/walreceiver.h b/src/include/replication/walreceiver.h
index decffe352d..c20969aed7 100644
--- a/src/include/replication/walreceiver.h
+++ b/src/include/replication/walreceiver.h
@@ -187,6 +187,7 @@ typedef struct
* prepare time */
char *origin; /* Only publish data originating from the
* specified origin */
+ int32 min_send_delay; /* The minimum send delay */
} logical;
} proto;
} WalRcvStreamOptions;
diff --git a/src/include/utils/wait_event.h b/src/include/utils/wait_event.h
index 9ab23e1c4a..cc3a234eba 100644
--- a/src/include/utils/wait_event.h
+++ b/src/include/utils/wait_event.h
@@ -150,7 +150,8 @@ typedef enum
WAIT_EVENT_REGISTER_SYNC_REQUEST,
WAIT_EVENT_SPIN_DELAY,
WAIT_EVENT_VACUUM_DELAY,
- WAIT_EVENT_VACUUM_TRUNCATE
+ WAIT_EVENT_VACUUM_TRUNCATE,
+ WAIT_EVENT_WALSENDER_SEND_DELAY
} WaitEventTimeout;
/* ----------
diff --git a/src/test/regress/expected/subscription.out b/src/test/regress/expected/subscription.out
index 3f99b14394..b14384e8e7 100644
--- a/src/test/regress/expected/subscription.out
+++ b/src/test/regress/expected/subscription.out
@@ -114,18 +114,18 @@ CREATE SUBSCRIPTION regress_testsub4 CONNECTION 'dbname=regress_doesnotexist' PU
WARNING: subscription was created, but is not connected
HINT: To initiate replication, you must manually create the replication slot, enable the subscription, and refresh the subscription.
\dRs+ regress_testsub4
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
-------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub4 | regress_subscription_user | f | {testpub} | f | off | d | f | none | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min send delay | Synchronous commit | Conninfo | Skip LSN
+------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+----------------+--------------------+-----------------------------+----------
+ regress_testsub4 | regress_subscription_user | f | {testpub} | f | off | d | f | none | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
ALTER SUBSCRIPTION regress_testsub4 SET (origin = any);
\dRs+ regress_testsub4
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
-------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub4 | regress_subscription_user | f | {testpub} | f | off | d | f | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min send delay | Synchronous commit | Conninfo | Skip LSN
+------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+----------------+--------------------+-----------------------------+----------
+ regress_testsub4 | regress_subscription_user | f | {testpub} | f | off | d | f | any | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
DROP SUBSCRIPTION regress_testsub3;
@@ -143,10 +143,10 @@ ALTER SUBSCRIPTION regress_testsub CONNECTION 'foobar';
ERROR: invalid connection string syntax: missing "=" after "foobar" in connection info string
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub} | f | off | d | f | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min send delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+----------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | off | d | f | any | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
ALTER SUBSCRIPTION regress_testsub SET PUBLICATION testpub2, testpub3 WITH (refresh = false);
@@ -163,10 +163,10 @@ ERROR: unrecognized subscription parameter: "create_slot"
-- ok
ALTER SUBSCRIPTION regress_testsub SKIP (lsn = '0/12345');
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+---------------------+--------+-----------+------------------+------------------+--------+--------------------+------------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub2,testpub3} | f | off | d | f | any | off | dbname=regress_doesnotexist2 | 0/12345
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min send delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+---------------------+--------+-----------+------------------+------------------+--------+----------------+--------------------+------------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub2,testpub3} | f | off | d | f | any | 0 | off | dbname=regress_doesnotexist2 | 0/12345
(1 row)
-- ok - with lsn = NONE
@@ -175,10 +175,10 @@ ALTER SUBSCRIPTION regress_testsub SKIP (lsn = NONE);
ALTER SUBSCRIPTION regress_testsub SKIP (lsn = '0/0');
ERROR: invalid WAL location (LSN): 0/0
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+---------------------+--------+-----------+------------------+------------------+--------+--------------------+------------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub2,testpub3} | f | off | d | f | any | off | dbname=regress_doesnotexist2 | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min send delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+---------------------+--------+-----------+------------------+------------------+--------+----------------+--------------------+------------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub2,testpub3} | f | off | d | f | any | 0 | off | dbname=regress_doesnotexist2 | 0/0
(1 row)
BEGIN;
@@ -210,10 +210,10 @@ ALTER SUBSCRIPTION regress_testsub_foo SET (synchronous_commit = foobar);
ERROR: invalid value for parameter "synchronous_commit": "foobar"
HINT: Available values: local, remote_write, remote_apply, on, off.
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
----------------------+---------------------------+---------+---------------------+--------+-----------+------------------+------------------+--------+--------------------+------------------------------+----------
- regress_testsub_foo | regress_subscription_user | f | {testpub2,testpub3} | f | off | d | f | any | local | dbname=regress_doesnotexist2 | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min send delay | Synchronous commit | Conninfo | Skip LSN
+---------------------+---------------------------+---------+---------------------+--------+-----------+------------------+------------------+--------+----------------+--------------------+------------------------------+----------
+ regress_testsub_foo | regress_subscription_user | f | {testpub2,testpub3} | f | off | d | f | any | 0 | local | dbname=regress_doesnotexist2 | 0/0
(1 row)
-- rename back to keep the rest simple
@@ -247,19 +247,19 @@ CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUB
WARNING: subscription was created, but is not connected
HINT: To initiate replication, you must manually create the replication slot, enable the subscription, and refresh the subscription.
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub} | t | off | d | f | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min send delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+----------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | t | off | d | f | any | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
ALTER SUBSCRIPTION regress_testsub SET (binary = false);
ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE);
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub} | f | off | d | f | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min send delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+----------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | off | d | f | any | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
DROP SUBSCRIPTION regress_testsub;
@@ -271,27 +271,27 @@ CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUB
WARNING: subscription was created, but is not connected
HINT: To initiate replication, you must manually create the replication slot, enable the subscription, and refresh the subscription.
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub} | f | on | d | f | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min send delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+----------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | on | d | f | any | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
ALTER SUBSCRIPTION regress_testsub SET (streaming = parallel);
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub} | f | parallel | d | f | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min send delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+----------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | parallel | d | f | any | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
ALTER SUBSCRIPTION regress_testsub SET (streaming = false);
ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE);
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub} | f | off | d | f | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min send delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+----------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | off | d | f | any | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
-- fail - publication already exists
@@ -306,10 +306,10 @@ ALTER SUBSCRIPTION regress_testsub ADD PUBLICATION testpub1, testpub2 WITH (refr
ALTER SUBSCRIPTION regress_testsub ADD PUBLICATION testpub1, testpub2 WITH (refresh = false);
ERROR: publication "testpub1" is already in subscription "regress_testsub"
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-----------------------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub,testpub1,testpub2} | f | off | d | f | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min send delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-----------------------------+--------+-----------+------------------+------------------+--------+----------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub,testpub1,testpub2} | f | off | d | f | any | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
-- fail - publication used more than once
@@ -324,10 +324,10 @@ ERROR: publication "testpub3" is not in subscription "regress_testsub"
-- ok - delete publications
ALTER SUBSCRIPTION regress_testsub DROP PUBLICATION testpub1, testpub2 WITH (refresh = false);
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub} | f | off | d | f | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min send delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+----------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | off | d | f | any | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
DROP SUBSCRIPTION regress_testsub;
@@ -363,10 +363,10 @@ CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUB
WARNING: subscription was created, but is not connected
HINT: To initiate replication, you must manually create the replication slot, enable the subscription, and refresh the subscription.
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub} | f | off | p | f | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min send delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+----------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | off | p | f | any | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
--fail - alter of two_phase option not supported.
@@ -375,10 +375,10 @@ ERROR: unrecognized subscription parameter: "two_phase"
-- but can alter streaming when two_phase enabled
ALTER SUBSCRIPTION regress_testsub SET (streaming = true);
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub} | f | on | p | f | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min send delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+----------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | on | p | f | any | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE);
@@ -388,10 +388,10 @@ CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUB
WARNING: subscription was created, but is not connected
HINT: To initiate replication, you must manually create the replication slot, enable the subscription, and refresh the subscription.
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub} | f | on | p | f | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min send delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+----------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | on | p | f | any | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE);
@@ -404,18 +404,48 @@ CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUB
WARNING: subscription was created, but is not connected
HINT: To initiate replication, you must manually create the replication slot, enable the subscription, and refresh the subscription.
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub} | f | off | d | f | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min send delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+----------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | off | d | f | any | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
ALTER SUBSCRIPTION regress_testsub SET (disable_on_error = true);
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub} | f | off | d | t | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min send delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+----------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | off | d | t | any | 0 | off | dbname=regress_doesnotexist | 0/0
+(1 row)
+
+ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE);
+DROP SUBSCRIPTION regress_testsub;
+-- fail -- min_send_delay must be a non-negative integer
+CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexit' PUBLICATION testpub WITH (connect = false, min_send_delay = foo);
+ERROR: invalid value for parameter "min_send_delay": "foo"
+CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexit' PUBLICATION testpub WITH (connect = false, min_send_delay = -1);
+ERROR: -1 ms is outside the valid range for parameter "min_send_delay" (0 .. 2147483647)
+-- fail - utilizing streaming = parallel with time-delayed replication is not supported
+CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUBLICATION testpub WITH (connect = false, streaming = parallel, min_send_delay = 123);
+ERROR: min_send_delay > 0 and streaming = parallel are mutually exclusive options
+-- success -- min_send_delay value without unit is take as milliseconds
+CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexit' PUBLICATION testpub WITH (connect = false, min_send_delay = 123);
+WARNING: subscription was created, but is not connected
+HINT: To initiate replication, you must manually create the replication slot, enable the subscription, and refresh the subscription.
+\dRs+
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min send delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+----------------+--------------------+----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | off | d | f | any | 123 | off | dbname=regress_doesnotexit | 0/0
+(1 row)
+
+-- success -- min_send_delay value with unit is converted into ms and stored as an integer
+ALTER SUBSCRIPTION regress_testsub SET (min_send_delay = '1 d');
+\dRs+
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min send delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+----------------+--------------------+----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | off | d | f | any | 86400000 | off | dbname=regress_doesnotexit | 0/0
(1 row)
ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE);
diff --git a/src/test/regress/sql/subscription.sql b/src/test/regress/sql/subscription.sql
index 7281f5fee2..2fae3b06c7 100644
--- a/src/test/regress/sql/subscription.sql
+++ b/src/test/regress/sql/subscription.sql
@@ -286,6 +286,24 @@ ALTER SUBSCRIPTION regress_testsub SET (disable_on_error = true);
ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE);
DROP SUBSCRIPTION regress_testsub;
+-- fail -- min_send_delay must be a non-negative integer
+CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexit' PUBLICATION testpub WITH (connect = false, min_send_delay = foo);
+CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexit' PUBLICATION testpub WITH (connect = false, min_send_delay = -1);
+
+-- fail - utilizing streaming = parallel with time-delayed replication is not supported
+CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUBLICATION testpub WITH (connect = false, streaming = parallel, min_send_delay = 123);
+
+-- success -- min_send_delay value without unit is take as milliseconds
+CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexit' PUBLICATION testpub WITH (connect = false, min_send_delay = 123);
+\dRs+
+
+-- success -- min_send_delay value with unit is converted into ms and stored as an integer
+ALTER SUBSCRIPTION regress_testsub SET (min_send_delay = '1 d');
+\dRs+
+
+ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE);
+DROP SUBSCRIPTION regress_testsub;
+
RESET SESSION AUTHORIZATION;
DROP ROLE regress_subscription_user;
DROP ROLE regress_subscription_user2;
diff --git a/src/test/subscription/t/001_rep_changes.pl b/src/test/subscription/t/001_rep_changes.pl
index 91aa068c95..8984e14d74 100644
--- a/src/test/subscription/t/001_rep_changes.pl
+++ b/src/test/subscription/t/001_rep_changes.pl
@@ -515,6 +515,34 @@ $node_publisher->poll_query_until('postgres',
or die
"Timed out while waiting for apply to restart after renaming SUBSCRIPTION";
+# Test time-delayed logical replication
+#
+# If the subscription sets min_send_delay parameter, the logical replication
+# worker will delay the transaction apply for min_send_delay milliseconds. We
+# verify this by looking at the time difference between a) when tuples are
+# inserted on the publisher, and b) when those changes are replicated on the
+# subscriber. Even on slow machines, this strategy will give predictable behavior.
+
+# Set min_send_delay parameter to 3 seconds
+my $delay = 3;
+$node_subscriber->safe_psql('postgres',
+ "ALTER SUBSCRIPTION tap_sub_renamed SET (min_send_delay = '${delay}s')");
+
+# Before doing the insertion, get the current timestamp that will be
+# used as a comparison base.
+my $publisher_insert_time = time();
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_ins VALUES (generate_series(1101, 1120))");
+
+# The publisher waits for the replication to complete
+$node_publisher->wait_for_catchup('tap_sub_renamed');
+
+# This test is successful if and only if the LSN has been applied with at least
+# the configured apply delay.
+ok( time() - $publisher_insert_time >= $delay,
+ "subscriber applies WAL only after replication delay for non-streaming transaction"
+);
+
# check all the cleanup
$node_subscriber->safe_psql('postgres', "DROP SUBSCRIPTION tap_sub_renamed");
--
2.27.0
Dear Amit,
Perhaps what I understand
correctly is that we could delay right before only sending commit
records in this case. If we delay at publisher end, all changes will
be sent at once if !streaming, and otherwise, all changes in a
transaction will be spooled at subscriber end. In any case, apply
worker won't be holding an active transaction unnecessarily.What about parallel case? Latest patch does not reject the combination of
parallel
streaming mode and delay. If delay is done at commit and subscriber uses an
parallel
apply worker, it may acquire lock for a long time.
I didn't looked too closely, but my guess is that transactions are
conveyed in spool files in parallel mode, with each file storing a
complete transaction.No, we don't try to collect all the data in files for parallel mode.
Having said that, it doesn't matter because we won't know the time of
the commit (which is used to compute delay) before we encounter the
commit record in WAL. So, I feel for this approach, we can follow what
you said.
Right. And new patch follows the opinion.
Of
course we need add the mechanism to process keep-alive and status
report messages.Could you share the good way to handle keep-alive and status messages if
you have?
If we changed to the decoding layer, it is strange to call walsender function
directly.I'm sorry, but I don't have a concrete idea at the moment. When I read
through the last patch, I missed that WalSndDelay is actually a subset
of WalSndLoop. Although it can handle keep-alives correctly, I'm not
sure we can accept that structure..I think we can use update_progress_txn() for this purpose but note
that we are discussing to change the same in thread [1].[1] -
/messages/by-id/20230210210423.r26ndnfmuifie4f6@
awork3.anarazel.de
I did not reuse update_progress_txn() because we cannot use it straightforward,
But I can change if we have better idea than present.
New patch was posted in [1]/messages/by-id/TYAPR01MB5866F00191375D0193320A4DF5A19@TYAPR01MB5866.jpnprd01.prod.outlook.com.
[1]: /messages/by-id/TYAPR01MB5866F00191375D0193320A4DF5A19@TYAPR01MB5866.jpnprd01.prod.outlook.com
Best Regards,
Hayato Kuroda
FUJITSU LIMITED
Dear Andres,
Thank you for giving comments! I understood that you have agreed the approach
that publisher delays to send data.
I'm not sure why output plugin is involved in the delay mechanism.
+many
The output plugin absolutely never should be involved in something like
this. It was a grave mistake that OutputPluginUpdateProgress() calls were
added to the commit callback and then proliferated.It appears to me that it would be simpler if the delay occurred in reorder
buffer or logical decoder instead.This is a feature specific to walsender. So the riggering logic should either
directly live in the walsender, or in a callback set in
LogicalDecodingContext. That could be called from decode.c or such.
OK, I can follow the opinion.
I think the walsender function should not be called directly from decode.c.
So I implemented as callback in LogicalDecodingContext and it is called
from decode.c if set.
New patch was posted in [1]/messages/by-id/TYAPR01MB5866F00191375D0193320A4DF5A19@TYAPR01MB5866.jpnprd01.prod.outlook.com.
[1]: /messages/by-id/TYAPR01MB5866F00191375D0193320A4DF5A19@TYAPR01MB5866.jpnprd01.prod.outlook.com
Best Regards,
Hayato Kuroda
FUJITSU LIMITED
On Fri, Feb 17, 2023 at 12:14 PM Hayato Kuroda (Fujitsu)
<kuroda.hayato@fujitsu.com> wrote:
Thank you for replying! This direction seems OK, so I started to revise the patch.
PSA new version.
Few comments:
=============
1.
+ <para>
+ The minimum delay for publisher sends data, in milliseconds
+ </para></entry>
+ </row>
It would probably be better to write it as "The minimum delay, in
milliseconds, by the publisher to send changes"
2. The subminsenddelay is placed inconsistently in the patch. In the
docs (catalogs.sgml), system_views.sql, and in some places in the
code, it is after subskiplsn, but in the catalog table and
corresponding structure, it is placed after subowner. It should be
consistently placed after the subscription owner.
3.
+ <row>
+ <entry><literal>WalSenderSendDelay</literal></entry>
+ <entry>Waiting for sending changes to subscriber in WAL sender
+ process.</entry>
How about writing it as follows: "Waiting while sending changes for
time-delayed logical replication in the WAL sender process."?
4.
+ <para>
+ Any delay becomes effective only after all initial table
+ synchronization has finished and occurs before each transaction
+ starts to get applied on the subscriber. The delay does not take into
+ account the overhead of time spent in transferring the transaction,
+ which means that the arrival time at the subscriber may be delayed
+ more than the given time.
+ </para>
This needs to change based on a new approach. It should be something
like: "The delay is effective only when the publisher decides to send
a particular transaction downstream."
5.
+ * allowed. This is because in parallel streaming mode, we start applying
+ * the transaction stream as soon as the first change arrives without
+ * knowing the transaction's prepare/commit time. Always waiting for the
+ * full 'min_send_delay' period might include unnecessary delay.
+ *
+ * The other possibility was to apply the delay at the end of the parallel
+ * apply transaction but that would cause issues related to resource bloat
+ * and locks being held for a long time.
+ */
This part of the comments seems to imply more of a subscriber-side
delay approach. I think we should try to adjust these as per the
changed approach.
6.
@@ -666,6 +666,10 @@ DecodeCommit(LogicalDecodingContext *ctx,
XLogRecordBuffer *buf,
buf->origptr, buf->endptr);
}
+ /* Delay given time if the context has 'delay' callback */
+ if (ctx->delay)
+ ctx->delay(ctx, commit_time);
+
I think we should invoke delay functionality only when
ctx->min_send_delay > 0. Otherwise, there will be some unnecessary
overhead. We can change the comment along the lines of: "Delay sending
the changes if required. For streaming transactions, this means a
delay in sending the last stream but that is okay because on the
downstream the changes will be applied only after receiving the last
stream."
7. For 2PC transactions, I think we should add the delay in
DecodePrerpare. Because after receiving the PREPARE, the downstream
will apply the xact. In this case, we shouldn't add a delay for the
commit_prepared.
8.
+#
+# If the subscription sets min_send_delay parameter, the logical replication
+# worker will delay the transaction apply for min_send_delay milliseconds.
I think here also comments should be updated as per the changed
approach for applying the delay on the publisher side.
--
With Regards,
Amit Kapila.
Dear Amit,
Thank you for reviewing! PSA new version.
1. + <para> + The minimum delay for publisher sends data, in milliseconds + </para></entry> + </row>It would probably be better to write it as "The minimum delay, in
milliseconds, by the publisher to send changes"
Fixed.
2. The subminsenddelay is placed inconsistently in the patch. In the
docs (catalogs.sgml), system_views.sql, and in some places in the
code, it is after subskiplsn, but in the catalog table and
corresponding structure, it is placed after subowner. It should be
consistently placed after the subscription owner.
Basically moved. Note that some parts were not changed like
maybe_reread_subscription() because the ordering had been already broken.
3. + <row> + <entry><literal>WalSenderSendDelay</literal></entry> + <entry>Waiting for sending changes to subscriber in WAL sender + process.</entry>How about writing it as follows: "Waiting while sending changes for
time-delayed logical replication in the WAL sender process."?
Fixed.
4. + <para> + Any delay becomes effective only after all initial table + synchronization has finished and occurs before each transaction + starts to get applied on the subscriber. The delay does not take into + account the overhead of time spent in transferring the transaction, + which means that the arrival time at the subscriber may be delayed + more than the given time. + </para>This needs to change based on a new approach. It should be something
like: "The delay is effective only when the publisher decides to send
a particular transaction downstream."
Right, the first sentence is partially changed as you said.
5. + * allowed. This is because in parallel streaming mode, we start applying + * the transaction stream as soon as the first change arrives without + * knowing the transaction's prepare/commit time. Always waiting for the + * full 'min_send_delay' period might include unnecessary delay. + * + * The other possibility was to apply the delay at the end of the parallel + * apply transaction but that would cause issues related to resource bloat + * and locks being held for a long time. + */This part of the comments seems to imply more of a subscriber-side
delay approach. I think we should try to adjust these as per the
changed approach.
Adjusted.
6.
@@ -666,6 +666,10 @@ DecodeCommit(LogicalDecodingContext *ctx,
XLogRecordBuffer *buf,
buf->origptr, buf->endptr);
}+ /* Delay given time if the context has 'delay' callback */ + if (ctx->delay) + ctx->delay(ctx, commit_time); +I think we should invoke delay functionality only when
ctx->min_send_delay > 0. Otherwise, there will be some unnecessary
overhead. We can change the comment along the lines of: "Delay sending
the changes if required. For streaming transactions, this means a
delay in sending the last stream but that is okay because on the
downstream the changes will be applied only after receiving the last
stream."
Changed accordingly.
7. For 2PC transactions, I think we should add the delay in
DecodePrerpare. Because after receiving the PREPARE, the downstream
will apply the xact. In this case, we shouldn't add a delay for the
commit_prepared.
Right, the transaction will be end when it receive PREPARE. Fixed.
I've tested locally and the delay seemed to be occurred at PREPARE phase.
8. +# +# If the subscription sets min_send_delay parameter, the logical replication +# worker will delay the transaction apply for min_send_delay milliseconds.I think here also comments should be updated as per the changed
approach for applying the delay on the publisher side.
Fixed.
Best Regards,
Hayato Kuroda
FUJITSU LIMITED
Attachments:
v3-0001-Time-delayed-logical-replication-on-publisher-sid.patchapplication/octet-stream; name=v3-0001-Time-delayed-logical-replication-on-publisher-sid.patchDownload
From 339413ed077c173eb61d12d5a4e4bcaab60e3178 Mon Sep 17 00:00:00 2001
From: Hayato Kuroda <kuroda.hayato@fujitsu.com>
Date: Thu, 16 Feb 2023 07:52:23 +0000
Subject: [PATCH v3] Time-delayed logical replication on publisher side
Similar to physical replication, a time-delayed copy of the data for
logical replication is useful for some scenarios (particularly to fix
errors that might cause data loss).
This patch implements a new subscription parameter called 'min_send_delay'.
If the subscription sets min_send_delay parameter, an apply worker passes the
value to the publisher as an output plugin option. And then, the walsender will
delay the transaction sending for given milliseconds.
The delay does not take into account the overhead of time spent in transferring
the transaction, which means that the arrival time at the subscriber may be
delayed more than the given time.
The combination of parallel streaming mode and min_send_delay is not allowed.
This is because in parallel streaming mode, we start applying the transaction
stream as soon as the first change arrives without knowing the transaction's
prepare/commit time. Always waiting for the full 'min_send_delay' period might
include unnecessary delay.
The other possibility was to apply the delay at the end of the parallel apply
transaction but that would cause issues related to resource bloat and locks being
held for a long time.
The delay occurs before we start to send the transaction on the publisher.
Regular and prepared transactions are covered. Streamed transactions are also
covered.
Author: Euler Taveira, Takamichi Osumi, Kuroda Hayato
Reviewed-by: Amit Kapila, Peter Smith, Vignesh C, Shveta Malik,
Kyotaro Horiguchi, Shi Yu, Wang Wei, Dilip Kumar, Melih Mutlu,
Andres Freund
---
doc/src/sgml/catalogs.sgml | 9 +
doc/src/sgml/glossary.sgml | 15 ++
doc/src/sgml/logical-replication.sgml | 6 +
doc/src/sgml/monitoring.sgml | 5 +
doc/src/sgml/ref/alter_subscription.sgml | 5 +-
doc/src/sgml/ref/create_subscription.sgml | 39 +++-
src/backend/catalog/pg_subscription.c | 1 +
src/backend/catalog/system_views.sql | 7 +-
src/backend/commands/subscriptioncmds.c | 122 +++++++++++-
.../libpqwalreceiver/libpqwalreceiver.c | 5 +
src/backend/replication/logical/decode.c | 18 ++
src/backend/replication/logical/logical.c | 18 +-
.../replication/logical/logicalfuncs.c | 2 +-
src/backend/replication/logical/worker.c | 12 +-
src/backend/replication/pgoutput/pgoutput.c | 36 ++++
src/backend/replication/slotfuncs.c | 4 +-
src/backend/replication/walsender.c | 77 +++++++-
src/backend/utils/activity/wait_event.c | 3 +
src/bin/pg_dump/pg_dump.c | 15 +-
src/bin/pg_dump/pg_dump.h | 1 +
src/bin/psql/describe.c | 9 +-
src/bin/psql/tab-complete.c | 4 +-
src/include/catalog/pg_subscription.h | 3 +
src/include/replication/logical.h | 13 +-
src/include/replication/pgoutput.h | 1 +
src/include/replication/walreceiver.h | 1 +
src/include/utils/wait_event.h | 3 +-
src/test/regress/expected/subscription.out | 174 ++++++++++--------
src/test/regress/sql/subscription.sql | 18 ++
src/test/subscription/t/001_rep_changes.pl | 28 +++
30 files changed, 550 insertions(+), 104 deletions(-)
diff --git a/doc/src/sgml/catalogs.sgml b/doc/src/sgml/catalogs.sgml
index c1e4048054..c41771d5ac 100644
--- a/doc/src/sgml/catalogs.sgml
+++ b/doc/src/sgml/catalogs.sgml
@@ -7892,6 +7892,15 @@ SCRAM-SHA-256$<replaceable><iteration count></replaceable>:<replaceable>&l
</para></entry>
</row>
+ <row>
+ <entry role="catalog_table_entry"><para role="column_definition">
+ <structfield>subminsenddelay</structfield> <type>int4</type>
+ </para>
+ <para>
+ The minimum delay, in milliseconds, by the publisher to send changes
+ </para></entry>
+ </row>
+
<row>
<entry role="catalog_table_entry"><para role="column_definition">
<structfield>subenabled</structfield> <type>bool</type>
diff --git a/doc/src/sgml/glossary.sgml b/doc/src/sgml/glossary.sgml
index 7c01a541fe..9ede9d05f6 100644
--- a/doc/src/sgml/glossary.sgml
+++ b/doc/src/sgml/glossary.sgml
@@ -1729,6 +1729,21 @@
</glossdef>
</glossentry>
+ <glossentry id="glossary-time-delayed-replication">
+ <glossterm>Time-delayed replication</glossterm>
+ <glossdef>
+ <para>
+ Replication setup that delays the application of changes by a specified
+ minimum time-delay period.
+ </para>
+ <para>
+ For more information, see
+ <xref linkend="guc-recovery-min-apply-delay"/> for physical replication
+ and <xref linkend="sql-createsubscription"/> for logical replication.
+ </para>
+ </glossdef>
+ </glossentry>
+
<glossentry id="glossary-toast">
<glossterm>TOAST</glossterm>
<glossdef>
diff --git a/doc/src/sgml/logical-replication.sgml b/doc/src/sgml/logical-replication.sgml
index 1bd5660c87..8bca0b3800 100644
--- a/doc/src/sgml/logical-replication.sgml
+++ b/doc/src/sgml/logical-replication.sgml
@@ -247,6 +247,12 @@
target table.
</para>
+ <para>
+ A publication can delay sending changes to the subscription by specifying
+ the <literal>min_send_delay</literal> subscription parameter. See
+ <xref linkend="sql-createsubscription"/> for details.
+ </para>
+
<sect2 id="logical-replication-subscription-slot">
<title>Replication Slot Management</title>
diff --git a/doc/src/sgml/monitoring.sgml b/doc/src/sgml/monitoring.sgml
index b0b997f092..6158587644 100644
--- a/doc/src/sgml/monitoring.sgml
+++ b/doc/src/sgml/monitoring.sgml
@@ -2349,6 +2349,11 @@ postgres 27093 0.0 0.0 30096 2752 ? Ss 11:34 0:00 postgres: ser
<entry>Waiting to acquire an exclusive lock to truncate off any
empty pages at the end of a table vacuumed.</entry>
</row>
+ <row>
+ <entry><literal>WalSenderSendDelay</literal></entry>
+ <entry>Waiting while sending changes for time-delayed logical replication
+ in the WAL sender process.</entry>
+ </row>
</tbody>
</tgroup>
</table>
diff --git a/doc/src/sgml/ref/alter_subscription.sgml b/doc/src/sgml/ref/alter_subscription.sgml
index 964fcbb8ff..3f238b958b 100644
--- a/doc/src/sgml/ref/alter_subscription.sgml
+++ b/doc/src/sgml/ref/alter_subscription.sgml
@@ -213,8 +213,9 @@ ALTER SUBSCRIPTION <replaceable class="parameter">name</replaceable> RENAME TO <
are <literal>slot_name</literal>,
<literal>synchronous_commit</literal>,
<literal>binary</literal>, <literal>streaming</literal>,
- <literal>disable_on_error</literal>, and
- <literal>origin</literal>.
+ <literal>disable_on_error</literal>,
+ <literal>origin</literal>, and
+ <literal>min_send_delay</literal>.
</para>
</listitem>
</varlistentry>
diff --git a/doc/src/sgml/ref/create_subscription.sgml b/doc/src/sgml/ref/create_subscription.sgml
index 51c45f17c7..9fd2922d69 100644
--- a/doc/src/sgml/ref/create_subscription.sgml
+++ b/doc/src/sgml/ref/create_subscription.sgml
@@ -349,7 +349,39 @@ CREATE SUBSCRIPTION <replaceable class="parameter">subscription_name</replaceabl
</para>
</listitem>
</varlistentry>
- </variablelist></para>
+
+ <varlistentry>
+ <term><literal>min_send_delay</literal> (<type>integer</type>)</term>
+ <listitem>
+ <para>
+ By default, the publisher sends changes as soon as possible. This
+ parameter allows the user to delay the publisher to send changes by
+ given time period. If the value is specified without units, it is
+ taken as milliseconds. The default is zero (no delay). See
+ <xref linkend="config-setting-names-values"/> for details on the
+ available valid time units.
+ </para>
+ <para>
+ The delay is effective only when the initial table synchronization
+ has been finished and the publisher decides to send a particular
+ transaction downstream. The delay does not take into account the
+ overhead of time spent in transferring the transaction, which means
+ that the arrival time at the subscriber may be delayed more than the
+ given time.
+ </para>
+ <warning>
+ <para>
+ Delaying the replication means there is a much longer time between
+ making a change on the publisher, and that change being committed
+ on the subscriber. This can impact the performance of synchronous
+ replication. See <xref linkend="guc-synchronous-commit"/>
+ parameter.
+ </para>
+ </warning>
+ </listitem>
+ </varlistentry>
+ </variablelist>
+ </para>
</listitem>
</varlistentry>
@@ -420,6 +452,11 @@ CREATE SUBSCRIPTION <replaceable class="parameter">subscription_name</replaceabl
published with different column lists are not supported.
</para>
+ <para>
+ A non-zero <literal>min_send_delay</literal> parameter is not allowed when
+ streaming in parallel mode.
+ </para>
+
<para>
We allow non-existent publications to be specified so that users can add
those later. This means
diff --git a/src/backend/catalog/pg_subscription.c b/src/backend/catalog/pg_subscription.c
index a56ae311c3..63a10b06d1 100644
--- a/src/backend/catalog/pg_subscription.c
+++ b/src/backend/catalog/pg_subscription.c
@@ -66,6 +66,7 @@ GetSubscription(Oid subid, bool missing_ok)
sub->skiplsn = subform->subskiplsn;
sub->name = pstrdup(NameStr(subform->subname));
sub->owner = subform->subowner;
+ sub->minsenddelay = subform->subminsenddelay;
sub->enabled = subform->subenabled;
sub->binary = subform->subbinary;
sub->stream = subform->substream;
diff --git a/src/backend/catalog/system_views.sql b/src/backend/catalog/system_views.sql
index 34ca0e739f..54a705d71b 100644
--- a/src/backend/catalog/system_views.sql
+++ b/src/backend/catalog/system_views.sql
@@ -1314,9 +1314,10 @@ REVOKE ALL ON pg_replication_origin_status FROM public;
-- All columns of pg_subscription except subconninfo are publicly readable.
REVOKE ALL ON pg_subscription FROM public;
-GRANT SELECT (oid, subdbid, subskiplsn, subname, subowner, subenabled,
- subbinary, substream, subtwophasestate, subdisableonerr,
- subslotname, subsynccommit, subpublications, suborigin)
+GRANT SELECT (oid, subdbid, subskiplsn, subname, subowner, subminsenddelay,
+ subenabled, subbinary, substream, subtwophasestate,
+ subdisableonerr, subslotname, subsynccommit, subpublications,
+ suborigin)
ON pg_subscription TO public;
CREATE VIEW pg_stat_subscription_stats AS
diff --git a/src/backend/commands/subscriptioncmds.c b/src/backend/commands/subscriptioncmds.c
index 464db6d247..951fa874e2 100644
--- a/src/backend/commands/subscriptioncmds.c
+++ b/src/backend/commands/subscriptioncmds.c
@@ -66,6 +66,7 @@
#define SUBOPT_DISABLE_ON_ERR 0x00000400
#define SUBOPT_LSN 0x00000800
#define SUBOPT_ORIGIN 0x00001000
+#define SUBOPT_MIN_SEND_DELAY 0x00002000
/* check if the 'val' has 'bits' set */
#define IsSet(val, bits) (((val) & (bits)) == (bits))
@@ -90,6 +91,7 @@ typedef struct SubOpts
bool disableonerr;
char *origin;
XLogRecPtr lsn;
+ int32 min_send_delay;
} SubOpts;
static List *fetch_table_list(WalReceiverConn *wrconn, List *publications);
@@ -100,7 +102,7 @@ static void check_publications_origin(WalReceiverConn *wrconn,
static void check_duplicates_in_publist(List *publist, Datum *datums);
static List *merge_publications(List *oldpublist, List *newpublist, bool addpub, const char *subname);
static void ReportSlotConnectionError(List *rstates, Oid subid, char *slotname, char *err);
-
+static int32 defGetMinSendDelay(DefElem *def);
/*
* Common option parsing function for CREATE and ALTER SUBSCRIPTION commands.
@@ -146,6 +148,8 @@ parse_subscription_options(ParseState *pstate, List *stmt_options,
opts->disableonerr = false;
if (IsSet(supported_opts, SUBOPT_ORIGIN))
opts->origin = pstrdup(LOGICALREP_ORIGIN_ANY);
+ if (IsSet(supported_opts, SUBOPT_MIN_SEND_DELAY))
+ opts->min_send_delay = 0;
/* Parse options */
foreach(lc, stmt_options)
@@ -324,6 +328,15 @@ parse_subscription_options(ParseState *pstate, List *stmt_options,
opts->specified_opts |= SUBOPT_LSN;
opts->lsn = lsn;
}
+ else if (IsSet(supported_opts, SUBOPT_MIN_SEND_DELAY) &&
+ strcmp(defel->defname, "min_send_delay") == 0)
+ {
+ if (IsSet(opts->specified_opts, SUBOPT_MIN_SEND_DELAY))
+ errorConflictingDefElem(defel, pstate);
+
+ opts->specified_opts |= SUBOPT_MIN_SEND_DELAY;
+ opts->min_send_delay = defGetMinSendDelay(defel);
+ }
else
ereport(ERROR,
(errcode(ERRCODE_SYNTAX_ERROR),
@@ -404,6 +417,32 @@ parse_subscription_options(ParseState *pstate, List *stmt_options,
"slot_name = NONE", "create_slot = false")));
}
}
+
+ /*
+ * The combination of parallel streaming mode and min_send_delay is not
+ * allowed. This is because in parallel streaming mode, the walsender
+ * starts sending the transaction stream without knowing the prepare/commit
+ * time of the transaction. Always waiting for the full 'min_send_delay'
+ * time to send may introduce unnecessary delay.
+ *
+ * The other possibility was to wait sending COMMIT record of the parallel
+ * apply transaction but that would cause issues related to resource bloat
+ * and locks being held for a long time.
+ */
+ if (IsSet(supported_opts, SUBOPT_MIN_SEND_DELAY) &&
+ opts->min_send_delay > 0 &&
+ opts->streaming == LOGICALREP_STREAM_PARALLEL)
+ ereport(ERROR,
+ errcode(ERRCODE_SYNTAX_ERROR),
+
+ /*
+ * translator: the first %s is a string of the form "parameter > 0"
+ * and the second one is "option = value".
+ */
+ errmsg("%s and %s are mutually exclusive options",
+ "min_send_delay > 0", "streaming = parallel"));
+
+
}
/*
@@ -560,7 +599,8 @@ CreateSubscription(ParseState *pstate, CreateSubscriptionStmt *stmt,
SUBOPT_SLOT_NAME | SUBOPT_COPY_DATA |
SUBOPT_SYNCHRONOUS_COMMIT | SUBOPT_BINARY |
SUBOPT_STREAMING | SUBOPT_TWOPHASE_COMMIT |
- SUBOPT_DISABLE_ON_ERR | SUBOPT_ORIGIN);
+ SUBOPT_DISABLE_ON_ERR | SUBOPT_ORIGIN |
+ SUBOPT_MIN_SEND_DELAY);
parse_subscription_options(pstate, stmt->options, supported_opts, &opts);
/*
@@ -628,6 +668,7 @@ CreateSubscription(ParseState *pstate, CreateSubscriptionStmt *stmt,
values[Anum_pg_subscription_subname - 1] =
DirectFunctionCall1(namein, CStringGetDatum(stmt->subname));
values[Anum_pg_subscription_subowner - 1] = ObjectIdGetDatum(owner);
+ values[Anum_pg_subscription_subminsenddelay - 1] = Int32GetDatum(opts.min_send_delay);
values[Anum_pg_subscription_subenabled - 1] = BoolGetDatum(opts.enabled);
values[Anum_pg_subscription_subbinary - 1] = BoolGetDatum(opts.binary);
values[Anum_pg_subscription_substream - 1] = CharGetDatum(opts.streaming);
@@ -1054,7 +1095,7 @@ AlterSubscription(ParseState *pstate, AlterSubscriptionStmt *stmt,
supported_opts = (SUBOPT_SLOT_NAME |
SUBOPT_SYNCHRONOUS_COMMIT | SUBOPT_BINARY |
SUBOPT_STREAMING | SUBOPT_DISABLE_ON_ERR |
- SUBOPT_ORIGIN);
+ SUBOPT_ORIGIN | SUBOPT_MIN_SEND_DELAY);
parse_subscription_options(pstate, stmt->options,
supported_opts, &opts);
@@ -1098,6 +1139,19 @@ AlterSubscription(ParseState *pstate, AlterSubscriptionStmt *stmt,
if (IsSet(opts.specified_opts, SUBOPT_STREAMING))
{
+ /*
+ * The combination of parallel streaming mode and
+ * min_send_delay is not allowed. See
+ * parse_subscription_options.
+ */
+ if (opts.streaming == LOGICALREP_STREAM_PARALLEL &&
+ !IsSet(opts.specified_opts, SUBOPT_MIN_SEND_DELAY) &&
+ sub->minsenddelay > 0)
+ ereport(ERROR,
+ errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
+ errmsg("cannot set parallel streaming mode for subscription with %s",
+ "min_send_delay"));
+
values[Anum_pg_subscription_substream - 1] =
CharGetDatum(opts.streaming);
replaces[Anum_pg_subscription_substream - 1] = true;
@@ -1111,6 +1165,26 @@ AlterSubscription(ParseState *pstate, AlterSubscriptionStmt *stmt,
= true;
}
+ if (IsSet(opts.specified_opts, SUBOPT_MIN_SEND_DELAY))
+ {
+ /*
+ * The combination of parallel streaming mode and
+ * min_send_delay is not allowed. See
+ * parse_subscription_options.
+ */
+ if (opts.min_send_delay > 0 &&
+ !IsSet(opts.specified_opts, SUBOPT_STREAMING) &&
+ sub->stream == LOGICALREP_STREAM_PARALLEL)
+ ereport(ERROR,
+ errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
+ errmsg("cannot set %s for subscription in parallel streaming mode",
+ "min_send_delay"));
+
+ values[Anum_pg_subscription_subminsenddelay - 1] =
+ Int32GetDatum(opts.min_send_delay);
+ replaces[Anum_pg_subscription_subminsenddelay - 1] = true;
+ }
+
if (IsSet(opts.specified_opts, SUBOPT_ORIGIN))
{
values[Anum_pg_subscription_suborigin - 1] =
@@ -2195,3 +2269,45 @@ defGetStreamingMode(DefElem *def)
def->defname)));
return LOGICALREP_STREAM_OFF; /* keep compiler quiet */
}
+
+/*
+ * Extract the min_send_delay value from a DefElem. This is very similar to
+ * parse_and_validate_value() for integer values, because min_send_delay
+ * accepts the same parameter format as recovery_min_apply_delay.
+ */
+static int32
+defGetMinSendDelay(DefElem *def)
+{
+ char *input_string;
+ int result;
+ const char *hintmsg;
+
+ input_string = defGetString(def);
+
+ /*
+ * Parse given string as parameter which has millisecond unit
+ */
+ if (!parse_int(input_string, &result, GUC_UNIT_MS, &hintmsg))
+ ereport(ERROR,
+ (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+ errmsg("invalid value for parameter \"%s\": \"%s\"",
+ "min_send_delay", input_string),
+ hintmsg ? errhint("%s", _(hintmsg)) : 0));
+
+ /*
+ * Check both the lower boundary for the valid min_send_delay range and
+ * the upper boundary as the safeguard for some platforms where INT_MAX is
+ * wider than int32 respectively. Although parse_int() has confirmed that
+ * the result is less than or equal to INT_MAX, the value will be stored
+ * in a catalog column of int32.
+ */
+ if (result < 0 || result > PG_INT32_MAX)
+ ereport(ERROR,
+ (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+ errmsg("%d ms is outside the valid range for parameter \"%s\" (%d .. %d)",
+ result,
+ "min_send_delay",
+ 0, PG_INT32_MAX)));
+
+ return result;
+}
diff --git a/src/backend/replication/libpqwalreceiver/libpqwalreceiver.c b/src/backend/replication/libpqwalreceiver/libpqwalreceiver.c
index 560ec974fa..89a72c1abe 100644
--- a/src/backend/replication/libpqwalreceiver/libpqwalreceiver.c
+++ b/src/backend/replication/libpqwalreceiver/libpqwalreceiver.c
@@ -443,6 +443,11 @@ libpqrcv_startstreaming(WalReceiverConn *conn,
PQserverVersion(conn->streamConn) >= 140000)
appendStringInfoString(&cmd, ", binary 'true'");
+ if (options->proto.logical.min_send_delay > 0 &&
+ PQserverVersion(conn->streamConn) >= 160000)
+ appendStringInfo(&cmd, ", min_send_delay '%d'",
+ options->proto.logical.min_send_delay);
+
appendStringInfoChar(&cmd, ')');
}
else
diff --git a/src/backend/replication/logical/decode.c b/src/backend/replication/logical/decode.c
index a53e23c679..9816a536ab 100644
--- a/src/backend/replication/logical/decode.c
+++ b/src/backend/replication/logical/decode.c
@@ -679,6 +679,15 @@ DecodeCommit(LogicalDecodingContext *ctx, XLogRecordBuffer *buf,
}
else
{
+ /*
+ * Delay sending the changes if required. For streaming transactions,
+ * this means a delay in sending the last stream but that is OK because
+ * on the downstream the changes will be applied only after receiving
+ * the last stream.
+ */
+ if (ctx->min_send_delay > 0 && ctx->delay)
+ ctx->delay(ctx, commit_time);
+
ReorderBufferCommit(ctx->reorder, xid, buf->origptr, buf->endptr,
commit_time, origin_id, origin_lsn);
}
@@ -763,6 +772,15 @@ DecodePrepare(LogicalDecodingContext *ctx, XLogRecordBuffer *buf,
buf->origptr, buf->endptr);
}
+ /*
+ * Delay sending the changes if required. For streaming transactions,
+ * this means a delay in sending the last stream but that is OK because
+ * on the downstream the changes will be applied only after receiving
+ * the last stream.
+ */
+ if (ctx->min_send_delay > 0 && ctx->delay)
+ ctx->delay(ctx, prepare_time);
+
/* replay actions of all transaction + subtransactions in order */
ReorderBufferPrepare(ctx->reorder, xid, parsed->twophase_gid);
diff --git a/src/backend/replication/logical/logical.c b/src/backend/replication/logical/logical.c
index c3ec97a0a6..ac1f9f92f7 100644
--- a/src/backend/replication/logical/logical.c
+++ b/src/backend/replication/logical/logical.c
@@ -156,7 +156,8 @@ StartupDecodingContext(List *output_plugin_options,
XLogReaderRoutine *xl_routine,
LogicalOutputPluginWriterPrepareWrite prepare_write,
LogicalOutputPluginWriterWrite do_write,
- LogicalOutputPluginWriterUpdateProgress update_progress)
+ LogicalOutputPluginWriterUpdateProgress update_progress,
+ LogicalOutputPluginWriterDelay delay)
{
ReplicationSlot *slot;
MemoryContext context,
@@ -293,6 +294,7 @@ StartupDecodingContext(List *output_plugin_options,
ctx->prepare_write = prepare_write;
ctx->write = do_write;
ctx->update_progress = update_progress;
+ ctx->delay = delay;
ctx->output_plugin_options = output_plugin_options;
@@ -316,7 +318,7 @@ StartupDecodingContext(List *output_plugin_options,
* marking WAL reserved beforehand. In that scenario, it's up to the
* caller to guarantee that WAL remains available.
* xl_routine -- XLogReaderRoutine for underlying XLogReader
- * prepare_write, do_write, update_progress --
+ * prepare_write, do_write, update_progress, delay --
* callbacks that perform the use-case dependent, actual, work.
*
* Needs to be called while in a memory context that's at least as long lived
@@ -334,7 +336,8 @@ CreateInitDecodingContext(const char *plugin,
XLogReaderRoutine *xl_routine,
LogicalOutputPluginWriterPrepareWrite prepare_write,
LogicalOutputPluginWriterWrite do_write,
- LogicalOutputPluginWriterUpdateProgress update_progress)
+ LogicalOutputPluginWriterUpdateProgress update_progress,
+ LogicalOutputPluginWriterDelay delay)
{
TransactionId xmin_horizon = InvalidTransactionId;
ReplicationSlot *slot;
@@ -435,7 +438,7 @@ CreateInitDecodingContext(const char *plugin,
ctx = StartupDecodingContext(NIL, restart_lsn, xmin_horizon,
need_full_snapshot, false,
xl_routine, prepare_write, do_write,
- update_progress);
+ update_progress, delay);
/* call output plugin initialization callback */
old_context = MemoryContextSwitchTo(ctx->context);
@@ -475,7 +478,7 @@ CreateInitDecodingContext(const char *plugin,
* xl_routine
* XLogReaderRoutine used by underlying xlogreader
*
- * prepare_write, do_write, update_progress
+ * prepare_write, do_write, update_progress, delay
* callbacks that have to be filled to perform the use-case dependent,
* actual work.
*
@@ -493,7 +496,8 @@ CreateDecodingContext(XLogRecPtr start_lsn,
XLogReaderRoutine *xl_routine,
LogicalOutputPluginWriterPrepareWrite prepare_write,
LogicalOutputPluginWriterWrite do_write,
- LogicalOutputPluginWriterUpdateProgress update_progress)
+ LogicalOutputPluginWriterUpdateProgress update_progress,
+ LogicalOutputPluginWriterDelay delay)
{
LogicalDecodingContext *ctx;
ReplicationSlot *slot;
@@ -547,7 +551,7 @@ CreateDecodingContext(XLogRecPtr start_lsn,
ctx = StartupDecodingContext(output_plugin_options,
start_lsn, InvalidTransactionId, false,
fast_forward, xl_routine, prepare_write,
- do_write, update_progress);
+ do_write, update_progress, delay);
/* call output plugin initialization callback */
old_context = MemoryContextSwitchTo(ctx->context);
diff --git a/src/backend/replication/logical/logicalfuncs.c b/src/backend/replication/logical/logicalfuncs.c
index fa1b641a2b..960025197f 100644
--- a/src/backend/replication/logical/logicalfuncs.c
+++ b/src/backend/replication/logical/logicalfuncs.c
@@ -212,7 +212,7 @@ pg_logical_slot_get_changes_guts(FunctionCallInfo fcinfo, bool confirm, bool bin
.segment_open = wal_segment_open,
.segment_close = wal_segment_close),
LogicalOutputPrepareWrite,
- LogicalOutputWrite, NULL);
+ LogicalOutputWrite, NULL, NULL);
/*
* After the sanity checks in CreateDecodingContext, make sure the
diff --git a/src/backend/replication/logical/worker.c b/src/backend/replication/logical/worker.c
index cfb2ab6248..afbac3d80e 100644
--- a/src/backend/replication/logical/worker.c
+++ b/src/backend/replication/logical/worker.c
@@ -3898,7 +3898,8 @@ maybe_reread_subscription(void)
newsub->stream != MySubscription->stream ||
strcmp(newsub->origin, MySubscription->origin) != 0 ||
newsub->owner != MySubscription->owner ||
- !equal(newsub->publications, MySubscription->publications))
+ !equal(newsub->publications, MySubscription->publications) ||
+ newsub->minsenddelay != MySubscription->minsenddelay)
{
if (am_parallel_apply_worker())
ereport(LOG,
@@ -4617,9 +4618,18 @@ ApplyWorkerMain(Datum main_arg)
options.proto.logical.twophase = false;
options.proto.logical.origin = pstrdup(MySubscription->origin);
+ options.proto.logical.min_send_delay = 0;
if (!am_tablesync_worker())
{
+ /*
+ * Time-delayed logical replication does not support tablesync
+ * workers, so only the leader apply worker can request walsenders to
+ * apply delay on the publisher side.
+ */
+ if (server_version >= 160000 && MySubscription->minsenddelay > 0)
+ options.proto.logical.min_send_delay = MySubscription->minsenddelay;
+
/*
* Even when the two_phase mode is requested by the user, it remains
* as the tri-state PENDING until all tablesyncs have reached READY
diff --git a/src/backend/replication/pgoutput/pgoutput.c b/src/backend/replication/pgoutput/pgoutput.c
index 98377c094b..be0095cf52 100644
--- a/src/backend/replication/pgoutput/pgoutput.c
+++ b/src/backend/replication/pgoutput/pgoutput.c
@@ -285,6 +285,7 @@ parse_output_parameters(List *options, PGOutputData *data)
bool streaming_given = false;
bool two_phase_option_given = false;
bool origin_option_given = false;
+ bool min_send_delay_option_given = false;
data->binary = false;
data->streaming = LOGICALREP_STREAM_OFF;
@@ -396,6 +397,32 @@ parse_output_parameters(List *options, PGOutputData *data)
errcode(ERRCODE_INVALID_PARAMETER_VALUE),
errmsg("unrecognized origin value: \"%s\"", data->origin));
}
+ else if (strcmp(defel->defname, "min_send_delay") == 0)
+ {
+ long parsed;
+ char *endptr;
+
+ if (min_send_delay_option_given)
+ ereport(ERROR,
+ errcode(ERRCODE_SYNTAX_ERROR),
+ errmsg("conflicting or redundant options"));
+ min_send_delay_option_given = true;
+
+ errno = 0;
+ parsed = strtoul(strVal(defel->arg), &endptr, 10);
+ if (errno != 0 || *endptr != '\0')
+ ereport(ERROR,
+ (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+ errmsg("invalid min_send_delay")));
+
+ if (parsed > PG_INT32_MAX)
+ ereport(ERROR,
+ (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+ errmsg("min_send_delay \"%s\" out of range",
+ strVal(defel->arg))));
+
+ data->min_send_delay = (int32) parsed;
+ }
else
elog(ERROR, "unrecognized pgoutput option: %s", defel->defname);
}
@@ -501,6 +528,15 @@ pgoutput_startup(LogicalDecodingContext *ctx, OutputPluginOptions *opt,
else
ctx->twophase_opt_given = true;
+ if (data->min_send_delay &&
+ data->protocol_version < LOGICALREP_PROTO_STREAM_PARALLEL_VERSION_NUM)
+ ereport(ERROR,
+ (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+ errmsg("requested proto_version=%d does not support delay sending data, need %d or higher",
+ data->protocol_version, LOGICALREP_PROTO_STREAM_PARALLEL_VERSION_NUM)));
+ else
+ ctx->min_send_delay = data->min_send_delay;
+
/* Init publication state. */
data->publications = NIL;
publications_valid = false;
diff --git a/src/backend/replication/slotfuncs.c b/src/backend/replication/slotfuncs.c
index 2f3c964824..522f7600a1 100644
--- a/src/backend/replication/slotfuncs.c
+++ b/src/backend/replication/slotfuncs.c
@@ -148,7 +148,7 @@ create_logical_replication_slot(char *name, char *plugin,
XL_ROUTINE(.page_read = read_local_xlog_page,
.segment_open = wal_segment_open,
.segment_close = wal_segment_close),
- NULL, NULL, NULL);
+ NULL, NULL, NULL, NULL);
/*
* If caller needs us to determine the decoding start point, do so now.
@@ -481,7 +481,7 @@ pg_logical_replication_slot_advance(XLogRecPtr moveto)
XL_ROUTINE(.page_read = read_local_xlog_page,
.segment_open = wal_segment_open,
.segment_close = wal_segment_close),
- NULL, NULL, NULL);
+ NULL, NULL, NULL, NULL);
/*
* Start reading at the slot's restart_lsn, which we know to point to
diff --git a/src/backend/replication/walsender.c b/src/backend/replication/walsender.c
index 75e8363e24..a4f03ddba1 100644
--- a/src/backend/replication/walsender.c
+++ b/src/backend/replication/walsender.c
@@ -252,6 +252,7 @@ static void WalSndPrepareWrite(LogicalDecodingContext *ctx, XLogRecPtr lsn, Tran
static void WalSndWriteData(LogicalDecodingContext *ctx, XLogRecPtr lsn, TransactionId xid, bool last_write);
static void WalSndUpdateProgress(LogicalDecodingContext *ctx, XLogRecPtr lsn, TransactionId xid,
bool skipped_xact);
+static void WalSndDelay(LogicalDecodingContext *ctx, TimestampTz delay_start);
static XLogRecPtr WalSndWaitForWal(XLogRecPtr loc);
static void LagTrackerWrite(XLogRecPtr lsn, TimestampTz local_flush_time);
static TimeOffset LagTrackerRead(int head, XLogRecPtr lsn, TimestampTz now);
@@ -1126,7 +1127,7 @@ CreateReplicationSlot(CreateReplicationSlotCmd *cmd)
.segment_open = WalSndSegmentOpen,
.segment_close = wal_segment_close),
WalSndPrepareWrite, WalSndWriteData,
- WalSndUpdateProgress);
+ WalSndUpdateProgress, WalSndDelay);
/*
* Signal that we don't need the timeout mechanism. We're just
@@ -1285,7 +1286,7 @@ StartLogicalReplication(StartReplicationCmd *cmd)
.segment_open = WalSndSegmentOpen,
.segment_close = wal_segment_close),
WalSndPrepareWrite, WalSndWriteData,
- WalSndUpdateProgress);
+ WalSndUpdateProgress, WalSndDelay);
xlogreader = logical_decoding_ctx->reader;
WalSndSetState(WALSNDSTATE_CATCHUP);
@@ -3849,3 +3850,75 @@ LagTrackerRead(int head, XLogRecPtr lsn, TimestampTz now)
Assert(time != 0);
return now - time;
}
+
+/*
+ * LogicalDecodingContext 'delay' callback.
+ *
+ * Wait long enough to make sure a transaction is applied at least that
+ * period behind the publisher.
+ */
+static void
+WalSndDelay(LogicalDecodingContext *ctx, TimestampTz delay_start)
+{
+ /* Wait till delayUntil by the latch mechanism */
+ while (true)
+ {
+ TimestampTz delayUntil;
+ long diffms;
+ long timeout_interval_ms;
+
+ ResetLatch(MyLatch);
+
+ CHECK_FOR_INTERRUPTS();
+
+ /* This might change wal_sender_timeout */
+ if (ConfigReloadPending)
+ {
+ ConfigReloadPending = false;
+ ProcessConfigFile(PGC_SIGHUP);
+ }
+
+ /* Check for input from the client */
+ ProcessRepliesIfAny();
+
+ /* die if timeout was reached */
+ WalSndCheckTimeOut();
+
+ /* Send keepalive if the time has come */
+ WalSndKeepaliveIfNecessary();
+
+ /* Try to flush pending output to the client */
+ if (pq_flush_if_writable() != 0)
+ WalSndShutdown();
+
+ /*
+ * If we've requested to shut down, exit the process.
+ *
+ * Note that WalSndDone() cannot be used here because the delaying
+ * changes will be sent in the function.
+ */
+ if (got_STOPPING)
+ WalSndShutdown();
+
+ delayUntil = TimestampTzPlusMilliseconds(delay_start, ctx->min_send_delay);
+ diffms = TimestampDifferenceMilliseconds(GetCurrentTimestamp(), delayUntil);
+
+ /*
+ * Exit without arming the latch if it's already past time to send
+ * this transaction.
+ */
+ if (diffms <= 0)
+ break;
+
+ /* Sleep until appropriate time. */
+ timeout_interval_ms = WalSndComputeSleeptime(GetCurrentTimestamp());
+
+ elog(DEBUG2, "time-delayed replication for txid %u, delay_time = %d ms, remaining wait time: %ld ms",
+ ctx->write_xid, (int) ctx->min_send_delay, diffms);
+
+ /* Sleep until we get reply from worker or we time out */
+ WalSndWait(WL_SOCKET_READABLE,
+ Min(timeout_interval_ms, diffms),
+ WAIT_EVENT_WALSENDER_SEND_DELAY);
+ }
+}
diff --git a/src/backend/utils/activity/wait_event.c b/src/backend/utils/activity/wait_event.c
index cb99cc6339..76c19fe11d 100644
--- a/src/backend/utils/activity/wait_event.c
+++ b/src/backend/utils/activity/wait_event.c
@@ -515,6 +515,9 @@ pgstat_get_wait_timeout(WaitEventTimeout w)
case WAIT_EVENT_VACUUM_TRUNCATE:
event_name = "VacuumTruncate";
break;
+ case WAIT_EVENT_WALSENDER_SEND_DELAY:
+ event_name = "WalSenderSendDelay";
+ break;
/* no default case, so that compiler will warn */
}
diff --git a/src/bin/pg_dump/pg_dump.c b/src/bin/pg_dump/pg_dump.c
index 527c7651ab..bd95747840 100644
--- a/src/bin/pg_dump/pg_dump.c
+++ b/src/bin/pg_dump/pg_dump.c
@@ -4494,6 +4494,7 @@ getSubscriptions(Archive *fout)
int i_subsynccommit;
int i_subpublications;
int i_subbinary;
+ int i_subminsenddelay;
int i,
ntups;
@@ -4546,9 +4547,13 @@ getSubscriptions(Archive *fout)
LOGICALREP_TWOPHASE_STATE_DISABLED);
if (fout->remoteVersion >= 160000)
- appendPQExpBufferStr(query, " s.suborigin\n");
+ appendPQExpBufferStr(query,
+ " s.suborigin,\n"
+ " s.subminsenddelay\n");
else
- appendPQExpBuffer(query, " '%s' AS suborigin\n", LOGICALREP_ORIGIN_ANY);
+ appendPQExpBuffer(query, " '%s' AS suborigin,\n"
+ " 0 AS subminsenddelay\n",
+ LOGICALREP_ORIGIN_ANY);
appendPQExpBufferStr(query,
"FROM pg_subscription s\n"
@@ -4576,6 +4581,7 @@ getSubscriptions(Archive *fout)
i_subtwophasestate = PQfnumber(res, "subtwophasestate");
i_subdisableonerr = PQfnumber(res, "subdisableonerr");
i_suborigin = PQfnumber(res, "suborigin");
+ i_subminsenddelay = PQfnumber(res, "subminsenddelay");
subinfo = pg_malloc(ntups * sizeof(SubscriptionInfo));
@@ -4606,6 +4612,8 @@ getSubscriptions(Archive *fout)
subinfo[i].subdisableonerr =
pg_strdup(PQgetvalue(res, i, i_subdisableonerr));
subinfo[i].suborigin = pg_strdup(PQgetvalue(res, i, i_suborigin));
+ subinfo[i].subminsenddelay =
+ atoi(PQgetvalue(res, i, i_subminsenddelay));
/* Decide whether we want to dump it */
selectDumpableObject(&(subinfo[i].dobj), fout);
@@ -4687,6 +4695,9 @@ dumpSubscription(Archive *fout, const SubscriptionInfo *subinfo)
if (strcmp(subinfo->subsynccommit, "off") != 0)
appendPQExpBuffer(query, ", synchronous_commit = %s", fmtId(subinfo->subsynccommit));
+ if (subinfo->subminsenddelay > 0)
+ appendPQExpBuffer(query, ", min_send_delay = '%d ms'", subinfo->subminsenddelay);
+
appendPQExpBufferStr(query, ");\n");
if (subinfo->dobj.dump & DUMP_COMPONENT_DEFINITION)
diff --git a/src/bin/pg_dump/pg_dump.h b/src/bin/pg_dump/pg_dump.h
index e7cbd8d7ed..24e0f6737f 100644
--- a/src/bin/pg_dump/pg_dump.h
+++ b/src/bin/pg_dump/pg_dump.h
@@ -661,6 +661,7 @@ typedef struct _SubscriptionInfo
char *subdisableonerr;
char *suborigin;
char *subsynccommit;
+ int subminsenddelay;
char *subpublications;
} SubscriptionInfo;
diff --git a/src/bin/psql/describe.c b/src/bin/psql/describe.c
index c8a0bb7b3a..c7d303a168 100644
--- a/src/bin/psql/describe.c
+++ b/src/bin/psql/describe.c
@@ -6472,7 +6472,7 @@ describeSubscriptions(const char *pattern, bool verbose)
PGresult *res;
printQueryOpt myopt = pset.popt;
static const bool translate_columns[] = {false, false, false, false,
- false, false, false, false, false, false, false, false};
+ false, false, false, false, false, false, false, false, false};
if (pset.sversion < 100000)
{
@@ -6527,10 +6527,13 @@ describeSubscriptions(const char *pattern, bool verbose)
gettext_noop("Two-phase commit"),
gettext_noop("Disable on error"));
+ /* Origin and min_send_delay are only supported in v16 and higher */
if (pset.sversion >= 160000)
appendPQExpBuffer(&buf,
- ", suborigin AS \"%s\"\n",
- gettext_noop("Origin"));
+ ", suborigin AS \"%s\"\n"
+ ", subminsenddelay AS \"%s\"\n",
+ gettext_noop("Origin"),
+ gettext_noop("Min send delay"));
appendPQExpBuffer(&buf,
", subsynccommit AS \"%s\"\n"
diff --git a/src/bin/psql/tab-complete.c b/src/bin/psql/tab-complete.c
index 5e1882eaea..6643db6f55 100644
--- a/src/bin/psql/tab-complete.c
+++ b/src/bin/psql/tab-complete.c
@@ -1925,7 +1925,7 @@ psql_completion(const char *text, int start, int end)
COMPLETE_WITH("(", "PUBLICATION");
/* ALTER SUBSCRIPTION <name> SET ( */
else if (HeadMatches("ALTER", "SUBSCRIPTION", MatchAny) && TailMatches("SET", "("))
- COMPLETE_WITH("binary", "disable_on_error", "origin", "slot_name",
+ COMPLETE_WITH("binary", "disable_on_error", "min_send_delay", "origin", "slot_name",
"streaming", "synchronous_commit");
/* ALTER SUBSCRIPTION <name> SKIP ( */
else if (HeadMatches("ALTER", "SUBSCRIPTION", MatchAny) && TailMatches("SKIP", "("))
@@ -3268,7 +3268,7 @@ psql_completion(const char *text, int start, int end)
/* Complete "CREATE SUBSCRIPTION <name> ... WITH ( <opt>" */
else if (HeadMatches("CREATE", "SUBSCRIPTION") && TailMatches("WITH", "("))
COMPLETE_WITH("binary", "connect", "copy_data", "create_slot",
- "disable_on_error", "enabled", "origin", "slot_name",
+ "disable_on_error", "enabled", "min_send_delay", "origin", "slot_name",
"streaming", "synchronous_commit", "two_phase");
/* CREATE TRIGGER --- is allowed inside CREATE SCHEMA, so use TailMatches */
diff --git a/src/include/catalog/pg_subscription.h b/src/include/catalog/pg_subscription.h
index b0f2a1705d..69ae4314b4 100644
--- a/src/include/catalog/pg_subscription.h
+++ b/src/include/catalog/pg_subscription.h
@@ -74,6 +74,8 @@ CATALOG(pg_subscription,6100,SubscriptionRelationId) BKI_SHARED_RELATION BKI_ROW
Oid subowner BKI_LOOKUP(pg_authid); /* Owner of the subscription */
+ int32 subminsenddelay; /* Replication send delay (ms) */
+
bool subenabled; /* True if the subscription is enabled (the
* worker should be running) */
@@ -122,6 +124,7 @@ typedef struct Subscription
* skipped */
char *name; /* Name of the subscription */
Oid owner; /* Oid of the subscription owner */
+ int32 minsenddelay; /* Replication send delay (ms) */
bool enabled; /* Indicates if the subscription is enabled */
bool binary; /* Indicates if the subscription wants data in
* binary format */
diff --git a/src/include/replication/logical.h b/src/include/replication/logical.h
index 5f49554ea0..603c37b6ce 100644
--- a/src/include/replication/logical.h
+++ b/src/include/replication/logical.h
@@ -30,6 +30,10 @@ typedef void (*LogicalOutputPluginWriterUpdateProgress) (struct LogicalDecodingC
bool skipped_xact
);
+typedef void (*LogicalOutputPluginWriterDelay) (struct LogicalDecodingContext *lr,
+ TimestampTz start_time
+);
+
typedef struct LogicalDecodingContext
{
/* memory context this is all allocated in */
@@ -64,6 +68,7 @@ typedef struct LogicalDecodingContext
LogicalOutputPluginWriterPrepareWrite prepare_write;
LogicalOutputPluginWriterWrite write;
LogicalOutputPluginWriterUpdateProgress update_progress;
+ LogicalOutputPluginWriterDelay delay;
/*
* Output buffer.
@@ -100,6 +105,8 @@ typedef struct LogicalDecodingContext
*/
bool twophase_opt_given;
+ int32 min_send_delay;
+
/*
* State for writing output.
*/
@@ -121,14 +128,16 @@ extern LogicalDecodingContext *CreateInitDecodingContext(const char *plugin,
XLogReaderRoutine *xl_routine,
LogicalOutputPluginWriterPrepareWrite prepare_write,
LogicalOutputPluginWriterWrite do_write,
- LogicalOutputPluginWriterUpdateProgress update_progress);
+ LogicalOutputPluginWriterUpdateProgress update_progress,
+ LogicalOutputPluginWriterDelay delay);
extern LogicalDecodingContext *CreateDecodingContext(XLogRecPtr start_lsn,
List *output_plugin_options,
bool fast_forward,
XLogReaderRoutine *xl_routine,
LogicalOutputPluginWriterPrepareWrite prepare_write,
LogicalOutputPluginWriterWrite do_write,
- LogicalOutputPluginWriterUpdateProgress update_progress);
+ LogicalOutputPluginWriterUpdateProgress update_progress,
+ LogicalOutputPluginWriterDelay delay);
extern void DecodingContextFindStartpoint(LogicalDecodingContext *ctx);
extern bool DecodingContextReady(LogicalDecodingContext *ctx);
extern void FreeDecodingContext(LogicalDecodingContext *ctx);
diff --git a/src/include/replication/pgoutput.h b/src/include/replication/pgoutput.h
index b4a8015403..d2fde09e00 100644
--- a/src/include/replication/pgoutput.h
+++ b/src/include/replication/pgoutput.h
@@ -30,6 +30,7 @@ typedef struct PGOutputData
bool messages;
bool two_phase;
char *origin;
+ int32 min_send_delay;
} PGOutputData;
#endif /* PGOUTPUT_H */
diff --git a/src/include/replication/walreceiver.h b/src/include/replication/walreceiver.h
index decffe352d..c20969aed7 100644
--- a/src/include/replication/walreceiver.h
+++ b/src/include/replication/walreceiver.h
@@ -187,6 +187,7 @@ typedef struct
* prepare time */
char *origin; /* Only publish data originating from the
* specified origin */
+ int32 min_send_delay; /* The minimum send delay */
} logical;
} proto;
} WalRcvStreamOptions;
diff --git a/src/include/utils/wait_event.h b/src/include/utils/wait_event.h
index 9ab23e1c4a..cc3a234eba 100644
--- a/src/include/utils/wait_event.h
+++ b/src/include/utils/wait_event.h
@@ -150,7 +150,8 @@ typedef enum
WAIT_EVENT_REGISTER_SYNC_REQUEST,
WAIT_EVENT_SPIN_DELAY,
WAIT_EVENT_VACUUM_DELAY,
- WAIT_EVENT_VACUUM_TRUNCATE
+ WAIT_EVENT_VACUUM_TRUNCATE,
+ WAIT_EVENT_WALSENDER_SEND_DELAY
} WaitEventTimeout;
/* ----------
diff --git a/src/test/regress/expected/subscription.out b/src/test/regress/expected/subscription.out
index 3f99b14394..b14384e8e7 100644
--- a/src/test/regress/expected/subscription.out
+++ b/src/test/regress/expected/subscription.out
@@ -114,18 +114,18 @@ CREATE SUBSCRIPTION regress_testsub4 CONNECTION 'dbname=regress_doesnotexist' PU
WARNING: subscription was created, but is not connected
HINT: To initiate replication, you must manually create the replication slot, enable the subscription, and refresh the subscription.
\dRs+ regress_testsub4
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
-------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub4 | regress_subscription_user | f | {testpub} | f | off | d | f | none | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min send delay | Synchronous commit | Conninfo | Skip LSN
+------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+----------------+--------------------+-----------------------------+----------
+ regress_testsub4 | regress_subscription_user | f | {testpub} | f | off | d | f | none | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
ALTER SUBSCRIPTION regress_testsub4 SET (origin = any);
\dRs+ regress_testsub4
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
-------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub4 | regress_subscription_user | f | {testpub} | f | off | d | f | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min send delay | Synchronous commit | Conninfo | Skip LSN
+------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+----------------+--------------------+-----------------------------+----------
+ regress_testsub4 | regress_subscription_user | f | {testpub} | f | off | d | f | any | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
DROP SUBSCRIPTION regress_testsub3;
@@ -143,10 +143,10 @@ ALTER SUBSCRIPTION regress_testsub CONNECTION 'foobar';
ERROR: invalid connection string syntax: missing "=" after "foobar" in connection info string
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub} | f | off | d | f | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min send delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+----------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | off | d | f | any | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
ALTER SUBSCRIPTION regress_testsub SET PUBLICATION testpub2, testpub3 WITH (refresh = false);
@@ -163,10 +163,10 @@ ERROR: unrecognized subscription parameter: "create_slot"
-- ok
ALTER SUBSCRIPTION regress_testsub SKIP (lsn = '0/12345');
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+---------------------+--------+-----------+------------------+------------------+--------+--------------------+------------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub2,testpub3} | f | off | d | f | any | off | dbname=regress_doesnotexist2 | 0/12345
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min send delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+---------------------+--------+-----------+------------------+------------------+--------+----------------+--------------------+------------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub2,testpub3} | f | off | d | f | any | 0 | off | dbname=regress_doesnotexist2 | 0/12345
(1 row)
-- ok - with lsn = NONE
@@ -175,10 +175,10 @@ ALTER SUBSCRIPTION regress_testsub SKIP (lsn = NONE);
ALTER SUBSCRIPTION regress_testsub SKIP (lsn = '0/0');
ERROR: invalid WAL location (LSN): 0/0
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+---------------------+--------+-----------+------------------+------------------+--------+--------------------+------------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub2,testpub3} | f | off | d | f | any | off | dbname=regress_doesnotexist2 | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min send delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+---------------------+--------+-----------+------------------+------------------+--------+----------------+--------------------+------------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub2,testpub3} | f | off | d | f | any | 0 | off | dbname=regress_doesnotexist2 | 0/0
(1 row)
BEGIN;
@@ -210,10 +210,10 @@ ALTER SUBSCRIPTION regress_testsub_foo SET (synchronous_commit = foobar);
ERROR: invalid value for parameter "synchronous_commit": "foobar"
HINT: Available values: local, remote_write, remote_apply, on, off.
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
----------------------+---------------------------+---------+---------------------+--------+-----------+------------------+------------------+--------+--------------------+------------------------------+----------
- regress_testsub_foo | regress_subscription_user | f | {testpub2,testpub3} | f | off | d | f | any | local | dbname=regress_doesnotexist2 | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min send delay | Synchronous commit | Conninfo | Skip LSN
+---------------------+---------------------------+---------+---------------------+--------+-----------+------------------+------------------+--------+----------------+--------------------+------------------------------+----------
+ regress_testsub_foo | regress_subscription_user | f | {testpub2,testpub3} | f | off | d | f | any | 0 | local | dbname=regress_doesnotexist2 | 0/0
(1 row)
-- rename back to keep the rest simple
@@ -247,19 +247,19 @@ CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUB
WARNING: subscription was created, but is not connected
HINT: To initiate replication, you must manually create the replication slot, enable the subscription, and refresh the subscription.
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub} | t | off | d | f | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min send delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+----------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | t | off | d | f | any | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
ALTER SUBSCRIPTION regress_testsub SET (binary = false);
ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE);
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub} | f | off | d | f | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min send delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+----------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | off | d | f | any | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
DROP SUBSCRIPTION regress_testsub;
@@ -271,27 +271,27 @@ CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUB
WARNING: subscription was created, but is not connected
HINT: To initiate replication, you must manually create the replication slot, enable the subscription, and refresh the subscription.
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub} | f | on | d | f | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min send delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+----------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | on | d | f | any | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
ALTER SUBSCRIPTION regress_testsub SET (streaming = parallel);
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub} | f | parallel | d | f | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min send delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+----------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | parallel | d | f | any | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
ALTER SUBSCRIPTION regress_testsub SET (streaming = false);
ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE);
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub} | f | off | d | f | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min send delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+----------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | off | d | f | any | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
-- fail - publication already exists
@@ -306,10 +306,10 @@ ALTER SUBSCRIPTION regress_testsub ADD PUBLICATION testpub1, testpub2 WITH (refr
ALTER SUBSCRIPTION regress_testsub ADD PUBLICATION testpub1, testpub2 WITH (refresh = false);
ERROR: publication "testpub1" is already in subscription "regress_testsub"
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-----------------------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub,testpub1,testpub2} | f | off | d | f | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min send delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-----------------------------+--------+-----------+------------------+------------------+--------+----------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub,testpub1,testpub2} | f | off | d | f | any | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
-- fail - publication used more than once
@@ -324,10 +324,10 @@ ERROR: publication "testpub3" is not in subscription "regress_testsub"
-- ok - delete publications
ALTER SUBSCRIPTION regress_testsub DROP PUBLICATION testpub1, testpub2 WITH (refresh = false);
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub} | f | off | d | f | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min send delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+----------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | off | d | f | any | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
DROP SUBSCRIPTION regress_testsub;
@@ -363,10 +363,10 @@ CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUB
WARNING: subscription was created, but is not connected
HINT: To initiate replication, you must manually create the replication slot, enable the subscription, and refresh the subscription.
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub} | f | off | p | f | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min send delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+----------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | off | p | f | any | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
--fail - alter of two_phase option not supported.
@@ -375,10 +375,10 @@ ERROR: unrecognized subscription parameter: "two_phase"
-- but can alter streaming when two_phase enabled
ALTER SUBSCRIPTION regress_testsub SET (streaming = true);
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub} | f | on | p | f | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min send delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+----------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | on | p | f | any | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE);
@@ -388,10 +388,10 @@ CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUB
WARNING: subscription was created, but is not connected
HINT: To initiate replication, you must manually create the replication slot, enable the subscription, and refresh the subscription.
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub} | f | on | p | f | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min send delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+----------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | on | p | f | any | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE);
@@ -404,18 +404,48 @@ CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUB
WARNING: subscription was created, but is not connected
HINT: To initiate replication, you must manually create the replication slot, enable the subscription, and refresh the subscription.
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub} | f | off | d | f | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min send delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+----------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | off | d | f | any | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
ALTER SUBSCRIPTION regress_testsub SET (disable_on_error = true);
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub} | f | off | d | t | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min send delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+----------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | off | d | t | any | 0 | off | dbname=regress_doesnotexist | 0/0
+(1 row)
+
+ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE);
+DROP SUBSCRIPTION regress_testsub;
+-- fail -- min_send_delay must be a non-negative integer
+CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexit' PUBLICATION testpub WITH (connect = false, min_send_delay = foo);
+ERROR: invalid value for parameter "min_send_delay": "foo"
+CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexit' PUBLICATION testpub WITH (connect = false, min_send_delay = -1);
+ERROR: -1 ms is outside the valid range for parameter "min_send_delay" (0 .. 2147483647)
+-- fail - utilizing streaming = parallel with time-delayed replication is not supported
+CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUBLICATION testpub WITH (connect = false, streaming = parallel, min_send_delay = 123);
+ERROR: min_send_delay > 0 and streaming = parallel are mutually exclusive options
+-- success -- min_send_delay value without unit is take as milliseconds
+CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexit' PUBLICATION testpub WITH (connect = false, min_send_delay = 123);
+WARNING: subscription was created, but is not connected
+HINT: To initiate replication, you must manually create the replication slot, enable the subscription, and refresh the subscription.
+\dRs+
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min send delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+----------------+--------------------+----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | off | d | f | any | 123 | off | dbname=regress_doesnotexit | 0/0
+(1 row)
+
+-- success -- min_send_delay value with unit is converted into ms and stored as an integer
+ALTER SUBSCRIPTION regress_testsub SET (min_send_delay = '1 d');
+\dRs+
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min send delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+----------------+--------------------+----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | off | d | f | any | 86400000 | off | dbname=regress_doesnotexit | 0/0
(1 row)
ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE);
diff --git a/src/test/regress/sql/subscription.sql b/src/test/regress/sql/subscription.sql
index 7281f5fee2..2fae3b06c7 100644
--- a/src/test/regress/sql/subscription.sql
+++ b/src/test/regress/sql/subscription.sql
@@ -286,6 +286,24 @@ ALTER SUBSCRIPTION regress_testsub SET (disable_on_error = true);
ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE);
DROP SUBSCRIPTION regress_testsub;
+-- fail -- min_send_delay must be a non-negative integer
+CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexit' PUBLICATION testpub WITH (connect = false, min_send_delay = foo);
+CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexit' PUBLICATION testpub WITH (connect = false, min_send_delay = -1);
+
+-- fail - utilizing streaming = parallel with time-delayed replication is not supported
+CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUBLICATION testpub WITH (connect = false, streaming = parallel, min_send_delay = 123);
+
+-- success -- min_send_delay value without unit is take as milliseconds
+CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexit' PUBLICATION testpub WITH (connect = false, min_send_delay = 123);
+\dRs+
+
+-- success -- min_send_delay value with unit is converted into ms and stored as an integer
+ALTER SUBSCRIPTION regress_testsub SET (min_send_delay = '1 d');
+\dRs+
+
+ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE);
+DROP SUBSCRIPTION regress_testsub;
+
RESET SESSION AUTHORIZATION;
DROP ROLE regress_subscription_user;
DROP ROLE regress_subscription_user2;
diff --git a/src/test/subscription/t/001_rep_changes.pl b/src/test/subscription/t/001_rep_changes.pl
index 91aa068c95..34807ffba2 100644
--- a/src/test/subscription/t/001_rep_changes.pl
+++ b/src/test/subscription/t/001_rep_changes.pl
@@ -515,6 +515,34 @@ $node_publisher->poll_query_until('postgres',
or die
"Timed out while waiting for apply to restart after renaming SUBSCRIPTION";
+# Test time-delayed logical replication
+#
+# If the subscription sets min_send_delay parameter, the walsender will delay
+# the transaction send for min_send_delay milliseconds. We verify this by
+# looking at the time difference between a) when tuples are inserted on the
+# publisher, and b) when those changes are replicated on the subscriber. Even
+# on slow machines, this strategy will give predictable behavior.
+
+# Set min_send_delay parameter to 3 seconds
+my $delay = 3;
+$node_subscriber->safe_psql('postgres',
+ "ALTER SUBSCRIPTION tap_sub_renamed SET (min_send_delay = '${delay}s')");
+
+# Before doing the insertion, get the current timestamp that will be
+# used as a comparison base.
+my $publisher_insert_time = time();
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_ins VALUES (generate_series(1101, 1120))");
+
+# The publisher waits for the replication to complete
+$node_publisher->wait_for_catchup('tap_sub_renamed');
+
+# This test is successful if and only if the LSN has been applied with at least
+# the configured apply delay.
+ok( time() - $publisher_insert_time >= $delay,
+ "subscriber applies WAL only after replication delay for non-streaming transaction"
+);
+
# check all the cleanup
$node_subscriber->safe_psql('postgres', "DROP SUBSCRIPTION tap_sub_renamed");
--
2.27.0
Here are some review comments for patch v3-0001.
(I haven't looked at the test code yet)
======
Commit Message
1.
If the subscription sets min_send_delay parameter, an apply worker passes the
value to the publisher as an output plugin option. And then, the walsender will
delay the transaction sending for given milliseconds.
~
1a.
"an apply worker" --> "the apply worker (via walrcv_startstreaming)".
~
1b.
"And then, the walsender" --> "The walsender"
~~~
2.
The combination of parallel streaming mode and min_send_delay is not allowed.
This is because in parallel streaming mode, we start applying the transaction
stream as soon as the first change arrives without knowing the transaction's
prepare/commit time. Always waiting for the full 'min_send_delay' period might
include unnecessary delay.
~
Is there another reason not to support this?
Even if streaming + min_send_delay incurs some extra delay, is that a
reason to reject outright the combination? What difference will the
potential of a few extra seconds overhead make when min_send_delay is
more likely to be far greater (e.g. minutes or hours)?
~~~
3.
The other possibility was to apply the delay at the end of the parallel apply
transaction but that would cause issues related to resource bloat and
locks being
held for a long time.
~
Is this explanation still relevant now you are doing pub-side delays?
======
doc/src/sgml/catalogs.sgml
4.
+ <row>
+ <entry role="catalog_table_entry"><para role="column_definition">
+ <structfield>subminsenddelay</structfield> <type>int4</type>
+ </para>
+ <para>
+ The minimum delay, in milliseconds, by the publisher to send changes
+ </para></entry>
+ </row>
"by the publisher to send changes" --> "by the publisher before sending changes"
======
doc/src/sgml/logical-replication.sgml
5.
+ <para>
+ A publication can delay sending changes to the subscription by specifying
+ the <literal>min_send_delay</literal> subscription parameter. See
+ <xref linkend="sql-createsubscription"/> for details.
+ </para>
~
This description seemed backwards because IIUC the PUBLICATION has
nothing to do with the delay really, the walsender is told what to do
by the SUBSCRIPTION. Anyway, this paragraph is in the "Subscriber"
section, so mentioning publications was a bit confusing.
SUGGESTION
A subscription can delay the receipt of changes by specifying the
min_send_delay subscription parameter. See ...
======
doc/src/sgml/monitoring.sgml
6.
+ <row>
+ <entry><literal>WalSenderSendDelay</literal></entry>
+ <entry>Waiting while sending changes for time-delayed logical replication
+ in the WAL sender process.</entry>
+ </row>
Should this say "Waiting before sending changes", instead of "Waiting
while sending changes"?
======
doc/src/sgml/ref/create_subscription.sgml
7.
+ <para>
+ By default, the publisher sends changes as soon as possible. This
+ parameter allows the user to delay the publisher to send changes by
+ given time period. If the value is specified without units, it is
+ taken as milliseconds. The default is zero (no delay). See
+ <xref linkend="config-setting-names-values"/> for details on the
+ available valid time units.
+ </para>
"to delay the publisher to send changes" --> "to delay changes"
~~~
8.
+ <para>
+ The delay is effective only when the initial table synchronization
+ has been finished and the publisher decides to send a particular
+ transaction downstream. The delay does not take into account the
+ overhead of time spent in transferring the transaction, which means
+ that the arrival time at the subscriber may be delayed more than the
+ given time.
+ </para>
I'm not sure about this mention about only "effective only when the
initial table synchronization has been finished"... Now that the delay
is pub-side I don't know that it is true anymore. The tablesync worker
will try to synchronize with the apply worker. IIUC during this
"synchronization" phase the apply worker might be getting delayed by
its own walsender, so therefore the tablesync might also be delayed
(due to syncing with the apply worker) won't it?
======
src/backend/commands/subscriptioncmds.c
9.
+ /*
+ * translator: the first %s is a string of the form "parameter > 0"
+ * and the second one is "option = value".
+ */
+ errmsg("%s and %s are mutually exclusive options",
+ "min_send_delay > 0", "streaming = parallel"));
+
+
}
Excessive whitespace.
======
src/backend/replication/logical/worker.c
10. ApplyWorkerMain
+ /*
+ * Time-delayed logical replication does not support tablesync
+ * workers, so only the leader apply worker can request walsenders to
+ * apply delay on the publisher side.
+ */
+ if (server_version >= 160000 && MySubscription->minsenddelay > 0)
+ options.proto.logical.min_send_delay = MySubscription->minsenddelay;
"apply delay" --> "delay"
======
src/backend/replication/pgoutput/pgoutput.c
11.
+ errno = 0;
+ parsed = strtoul(strVal(defel->arg), &endptr, 10);
+ if (errno != 0 || *endptr != '\0')
+ ereport(ERROR,
+ (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+ errmsg("invalid min_send_delay")));
+
+ if (parsed > PG_INT32_MAX)
+ ereport(ERROR,
+ (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+ errmsg("min_send_delay \"%s\" out of range",
+ strVal(defel->arg))));
Should the validation be also checking/asserting no negative numbers,
or actually should the min_send_delay be defined as a uint32 in the
first place?
~~~
12. pgoutput_startup
@@ -501,6 +528,15 @@ pgoutput_startup(LogicalDecodingContext *ctx,
OutputPluginOptions *opt,
else
ctx->twophase_opt_given = true;
+ if (data->min_send_delay &&
+ data->protocol_version < LOGICALREP_PROTO_STREAM_PARALLEL_VERSION_NUM)
+ ereport(ERROR,
+ (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+ errmsg("requested proto_version=%d does not support delay sending
data, need %d or higher",
+ data->protocol_version, LOGICALREP_PROTO_STREAM_PARALLEL_VERSION_NUM)));
+ else
+ ctx->min_send_delay = data->min_send_delay;
IMO it doesn't make sense to compare this new feature with the
unrelated LOGICALREP_PROTO_STREAM_PARALLEL_VERSION_NUM protocol
version. I think we should define a new constant
LOGICALREP_PROTO_MIN_SEND_DELAY_VERSION_NUM (even if it has the same
value as the LOGICALREP_PROTO_STREAM_PARALLEL_VERSION_NUM).
======
src/backend/replication/walsender.c
13. WalSndDelay
+ long diffms;
+ long timeout_interval_ms;
IMO some more informative name for these would make the code read better:
'diffms' --> 'remaining_wait_time_ms'
'timeout_interval_ms' --> 'timeout_sleeptime_ms'
~~~
14.
+ /* Sleep until we get reply from worker or we time out */
+ WalSndWait(WL_SOCKET_READABLE,
+ Min(timeout_interval_ms, diffms),
+ WAIT_EVENT_WALSENDER_SEND_DELAY);
Sorry, I didn't understand this comment "reply from worker"... AFAIK
here we are just sleeping, not waiting for replies from anywhere (???)
======
src/include/replication/logical.h
15.
@@ -64,6 +68,7 @@ typedef struct LogicalDecodingContext
LogicalOutputPluginWriterPrepareWrite prepare_write;
LogicalOutputPluginWriterWrite write;
LogicalOutputPluginWriterUpdateProgress update_progress;
+ LogicalOutputPluginWriterDelay delay;
~
15a.
Question: Is there some advantage to introducing another callback,
instead of just doing the delay inline?
~
15b.
Should this be a more informative member name like 'delay_send'?
~~~
16.
@@ -100,6 +105,8 @@ typedef struct LogicalDecodingContext
*/
bool twophase_opt_given;
+ int32 min_send_delay;
+
Missing comment for this new member.
------
Kind Regards,
Peter Smith.
Fujitsu Australia
Here are some review comments for the v3-0001 test code.
======
src/test/regress/sql/subscription.sql
1.
+-- fail - utilizing streaming = parallel with time-delayed
replication is not supported
+CREATE SUBSCRIPTION regress_testsub CONNECTION
'dbname=regress_doesnotexist' PUBLICATION testpub WITH (connect =
false, streaming = parallel, min_send_delay = 123);
"utilizing" --> "specifying"
~~~
2.
+-- success -- min_send_delay value without unit is take as milliseconds
+CREATE SUBSCRIPTION regress_testsub CONNECTION
'dbname=regress_doesnotexit' PUBLICATION testpub WITH (connect =
false, min_send_delay = 123);
+\dRs+
"without unit is take as" --> "without units is taken as"
~~~
3.
+-- success -- min_send_delay value with unit is converted into ms and
stored as an integer
+ALTER SUBSCRIPTION regress_testsub SET (min_send_delay = '1 d');
+\dRs+
"with unit is converted into ms" --> "with units other than ms is
converted to ms"
~~~
4. Missing tests?
Why have the previous ALTER SUBSCRIPTION tests been removed? AFAIK,
currently, there are no regression tests for error messages like:
test_sub=# ALTER SUBSCRIPTION sub1 SET (min_send_delay = 123);
ERROR: cannot set min_send_delay for subscription in parallel streaming mode
======
src/test/subscription/t/001_rep_changes.pl
5.
+# This test is successful if and only if the LSN has been applied with at least
+# the configured apply delay.
+ok( time() - $publisher_insert_time >= $delay,
+ "subscriber applies WAL only after replication delay for
non-streaming transaction"
+);
It's not strictly an "apply delay". Maybe this comment only needs to
say like below:
SUGGESTION
# This test is successful only if at least the configured delay has elapsed.
------
Kind Regards,
Peter Smith.
Fujitsu Australia
On Tue, Feb 21, 2023 at 3:31 AM Peter Smith <smithpb2250@gmail.com> wrote:
2.
The combination of parallel streaming mode and min_send_delay is not allowed.
This is because in parallel streaming mode, we start applying the transaction
stream as soon as the first change arrives without knowing the transaction's
prepare/commit time. Always waiting for the full 'min_send_delay' period might
include unnecessary delay.~
Is there another reason not to support this?
Even if streaming + min_send_delay incurs some extra delay, is that a
reason to reject outright the combination? What difference will the
potential of a few extra seconds overhead make when min_send_delay is
more likely to be far greater (e.g. minutes or hours)?
I think the point is that we don't know the commit time at the start
of streaming and even the transaction can be quite long in which case
adding the delay is not expected.
======
doc/src/sgml/catalogs.sgml4. + <row> + <entry role="catalog_table_entry"><para role="column_definition"> + <structfield>subminsenddelay</structfield> <type>int4</type> + </para> + <para> + The minimum delay, in milliseconds, by the publisher to send changes + </para></entry> + </row>"by the publisher to send changes" --> "by the publisher before sending changes"
For the streaming (=on) case, we may end up sending changes before we
start to apply delay.
======
doc/src/sgml/monitoring.sgml6. + <row> + <entry><literal>WalSenderSendDelay</literal></entry> + <entry>Waiting while sending changes for time-delayed logical replication + in the WAL sender process.</entry> + </row>Should this say "Waiting before sending changes", instead of "Waiting
while sending changes"?
In the streaming (non-parallel) case, we may have sent some changes
before wait as we wait only at commit/prepare time. The downstream
won't apply such changes till commit. So, this description makes sense
and this matches similar nearby descriptions.
8. + <para> + The delay is effective only when the initial table synchronization + has been finished and the publisher decides to send a particular + transaction downstream. The delay does not take into account the + overhead of time spent in transferring the transaction, which means + that the arrival time at the subscriber may be delayed more than the + given time. + </para>I'm not sure about this mention about only "effective only when the
initial table synchronization has been finished"... Now that the delay
is pub-side I don't know that it is true anymore.
This will still be true because we don't wait during the initial copy
(sync). The delay happens only when the replication starts.
======
src/backend/commands/subscriptioncmds.c
======
src/backend/replication/pgoutput/pgoutput.c11. + errno = 0; + parsed = strtoul(strVal(defel->arg), &endptr, 10); + if (errno != 0 || *endptr != '\0') + ereport(ERROR, + (errcode(ERRCODE_INVALID_PARAMETER_VALUE), + errmsg("invalid min_send_delay"))); + + if (parsed > PG_INT32_MAX) + ereport(ERROR, + (errcode(ERRCODE_INVALID_PARAMETER_VALUE), + errmsg("min_send_delay \"%s\" out of range", + strVal(defel->arg))));Should the validation be also checking/asserting no negative numbers,
or actually should the min_send_delay be defined as a uint32 in the
first place?
I don't see the need to change the datatype of min_send_delay as
compared to what we have min_apply_delay.
======
src/include/replication/logical.h15.
@@ -64,6 +68,7 @@ typedef struct LogicalDecodingContext
LogicalOutputPluginWriterPrepareWrite prepare_write;
LogicalOutputPluginWriterWrite write;
LogicalOutputPluginWriterUpdateProgress update_progress;
+ LogicalOutputPluginWriterDelay delay;~
15a.
Question: Is there some advantage to introducing another callback,
instead of just doing the delay inline?
This is required because we need to check walsender's timeout and or
process replies during the delay.
--
With Regards,
Amit Kapila.
Dear Peter,
Thank you for reviewing! PSA new version.
1.
If the subscription sets min_send_delay parameter, an apply worker passes the
value to the publisher as an output plugin option. And then, the walsender will
delay the transaction sending for given milliseconds.~
1a.
"an apply worker" --> "the apply worker (via walrcv_startstreaming)".~
1b.
"And then, the walsender" --> "The walsender"
Fixed.
2.
The combination of parallel streaming mode and min_send_delay is not allowed.
This is because in parallel streaming mode, we start applying the transaction
stream as soon as the first change arrives without knowing the transaction's
prepare/commit time. Always waiting for the full 'min_send_delay' period might
include unnecessary delay.~
Is there another reason not to support this?
Even if streaming + min_send_delay incurs some extra delay, is that a
reason to reject outright the combination? What difference will the
potential of a few extra seconds overhead make when min_send_delay is
more likely to be far greater (e.g. minutes or hours)?
Another case I came up with is that streaming transactions are come continuously.
If there are many transactions to be streamed, the walsender must delay to send for
every transactions, for the given period. It means that arrival of transactions at
the subscriber may delay for approximately min_send_delay x # of transactions.
3.
The other possibility was to apply the delay at the end of the parallel apply
transaction but that would cause issues related to resource bloat and
locks being
held for a long time.~
Is this explanation still relevant now you are doing pub-side delays?
Slightly reworded. I think the problem may be occurred if we delay sending COMMIT
record for parallel applied transactions.
doc/src/sgml/catalogs.sgml
4. + <row> + <entry role="catalog_table_entry"><para role="column_definition"> + <structfield>subminsenddelay</structfield> <type>int4</type> + </para> + <para> + The minimum delay, in milliseconds, by the publisher to send changes + </para></entry> + </row>"by the publisher to send changes" --> "by the publisher before sending changes"
As Amit said[1]/messages/by-id/CAA4eK1+JwLAVAOphnZ1YTiEV_jOMAE6JgJmBE98oek2cg7XF0w@mail.gmail.com, there is a possibility to delay after sending delay. So I changed to
"before sending COMMIT record". How do you think?
doc/src/sgml/logical-replication.sgml
5. + <para> + A publication can delay sending changes to the subscription by specifying + the <literal>min_send_delay</literal> subscription parameter. See + <xref linkend="sql-createsubscription"/> for details. + </para>~
This description seemed backwards because IIUC the PUBLICATION has
nothing to do with the delay really, the walsender is told what to do
by the SUBSCRIPTION. Anyway, this paragraph is in the "Subscriber"
section, so mentioning publications was a bit confusing.SUGGESTION
A subscription can delay the receipt of changes by specifying the
min_send_delay subscription parameter. See ...
Changed.
doc/src/sgml/monitoring.sgml
6. + <row> + <entry><literal>WalSenderSendDelay</literal></entry> + <entry>Waiting while sending changes for time-delayed logical replication + in the WAL sender process.</entry> + </row>Should this say "Waiting before sending changes", instead of "Waiting
while sending changes"?
Per discussion[1]/messages/by-id/CAA4eK1+JwLAVAOphnZ1YTiEV_jOMAE6JgJmBE98oek2cg7XF0w@mail.gmail.com, I did not fix.
doc/src/sgml/ref/create_subscription.sgml
7. + <para> + By default, the publisher sends changes as soon as possible. This + parameter allows the user to delay the publisher to send changes by + given time period. If the value is specified without units, it is + taken as milliseconds. The default is zero (no delay). See + <xref linkend="config-setting-names-values"/> for details on the + available valid time units. + </para>"to delay the publisher to send changes" --> "to delay changes"
Fixed.
8. + <para> + The delay is effective only when the initial table synchronization + has been finished and the publisher decides to send a particular + transaction downstream. The delay does not take into account the + overhead of time spent in transferring the transaction, which means + that the arrival time at the subscriber may be delayed more than the + given time. + </para>I'm not sure about this mention about only "effective only when the
initial table synchronization has been finished"... Now that the delay
is pub-side I don't know that it is true anymore. The tablesync worker
will try to synchronize with the apply worker. IIUC during this
"synchronization" phase the apply worker might be getting delayed by
its own walsender, so therefore the tablesync might also be delayed
(due to syncing with the apply worker) won't it?
I tested and checked codes. First of all, the tablesync worker request to send WALs
without min_send_delay, so changes will be sent and applied with no delays. In this meaning,
the table synchronization has not been affected by the feature. While checking,
however, there is a possibility that the state of table will be delayed to get
'readly' because the changing of status from SYNCDONE from READY is done by apply worker.
It may lead that two-phase will be delayed in getting to "enabled".
I added descriptions about it.
src/backend/commands/subscriptioncmds.c
9. + /* + * translator: the first %s is a string of the form "parameter > 0" + * and the second one is "option = value". + */ + errmsg("%s and %s are mutually exclusive options", + "min_send_delay > 0", "streaming = parallel")); + + }Excessive whitespace.
Adjusted.
src/backend/replication/logical/worker.c
10. ApplyWorkerMain
+ /* + * Time-delayed logical replication does not support tablesync + * workers, so only the leader apply worker can request walsenders to + * apply delay on the publisher side. + */ + if (server_version >= 160000 && MySubscription->minsenddelay > 0) + options.proto.logical.min_send_delay = MySubscription->minsenddelay;"apply delay" --> "delay"
Fixed.
src/backend/replication/pgoutput/pgoutput.c
11. + errno = 0; + parsed = strtoul(strVal(defel->arg), &endptr, 10); + if (errno != 0 || *endptr != '\0') + ereport(ERROR, + (errcode(ERRCODE_INVALID_PARAMETER_VALUE), + errmsg("invalid min_send_delay"))); + + if (parsed > PG_INT32_MAX) + ereport(ERROR, + (errcode(ERRCODE_INVALID_PARAMETER_VALUE), + errmsg("min_send_delay \"%s\" out of range", + strVal(defel->arg))));Should the validation be also checking/asserting no negative numbers,
or actually should the min_send_delay be defined as a uint32 in the
first place?
I think you are right because min_apply_delay does not have related code.
we must consider additional possibility that user may send START_REPLICATION
by hand and it has minus value.
Fixed.
12. pgoutput_startup
@@ -501,6 +528,15 @@ pgoutput_startup(LogicalDecodingContext *ctx,
OutputPluginOptions *opt,
else
ctx->twophase_opt_given = true;+ if (data->min_send_delay && + data->protocol_version < LOGICALREP_PROTO_STREAM_PARALLEL_VERSION_NUM) + ereport(ERROR, + (errcode(ERRCODE_INVALID_PARAMETER_VALUE), + errmsg("requested proto_version=%d does not support delay sending data, need %d or higher", + data->protocol_version, LOGICALREP_PROTO_STREAM_PARALLEL_VERSION_NUM))); + else + ctx->min_send_delay = data->min_send_delay;IMO it doesn't make sense to compare this new feature with the
unrelated LOGICALREP_PROTO_STREAM_PARALLEL_VERSION_NUM protocol
version. I think we should define a new constant
LOGICALREP_PROTO_MIN_SEND_DELAY_VERSION_NUM (even if it has the
same
value as the LOGICALREP_PROTO_STREAM_PARALLEL_VERSION_NUM).
Added.
src/backend/replication/walsender.c
13. WalSndDelay
+ long diffms;
+ long timeout_interval_ms;IMO some more informative name for these would make the code read better:
'diffms' --> 'remaining_wait_time_ms'
'timeout_interval_ms' --> 'timeout_sleeptime_ms'
Changed.
14. + /* Sleep until we get reply from worker or we time out */ + WalSndWait(WL_SOCKET_READABLE, + Min(timeout_interval_ms, diffms), + WAIT_EVENT_WALSENDER_SEND_DELAY);Sorry, I didn't understand this comment "reply from worker"... AFAIK
here we are just sleeping, not waiting for replies from anywhere (???)======
src/include/replication/logical.h15.
@@ -64,6 +68,7 @@ typedef struct LogicalDecodingContext
LogicalOutputPluginWriterPrepareWrite prepare_write;
LogicalOutputPluginWriterWrite write;
LogicalOutputPluginWriterUpdateProgress update_progress;
+ LogicalOutputPluginWriterDelay delay;~
15a.
Question: Is there some advantage to introducing another callback,
instead of just doing the delay inline?
IIUC functions related with walsender should not be called directly, because there
is a possibility that replication slots are manipulated from the backed.
15b.
Should this be a more informative member name like 'delay_send'?
Changed.
16.
@@ -100,6 +105,8 @@ typedef struct LogicalDecodingContext
*/
bool twophase_opt_given;+ int32 min_send_delay;
+Missing comment for this new member.
Added.
[1]: /messages/by-id/CAA4eK1+JwLAVAOphnZ1YTiEV_jOMAE6JgJmBE98oek2cg7XF0w@mail.gmail.com
Best Regards,
Hayato Kuroda
FUJITSU LIMITED
Attachments:
v4-0001-Time-delayed-logical-replication-on-publisher-sid.patchapplication/octet-stream; name=v4-0001-Time-delayed-logical-replication-on-publisher-sid.patchDownload
From 0239a31307a9036236bd3342eb50a5717ae9349f Mon Sep 17 00:00:00 2001
From: Hayato Kuroda <kuroda.hayato@fujitsu.com>
Date: Thu, 16 Feb 2023 07:52:23 +0000
Subject: [PATCH v4] Time-delayed logical replication on publisher side
Similar to physical replication, a time-delayed copy of the data for
logical replication is useful for some scenarios (particularly to fix
errors that might cause data loss).
This patch implements a new subscription parameter called 'min_send_delay'.
If the subscription sets min_send_delay parameter, the apply worker
(via walrcv_startstreaming) passes the value to the publisher as an output plugin
option. The walsender will delay the transaction sending for given milliseconds.
The delay does not take into account the overhead of time spent in transferring
the transaction, which means that the arrival time at the subscriber may be
delayed more than the given time.
The combination of parallel streaming mode and min_send_delay is not allowed.
This is because in parallel streaming mode, we start applying the transaction
stream as soon as the first change arrives without knowing the transaction's
prepare/commit time. Always waiting for the full 'min_send_delay' period might
include unnecessary delay.
The other possibility was to apply the delay at the end of the parallel apply
transaction but that would cause issues related to resource bloat and locks being
held for a long time.
The delay occurs before we start to send the transaction on the publisher.
Regular and prepared transactions are covered. Streamed transactions are also
covered.
Eariler versions were written by Euler Taveira, Takamichi Osumi, and Kuroda Hayato
Author: Kuroda Hayato, Takamichi Osumi, Kuroda Hayato
Reviewed-by: Amit Kapila, Peter Smith, Vignesh C, Shveta Malik,
Kyotaro Horiguchi, Shi Yu, Wang Wei, Dilip Kumar, Melih Mutlu,
Andres Freund
---
doc/src/sgml/catalogs.sgml | 10 +
doc/src/sgml/glossary.sgml | 15 ++
doc/src/sgml/logical-replication.sgml | 6 +
doc/src/sgml/monitoring.sgml | 5 +
doc/src/sgml/ref/alter_subscription.sgml | 5 +-
doc/src/sgml/ref/create_subscription.sgml | 43 +++-
src/backend/catalog/pg_subscription.c | 1 +
src/backend/catalog/system_views.sql | 7 +-
src/backend/commands/subscriptioncmds.c | 119 ++++++++++-
.../libpqwalreceiver/libpqwalreceiver.c | 5 +
src/backend/replication/logical/decode.c | 18 ++
src/backend/replication/logical/logical.c | 18 +-
.../replication/logical/logicalfuncs.c | 2 +-
src/backend/replication/logical/worker.c | 12 +-
src/backend/replication/pgoutput/pgoutput.c | 36 ++++
src/backend/replication/slotfuncs.c | 4 +-
src/backend/replication/walsender.c | 78 +++++++-
src/backend/utils/activity/wait_event.c | 3 +
src/bin/pg_dump/pg_dump.c | 15 +-
src/bin/pg_dump/pg_dump.h | 1 +
src/bin/psql/describe.c | 9 +-
src/bin/psql/tab-complete.c | 4 +-
src/include/catalog/pg_subscription.h | 3 +
src/include/replication/logical.h | 17 +-
src/include/replication/logicalproto.h | 6 +-
src/include/replication/pgoutput.h | 1 +
src/include/replication/walreceiver.h | 1 +
src/include/utils/wait_event.h | 3 +-
src/test/regress/expected/subscription.out | 185 +++++++++++-------
src/test/regress/sql/subscription.sql | 29 +++
src/test/subscription/t/001_rep_changes.pl | 27 +++
31 files changed, 583 insertions(+), 105 deletions(-)
diff --git a/doc/src/sgml/catalogs.sgml b/doc/src/sgml/catalogs.sgml
index c1e4048054..ee4c3c77e6 100644
--- a/doc/src/sgml/catalogs.sgml
+++ b/doc/src/sgml/catalogs.sgml
@@ -7892,6 +7892,16 @@ SCRAM-SHA-256$<replaceable><iteration count></replaceable>:<replaceable>&l
</para></entry>
</row>
+ <row>
+ <entry role="catalog_table_entry"><para role="column_definition">
+ <structfield>subminsenddelay</structfield> <type>int4</type>
+ </para>
+ <para>
+ The minimum delay, in milliseconds, by the publisher before sending
+ COMMIT record
+ </para></entry>
+ </row>
+
<row>
<entry role="catalog_table_entry"><para role="column_definition">
<structfield>subenabled</structfield> <type>bool</type>
diff --git a/doc/src/sgml/glossary.sgml b/doc/src/sgml/glossary.sgml
index 7c01a541fe..9ede9d05f6 100644
--- a/doc/src/sgml/glossary.sgml
+++ b/doc/src/sgml/glossary.sgml
@@ -1729,6 +1729,21 @@
</glossdef>
</glossentry>
+ <glossentry id="glossary-time-delayed-replication">
+ <glossterm>Time-delayed replication</glossterm>
+ <glossdef>
+ <para>
+ Replication setup that delays the application of changes by a specified
+ minimum time-delay period.
+ </para>
+ <para>
+ For more information, see
+ <xref linkend="guc-recovery-min-apply-delay"/> for physical replication
+ and <xref linkend="sql-createsubscription"/> for logical replication.
+ </para>
+ </glossdef>
+ </glossentry>
+
<glossentry id="glossary-toast">
<glossterm>TOAST</glossterm>
<glossdef>
diff --git a/doc/src/sgml/logical-replication.sgml b/doc/src/sgml/logical-replication.sgml
index 1bd5660c87..fe9e7f7b26 100644
--- a/doc/src/sgml/logical-replication.sgml
+++ b/doc/src/sgml/logical-replication.sgml
@@ -247,6 +247,12 @@
target table.
</para>
+ <para>
+ A subscription can delay the receipt of changes by specifying the
+ <literal>min_send_delay</literal> subscription parameter. See
+ <xref linkend="sql-createsubscription"/> for details.
+ </para>
+
<sect2 id="logical-replication-subscription-slot">
<title>Replication Slot Management</title>
diff --git a/doc/src/sgml/monitoring.sgml b/doc/src/sgml/monitoring.sgml
index b0b997f092..6158587644 100644
--- a/doc/src/sgml/monitoring.sgml
+++ b/doc/src/sgml/monitoring.sgml
@@ -2349,6 +2349,11 @@ postgres 27093 0.0 0.0 30096 2752 ? Ss 11:34 0:00 postgres: ser
<entry>Waiting to acquire an exclusive lock to truncate off any
empty pages at the end of a table vacuumed.</entry>
</row>
+ <row>
+ <entry><literal>WalSenderSendDelay</literal></entry>
+ <entry>Waiting while sending changes for time-delayed logical replication
+ in the WAL sender process.</entry>
+ </row>
</tbody>
</tgroup>
</table>
diff --git a/doc/src/sgml/ref/alter_subscription.sgml b/doc/src/sgml/ref/alter_subscription.sgml
index 964fcbb8ff..3f238b958b 100644
--- a/doc/src/sgml/ref/alter_subscription.sgml
+++ b/doc/src/sgml/ref/alter_subscription.sgml
@@ -213,8 +213,9 @@ ALTER SUBSCRIPTION <replaceable class="parameter">name</replaceable> RENAME TO <
are <literal>slot_name</literal>,
<literal>synchronous_commit</literal>,
<literal>binary</literal>, <literal>streaming</literal>,
- <literal>disable_on_error</literal>, and
- <literal>origin</literal>.
+ <literal>disable_on_error</literal>,
+ <literal>origin</literal>, and
+ <literal>min_send_delay</literal>.
</para>
</listitem>
</varlistentry>
diff --git a/doc/src/sgml/ref/create_subscription.sgml b/doc/src/sgml/ref/create_subscription.sgml
index 51c45f17c7..3862388e87 100644
--- a/doc/src/sgml/ref/create_subscription.sgml
+++ b/doc/src/sgml/ref/create_subscription.sgml
@@ -349,7 +349,43 @@ CREATE SUBSCRIPTION <replaceable class="parameter">subscription_name</replaceabl
</para>
</listitem>
</varlistentry>
- </variablelist></para>
+
+ <varlistentry>
+ <term><literal>min_send_delay</literal> (<type>integer</type>)</term>
+ <listitem>
+ <para>
+ By default, the publisher sends changes as soon as possible. This
+ parameter allows the user to delay changes by given time period. If
+ the value is specified without units, it is taken as milliseconds.
+ The default is zero (no delay). See <xref linkend="config-setting-names-values"/>
+ for details on the available valid time units.
+ </para>
+ <para>
+ The delay is effective only when the initial table synchronization
+ has been finished. However, there is a possibility that the table
+ status written in <link linkend="catalog-pg-subscription-rel"><structname>pg_subscription_rel</structname></link>
+ will be delayed in getting to "ready" state, and also two-phase
+ (if specified) will be delayed in getting to "enabled".
+ </para>
+ <para>
+ The delay does not take into account the overhead of time spent
+ transferring the transaction. Therefore, the arrival time at the
+ subscriber may be delayed more than the specified
+ <literal>min_send_delay</literal> time.
+ </para>
+ <warning>
+ <para>
+ Delaying the replication means there is a much longer time between
+ making a change on the publisher, and that change being committed
+ on the subscriber. This can impact the performance of synchronous
+ replication. See <xref linkend="guc-synchronous-commit"/>
+ parameter.
+ </para>
+ </warning>
+ </listitem>
+ </varlistentry>
+ </variablelist>
+ </para>
</listitem>
</varlistentry>
@@ -420,6 +456,11 @@ CREATE SUBSCRIPTION <replaceable class="parameter">subscription_name</replaceabl
published with different column lists are not supported.
</para>
+ <para>
+ A non-zero <literal>min_send_delay</literal> parameter is not allowed when
+ streaming in parallel mode.
+ </para>
+
<para>
We allow non-existent publications to be specified so that users can add
those later. This means
diff --git a/src/backend/catalog/pg_subscription.c b/src/backend/catalog/pg_subscription.c
index a56ae311c3..63a10b06d1 100644
--- a/src/backend/catalog/pg_subscription.c
+++ b/src/backend/catalog/pg_subscription.c
@@ -66,6 +66,7 @@ GetSubscription(Oid subid, bool missing_ok)
sub->skiplsn = subform->subskiplsn;
sub->name = pstrdup(NameStr(subform->subname));
sub->owner = subform->subowner;
+ sub->minsenddelay = subform->subminsenddelay;
sub->enabled = subform->subenabled;
sub->binary = subform->subbinary;
sub->stream = subform->substream;
diff --git a/src/backend/catalog/system_views.sql b/src/backend/catalog/system_views.sql
index 34ca0e739f..54a705d71b 100644
--- a/src/backend/catalog/system_views.sql
+++ b/src/backend/catalog/system_views.sql
@@ -1314,9 +1314,10 @@ REVOKE ALL ON pg_replication_origin_status FROM public;
-- All columns of pg_subscription except subconninfo are publicly readable.
REVOKE ALL ON pg_subscription FROM public;
-GRANT SELECT (oid, subdbid, subskiplsn, subname, subowner, subenabled,
- subbinary, substream, subtwophasestate, subdisableonerr,
- subslotname, subsynccommit, subpublications, suborigin)
+GRANT SELECT (oid, subdbid, subskiplsn, subname, subowner, subminsenddelay,
+ subenabled, subbinary, substream, subtwophasestate,
+ subdisableonerr, subslotname, subsynccommit, subpublications,
+ suborigin)
ON pg_subscription TO public;
CREATE VIEW pg_stat_subscription_stats AS
diff --git a/src/backend/commands/subscriptioncmds.c b/src/backend/commands/subscriptioncmds.c
index 464db6d247..4a8cd47171 100644
--- a/src/backend/commands/subscriptioncmds.c
+++ b/src/backend/commands/subscriptioncmds.c
@@ -66,6 +66,7 @@
#define SUBOPT_DISABLE_ON_ERR 0x00000400
#define SUBOPT_LSN 0x00000800
#define SUBOPT_ORIGIN 0x00001000
+#define SUBOPT_MIN_SEND_DELAY 0x00002000
/* check if the 'val' has 'bits' set */
#define IsSet(val, bits) (((val) & (bits)) == (bits))
@@ -90,6 +91,7 @@ typedef struct SubOpts
bool disableonerr;
char *origin;
XLogRecPtr lsn;
+ int32 min_send_delay;
} SubOpts;
static List *fetch_table_list(WalReceiverConn *wrconn, List *publications);
@@ -100,7 +102,7 @@ static void check_publications_origin(WalReceiverConn *wrconn,
static void check_duplicates_in_publist(List *publist, Datum *datums);
static List *merge_publications(List *oldpublist, List *newpublist, bool addpub, const char *subname);
static void ReportSlotConnectionError(List *rstates, Oid subid, char *slotname, char *err);
-
+static int32 defGetMinSendDelay(DefElem *def);
/*
* Common option parsing function for CREATE and ALTER SUBSCRIPTION commands.
@@ -146,6 +148,8 @@ parse_subscription_options(ParseState *pstate, List *stmt_options,
opts->disableonerr = false;
if (IsSet(supported_opts, SUBOPT_ORIGIN))
opts->origin = pstrdup(LOGICALREP_ORIGIN_ANY);
+ if (IsSet(supported_opts, SUBOPT_MIN_SEND_DELAY))
+ opts->min_send_delay = 0;
/* Parse options */
foreach(lc, stmt_options)
@@ -324,6 +328,15 @@ parse_subscription_options(ParseState *pstate, List *stmt_options,
opts->specified_opts |= SUBOPT_LSN;
opts->lsn = lsn;
}
+ else if (IsSet(supported_opts, SUBOPT_MIN_SEND_DELAY) &&
+ strcmp(defel->defname, "min_send_delay") == 0)
+ {
+ if (IsSet(opts->specified_opts, SUBOPT_MIN_SEND_DELAY))
+ errorConflictingDefElem(defel, pstate);
+
+ opts->specified_opts |= SUBOPT_MIN_SEND_DELAY;
+ opts->min_send_delay = defGetMinSendDelay(defel);
+ }
else
ereport(ERROR,
(errcode(ERRCODE_SYNTAX_ERROR),
@@ -404,6 +417,29 @@ parse_subscription_options(ParseState *pstate, List *stmt_options,
"slot_name = NONE", "create_slot = false")));
}
}
+
+ /*
+ * The combination of parallel streaming mode and min_send_delay is not
+ * allowed. This is because in parallel streaming mode, the walsender
+ * starts sending the transaction stream without knowing the prepare/commit
+ * time of the transaction. Always waiting for the full 'min_send_delay'
+ * time to send may introduce unnecessary delay.
+ *
+ * The other possibility was to wait sending COMMIT record of the parallel
+ * apply transaction but that would cause issues related to resource bloat
+ * and locks being held for a long time.
+ */
+ if (IsSet(supported_opts, SUBOPT_MIN_SEND_DELAY) &&
+ opts->min_send_delay > 0 &&
+ opts->streaming == LOGICALREP_STREAM_PARALLEL)
+ ereport(ERROR,
+ errcode(ERRCODE_SYNTAX_ERROR),
+ /*
+ * translator: the first %s is a string of the form "parameter > 0"
+ * and the second one is "option = value".
+ */
+ errmsg("%s and %s are mutually exclusive options",
+ "min_send_delay > 0", "streaming = parallel"));
}
/*
@@ -560,7 +596,8 @@ CreateSubscription(ParseState *pstate, CreateSubscriptionStmt *stmt,
SUBOPT_SLOT_NAME | SUBOPT_COPY_DATA |
SUBOPT_SYNCHRONOUS_COMMIT | SUBOPT_BINARY |
SUBOPT_STREAMING | SUBOPT_TWOPHASE_COMMIT |
- SUBOPT_DISABLE_ON_ERR | SUBOPT_ORIGIN);
+ SUBOPT_DISABLE_ON_ERR | SUBOPT_ORIGIN |
+ SUBOPT_MIN_SEND_DELAY);
parse_subscription_options(pstate, stmt->options, supported_opts, &opts);
/*
@@ -628,6 +665,7 @@ CreateSubscription(ParseState *pstate, CreateSubscriptionStmt *stmt,
values[Anum_pg_subscription_subname - 1] =
DirectFunctionCall1(namein, CStringGetDatum(stmt->subname));
values[Anum_pg_subscription_subowner - 1] = ObjectIdGetDatum(owner);
+ values[Anum_pg_subscription_subminsenddelay - 1] = Int32GetDatum(opts.min_send_delay);
values[Anum_pg_subscription_subenabled - 1] = BoolGetDatum(opts.enabled);
values[Anum_pg_subscription_subbinary - 1] = BoolGetDatum(opts.binary);
values[Anum_pg_subscription_substream - 1] = CharGetDatum(opts.streaming);
@@ -1054,7 +1092,7 @@ AlterSubscription(ParseState *pstate, AlterSubscriptionStmt *stmt,
supported_opts = (SUBOPT_SLOT_NAME |
SUBOPT_SYNCHRONOUS_COMMIT | SUBOPT_BINARY |
SUBOPT_STREAMING | SUBOPT_DISABLE_ON_ERR |
- SUBOPT_ORIGIN);
+ SUBOPT_ORIGIN | SUBOPT_MIN_SEND_DELAY);
parse_subscription_options(pstate, stmt->options,
supported_opts, &opts);
@@ -1098,6 +1136,19 @@ AlterSubscription(ParseState *pstate, AlterSubscriptionStmt *stmt,
if (IsSet(opts.specified_opts, SUBOPT_STREAMING))
{
+ /*
+ * The combination of parallel streaming mode and
+ * min_send_delay is not allowed. See
+ * parse_subscription_options.
+ */
+ if (opts.streaming == LOGICALREP_STREAM_PARALLEL &&
+ !IsSet(opts.specified_opts, SUBOPT_MIN_SEND_DELAY) &&
+ sub->minsenddelay > 0)
+ ereport(ERROR,
+ errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
+ errmsg("cannot set parallel streaming mode for subscription with %s",
+ "min_send_delay"));
+
values[Anum_pg_subscription_substream - 1] =
CharGetDatum(opts.streaming);
replaces[Anum_pg_subscription_substream - 1] = true;
@@ -1111,6 +1162,26 @@ AlterSubscription(ParseState *pstate, AlterSubscriptionStmt *stmt,
= true;
}
+ if (IsSet(opts.specified_opts, SUBOPT_MIN_SEND_DELAY))
+ {
+ /*
+ * The combination of parallel streaming mode and
+ * min_send_delay is not allowed. See
+ * parse_subscription_options.
+ */
+ if (opts.min_send_delay > 0 &&
+ !IsSet(opts.specified_opts, SUBOPT_STREAMING) &&
+ sub->stream == LOGICALREP_STREAM_PARALLEL)
+ ereport(ERROR,
+ errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
+ errmsg("cannot set %s for subscription in parallel streaming mode",
+ "min_send_delay"));
+
+ values[Anum_pg_subscription_subminsenddelay - 1] =
+ Int32GetDatum(opts.min_send_delay);
+ replaces[Anum_pg_subscription_subminsenddelay - 1] = true;
+ }
+
if (IsSet(opts.specified_opts, SUBOPT_ORIGIN))
{
values[Anum_pg_subscription_suborigin - 1] =
@@ -2195,3 +2266,45 @@ defGetStreamingMode(DefElem *def)
def->defname)));
return LOGICALREP_STREAM_OFF; /* keep compiler quiet */
}
+
+/*
+ * Extract the min_send_delay value from a DefElem. This is very similar to
+ * parse_and_validate_value() for integer values, because min_send_delay
+ * accepts the same parameter format as recovery_min_apply_delay.
+ */
+static int32
+defGetMinSendDelay(DefElem *def)
+{
+ char *input_string;
+ int result;
+ const char *hintmsg;
+
+ input_string = defGetString(def);
+
+ /*
+ * Parse given string as parameter which has millisecond unit
+ */
+ if (!parse_int(input_string, &result, GUC_UNIT_MS, &hintmsg))
+ ereport(ERROR,
+ (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+ errmsg("invalid value for parameter \"%s\": \"%s\"",
+ "min_send_delay", input_string),
+ hintmsg ? errhint("%s", _(hintmsg)) : 0));
+
+ /*
+ * Check both the lower boundary for the valid min_send_delay range and
+ * the upper boundary as the safeguard for some platforms where INT_MAX is
+ * wider than int32 respectively. Although parse_int() has confirmed that
+ * the result is less than or equal to INT_MAX, the value will be stored
+ * in a catalog column of int32.
+ */
+ if (result < 0 || result > PG_INT32_MAX)
+ ereport(ERROR,
+ (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+ errmsg("%d ms is outside the valid range for parameter \"%s\" (%d .. %d)",
+ result,
+ "min_send_delay",
+ 0, PG_INT32_MAX)));
+
+ return result;
+}
diff --git a/src/backend/replication/libpqwalreceiver/libpqwalreceiver.c b/src/backend/replication/libpqwalreceiver/libpqwalreceiver.c
index 560ec974fa..89a72c1abe 100644
--- a/src/backend/replication/libpqwalreceiver/libpqwalreceiver.c
+++ b/src/backend/replication/libpqwalreceiver/libpqwalreceiver.c
@@ -443,6 +443,11 @@ libpqrcv_startstreaming(WalReceiverConn *conn,
PQserverVersion(conn->streamConn) >= 140000)
appendStringInfoString(&cmd, ", binary 'true'");
+ if (options->proto.logical.min_send_delay > 0 &&
+ PQserverVersion(conn->streamConn) >= 160000)
+ appendStringInfo(&cmd, ", min_send_delay '%d'",
+ options->proto.logical.min_send_delay);
+
appendStringInfoChar(&cmd, ')');
}
else
diff --git a/src/backend/replication/logical/decode.c b/src/backend/replication/logical/decode.c
index a53e23c679..2674444894 100644
--- a/src/backend/replication/logical/decode.c
+++ b/src/backend/replication/logical/decode.c
@@ -679,6 +679,15 @@ DecodeCommit(LogicalDecodingContext *ctx, XLogRecordBuffer *buf,
}
else
{
+ /*
+ * Delay sending the changes if required. For streaming transactions,
+ * this means a delay in sending the last stream but that is OK because
+ * on the downstream the changes will be applied only after receiving
+ * the last stream.
+ */
+ if (ctx->min_send_delay > 0 && ctx->delay_send)
+ ctx->delay_send(ctx, commit_time);
+
ReorderBufferCommit(ctx->reorder, xid, buf->origptr, buf->endptr,
commit_time, origin_id, origin_lsn);
}
@@ -763,6 +772,15 @@ DecodePrepare(LogicalDecodingContext *ctx, XLogRecordBuffer *buf,
buf->origptr, buf->endptr);
}
+ /*
+ * Delay sending the changes if required. For streaming transactions,
+ * this means a delay in sending the last stream but that is OK because
+ * on the downstream the changes will be applied only after receiving
+ * the last stream.
+ */
+ if (ctx->min_send_delay > 0 && ctx->delay_send)
+ ctx->delay_send(ctx, prepare_time);
+
/* replay actions of all transaction + subtransactions in order */
ReorderBufferPrepare(ctx->reorder, xid, parsed->twophase_gid);
diff --git a/src/backend/replication/logical/logical.c b/src/backend/replication/logical/logical.c
index c3ec97a0a6..e4dd822cdc 100644
--- a/src/backend/replication/logical/logical.c
+++ b/src/backend/replication/logical/logical.c
@@ -156,7 +156,8 @@ StartupDecodingContext(List *output_plugin_options,
XLogReaderRoutine *xl_routine,
LogicalOutputPluginWriterPrepareWrite prepare_write,
LogicalOutputPluginWriterWrite do_write,
- LogicalOutputPluginWriterUpdateProgress update_progress)
+ LogicalOutputPluginWriterUpdateProgress update_progress,
+ LogicalOutputPluginWriterDelay delay_send)
{
ReplicationSlot *slot;
MemoryContext context,
@@ -293,6 +294,7 @@ StartupDecodingContext(List *output_plugin_options,
ctx->prepare_write = prepare_write;
ctx->write = do_write;
ctx->update_progress = update_progress;
+ ctx->delay_send = delay_send;
ctx->output_plugin_options = output_plugin_options;
@@ -316,7 +318,7 @@ StartupDecodingContext(List *output_plugin_options,
* marking WAL reserved beforehand. In that scenario, it's up to the
* caller to guarantee that WAL remains available.
* xl_routine -- XLogReaderRoutine for underlying XLogReader
- * prepare_write, do_write, update_progress --
+ * prepare_write, do_write, update_progress, delay_send --
* callbacks that perform the use-case dependent, actual, work.
*
* Needs to be called while in a memory context that's at least as long lived
@@ -334,7 +336,8 @@ CreateInitDecodingContext(const char *plugin,
XLogReaderRoutine *xl_routine,
LogicalOutputPluginWriterPrepareWrite prepare_write,
LogicalOutputPluginWriterWrite do_write,
- LogicalOutputPluginWriterUpdateProgress update_progress)
+ LogicalOutputPluginWriterUpdateProgress update_progress,
+ LogicalOutputPluginWriterDelay delay_send)
{
TransactionId xmin_horizon = InvalidTransactionId;
ReplicationSlot *slot;
@@ -435,7 +438,7 @@ CreateInitDecodingContext(const char *plugin,
ctx = StartupDecodingContext(NIL, restart_lsn, xmin_horizon,
need_full_snapshot, false,
xl_routine, prepare_write, do_write,
- update_progress);
+ update_progress, delay_send);
/* call output plugin initialization callback */
old_context = MemoryContextSwitchTo(ctx->context);
@@ -475,7 +478,7 @@ CreateInitDecodingContext(const char *plugin,
* xl_routine
* XLogReaderRoutine used by underlying xlogreader
*
- * prepare_write, do_write, update_progress
+ * prepare_write, do_write, update_progress, delay_send
* callbacks that have to be filled to perform the use-case dependent,
* actual work.
*
@@ -493,7 +496,8 @@ CreateDecodingContext(XLogRecPtr start_lsn,
XLogReaderRoutine *xl_routine,
LogicalOutputPluginWriterPrepareWrite prepare_write,
LogicalOutputPluginWriterWrite do_write,
- LogicalOutputPluginWriterUpdateProgress update_progress)
+ LogicalOutputPluginWriterUpdateProgress update_progress,
+ LogicalOutputPluginWriterDelay delay_send)
{
LogicalDecodingContext *ctx;
ReplicationSlot *slot;
@@ -547,7 +551,7 @@ CreateDecodingContext(XLogRecPtr start_lsn,
ctx = StartupDecodingContext(output_plugin_options,
start_lsn, InvalidTransactionId, false,
fast_forward, xl_routine, prepare_write,
- do_write, update_progress);
+ do_write, update_progress, delay_send);
/* call output plugin initialization callback */
old_context = MemoryContextSwitchTo(ctx->context);
diff --git a/src/backend/replication/logical/logicalfuncs.c b/src/backend/replication/logical/logicalfuncs.c
index fa1b641a2b..960025197f 100644
--- a/src/backend/replication/logical/logicalfuncs.c
+++ b/src/backend/replication/logical/logicalfuncs.c
@@ -212,7 +212,7 @@ pg_logical_slot_get_changes_guts(FunctionCallInfo fcinfo, bool confirm, bool bin
.segment_open = wal_segment_open,
.segment_close = wal_segment_close),
LogicalOutputPrepareWrite,
- LogicalOutputWrite, NULL);
+ LogicalOutputWrite, NULL, NULL);
/*
* After the sanity checks in CreateDecodingContext, make sure the
diff --git a/src/backend/replication/logical/worker.c b/src/backend/replication/logical/worker.c
index cfb2ab6248..e68902ae34 100644
--- a/src/backend/replication/logical/worker.c
+++ b/src/backend/replication/logical/worker.c
@@ -3898,7 +3898,8 @@ maybe_reread_subscription(void)
newsub->stream != MySubscription->stream ||
strcmp(newsub->origin, MySubscription->origin) != 0 ||
newsub->owner != MySubscription->owner ||
- !equal(newsub->publications, MySubscription->publications))
+ !equal(newsub->publications, MySubscription->publications) ||
+ newsub->minsenddelay != MySubscription->minsenddelay)
{
if (am_parallel_apply_worker())
ereport(LOG,
@@ -4617,9 +4618,18 @@ ApplyWorkerMain(Datum main_arg)
options.proto.logical.twophase = false;
options.proto.logical.origin = pstrdup(MySubscription->origin);
+ options.proto.logical.min_send_delay = 0;
if (!am_tablesync_worker())
{
+ /*
+ * Time-delayed logical replication does not support tablesync
+ * workers, so only the leader apply worker can request walsenders to
+ * delay on the publisher side.
+ */
+ if (server_version >= 160000 && MySubscription->minsenddelay > 0)
+ options.proto.logical.min_send_delay = MySubscription->minsenddelay;
+
/*
* Even when the two_phase mode is requested by the user, it remains
* as the tri-state PENDING until all tablesyncs have reached READY
diff --git a/src/backend/replication/pgoutput/pgoutput.c b/src/backend/replication/pgoutput/pgoutput.c
index 98377c094b..b7271d0f8a 100644
--- a/src/backend/replication/pgoutput/pgoutput.c
+++ b/src/backend/replication/pgoutput/pgoutput.c
@@ -285,6 +285,7 @@ parse_output_parameters(List *options, PGOutputData *data)
bool streaming_given = false;
bool two_phase_option_given = false;
bool origin_option_given = false;
+ bool min_send_delay_option_given = false;
data->binary = false;
data->streaming = LOGICALREP_STREAM_OFF;
@@ -396,6 +397,32 @@ parse_output_parameters(List *options, PGOutputData *data)
errcode(ERRCODE_INVALID_PARAMETER_VALUE),
errmsg("unrecognized origin value: \"%s\"", data->origin));
}
+ else if (strcmp(defel->defname, "min_send_delay") == 0)
+ {
+ unsigned long parsed;
+ char *endptr;
+
+ if (min_send_delay_option_given)
+ ereport(ERROR,
+ errcode(ERRCODE_SYNTAX_ERROR),
+ errmsg("conflicting or redundant options"));
+ min_send_delay_option_given = true;
+
+ errno = 0;
+ parsed = strtoul(strVal(defel->arg), &endptr, 10);
+ if (errno != 0 || *endptr != '\0')
+ ereport(ERROR,
+ (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+ errmsg("invalid min_send_delay")));
+
+ if (parsed > PG_INT32_MAX)
+ ereport(ERROR,
+ (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+ errmsg("min_send_delay \"%s\" out of range",
+ strVal(defel->arg))));
+
+ data->min_send_delay = (int32) parsed;
+ }
else
elog(ERROR, "unrecognized pgoutput option: %s", defel->defname);
}
@@ -501,6 +528,15 @@ pgoutput_startup(LogicalDecodingContext *ctx, OutputPluginOptions *opt,
else
ctx->twophase_opt_given = true;
+ if (data->min_send_delay &&
+ data->protocol_version < LOGICALREP_PROTO_MIN_SEND_DELAY_VERSION_NUM)
+ ereport(ERROR,
+ (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+ errmsg("requested proto_version=%d does not support delay sending data, need %d or higher",
+ data->protocol_version, LOGICALREP_PROTO_MIN_SEND_DELAY_VERSION_NUM)));
+ else
+ ctx->min_send_delay = data->min_send_delay;
+
/* Init publication state. */
data->publications = NIL;
publications_valid = false;
diff --git a/src/backend/replication/slotfuncs.c b/src/backend/replication/slotfuncs.c
index 2f3c964824..522f7600a1 100644
--- a/src/backend/replication/slotfuncs.c
+++ b/src/backend/replication/slotfuncs.c
@@ -148,7 +148,7 @@ create_logical_replication_slot(char *name, char *plugin,
XL_ROUTINE(.page_read = read_local_xlog_page,
.segment_open = wal_segment_open,
.segment_close = wal_segment_close),
- NULL, NULL, NULL);
+ NULL, NULL, NULL, NULL);
/*
* If caller needs us to determine the decoding start point, do so now.
@@ -481,7 +481,7 @@ pg_logical_replication_slot_advance(XLogRecPtr moveto)
XL_ROUTINE(.page_read = read_local_xlog_page,
.segment_open = wal_segment_open,
.segment_close = wal_segment_close),
- NULL, NULL, NULL);
+ NULL, NULL, NULL, NULL);
/*
* Start reading at the slot's restart_lsn, which we know to point to
diff --git a/src/backend/replication/walsender.c b/src/backend/replication/walsender.c
index 75e8363e24..9537fba7df 100644
--- a/src/backend/replication/walsender.c
+++ b/src/backend/replication/walsender.c
@@ -252,6 +252,7 @@ static void WalSndPrepareWrite(LogicalDecodingContext *ctx, XLogRecPtr lsn, Tran
static void WalSndWriteData(LogicalDecodingContext *ctx, XLogRecPtr lsn, TransactionId xid, bool last_write);
static void WalSndUpdateProgress(LogicalDecodingContext *ctx, XLogRecPtr lsn, TransactionId xid,
bool skipped_xact);
+static void WalSndDelay(LogicalDecodingContext *ctx, TimestampTz delay_start);
static XLogRecPtr WalSndWaitForWal(XLogRecPtr loc);
static void LagTrackerWrite(XLogRecPtr lsn, TimestampTz local_flush_time);
static TimeOffset LagTrackerRead(int head, XLogRecPtr lsn, TimestampTz now);
@@ -1126,7 +1127,7 @@ CreateReplicationSlot(CreateReplicationSlotCmd *cmd)
.segment_open = WalSndSegmentOpen,
.segment_close = wal_segment_close),
WalSndPrepareWrite, WalSndWriteData,
- WalSndUpdateProgress);
+ WalSndUpdateProgress, WalSndDelay);
/*
* Signal that we don't need the timeout mechanism. We're just
@@ -1285,7 +1286,7 @@ StartLogicalReplication(StartReplicationCmd *cmd)
.segment_open = WalSndSegmentOpen,
.segment_close = wal_segment_close),
WalSndPrepareWrite, WalSndWriteData,
- WalSndUpdateProgress);
+ WalSndUpdateProgress, WalSndDelay);
xlogreader = logical_decoding_ctx->reader;
WalSndSetState(WALSNDSTATE_CATCHUP);
@@ -3849,3 +3850,76 @@ LagTrackerRead(int head, XLogRecPtr lsn, TimestampTz now)
Assert(time != 0);
return now - time;
}
+
+/*
+ * LogicalDecodingContext 'delay' callback.
+ *
+ * Wait long enough to make sure a transaction is applied at least that
+ * period behind the publisher.
+ */
+static void
+WalSndDelay(LogicalDecodingContext *ctx, TimestampTz delay_start)
+{
+ /* Wait till delayUntil by the latch mechanism */
+ while (true)
+ {
+ TimestampTz delayUntil;
+ long remaining_wait_time_ms;
+ long timeout_sleeptime_ms;
+
+ ResetLatch(MyLatch);
+
+ CHECK_FOR_INTERRUPTS();
+
+ /* This might change wal_sender_timeout */
+ if (ConfigReloadPending)
+ {
+ ConfigReloadPending = false;
+ ProcessConfigFile(PGC_SIGHUP);
+ }
+
+ /* Check for input from the client */
+ ProcessRepliesIfAny();
+
+ /* die if timeout was reached */
+ WalSndCheckTimeOut();
+
+ /* Send keepalive if the time has come */
+ WalSndKeepaliveIfNecessary();
+
+ /* Try to flush pending output to the client */
+ if (pq_flush_if_writable() != 0)
+ WalSndShutdown();
+
+ /*
+ * If we've requested to shut down, exit the process.
+ *
+ * Note that WalSndDone() cannot be used here because the delaying
+ * changes will be sent in the function.
+ */
+ if (got_STOPPING)
+ WalSndShutdown();
+
+ delayUntil = TimestampTzPlusMilliseconds(delay_start, ctx->min_send_delay);
+ remaining_wait_time_ms = TimestampDifferenceMilliseconds(GetCurrentTimestamp(), delayUntil);
+
+ /*
+ * Exit without arming the latch if it's already past time to send
+ * this transaction.
+ */
+ if (remaining_wait_time_ms <= 0)
+ break;
+
+ /* Sleep until appropriate time. */
+ timeout_sleeptime_ms = WalSndComputeSleeptime(GetCurrentTimestamp());
+
+ elog(DEBUG2, "time-delayed replication for txid %u, delay_time = %d ms, remaining wait time: %ld ms",
+ ctx->write_xid, (int) ctx->min_send_delay,
+ remaining_wait_time_ms);
+
+ /* Sleep until we get reply from worker or we time out */
+ WalSndWait(WL_SOCKET_READABLE,
+ Min(timeout_sleeptime_ms, remaining_wait_time_ms),
+ WAIT_EVENT_WALSENDER_SEND_DELAY);
+ }
+}
diff --git a/src/backend/utils/activity/wait_event.c b/src/backend/utils/activity/wait_event.c
index cb99cc6339..76c19fe11d 100644
--- a/src/backend/utils/activity/wait_event.c
+++ b/src/backend/utils/activity/wait_event.c
@@ -515,6 +515,9 @@ pgstat_get_wait_timeout(WaitEventTimeout w)
case WAIT_EVENT_VACUUM_TRUNCATE:
event_name = "VacuumTruncate";
break;
+ case WAIT_EVENT_WALSENDER_SEND_DELAY:
+ event_name = "WalSenderSendDelay";
+ break;
/* no default case, so that compiler will warn */
}
diff --git a/src/bin/pg_dump/pg_dump.c b/src/bin/pg_dump/pg_dump.c
index 527c7651ab..bd95747840 100644
--- a/src/bin/pg_dump/pg_dump.c
+++ b/src/bin/pg_dump/pg_dump.c
@@ -4494,6 +4494,7 @@ getSubscriptions(Archive *fout)
int i_subsynccommit;
int i_subpublications;
int i_subbinary;
+ int i_subminsenddelay;
int i,
ntups;
@@ -4546,9 +4547,13 @@ getSubscriptions(Archive *fout)
LOGICALREP_TWOPHASE_STATE_DISABLED);
if (fout->remoteVersion >= 160000)
- appendPQExpBufferStr(query, " s.suborigin\n");
+ appendPQExpBufferStr(query,
+ " s.suborigin,\n"
+ " s.subminsenddelay\n");
else
- appendPQExpBuffer(query, " '%s' AS suborigin\n", LOGICALREP_ORIGIN_ANY);
+ appendPQExpBuffer(query, " '%s' AS suborigin,\n"
+ " 0 AS subminsenddelay\n",
+ LOGICALREP_ORIGIN_ANY);
appendPQExpBufferStr(query,
"FROM pg_subscription s\n"
@@ -4576,6 +4581,7 @@ getSubscriptions(Archive *fout)
i_subtwophasestate = PQfnumber(res, "subtwophasestate");
i_subdisableonerr = PQfnumber(res, "subdisableonerr");
i_suborigin = PQfnumber(res, "suborigin");
+ i_subminsenddelay = PQfnumber(res, "subminsenddelay");
subinfo = pg_malloc(ntups * sizeof(SubscriptionInfo));
@@ -4606,6 +4612,8 @@ getSubscriptions(Archive *fout)
subinfo[i].subdisableonerr =
pg_strdup(PQgetvalue(res, i, i_subdisableonerr));
subinfo[i].suborigin = pg_strdup(PQgetvalue(res, i, i_suborigin));
+ subinfo[i].subminsenddelay =
+ atoi(PQgetvalue(res, i, i_subminsenddelay));
/* Decide whether we want to dump it */
selectDumpableObject(&(subinfo[i].dobj), fout);
@@ -4687,6 +4695,9 @@ dumpSubscription(Archive *fout, const SubscriptionInfo *subinfo)
if (strcmp(subinfo->subsynccommit, "off") != 0)
appendPQExpBuffer(query, ", synchronous_commit = %s", fmtId(subinfo->subsynccommit));
+ if (subinfo->subminsenddelay > 0)
+ appendPQExpBuffer(query, ", min_send_delay = '%d ms'", subinfo->subminsenddelay);
+
appendPQExpBufferStr(query, ");\n");
if (subinfo->dobj.dump & DUMP_COMPONENT_DEFINITION)
diff --git a/src/bin/pg_dump/pg_dump.h b/src/bin/pg_dump/pg_dump.h
index e7cbd8d7ed..24e0f6737f 100644
--- a/src/bin/pg_dump/pg_dump.h
+++ b/src/bin/pg_dump/pg_dump.h
@@ -661,6 +661,7 @@ typedef struct _SubscriptionInfo
char *subdisableonerr;
char *suborigin;
char *subsynccommit;
+ int subminsenddelay;
char *subpublications;
} SubscriptionInfo;
diff --git a/src/bin/psql/describe.c b/src/bin/psql/describe.c
index c8a0bb7b3a..c7d303a168 100644
--- a/src/bin/psql/describe.c
+++ b/src/bin/psql/describe.c
@@ -6472,7 +6472,7 @@ describeSubscriptions(const char *pattern, bool verbose)
PGresult *res;
printQueryOpt myopt = pset.popt;
static const bool translate_columns[] = {false, false, false, false,
- false, false, false, false, false, false, false, false};
+ false, false, false, false, false, false, false, false, false};
if (pset.sversion < 100000)
{
@@ -6527,10 +6527,13 @@ describeSubscriptions(const char *pattern, bool verbose)
gettext_noop("Two-phase commit"),
gettext_noop("Disable on error"));
+ /* Origin and min_send_delay are only supported in v16 and higher */
if (pset.sversion >= 160000)
appendPQExpBuffer(&buf,
- ", suborigin AS \"%s\"\n",
- gettext_noop("Origin"));
+ ", suborigin AS \"%s\"\n"
+ ", subminsenddelay AS \"%s\"\n",
+ gettext_noop("Origin"),
+ gettext_noop("Min send delay"));
appendPQExpBuffer(&buf,
", subsynccommit AS \"%s\"\n"
diff --git a/src/bin/psql/tab-complete.c b/src/bin/psql/tab-complete.c
index 5e1882eaea..6643db6f55 100644
--- a/src/bin/psql/tab-complete.c
+++ b/src/bin/psql/tab-complete.c
@@ -1925,7 +1925,7 @@ psql_completion(const char *text, int start, int end)
COMPLETE_WITH("(", "PUBLICATION");
/* ALTER SUBSCRIPTION <name> SET ( */
else if (HeadMatches("ALTER", "SUBSCRIPTION", MatchAny) && TailMatches("SET", "("))
- COMPLETE_WITH("binary", "disable_on_error", "origin", "slot_name",
+ COMPLETE_WITH("binary", "disable_on_error", "min_send_delay", "origin", "slot_name",
"streaming", "synchronous_commit");
/* ALTER SUBSCRIPTION <name> SKIP ( */
else if (HeadMatches("ALTER", "SUBSCRIPTION", MatchAny) && TailMatches("SKIP", "("))
@@ -3268,7 +3268,7 @@ psql_completion(const char *text, int start, int end)
/* Complete "CREATE SUBSCRIPTION <name> ... WITH ( <opt>" */
else if (HeadMatches("CREATE", "SUBSCRIPTION") && TailMatches("WITH", "("))
COMPLETE_WITH("binary", "connect", "copy_data", "create_slot",
- "disable_on_error", "enabled", "origin", "slot_name",
+ "disable_on_error", "enabled", "min_send_delay", "origin", "slot_name",
"streaming", "synchronous_commit", "two_phase");
/* CREATE TRIGGER --- is allowed inside CREATE SCHEMA, so use TailMatches */
diff --git a/src/include/catalog/pg_subscription.h b/src/include/catalog/pg_subscription.h
index b0f2a1705d..69ae4314b4 100644
--- a/src/include/catalog/pg_subscription.h
+++ b/src/include/catalog/pg_subscription.h
@@ -74,6 +74,8 @@ CATALOG(pg_subscription,6100,SubscriptionRelationId) BKI_SHARED_RELATION BKI_ROW
Oid subowner BKI_LOOKUP(pg_authid); /* Owner of the subscription */
+ int32 subminsenddelay; /* Replication send delay (ms) */
+
bool subenabled; /* True if the subscription is enabled (the
* worker should be running) */
@@ -122,6 +124,7 @@ typedef struct Subscription
* skipped */
char *name; /* Name of the subscription */
Oid owner; /* Oid of the subscription owner */
+ int32 minsenddelay; /* Replication send delay (ms) */
bool enabled; /* Indicates if the subscription is enabled */
bool binary; /* Indicates if the subscription wants data in
* binary format */
diff --git a/src/include/replication/logical.h b/src/include/replication/logical.h
index 5f49554ea0..ae6d873a94 100644
--- a/src/include/replication/logical.h
+++ b/src/include/replication/logical.h
@@ -30,6 +30,10 @@ typedef void (*LogicalOutputPluginWriterUpdateProgress) (struct LogicalDecodingC
bool skipped_xact
);
+typedef void (*LogicalOutputPluginWriterDelay) (struct LogicalDecodingContext *lr,
+ TimestampTz start_time
+);
+
typedef struct LogicalDecodingContext
{
/* memory context this is all allocated in */
@@ -64,6 +68,7 @@ typedef struct LogicalDecodingContext
LogicalOutputPluginWriterPrepareWrite prepare_write;
LogicalOutputPluginWriterWrite write;
LogicalOutputPluginWriterUpdateProgress update_progress;
+ LogicalOutputPluginWriterDelay delay_send;
/*
* Output buffer.
@@ -100,6 +105,12 @@ typedef struct LogicalDecodingContext
*/
bool twophase_opt_given;
+ /*
+ * The minimum delay, in milliseconds, by the publisher before sending
+ * COMMIT/PREPARE record
+ */
+ int32 min_send_delay;
+
/*
* State for writing output.
*/
@@ -121,14 +132,16 @@ extern LogicalDecodingContext *CreateInitDecodingContext(const char *plugin,
XLogReaderRoutine *xl_routine,
LogicalOutputPluginWriterPrepareWrite prepare_write,
LogicalOutputPluginWriterWrite do_write,
- LogicalOutputPluginWriterUpdateProgress update_progress);
+ LogicalOutputPluginWriterUpdateProgress update_progress,
+ LogicalOutputPluginWriterDelay delay_send);
extern LogicalDecodingContext *CreateDecodingContext(XLogRecPtr start_lsn,
List *output_plugin_options,
bool fast_forward,
XLogReaderRoutine *xl_routine,
LogicalOutputPluginWriterPrepareWrite prepare_write,
LogicalOutputPluginWriterWrite do_write,
- LogicalOutputPluginWriterUpdateProgress update_progress);
+ LogicalOutputPluginWriterUpdateProgress update_progress,
+ LogicalOutputPluginWriterDelay delay_send);
extern void DecodingContextFindStartpoint(LogicalDecodingContext *ctx);
extern bool DecodingContextReady(LogicalDecodingContext *ctx);
extern void FreeDecodingContext(LogicalDecodingContext *ctx);
diff --git a/src/include/replication/logicalproto.h b/src/include/replication/logicalproto.h
index 0ea2df5088..46faadbd7a 100644
--- a/src/include/replication/logicalproto.h
+++ b/src/include/replication/logicalproto.h
@@ -36,13 +36,17 @@
* LOGICALREP_PROTO_STREAM_PARALLEL_VERSION_NUM is the minimum protocol version
* where we support applying large streaming transactions in parallel.
* Introduced in PG16.
+ *
+ * LOGICALREP_PROTO_MIN_SEND_DELAY_VERSION_NUM is the minimum protocol version
+ * with support for delaying to send transactions. Introduced in PG16.
*/
#define LOGICALREP_PROTO_MIN_VERSION_NUM 1
#define LOGICALREP_PROTO_VERSION_NUM 1
#define LOGICALREP_PROTO_STREAM_VERSION_NUM 2
#define LOGICALREP_PROTO_TWOPHASE_VERSION_NUM 3
#define LOGICALREP_PROTO_STREAM_PARALLEL_VERSION_NUM 4
-#define LOGICALREP_PROTO_MAX_VERSION_NUM LOGICALREP_PROTO_STREAM_PARALLEL_VERSION_NUM
+#define LOGICALREP_PROTO_MIN_SEND_DELAY_VERSION_NUM 4
+#define LOGICALREP_PROTO_MAX_VERSION_NUM LOGICALREP_PROTO_MIN_SEND_DELAY_VERSION_NUM
/*
* Logical message types
diff --git a/src/include/replication/pgoutput.h b/src/include/replication/pgoutput.h
index b4a8015403..d2fde09e00 100644
--- a/src/include/replication/pgoutput.h
+++ b/src/include/replication/pgoutput.h
@@ -30,6 +30,7 @@ typedef struct PGOutputData
bool messages;
bool two_phase;
char *origin;
+ int32 min_send_delay;
} PGOutputData;
#endif /* PGOUTPUT_H */
diff --git a/src/include/replication/walreceiver.h b/src/include/replication/walreceiver.h
index decffe352d..c20969aed7 100644
--- a/src/include/replication/walreceiver.h
+++ b/src/include/replication/walreceiver.h
@@ -187,6 +187,7 @@ typedef struct
* prepare time */
char *origin; /* Only publish data originating from the
* specified origin */
+ int32 min_send_delay; /* The minimum send delay */
} logical;
} proto;
} WalRcvStreamOptions;
diff --git a/src/include/utils/wait_event.h b/src/include/utils/wait_event.h
index 9ab23e1c4a..cc3a234eba 100644
--- a/src/include/utils/wait_event.h
+++ b/src/include/utils/wait_event.h
@@ -150,7 +150,8 @@ typedef enum
WAIT_EVENT_REGISTER_SYNC_REQUEST,
WAIT_EVENT_SPIN_DELAY,
WAIT_EVENT_VACUUM_DELAY,
- WAIT_EVENT_VACUUM_TRUNCATE
+ WAIT_EVENT_VACUUM_TRUNCATE,
+ WAIT_EVENT_WALSENDER_SEND_DELAY
} WaitEventTimeout;
/* ----------
diff --git a/src/test/regress/expected/subscription.out b/src/test/regress/expected/subscription.out
index 3f99b14394..2027316233 100644
--- a/src/test/regress/expected/subscription.out
+++ b/src/test/regress/expected/subscription.out
@@ -114,18 +114,18 @@ CREATE SUBSCRIPTION regress_testsub4 CONNECTION 'dbname=regress_doesnotexist' PU
WARNING: subscription was created, but is not connected
HINT: To initiate replication, you must manually create the replication slot, enable the subscription, and refresh the subscription.
\dRs+ regress_testsub4
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
-------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub4 | regress_subscription_user | f | {testpub} | f | off | d | f | none | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min send delay | Synchronous commit | Conninfo | Skip LSN
+------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+----------------+--------------------+-----------------------------+----------
+ regress_testsub4 | regress_subscription_user | f | {testpub} | f | off | d | f | none | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
ALTER SUBSCRIPTION regress_testsub4 SET (origin = any);
\dRs+ regress_testsub4
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
-------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub4 | regress_subscription_user | f | {testpub} | f | off | d | f | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min send delay | Synchronous commit | Conninfo | Skip LSN
+------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+----------------+--------------------+-----------------------------+----------
+ regress_testsub4 | regress_subscription_user | f | {testpub} | f | off | d | f | any | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
DROP SUBSCRIPTION regress_testsub3;
@@ -143,10 +143,10 @@ ALTER SUBSCRIPTION regress_testsub CONNECTION 'foobar';
ERROR: invalid connection string syntax: missing "=" after "foobar" in connection info string
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub} | f | off | d | f | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min send delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+----------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | off | d | f | any | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
ALTER SUBSCRIPTION regress_testsub SET PUBLICATION testpub2, testpub3 WITH (refresh = false);
@@ -163,10 +163,10 @@ ERROR: unrecognized subscription parameter: "create_slot"
-- ok
ALTER SUBSCRIPTION regress_testsub SKIP (lsn = '0/12345');
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+---------------------+--------+-----------+------------------+------------------+--------+--------------------+------------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub2,testpub3} | f | off | d | f | any | off | dbname=regress_doesnotexist2 | 0/12345
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min send delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+---------------------+--------+-----------+------------------+------------------+--------+----------------+--------------------+------------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub2,testpub3} | f | off | d | f | any | 0 | off | dbname=regress_doesnotexist2 | 0/12345
(1 row)
-- ok - with lsn = NONE
@@ -175,10 +175,10 @@ ALTER SUBSCRIPTION regress_testsub SKIP (lsn = NONE);
ALTER SUBSCRIPTION regress_testsub SKIP (lsn = '0/0');
ERROR: invalid WAL location (LSN): 0/0
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+---------------------+--------+-----------+------------------+------------------+--------+--------------------+------------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub2,testpub3} | f | off | d | f | any | off | dbname=regress_doesnotexist2 | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min send delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+---------------------+--------+-----------+------------------+------------------+--------+----------------+--------------------+------------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub2,testpub3} | f | off | d | f | any | 0 | off | dbname=regress_doesnotexist2 | 0/0
(1 row)
BEGIN;
@@ -210,10 +210,10 @@ ALTER SUBSCRIPTION regress_testsub_foo SET (synchronous_commit = foobar);
ERROR: invalid value for parameter "synchronous_commit": "foobar"
HINT: Available values: local, remote_write, remote_apply, on, off.
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
----------------------+---------------------------+---------+---------------------+--------+-----------+------------------+------------------+--------+--------------------+------------------------------+----------
- regress_testsub_foo | regress_subscription_user | f | {testpub2,testpub3} | f | off | d | f | any | local | dbname=regress_doesnotexist2 | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min send delay | Synchronous commit | Conninfo | Skip LSN
+---------------------+---------------------------+---------+---------------------+--------+-----------+------------------+------------------+--------+----------------+--------------------+------------------------------+----------
+ regress_testsub_foo | regress_subscription_user | f | {testpub2,testpub3} | f | off | d | f | any | 0 | local | dbname=regress_doesnotexist2 | 0/0
(1 row)
-- rename back to keep the rest simple
@@ -247,19 +247,19 @@ CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUB
WARNING: subscription was created, but is not connected
HINT: To initiate replication, you must manually create the replication slot, enable the subscription, and refresh the subscription.
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub} | t | off | d | f | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min send delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+----------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | t | off | d | f | any | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
ALTER SUBSCRIPTION regress_testsub SET (binary = false);
ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE);
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub} | f | off | d | f | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min send delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+----------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | off | d | f | any | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
DROP SUBSCRIPTION regress_testsub;
@@ -271,27 +271,27 @@ CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUB
WARNING: subscription was created, but is not connected
HINT: To initiate replication, you must manually create the replication slot, enable the subscription, and refresh the subscription.
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub} | f | on | d | f | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min send delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+----------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | on | d | f | any | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
ALTER SUBSCRIPTION regress_testsub SET (streaming = parallel);
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub} | f | parallel | d | f | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min send delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+----------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | parallel | d | f | any | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
ALTER SUBSCRIPTION regress_testsub SET (streaming = false);
ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE);
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub} | f | off | d | f | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min send delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+----------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | off | d | f | any | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
-- fail - publication already exists
@@ -306,10 +306,10 @@ ALTER SUBSCRIPTION regress_testsub ADD PUBLICATION testpub1, testpub2 WITH (refr
ALTER SUBSCRIPTION regress_testsub ADD PUBLICATION testpub1, testpub2 WITH (refresh = false);
ERROR: publication "testpub1" is already in subscription "regress_testsub"
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-----------------------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub,testpub1,testpub2} | f | off | d | f | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min send delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-----------------------------+--------+-----------+------------------+------------------+--------+----------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub,testpub1,testpub2} | f | off | d | f | any | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
-- fail - publication used more than once
@@ -324,10 +324,10 @@ ERROR: publication "testpub3" is not in subscription "regress_testsub"
-- ok - delete publications
ALTER SUBSCRIPTION regress_testsub DROP PUBLICATION testpub1, testpub2 WITH (refresh = false);
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub} | f | off | d | f | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min send delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+----------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | off | d | f | any | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
DROP SUBSCRIPTION regress_testsub;
@@ -363,10 +363,10 @@ CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUB
WARNING: subscription was created, but is not connected
HINT: To initiate replication, you must manually create the replication slot, enable the subscription, and refresh the subscription.
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub} | f | off | p | f | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min send delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+----------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | off | p | f | any | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
--fail - alter of two_phase option not supported.
@@ -375,10 +375,10 @@ ERROR: unrecognized subscription parameter: "two_phase"
-- but can alter streaming when two_phase enabled
ALTER SUBSCRIPTION regress_testsub SET (streaming = true);
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub} | f | on | p | f | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min send delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+----------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | on | p | f | any | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE);
@@ -388,10 +388,10 @@ CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUB
WARNING: subscription was created, but is not connected
HINT: To initiate replication, you must manually create the replication slot, enable the subscription, and refresh the subscription.
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub} | f | on | p | f | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min send delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+----------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | on | p | f | any | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE);
@@ -404,20 +404,61 @@ CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUB
WARNING: subscription was created, but is not connected
HINT: To initiate replication, you must manually create the replication slot, enable the subscription, and refresh the subscription.
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub} | f | off | d | f | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min send delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+----------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | off | d | f | any | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
ALTER SUBSCRIPTION regress_testsub SET (disable_on_error = true);
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub} | f | off | d | t | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min send delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+----------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | off | d | t | any | 0 | off | dbname=regress_doesnotexist | 0/0
+(1 row)
+
+ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE);
+DROP SUBSCRIPTION regress_testsub;
+-- fail -- min_send_delay must be a non-negative integer
+CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexit' PUBLICATION testpub WITH (connect = false, min_send_delay = foo);
+ERROR: invalid value for parameter "min_send_delay": "foo"
+CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexit' PUBLICATION testpub WITH (connect = false, min_send_delay = -1);
+ERROR: -1 ms is outside the valid range for parameter "min_send_delay" (0 .. 2147483647)
+-- fail - specifying streaming = parallel with time-delayed replication is not
+-- supported
+CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUBLICATION testpub WITH (connect = false, streaming = parallel, min_send_delay = 123);
+ERROR: min_send_delay > 0 and streaming = parallel are mutually exclusive options
+-- success -- min_send_delay value without units is taken as milliseconds
+CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexit' PUBLICATION testpub WITH (connect = false, min_send_delay = 123);
+WARNING: subscription was created, but is not connected
+HINT: To initiate replication, you must manually create the replication slot, enable the subscription, and refresh the subscription.
+\dRs+
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min send delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+----------------+--------------------+----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | off | d | f | any | 123 | off | dbname=regress_doesnotexit | 0/0
+(1 row)
+
+-- success -- min_send_delay value with units other than ms is converted to ms
+-- and stored as an integer
+ALTER SUBSCRIPTION regress_testsub SET (min_send_delay = '1 d');
+\dRs+
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min send delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+----------------+--------------------+----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | off | d | f | any | 86400000 | off | dbname=regress_doesnotexit | 0/0
(1 row)
+-- fail - alter subscription with streaming = parallel should fail when
+-- time-delayed replication is set
+ALTER SUBSCRIPTION regress_testsub SET (streaming = parallel);
+ERROR: cannot set parallel streaming mode for subscription with min_send_delay
+-- fail - alter subscription with min_send_delay should fail when
+-- streaming = parallel is set
+ALTER SUBSCRIPTION regress_testsub SET (min_send_delay = 0, streaming = parallel);
+ALTER SUBSCRIPTION regress_testsub SET (min_send_delay = 123);
+ERROR: cannot set min_send_delay for subscription in parallel streaming mode
ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE);
DROP SUBSCRIPTION regress_testsub;
RESET SESSION AUTHORIZATION;
diff --git a/src/test/regress/sql/subscription.sql b/src/test/regress/sql/subscription.sql
index 7281f5fee2..46bf4a27d9 100644
--- a/src/test/regress/sql/subscription.sql
+++ b/src/test/regress/sql/subscription.sql
@@ -286,6 +286,35 @@ ALTER SUBSCRIPTION regress_testsub SET (disable_on_error = true);
ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE);
DROP SUBSCRIPTION regress_testsub;
+-- fail -- min_send_delay must be a non-negative integer
+CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexit' PUBLICATION testpub WITH (connect = false, min_send_delay = foo);
+CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexit' PUBLICATION testpub WITH (connect = false, min_send_delay = -1);
+
+-- fail - specifying streaming = parallel with time-delayed replication is not
+-- supported
+CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUBLICATION testpub WITH (connect = false, streaming = parallel, min_send_delay = 123);
+
+-- success -- min_send_delay value without units is taken as milliseconds
+CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexit' PUBLICATION testpub WITH (connect = false, min_send_delay = 123);
+\dRs+
+
+-- success -- min_send_delay value with units other than ms is converted to ms
+-- and stored as an integer
+ALTER SUBSCRIPTION regress_testsub SET (min_send_delay = '1 d');
+\dRs+
+
+-- fail - alter subscription with streaming = parallel should fail when
+-- time-delayed replication is set
+ALTER SUBSCRIPTION regress_testsub SET (streaming = parallel);
+
+-- fail - alter subscription with min_send_delay should fail when
+-- streaming = parallel is set
+ALTER SUBSCRIPTION regress_testsub SET (min_send_delay = 0, streaming = parallel);
+ALTER SUBSCRIPTION regress_testsub SET (min_send_delay = 123);
+
+ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE);
+DROP SUBSCRIPTION regress_testsub;
+
RESET SESSION AUTHORIZATION;
DROP ROLE regress_subscription_user;
DROP ROLE regress_subscription_user2;
diff --git a/src/test/subscription/t/001_rep_changes.pl b/src/test/subscription/t/001_rep_changes.pl
index 91aa068c95..063a98fde9 100644
--- a/src/test/subscription/t/001_rep_changes.pl
+++ b/src/test/subscription/t/001_rep_changes.pl
@@ -515,6 +515,33 @@ $node_publisher->poll_query_until('postgres',
or die
"Timed out while waiting for apply to restart after renaming SUBSCRIPTION";
+# Test time-delayed logical replication
+#
+# If the subscription sets min_send_delay parameter, the walsender will delay
+# the transaction send for min_send_delay milliseconds. We verify this by
+# looking at the time difference between a) when tuples are inserted on the
+# publisher, and b) when those changes are replicated on the subscriber. Even
+# on slow machines, this strategy will give predictable behavior.
+
+# Set min_send_delay parameter to 3 seconds
+my $delay = 3;
+$node_subscriber->safe_psql('postgres',
+ "ALTER SUBSCRIPTION tap_sub_renamed SET (min_send_delay = '${delay}s')");
+
+# Before doing the insertion, get the current timestamp that will be
+# used as a comparison base.
+my $publisher_insert_time = time();
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_ins VALUES (generate_series(1101, 1120))");
+
+# The publisher waits for the replication to complete
+$node_publisher->wait_for_catchup('tap_sub_renamed');
+
+# This test is successful only if at least the configured delay has elapsed.
+ok( time() - $publisher_insert_time >= $delay,
+ "subscriber applies WAL only after replication delay for non-streaming transaction"
+);
+
# check all the cleanup
$node_subscriber->safe_psql('postgres', "DROP SUBSCRIPTION tap_sub_renamed");
--
2.27.0
Dear Peter,
1. +-- fail - utilizing streaming = parallel with time-delayed replication is not supported +CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUBLICATION testpub WITH (connect = false, streaming = parallel, min_send_delay = 123);"utilizing" --> "specifying"
Fixed.
2. +-- success -- min_send_delay value without unit is take as milliseconds +CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexit' PUBLICATION testpub WITH (connect = false, min_send_delay = 123); +\dRs+"without unit is take as" --> "without units is taken as"
Fixed.
3. +-- success -- min_send_delay value with unit is converted into ms and stored as an integer +ALTER SUBSCRIPTION regress_testsub SET (min_send_delay = '1 d'); +\dRs+"with unit is converted into ms" --> "with units other than ms is
converted to ms"
Fixed.
4. Missing tests?
Why have the previous ALTER SUBSCRIPTION tests been removed? AFAIK,
currently, there are no regression tests for error messages like:test_sub=# ALTER SUBSCRIPTION sub1 SET (min_send_delay = 123);
ERROR: cannot set min_send_delay for subscription in parallel streaming mode
These tests were missed while changing the basic design.
Added.
src/test/subscription/t/001_rep_changes.pl
5. +# This test is successful if and only if the LSN has been applied with at least +# the configured apply delay. +ok( time() - $publisher_insert_time >= $delay, + "subscriber applies WAL only after replication delay for non-streaming transaction" +);It's not strictly an "apply delay". Maybe this comment only needs to
say like below:SUGGESTION
# This test is successful only if at least the configured delay has elapsed.
Changed.
New patch is available on [1]/messages/by-id/TYAPR01MB5866C6BCA4D9386D9C486033F5A59@TYAPR01MB5866.jpnprd01.prod.outlook.com.
[1]: /messages/by-id/TYAPR01MB5866C6BCA4D9386D9C486033F5A59@TYAPR01MB5866.jpnprd01.prod.outlook.com
Best Regards,
Hayato Kuroda
FUJITSU LIMITED
Dear Amit,
Thank you for commenting!
8. + <para> + The delay is effective only when the initial table synchronization + has been finished and the publisher decides to send a particular + transaction downstream. The delay does not take into account the + overhead of time spent in transferring the transaction, whichmeans
+ that the arrival time at the subscriber may be delayed more than
the
+ given time.
+ </para>I'm not sure about this mention about only "effective only when the
initial table synchronization has been finished"... Now that the delay
is pub-side I don't know that it is true anymore.This will still be true because we don't wait during the initial copy
(sync). The delay happens only when the replication starts.
Maybe this depends on the definition of initial copy and sync.
I checked and added descriptions in [1]/messages/by-id/TYAPR01MB5866C6BCA4D9386D9C486033F5A59@TYAPR01MB5866.jpnprd01.prod.outlook.com.
11. + errno = 0; + parsed = strtoul(strVal(defel->arg), &endptr, 10); + if (errno != 0 || *endptr != '\0') + ereport(ERROR, + (errcode(ERRCODE_INVALID_PARAMETER_VALUE), + errmsg("invalid min_send_delay"))); + + if (parsed > PG_INT32_MAX) + ereport(ERROR, + (errcode(ERRCODE_INVALID_PARAMETER_VALUE), + errmsg("min_send_delay \"%s\" out of range", + strVal(defel->arg))));Should the validation be also checking/asserting no negative numbers,
or actually should the min_send_delay be defined as a uint32 in the
first place?I don't see the need to change the datatype of min_send_delay as
compared to what we have min_apply_delay.
I think it is OK to change "long" to "unsinged long", because
We use strtoul() for reading and should reject the minus value.
Of course we can modify them, but I want to keep the consistency with proto_version part.
[1]: /messages/by-id/TYAPR01MB5866C6BCA4D9386D9C486033F5A59@TYAPR01MB5866.jpnprd01.prod.outlook.com
Best Regards,
Hayato Kuroda
FUJITSU LIMITED
Here are some very minor review comments for the patch v4-0001
======
Commit Message
1.
The other possibility was to apply the delay at the end of the parallel apply
transaction but that would cause issues related to resource bloat and
locks being
held for a long time.
~
The reply [1]Kuroda-san replied to my review v3-0001. /messages/by-id/TYAPR01MB5866C6BCA4D9386D9C486033F5A59@TYAPR01MB5866.jpnprd01.prod.outlook.com for review comment #2 says that this was "slightly
reworded", but AFAICT nothing is changed here.
~~~
2.
Eariler versions were written by Euler Taveira, Takamichi Osumi, and
Kuroda Hayato
Typo: "Eariler"
======
doc/src/sgml/ref/create_subscription.sgml
3.
+ <para>
+ By default, the publisher sends changes as soon as possible. This
+ parameter allows the user to delay changes by given time period. If
+ the value is specified without units, it is taken as milliseconds.
+ The default is zero (no delay). See <xref
linkend="config-setting-names-values"/>
+ for details on the available valid time units.
+ </para>
"by given time period" --> "by the given time period"
======
src/backend/replication/pgoutput/pgoutput.c
4. parse_output_parameters
+ else if (strcmp(defel->defname, "min_send_delay") == 0)
+ {
+ unsigned long parsed;
+ char *endptr;
I think 'parsed' is a fairly meaningless variable name. How about
calling this variable something useful like 'delay_val' or
'min_send_delay_value', or something like those? Yes, I recognize that
you copied this from some existing code fragment, but IMO that doesn't
make it good.
======
src/backend/replication/walsender.c
5.
+ /* Sleep until we get reply from worker or we time out */
+ WalSndWait(WL_SOCKET_READABLE,
+ Min(timeout_sleeptime_ms, remaining_wait_time_ms),
+ WAIT_EVENT_WALSENDER_SEND_DELAY);
In my previous review [2]My previous review v3-0001. /messages/by-id/CAHut+Pu6Y+BkYKg6MYGi2zGnx6imeK4QzxBVhpQoPMeDr1npnQ@mail.gmail.com comment #14, I questioned if this comment
was correct. It looks like that was accidentally missed.
======
src/include/replication/logical.h
6.
+ /*
+ * The minimum delay, in milliseconds, by the publisher before sending
+ * COMMIT/PREPARE record
+ */
+ int32 min_send_delay;
The comment is missing a period.
------
[1]: Kuroda-san replied to my review v3-0001. /messages/by-id/TYAPR01MB5866C6BCA4D9386D9C486033F5A59@TYAPR01MB5866.jpnprd01.prod.outlook.com
/messages/by-id/TYAPR01MB5866C6BCA4D9386D9C486033F5A59@TYAPR01MB5866.jpnprd01.prod.outlook.com
[2]: My previous review v3-0001. /messages/by-id/CAHut+Pu6Y+BkYKg6MYGi2zGnx6imeK4QzxBVhpQoPMeDr1npnQ@mail.gmail.com
/messages/by-id/CAHut+Pu6Y+BkYKg6MYGi2zGnx6imeK4QzxBVhpQoPMeDr1npnQ@mail.gmail.com
Kind Regards,
Peter Smith.
Fujitsu Australia
Dear Peter,
Thank you for reviewing! PSA new version.
1.
The other possibility was to apply the delay at the end of the parallel apply
transaction but that would cause issues related to resource bloat and
locks being
held for a long time.~
The reply [1] for review comment #2 says that this was "slightly
reworded", but AFAICT nothing is changed here.
Oh, my git operation might be wrong and it was disappeared.
Sorry for inconvenience, reworded again.
2.
Eariler versions were written by Euler Taveira, Takamichi Osumi, and
Kuroda HayatoTypo: "Eariler"
Fixed.
======
doc/src/sgml/ref/create_subscription.sgml3. + <para> + By default, the publisher sends changes as soon as possible. This + parameter allows the user to delay changes by given time period. If + the value is specified without units, it is taken as milliseconds. + The default is zero (no delay). See <xref linkend="config-setting-names-values"/> + for details on the available valid time units. + </para>"by given time period" --> "by the given time period"
Fixed.
src/backend/replication/pgoutput/pgoutput.c
4. parse_output_parameters
+ else if (strcmp(defel->defname, "min_send_delay") == 0) + { + unsigned long parsed; + char *endptr;I think 'parsed' is a fairly meaningless variable name. How about
calling this variable something useful like 'delay_val' or
'min_send_delay_value', or something like those? Yes, I recognize that
you copied this from some existing code fragment, but IMO that doesn't
make it good.
OK, changed to 'delay_val'.
======
src/backend/replication/walsender.c5. + /* Sleep until we get reply from worker or we time out */ + WalSndWait(WL_SOCKET_READABLE, + Min(timeout_sleeptime_ms, remaining_wait_time_ms), + WAIT_EVENT_WALSENDER_SEND_DELAY);In my previous review [2] comment #14, I questioned if this comment
was correct. It looks like that was accidentally missed.
Sorry, I missed that. But I think this does not have to be changed.
Important point here is that WalSndWait() is used, not WaitLatch().
According to comment atop WalSndWait(), the function waits till following events:
- the socket becomes readable or writable
- a timeout occurs
Logical walsender process is always connected to worker, so the socket becomes readable
when apply worker sends feedback message.
That's why I wrote "Sleep until we get reply from worker or we time out".
src/include/replication/logical.h
6. + /* + * The minimum delay, in milliseconds, by the publisher before sending + * COMMIT/PREPARE record + */ + int32 min_send_delay;The comment is missing a period.
Right, added.
Best Regards,
Hayato Kuroda
FUJITSU LIMITED
Attachments:
v5-0001-Time-delayed-logical-replication-on-publisher-sid.patchapplication/octet-stream; name=v5-0001-Time-delayed-logical-replication-on-publisher-sid.patchDownload
From 6d91d01aa254dd0b4103addf6d7773fefaed2b8f Mon Sep 17 00:00:00 2001
From: Hayato Kuroda <kuroda.hayato@fujitsu.com>
Date: Thu, 16 Feb 2023 07:52:23 +0000
Subject: [PATCH v5] Time-delayed logical replication on publisher side
Similar to physical replication, a time-delayed copy of the data for
logical replication is useful for some scenarios (particularly to fix
errors that might cause data loss).
This patch implements a new subscription parameter called 'min_send_delay'.
If the subscription sets min_send_delay parameter, the apply worker
(via walrcv_startstreaming) passes the value to the publisher as an output plugin
option. The walsender will delay the transaction sending for given milliseconds.
The delay does not take into account the overhead of time spent in transferring
the transaction, which means that the arrival time at the subscriber may be
delayed more than the given time.
The combination of parallel streaming mode and min_send_delay is not allowed.
This is because in parallel streaming mode, we start applying the transaction
stream as soon as the first change arrives without knowing the transaction's
prepare/commit time. Always waiting for the full 'min_send_delay' period might
include unnecessary delay.
The other possibility was to wait sending COMMIT/PREPARE message at the end of
the parallel apply transaction but that would cause issues related to resource
bloat and locks being held for a long time.
The delay occurs before we start to send the transaction on the publisher.
Regular and prepared transactions are covered. Streamed transactions are also
covered.
Earlier versions were written by Euler Taveira, Takamichi Osumi, and Kuroda Hayato
Author: Kuroda Hayato, Takamichi Osumi
Reviewed-by: Amit Kapila, Peter Smith, Vignesh C, Shveta Malik,
Kyotaro Horiguchi, Shi Yu, Wang Wei, Dilip Kumar, Melih Mutlu,
Andres Freund
---
doc/src/sgml/catalogs.sgml | 10 +
doc/src/sgml/glossary.sgml | 15 ++
doc/src/sgml/logical-replication.sgml | 6 +
doc/src/sgml/monitoring.sgml | 5 +
doc/src/sgml/ref/alter_subscription.sgml | 5 +-
doc/src/sgml/ref/create_subscription.sgml | 43 +++-
src/backend/catalog/pg_subscription.c | 1 +
src/backend/catalog/system_views.sql | 7 +-
src/backend/commands/subscriptioncmds.c | 119 ++++++++++-
.../libpqwalreceiver/libpqwalreceiver.c | 5 +
src/backend/replication/logical/decode.c | 18 ++
src/backend/replication/logical/logical.c | 18 +-
.../replication/logical/logicalfuncs.c | 2 +-
src/backend/replication/logical/worker.c | 12 +-
src/backend/replication/pgoutput/pgoutput.c | 36 ++++
src/backend/replication/slotfuncs.c | 4 +-
src/backend/replication/walsender.c | 78 +++++++-
src/backend/utils/activity/wait_event.c | 3 +
src/bin/pg_dump/pg_dump.c | 15 +-
src/bin/pg_dump/pg_dump.h | 1 +
src/bin/psql/describe.c | 9 +-
src/bin/psql/tab-complete.c | 4 +-
src/include/catalog/pg_subscription.h | 3 +
src/include/replication/logical.h | 17 +-
src/include/replication/logicalproto.h | 6 +-
src/include/replication/pgoutput.h | 1 +
src/include/replication/walreceiver.h | 1 +
src/include/utils/wait_event.h | 3 +-
src/test/regress/expected/subscription.out | 185 +++++++++++-------
src/test/regress/sql/subscription.sql | 29 +++
src/test/subscription/t/001_rep_changes.pl | 27 +++
31 files changed, 583 insertions(+), 105 deletions(-)
diff --git a/doc/src/sgml/catalogs.sgml b/doc/src/sgml/catalogs.sgml
index c1e4048054..ee4c3c77e6 100644
--- a/doc/src/sgml/catalogs.sgml
+++ b/doc/src/sgml/catalogs.sgml
@@ -7892,6 +7892,16 @@ SCRAM-SHA-256$<replaceable><iteration count></replaceable>:<replaceable>&l
</para></entry>
</row>
+ <row>
+ <entry role="catalog_table_entry"><para role="column_definition">
+ <structfield>subminsenddelay</structfield> <type>int4</type>
+ </para>
+ <para>
+ The minimum delay, in milliseconds, by the publisher before sending
+ COMMIT record
+ </para></entry>
+ </row>
+
<row>
<entry role="catalog_table_entry"><para role="column_definition">
<structfield>subenabled</structfield> <type>bool</type>
diff --git a/doc/src/sgml/glossary.sgml b/doc/src/sgml/glossary.sgml
index 7c01a541fe..9ede9d05f6 100644
--- a/doc/src/sgml/glossary.sgml
+++ b/doc/src/sgml/glossary.sgml
@@ -1729,6 +1729,21 @@
</glossdef>
</glossentry>
+ <glossentry id="glossary-time-delayed-replication">
+ <glossterm>Time-delayed replication</glossterm>
+ <glossdef>
+ <para>
+ Replication setup that delays the application of changes by a specified
+ minimum time-delay period.
+ </para>
+ <para>
+ For more information, see
+ <xref linkend="guc-recovery-min-apply-delay"/> for physical replication
+ and <xref linkend="sql-createsubscription"/> for logical replication.
+ </para>
+ </glossdef>
+ </glossentry>
+
<glossentry id="glossary-toast">
<glossterm>TOAST</glossterm>
<glossdef>
diff --git a/doc/src/sgml/logical-replication.sgml b/doc/src/sgml/logical-replication.sgml
index 1bd5660c87..fe9e7f7b26 100644
--- a/doc/src/sgml/logical-replication.sgml
+++ b/doc/src/sgml/logical-replication.sgml
@@ -247,6 +247,12 @@
target table.
</para>
+ <para>
+ A subscription can delay the receipt of changes by specifying the
+ <literal>min_send_delay</literal> subscription parameter. See
+ <xref linkend="sql-createsubscription"/> for details.
+ </para>
+
<sect2 id="logical-replication-subscription-slot">
<title>Replication Slot Management</title>
diff --git a/doc/src/sgml/monitoring.sgml b/doc/src/sgml/monitoring.sgml
index b0b997f092..6158587644 100644
--- a/doc/src/sgml/monitoring.sgml
+++ b/doc/src/sgml/monitoring.sgml
@@ -2349,6 +2349,11 @@ postgres 27093 0.0 0.0 30096 2752 ? Ss 11:34 0:00 postgres: ser
<entry>Waiting to acquire an exclusive lock to truncate off any
empty pages at the end of a table vacuumed.</entry>
</row>
+ <row>
+ <entry><literal>WalSenderSendDelay</literal></entry>
+ <entry>Waiting while sending changes for time-delayed logical replication
+ in the WAL sender process.</entry>
+ </row>
</tbody>
</tgroup>
</table>
diff --git a/doc/src/sgml/ref/alter_subscription.sgml b/doc/src/sgml/ref/alter_subscription.sgml
index 964fcbb8ff..3f238b958b 100644
--- a/doc/src/sgml/ref/alter_subscription.sgml
+++ b/doc/src/sgml/ref/alter_subscription.sgml
@@ -213,8 +213,9 @@ ALTER SUBSCRIPTION <replaceable class="parameter">name</replaceable> RENAME TO <
are <literal>slot_name</literal>,
<literal>synchronous_commit</literal>,
<literal>binary</literal>, <literal>streaming</literal>,
- <literal>disable_on_error</literal>, and
- <literal>origin</literal>.
+ <literal>disable_on_error</literal>,
+ <literal>origin</literal>, and
+ <literal>min_send_delay</literal>.
</para>
</listitem>
</varlistentry>
diff --git a/doc/src/sgml/ref/create_subscription.sgml b/doc/src/sgml/ref/create_subscription.sgml
index 51c45f17c7..38fd89368a 100644
--- a/doc/src/sgml/ref/create_subscription.sgml
+++ b/doc/src/sgml/ref/create_subscription.sgml
@@ -349,7 +349,43 @@ CREATE SUBSCRIPTION <replaceable class="parameter">subscription_name</replaceabl
</para>
</listitem>
</varlistentry>
- </variablelist></para>
+
+ <varlistentry>
+ <term><literal>min_send_delay</literal> (<type>integer</type>)</term>
+ <listitem>
+ <para>
+ By default, the publisher sends changes as soon as possible. This
+ parameter allows the user to delay changes by the given time period.
+ If the value is specified without units, it is taken as milliseconds.
+ The default is zero (no delay). See <xref linkend="config-setting-names-values"/>
+ for details on the available valid time units.
+ </para>
+ <para>
+ The delay is effective only when the initial table synchronization
+ has been finished. However, there is a possibility that the table
+ status written in <link linkend="catalog-pg-subscription-rel"><structname>pg_subscription_rel</structname></link>
+ will be delayed in getting to "ready" state, and also two-phase
+ (if specified) will be delayed in getting to "enabled".
+ </para>
+ <para>
+ The delay does not take into account the overhead of time spent
+ transferring the transaction. Therefore, the arrival time at the
+ subscriber may be delayed more than the specified
+ <literal>min_send_delay</literal> time.
+ </para>
+ <warning>
+ <para>
+ Delaying the replication means there is a much longer time between
+ making a change on the publisher, and that change being committed
+ on the subscriber. This can impact the performance of synchronous
+ replication. See <xref linkend="guc-synchronous-commit"/>
+ parameter.
+ </para>
+ </warning>
+ </listitem>
+ </varlistentry>
+ </variablelist>
+ </para>
</listitem>
</varlistentry>
@@ -420,6 +456,11 @@ CREATE SUBSCRIPTION <replaceable class="parameter">subscription_name</replaceabl
published with different column lists are not supported.
</para>
+ <para>
+ A non-zero <literal>min_send_delay</literal> parameter is not allowed when
+ streaming in parallel mode.
+ </para>
+
<para>
We allow non-existent publications to be specified so that users can add
those later. This means
diff --git a/src/backend/catalog/pg_subscription.c b/src/backend/catalog/pg_subscription.c
index a56ae311c3..63a10b06d1 100644
--- a/src/backend/catalog/pg_subscription.c
+++ b/src/backend/catalog/pg_subscription.c
@@ -66,6 +66,7 @@ GetSubscription(Oid subid, bool missing_ok)
sub->skiplsn = subform->subskiplsn;
sub->name = pstrdup(NameStr(subform->subname));
sub->owner = subform->subowner;
+ sub->minsenddelay = subform->subminsenddelay;
sub->enabled = subform->subenabled;
sub->binary = subform->subbinary;
sub->stream = subform->substream;
diff --git a/src/backend/catalog/system_views.sql b/src/backend/catalog/system_views.sql
index 34ca0e739f..54a705d71b 100644
--- a/src/backend/catalog/system_views.sql
+++ b/src/backend/catalog/system_views.sql
@@ -1314,9 +1314,10 @@ REVOKE ALL ON pg_replication_origin_status FROM public;
-- All columns of pg_subscription except subconninfo are publicly readable.
REVOKE ALL ON pg_subscription FROM public;
-GRANT SELECT (oid, subdbid, subskiplsn, subname, subowner, subenabled,
- subbinary, substream, subtwophasestate, subdisableonerr,
- subslotname, subsynccommit, subpublications, suborigin)
+GRANT SELECT (oid, subdbid, subskiplsn, subname, subowner, subminsenddelay,
+ subenabled, subbinary, substream, subtwophasestate,
+ subdisableonerr, subslotname, subsynccommit, subpublications,
+ suborigin)
ON pg_subscription TO public;
CREATE VIEW pg_stat_subscription_stats AS
diff --git a/src/backend/commands/subscriptioncmds.c b/src/backend/commands/subscriptioncmds.c
index 464db6d247..4a8cd47171 100644
--- a/src/backend/commands/subscriptioncmds.c
+++ b/src/backend/commands/subscriptioncmds.c
@@ -66,6 +66,7 @@
#define SUBOPT_DISABLE_ON_ERR 0x00000400
#define SUBOPT_LSN 0x00000800
#define SUBOPT_ORIGIN 0x00001000
+#define SUBOPT_MIN_SEND_DELAY 0x00002000
/* check if the 'val' has 'bits' set */
#define IsSet(val, bits) (((val) & (bits)) == (bits))
@@ -90,6 +91,7 @@ typedef struct SubOpts
bool disableonerr;
char *origin;
XLogRecPtr lsn;
+ int32 min_send_delay;
} SubOpts;
static List *fetch_table_list(WalReceiverConn *wrconn, List *publications);
@@ -100,7 +102,7 @@ static void check_publications_origin(WalReceiverConn *wrconn,
static void check_duplicates_in_publist(List *publist, Datum *datums);
static List *merge_publications(List *oldpublist, List *newpublist, bool addpub, const char *subname);
static void ReportSlotConnectionError(List *rstates, Oid subid, char *slotname, char *err);
-
+static int32 defGetMinSendDelay(DefElem *def);
/*
* Common option parsing function for CREATE and ALTER SUBSCRIPTION commands.
@@ -146,6 +148,8 @@ parse_subscription_options(ParseState *pstate, List *stmt_options,
opts->disableonerr = false;
if (IsSet(supported_opts, SUBOPT_ORIGIN))
opts->origin = pstrdup(LOGICALREP_ORIGIN_ANY);
+ if (IsSet(supported_opts, SUBOPT_MIN_SEND_DELAY))
+ opts->min_send_delay = 0;
/* Parse options */
foreach(lc, stmt_options)
@@ -324,6 +328,15 @@ parse_subscription_options(ParseState *pstate, List *stmt_options,
opts->specified_opts |= SUBOPT_LSN;
opts->lsn = lsn;
}
+ else if (IsSet(supported_opts, SUBOPT_MIN_SEND_DELAY) &&
+ strcmp(defel->defname, "min_send_delay") == 0)
+ {
+ if (IsSet(opts->specified_opts, SUBOPT_MIN_SEND_DELAY))
+ errorConflictingDefElem(defel, pstate);
+
+ opts->specified_opts |= SUBOPT_MIN_SEND_DELAY;
+ opts->min_send_delay = defGetMinSendDelay(defel);
+ }
else
ereport(ERROR,
(errcode(ERRCODE_SYNTAX_ERROR),
@@ -404,6 +417,29 @@ parse_subscription_options(ParseState *pstate, List *stmt_options,
"slot_name = NONE", "create_slot = false")));
}
}
+
+ /*
+ * The combination of parallel streaming mode and min_send_delay is not
+ * allowed. This is because in parallel streaming mode, the walsender
+ * starts sending the transaction stream without knowing the prepare/commit
+ * time of the transaction. Always waiting for the full 'min_send_delay'
+ * time to send may introduce unnecessary delay.
+ *
+ * The other possibility was to wait sending COMMIT record of the parallel
+ * apply transaction but that would cause issues related to resource bloat
+ * and locks being held for a long time.
+ */
+ if (IsSet(supported_opts, SUBOPT_MIN_SEND_DELAY) &&
+ opts->min_send_delay > 0 &&
+ opts->streaming == LOGICALREP_STREAM_PARALLEL)
+ ereport(ERROR,
+ errcode(ERRCODE_SYNTAX_ERROR),
+ /*
+ * translator: the first %s is a string of the form "parameter > 0"
+ * and the second one is "option = value".
+ */
+ errmsg("%s and %s are mutually exclusive options",
+ "min_send_delay > 0", "streaming = parallel"));
}
/*
@@ -560,7 +596,8 @@ CreateSubscription(ParseState *pstate, CreateSubscriptionStmt *stmt,
SUBOPT_SLOT_NAME | SUBOPT_COPY_DATA |
SUBOPT_SYNCHRONOUS_COMMIT | SUBOPT_BINARY |
SUBOPT_STREAMING | SUBOPT_TWOPHASE_COMMIT |
- SUBOPT_DISABLE_ON_ERR | SUBOPT_ORIGIN);
+ SUBOPT_DISABLE_ON_ERR | SUBOPT_ORIGIN |
+ SUBOPT_MIN_SEND_DELAY);
parse_subscription_options(pstate, stmt->options, supported_opts, &opts);
/*
@@ -628,6 +665,7 @@ CreateSubscription(ParseState *pstate, CreateSubscriptionStmt *stmt,
values[Anum_pg_subscription_subname - 1] =
DirectFunctionCall1(namein, CStringGetDatum(stmt->subname));
values[Anum_pg_subscription_subowner - 1] = ObjectIdGetDatum(owner);
+ values[Anum_pg_subscription_subminsenddelay - 1] = Int32GetDatum(opts.min_send_delay);
values[Anum_pg_subscription_subenabled - 1] = BoolGetDatum(opts.enabled);
values[Anum_pg_subscription_subbinary - 1] = BoolGetDatum(opts.binary);
values[Anum_pg_subscription_substream - 1] = CharGetDatum(opts.streaming);
@@ -1054,7 +1092,7 @@ AlterSubscription(ParseState *pstate, AlterSubscriptionStmt *stmt,
supported_opts = (SUBOPT_SLOT_NAME |
SUBOPT_SYNCHRONOUS_COMMIT | SUBOPT_BINARY |
SUBOPT_STREAMING | SUBOPT_DISABLE_ON_ERR |
- SUBOPT_ORIGIN);
+ SUBOPT_ORIGIN | SUBOPT_MIN_SEND_DELAY);
parse_subscription_options(pstate, stmt->options,
supported_opts, &opts);
@@ -1098,6 +1136,19 @@ AlterSubscription(ParseState *pstate, AlterSubscriptionStmt *stmt,
if (IsSet(opts.specified_opts, SUBOPT_STREAMING))
{
+ /*
+ * The combination of parallel streaming mode and
+ * min_send_delay is not allowed. See
+ * parse_subscription_options.
+ */
+ if (opts.streaming == LOGICALREP_STREAM_PARALLEL &&
+ !IsSet(opts.specified_opts, SUBOPT_MIN_SEND_DELAY) &&
+ sub->minsenddelay > 0)
+ ereport(ERROR,
+ errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
+ errmsg("cannot set parallel streaming mode for subscription with %s",
+ "min_send_delay"));
+
values[Anum_pg_subscription_substream - 1] =
CharGetDatum(opts.streaming);
replaces[Anum_pg_subscription_substream - 1] = true;
@@ -1111,6 +1162,26 @@ AlterSubscription(ParseState *pstate, AlterSubscriptionStmt *stmt,
= true;
}
+ if (IsSet(opts.specified_opts, SUBOPT_MIN_SEND_DELAY))
+ {
+ /*
+ * The combination of parallel streaming mode and
+ * min_send_delay is not allowed. See
+ * parse_subscription_options.
+ */
+ if (opts.min_send_delay > 0 &&
+ !IsSet(opts.specified_opts, SUBOPT_STREAMING) &&
+ sub->stream == LOGICALREP_STREAM_PARALLEL)
+ ereport(ERROR,
+ errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
+ errmsg("cannot set %s for subscription in parallel streaming mode",
+ "min_send_delay"));
+
+ values[Anum_pg_subscription_subminsenddelay - 1] =
+ Int32GetDatum(opts.min_send_delay);
+ replaces[Anum_pg_subscription_subminsenddelay - 1] = true;
+ }
+
if (IsSet(opts.specified_opts, SUBOPT_ORIGIN))
{
values[Anum_pg_subscription_suborigin - 1] =
@@ -2195,3 +2266,45 @@ defGetStreamingMode(DefElem *def)
def->defname)));
return LOGICALREP_STREAM_OFF; /* keep compiler quiet */
}
+
+/*
+ * Extract the min_send_delay value from a DefElem. This is very similar to
+ * parse_and_validate_value() for integer values, because min_send_delay
+ * accepts the same parameter format as recovery_min_apply_delay.
+ */
+static int32
+defGetMinSendDelay(DefElem *def)
+{
+ char *input_string;
+ int result;
+ const char *hintmsg;
+
+ input_string = defGetString(def);
+
+ /*
+ * Parse given string as parameter which has millisecond unit
+ */
+ if (!parse_int(input_string, &result, GUC_UNIT_MS, &hintmsg))
+ ereport(ERROR,
+ (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+ errmsg("invalid value for parameter \"%s\": \"%s\"",
+ "min_send_delay", input_string),
+ hintmsg ? errhint("%s", _(hintmsg)) : 0));
+
+ /*
+ * Check both the lower boundary for the valid min_send_delay range and
+ * the upper boundary as the safeguard for some platforms where INT_MAX is
+ * wider than int32 respectively. Although parse_int() has confirmed that
+ * the result is less than or equal to INT_MAX, the value will be stored
+ * in a catalog column of int32.
+ */
+ if (result < 0 || result > PG_INT32_MAX)
+ ereport(ERROR,
+ (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+ errmsg("%d ms is outside the valid range for parameter \"%s\" (%d .. %d)",
+ result,
+ "min_send_delay",
+ 0, PG_INT32_MAX)));
+
+ return result;
+}
diff --git a/src/backend/replication/libpqwalreceiver/libpqwalreceiver.c b/src/backend/replication/libpqwalreceiver/libpqwalreceiver.c
index 560ec974fa..89a72c1abe 100644
--- a/src/backend/replication/libpqwalreceiver/libpqwalreceiver.c
+++ b/src/backend/replication/libpqwalreceiver/libpqwalreceiver.c
@@ -443,6 +443,11 @@ libpqrcv_startstreaming(WalReceiverConn *conn,
PQserverVersion(conn->streamConn) >= 140000)
appendStringInfoString(&cmd, ", binary 'true'");
+ if (options->proto.logical.min_send_delay > 0 &&
+ PQserverVersion(conn->streamConn) >= 160000)
+ appendStringInfo(&cmd, ", min_send_delay '%d'",
+ options->proto.logical.min_send_delay);
+
appendStringInfoChar(&cmd, ')');
}
else
diff --git a/src/backend/replication/logical/decode.c b/src/backend/replication/logical/decode.c
index a53e23c679..2674444894 100644
--- a/src/backend/replication/logical/decode.c
+++ b/src/backend/replication/logical/decode.c
@@ -679,6 +679,15 @@ DecodeCommit(LogicalDecodingContext *ctx, XLogRecordBuffer *buf,
}
else
{
+ /*
+ * Delay sending the changes if required. For streaming transactions,
+ * this means a delay in sending the last stream but that is OK because
+ * on the downstream the changes will be applied only after receiving
+ * the last stream.
+ */
+ if (ctx->min_send_delay > 0 && ctx->delay_send)
+ ctx->delay_send(ctx, commit_time);
+
ReorderBufferCommit(ctx->reorder, xid, buf->origptr, buf->endptr,
commit_time, origin_id, origin_lsn);
}
@@ -763,6 +772,15 @@ DecodePrepare(LogicalDecodingContext *ctx, XLogRecordBuffer *buf,
buf->origptr, buf->endptr);
}
+ /*
+ * Delay sending the changes if required. For streaming transactions,
+ * this means a delay in sending the last stream but that is OK because
+ * on the downstream the changes will be applied only after receiving
+ * the last stream.
+ */
+ if (ctx->min_send_delay > 0 && ctx->delay_send)
+ ctx->delay_send(ctx, prepare_time);
+
/* replay actions of all transaction + subtransactions in order */
ReorderBufferPrepare(ctx->reorder, xid, parsed->twophase_gid);
diff --git a/src/backend/replication/logical/logical.c b/src/backend/replication/logical/logical.c
index c3ec97a0a6..e4dd822cdc 100644
--- a/src/backend/replication/logical/logical.c
+++ b/src/backend/replication/logical/logical.c
@@ -156,7 +156,8 @@ StartupDecodingContext(List *output_plugin_options,
XLogReaderRoutine *xl_routine,
LogicalOutputPluginWriterPrepareWrite prepare_write,
LogicalOutputPluginWriterWrite do_write,
- LogicalOutputPluginWriterUpdateProgress update_progress)
+ LogicalOutputPluginWriterUpdateProgress update_progress,
+ LogicalOutputPluginWriterDelay delay_send)
{
ReplicationSlot *slot;
MemoryContext context,
@@ -293,6 +294,7 @@ StartupDecodingContext(List *output_plugin_options,
ctx->prepare_write = prepare_write;
ctx->write = do_write;
ctx->update_progress = update_progress;
+ ctx->delay_send = delay_send;
ctx->output_plugin_options = output_plugin_options;
@@ -316,7 +318,7 @@ StartupDecodingContext(List *output_plugin_options,
* marking WAL reserved beforehand. In that scenario, it's up to the
* caller to guarantee that WAL remains available.
* xl_routine -- XLogReaderRoutine for underlying XLogReader
- * prepare_write, do_write, update_progress --
+ * prepare_write, do_write, update_progress, delay_send --
* callbacks that perform the use-case dependent, actual, work.
*
* Needs to be called while in a memory context that's at least as long lived
@@ -334,7 +336,8 @@ CreateInitDecodingContext(const char *plugin,
XLogReaderRoutine *xl_routine,
LogicalOutputPluginWriterPrepareWrite prepare_write,
LogicalOutputPluginWriterWrite do_write,
- LogicalOutputPluginWriterUpdateProgress update_progress)
+ LogicalOutputPluginWriterUpdateProgress update_progress,
+ LogicalOutputPluginWriterDelay delay_send)
{
TransactionId xmin_horizon = InvalidTransactionId;
ReplicationSlot *slot;
@@ -435,7 +438,7 @@ CreateInitDecodingContext(const char *plugin,
ctx = StartupDecodingContext(NIL, restart_lsn, xmin_horizon,
need_full_snapshot, false,
xl_routine, prepare_write, do_write,
- update_progress);
+ update_progress, delay_send);
/* call output plugin initialization callback */
old_context = MemoryContextSwitchTo(ctx->context);
@@ -475,7 +478,7 @@ CreateInitDecodingContext(const char *plugin,
* xl_routine
* XLogReaderRoutine used by underlying xlogreader
*
- * prepare_write, do_write, update_progress
+ * prepare_write, do_write, update_progress, delay_send
* callbacks that have to be filled to perform the use-case dependent,
* actual work.
*
@@ -493,7 +496,8 @@ CreateDecodingContext(XLogRecPtr start_lsn,
XLogReaderRoutine *xl_routine,
LogicalOutputPluginWriterPrepareWrite prepare_write,
LogicalOutputPluginWriterWrite do_write,
- LogicalOutputPluginWriterUpdateProgress update_progress)
+ LogicalOutputPluginWriterUpdateProgress update_progress,
+ LogicalOutputPluginWriterDelay delay_send)
{
LogicalDecodingContext *ctx;
ReplicationSlot *slot;
@@ -547,7 +551,7 @@ CreateDecodingContext(XLogRecPtr start_lsn,
ctx = StartupDecodingContext(output_plugin_options,
start_lsn, InvalidTransactionId, false,
fast_forward, xl_routine, prepare_write,
- do_write, update_progress);
+ do_write, update_progress, delay_send);
/* call output plugin initialization callback */
old_context = MemoryContextSwitchTo(ctx->context);
diff --git a/src/backend/replication/logical/logicalfuncs.c b/src/backend/replication/logical/logicalfuncs.c
index fa1b641a2b..960025197f 100644
--- a/src/backend/replication/logical/logicalfuncs.c
+++ b/src/backend/replication/logical/logicalfuncs.c
@@ -212,7 +212,7 @@ pg_logical_slot_get_changes_guts(FunctionCallInfo fcinfo, bool confirm, bool bin
.segment_open = wal_segment_open,
.segment_close = wal_segment_close),
LogicalOutputPrepareWrite,
- LogicalOutputWrite, NULL);
+ LogicalOutputWrite, NULL, NULL);
/*
* After the sanity checks in CreateDecodingContext, make sure the
diff --git a/src/backend/replication/logical/worker.c b/src/backend/replication/logical/worker.c
index cfb2ab6248..e68902ae34 100644
--- a/src/backend/replication/logical/worker.c
+++ b/src/backend/replication/logical/worker.c
@@ -3898,7 +3898,8 @@ maybe_reread_subscription(void)
newsub->stream != MySubscription->stream ||
strcmp(newsub->origin, MySubscription->origin) != 0 ||
newsub->owner != MySubscription->owner ||
- !equal(newsub->publications, MySubscription->publications))
+ !equal(newsub->publications, MySubscription->publications) ||
+ newsub->minsenddelay != MySubscription->minsenddelay)
{
if (am_parallel_apply_worker())
ereport(LOG,
@@ -4617,9 +4618,18 @@ ApplyWorkerMain(Datum main_arg)
options.proto.logical.twophase = false;
options.proto.logical.origin = pstrdup(MySubscription->origin);
+ options.proto.logical.min_send_delay = 0;
if (!am_tablesync_worker())
{
+ /*
+ * Time-delayed logical replication does not support tablesync
+ * workers, so only the leader apply worker can request walsenders to
+ * delay on the publisher side.
+ */
+ if (server_version >= 160000 && MySubscription->minsenddelay > 0)
+ options.proto.logical.min_send_delay = MySubscription->minsenddelay;
+
/*
* Even when the two_phase mode is requested by the user, it remains
* as the tri-state PENDING until all tablesyncs have reached READY
diff --git a/src/backend/replication/pgoutput/pgoutput.c b/src/backend/replication/pgoutput/pgoutput.c
index 98377c094b..df6a87b0ba 100644
--- a/src/backend/replication/pgoutput/pgoutput.c
+++ b/src/backend/replication/pgoutput/pgoutput.c
@@ -285,6 +285,7 @@ parse_output_parameters(List *options, PGOutputData *data)
bool streaming_given = false;
bool two_phase_option_given = false;
bool origin_option_given = false;
+ bool min_send_delay_option_given = false;
data->binary = false;
data->streaming = LOGICALREP_STREAM_OFF;
@@ -396,6 +397,32 @@ parse_output_parameters(List *options, PGOutputData *data)
errcode(ERRCODE_INVALID_PARAMETER_VALUE),
errmsg("unrecognized origin value: \"%s\"", data->origin));
}
+ else if (strcmp(defel->defname, "min_send_delay") == 0)
+ {
+ unsigned long delay_val;
+ char *endptr;
+
+ if (min_send_delay_option_given)
+ ereport(ERROR,
+ errcode(ERRCODE_SYNTAX_ERROR),
+ errmsg("conflicting or redundant options"));
+ min_send_delay_option_given = true;
+
+ errno = 0;
+ delay_val = strtoul(strVal(defel->arg), &endptr, 10);
+ if (errno != 0 || *endptr != '\0')
+ ereport(ERROR,
+ (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+ errmsg("invalid min_send_delay")));
+
+ if (delay_val > PG_INT32_MAX)
+ ereport(ERROR,
+ (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+ errmsg("min_send_delay \"%s\" out of range",
+ strVal(defel->arg))));
+
+ data->min_send_delay = (int32) delay_val;
+ }
else
elog(ERROR, "unrecognized pgoutput option: %s", defel->defname);
}
@@ -501,6 +528,15 @@ pgoutput_startup(LogicalDecodingContext *ctx, OutputPluginOptions *opt,
else
ctx->twophase_opt_given = true;
+ if (data->min_send_delay &&
+ data->protocol_version < LOGICALREP_PROTO_MIN_SEND_DELAY_VERSION_NUM)
+ ereport(ERROR,
+ (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+ errmsg("requested proto_version=%d does not support delay sending data, need %d or higher",
+ data->protocol_version, LOGICALREP_PROTO_MIN_SEND_DELAY_VERSION_NUM)));
+ else
+ ctx->min_send_delay = data->min_send_delay;
+
/* Init publication state. */
data->publications = NIL;
publications_valid = false;
diff --git a/src/backend/replication/slotfuncs.c b/src/backend/replication/slotfuncs.c
index 2f3c964824..522f7600a1 100644
--- a/src/backend/replication/slotfuncs.c
+++ b/src/backend/replication/slotfuncs.c
@@ -148,7 +148,7 @@ create_logical_replication_slot(char *name, char *plugin,
XL_ROUTINE(.page_read = read_local_xlog_page,
.segment_open = wal_segment_open,
.segment_close = wal_segment_close),
- NULL, NULL, NULL);
+ NULL, NULL, NULL, NULL);
/*
* If caller needs us to determine the decoding start point, do so now.
@@ -481,7 +481,7 @@ pg_logical_replication_slot_advance(XLogRecPtr moveto)
XL_ROUTINE(.page_read = read_local_xlog_page,
.segment_open = wal_segment_open,
.segment_close = wal_segment_close),
- NULL, NULL, NULL);
+ NULL, NULL, NULL, NULL);
/*
* Start reading at the slot's restart_lsn, which we know to point to
diff --git a/src/backend/replication/walsender.c b/src/backend/replication/walsender.c
index 75e8363e24..9537fba7df 100644
--- a/src/backend/replication/walsender.c
+++ b/src/backend/replication/walsender.c
@@ -252,6 +252,7 @@ static void WalSndPrepareWrite(LogicalDecodingContext *ctx, XLogRecPtr lsn, Tran
static void WalSndWriteData(LogicalDecodingContext *ctx, XLogRecPtr lsn, TransactionId xid, bool last_write);
static void WalSndUpdateProgress(LogicalDecodingContext *ctx, XLogRecPtr lsn, TransactionId xid,
bool skipped_xact);
+static void WalSndDelay(LogicalDecodingContext *ctx, TimestampTz delay_start);
static XLogRecPtr WalSndWaitForWal(XLogRecPtr loc);
static void LagTrackerWrite(XLogRecPtr lsn, TimestampTz local_flush_time);
static TimeOffset LagTrackerRead(int head, XLogRecPtr lsn, TimestampTz now);
@@ -1126,7 +1127,7 @@ CreateReplicationSlot(CreateReplicationSlotCmd *cmd)
.segment_open = WalSndSegmentOpen,
.segment_close = wal_segment_close),
WalSndPrepareWrite, WalSndWriteData,
- WalSndUpdateProgress);
+ WalSndUpdateProgress, WalSndDelay);
/*
* Signal that we don't need the timeout mechanism. We're just
@@ -1285,7 +1286,7 @@ StartLogicalReplication(StartReplicationCmd *cmd)
.segment_open = WalSndSegmentOpen,
.segment_close = wal_segment_close),
WalSndPrepareWrite, WalSndWriteData,
- WalSndUpdateProgress);
+ WalSndUpdateProgress, WalSndDelay);
xlogreader = logical_decoding_ctx->reader;
WalSndSetState(WALSNDSTATE_CATCHUP);
@@ -3849,3 +3850,76 @@ LagTrackerRead(int head, XLogRecPtr lsn, TimestampTz now)
Assert(time != 0);
return now - time;
}
+
+/*
+ * LogicalDecodingContext 'delay' callback.
+ *
+ * Wait long enough to make sure a transaction is applied at least that
+ * period behind the publisher.
+ */
+static void
+WalSndDelay(LogicalDecodingContext *ctx, TimestampTz delay_start)
+{
+ /* Wait till delayUntil by the latch mechanism */
+ while (true)
+ {
+ TimestampTz delayUntil;
+ long remaining_wait_time_ms;
+ long timeout_sleeptime_ms;
+
+ ResetLatch(MyLatch);
+
+ CHECK_FOR_INTERRUPTS();
+
+ /* This might change wal_sender_timeout */
+ if (ConfigReloadPending)
+ {
+ ConfigReloadPending = false;
+ ProcessConfigFile(PGC_SIGHUP);
+ }
+
+ /* Check for input from the client */
+ ProcessRepliesIfAny();
+
+ /* die if timeout was reached */
+ WalSndCheckTimeOut();
+
+ /* Send keepalive if the time has come */
+ WalSndKeepaliveIfNecessary();
+
+ /* Try to flush pending output to the client */
+ if (pq_flush_if_writable() != 0)
+ WalSndShutdown();
+
+ /*
+ * If we've requested to shut down, exit the process.
+ *
+ * Note that WalSndDone() cannot be used here because the delaying
+ * changes will be sent in the function.
+ */
+ if (got_STOPPING)
+ WalSndShutdown();
+
+ delayUntil = TimestampTzPlusMilliseconds(delay_start, ctx->min_send_delay);
+ remaining_wait_time_ms = TimestampDifferenceMilliseconds(GetCurrentTimestamp(), delayUntil);
+
+ /*
+ * Exit without arming the latch if it's already past time to send
+ * this transaction.
+ */
+ if (remaining_wait_time_ms <= 0)
+ break;
+
+ /* Sleep until appropriate time. */
+ timeout_sleeptime_ms = WalSndComputeSleeptime(GetCurrentTimestamp());
+
+ elog(DEBUG2, "time-delayed replication for txid %u, delay_time = %d ms, remaining wait time: %ld ms",
+ ctx->write_xid, (int) ctx->min_send_delay,
+ remaining_wait_time_ms);
+
+ /* Sleep until we get reply from worker or we time out */
+ WalSndWait(WL_SOCKET_READABLE,
+ Min(timeout_sleeptime_ms, remaining_wait_time_ms),
+ WAIT_EVENT_WALSENDER_SEND_DELAY);
+ }
+}
diff --git a/src/backend/utils/activity/wait_event.c b/src/backend/utils/activity/wait_event.c
index cb99cc6339..76c19fe11d 100644
--- a/src/backend/utils/activity/wait_event.c
+++ b/src/backend/utils/activity/wait_event.c
@@ -515,6 +515,9 @@ pgstat_get_wait_timeout(WaitEventTimeout w)
case WAIT_EVENT_VACUUM_TRUNCATE:
event_name = "VacuumTruncate";
break;
+ case WAIT_EVENT_WALSENDER_SEND_DELAY:
+ event_name = "WalSenderSendDelay";
+ break;
/* no default case, so that compiler will warn */
}
diff --git a/src/bin/pg_dump/pg_dump.c b/src/bin/pg_dump/pg_dump.c
index 527c7651ab..bd95747840 100644
--- a/src/bin/pg_dump/pg_dump.c
+++ b/src/bin/pg_dump/pg_dump.c
@@ -4494,6 +4494,7 @@ getSubscriptions(Archive *fout)
int i_subsynccommit;
int i_subpublications;
int i_subbinary;
+ int i_subminsenddelay;
int i,
ntups;
@@ -4546,9 +4547,13 @@ getSubscriptions(Archive *fout)
LOGICALREP_TWOPHASE_STATE_DISABLED);
if (fout->remoteVersion >= 160000)
- appendPQExpBufferStr(query, " s.suborigin\n");
+ appendPQExpBufferStr(query,
+ " s.suborigin,\n"
+ " s.subminsenddelay\n");
else
- appendPQExpBuffer(query, " '%s' AS suborigin\n", LOGICALREP_ORIGIN_ANY);
+ appendPQExpBuffer(query, " '%s' AS suborigin,\n"
+ " 0 AS subminsenddelay\n",
+ LOGICALREP_ORIGIN_ANY);
appendPQExpBufferStr(query,
"FROM pg_subscription s\n"
@@ -4576,6 +4581,7 @@ getSubscriptions(Archive *fout)
i_subtwophasestate = PQfnumber(res, "subtwophasestate");
i_subdisableonerr = PQfnumber(res, "subdisableonerr");
i_suborigin = PQfnumber(res, "suborigin");
+ i_subminsenddelay = PQfnumber(res, "subminsenddelay");
subinfo = pg_malloc(ntups * sizeof(SubscriptionInfo));
@@ -4606,6 +4612,8 @@ getSubscriptions(Archive *fout)
subinfo[i].subdisableonerr =
pg_strdup(PQgetvalue(res, i, i_subdisableonerr));
subinfo[i].suborigin = pg_strdup(PQgetvalue(res, i, i_suborigin));
+ subinfo[i].subminsenddelay =
+ atoi(PQgetvalue(res, i, i_subminsenddelay));
/* Decide whether we want to dump it */
selectDumpableObject(&(subinfo[i].dobj), fout);
@@ -4687,6 +4695,9 @@ dumpSubscription(Archive *fout, const SubscriptionInfo *subinfo)
if (strcmp(subinfo->subsynccommit, "off") != 0)
appendPQExpBuffer(query, ", synchronous_commit = %s", fmtId(subinfo->subsynccommit));
+ if (subinfo->subminsenddelay > 0)
+ appendPQExpBuffer(query, ", min_send_delay = '%d ms'", subinfo->subminsenddelay);
+
appendPQExpBufferStr(query, ");\n");
if (subinfo->dobj.dump & DUMP_COMPONENT_DEFINITION)
diff --git a/src/bin/pg_dump/pg_dump.h b/src/bin/pg_dump/pg_dump.h
index e7cbd8d7ed..24e0f6737f 100644
--- a/src/bin/pg_dump/pg_dump.h
+++ b/src/bin/pg_dump/pg_dump.h
@@ -661,6 +661,7 @@ typedef struct _SubscriptionInfo
char *subdisableonerr;
char *suborigin;
char *subsynccommit;
+ int subminsenddelay;
char *subpublications;
} SubscriptionInfo;
diff --git a/src/bin/psql/describe.c b/src/bin/psql/describe.c
index c8a0bb7b3a..c7d303a168 100644
--- a/src/bin/psql/describe.c
+++ b/src/bin/psql/describe.c
@@ -6472,7 +6472,7 @@ describeSubscriptions(const char *pattern, bool verbose)
PGresult *res;
printQueryOpt myopt = pset.popt;
static const bool translate_columns[] = {false, false, false, false,
- false, false, false, false, false, false, false, false};
+ false, false, false, false, false, false, false, false, false};
if (pset.sversion < 100000)
{
@@ -6527,10 +6527,13 @@ describeSubscriptions(const char *pattern, bool verbose)
gettext_noop("Two-phase commit"),
gettext_noop("Disable on error"));
+ /* Origin and min_send_delay are only supported in v16 and higher */
if (pset.sversion >= 160000)
appendPQExpBuffer(&buf,
- ", suborigin AS \"%s\"\n",
- gettext_noop("Origin"));
+ ", suborigin AS \"%s\"\n"
+ ", subminsenddelay AS \"%s\"\n",
+ gettext_noop("Origin"),
+ gettext_noop("Min send delay"));
appendPQExpBuffer(&buf,
", subsynccommit AS \"%s\"\n"
diff --git a/src/bin/psql/tab-complete.c b/src/bin/psql/tab-complete.c
index 5e1882eaea..6643db6f55 100644
--- a/src/bin/psql/tab-complete.c
+++ b/src/bin/psql/tab-complete.c
@@ -1925,7 +1925,7 @@ psql_completion(const char *text, int start, int end)
COMPLETE_WITH("(", "PUBLICATION");
/* ALTER SUBSCRIPTION <name> SET ( */
else if (HeadMatches("ALTER", "SUBSCRIPTION", MatchAny) && TailMatches("SET", "("))
- COMPLETE_WITH("binary", "disable_on_error", "origin", "slot_name",
+ COMPLETE_WITH("binary", "disable_on_error", "min_send_delay", "origin", "slot_name",
"streaming", "synchronous_commit");
/* ALTER SUBSCRIPTION <name> SKIP ( */
else if (HeadMatches("ALTER", "SUBSCRIPTION", MatchAny) && TailMatches("SKIP", "("))
@@ -3268,7 +3268,7 @@ psql_completion(const char *text, int start, int end)
/* Complete "CREATE SUBSCRIPTION <name> ... WITH ( <opt>" */
else if (HeadMatches("CREATE", "SUBSCRIPTION") && TailMatches("WITH", "("))
COMPLETE_WITH("binary", "connect", "copy_data", "create_slot",
- "disable_on_error", "enabled", "origin", "slot_name",
+ "disable_on_error", "enabled", "min_send_delay", "origin", "slot_name",
"streaming", "synchronous_commit", "two_phase");
/* CREATE TRIGGER --- is allowed inside CREATE SCHEMA, so use TailMatches */
diff --git a/src/include/catalog/pg_subscription.h b/src/include/catalog/pg_subscription.h
index b0f2a1705d..69ae4314b4 100644
--- a/src/include/catalog/pg_subscription.h
+++ b/src/include/catalog/pg_subscription.h
@@ -74,6 +74,8 @@ CATALOG(pg_subscription,6100,SubscriptionRelationId) BKI_SHARED_RELATION BKI_ROW
Oid subowner BKI_LOOKUP(pg_authid); /* Owner of the subscription */
+ int32 subminsenddelay; /* Replication send delay (ms) */
+
bool subenabled; /* True if the subscription is enabled (the
* worker should be running) */
@@ -122,6 +124,7 @@ typedef struct Subscription
* skipped */
char *name; /* Name of the subscription */
Oid owner; /* Oid of the subscription owner */
+ int32 minsenddelay; /* Replication send delay (ms) */
bool enabled; /* Indicates if the subscription is enabled */
bool binary; /* Indicates if the subscription wants data in
* binary format */
diff --git a/src/include/replication/logical.h b/src/include/replication/logical.h
index 5f49554ea0..21a324d69e 100644
--- a/src/include/replication/logical.h
+++ b/src/include/replication/logical.h
@@ -30,6 +30,10 @@ typedef void (*LogicalOutputPluginWriterUpdateProgress) (struct LogicalDecodingC
bool skipped_xact
);
+typedef void (*LogicalOutputPluginWriterDelay) (struct LogicalDecodingContext *lr,
+ TimestampTz start_time
+);
+
typedef struct LogicalDecodingContext
{
/* memory context this is all allocated in */
@@ -64,6 +68,7 @@ typedef struct LogicalDecodingContext
LogicalOutputPluginWriterPrepareWrite prepare_write;
LogicalOutputPluginWriterWrite write;
LogicalOutputPluginWriterUpdateProgress update_progress;
+ LogicalOutputPluginWriterDelay delay_send;
/*
* Output buffer.
@@ -100,6 +105,12 @@ typedef struct LogicalDecodingContext
*/
bool twophase_opt_given;
+ /*
+ * The minimum delay, in milliseconds, by the publisher before sending
+ * COMMIT/PREPARE record.
+ */
+ int32 min_send_delay;
+
/*
* State for writing output.
*/
@@ -121,14 +132,16 @@ extern LogicalDecodingContext *CreateInitDecodingContext(const char *plugin,
XLogReaderRoutine *xl_routine,
LogicalOutputPluginWriterPrepareWrite prepare_write,
LogicalOutputPluginWriterWrite do_write,
- LogicalOutputPluginWriterUpdateProgress update_progress);
+ LogicalOutputPluginWriterUpdateProgress update_progress,
+ LogicalOutputPluginWriterDelay delay_send);
extern LogicalDecodingContext *CreateDecodingContext(XLogRecPtr start_lsn,
List *output_plugin_options,
bool fast_forward,
XLogReaderRoutine *xl_routine,
LogicalOutputPluginWriterPrepareWrite prepare_write,
LogicalOutputPluginWriterWrite do_write,
- LogicalOutputPluginWriterUpdateProgress update_progress);
+ LogicalOutputPluginWriterUpdateProgress update_progress,
+ LogicalOutputPluginWriterDelay delay_send);
extern void DecodingContextFindStartpoint(LogicalDecodingContext *ctx);
extern bool DecodingContextReady(LogicalDecodingContext *ctx);
extern void FreeDecodingContext(LogicalDecodingContext *ctx);
diff --git a/src/include/replication/logicalproto.h b/src/include/replication/logicalproto.h
index 0ea2df5088..46faadbd7a 100644
--- a/src/include/replication/logicalproto.h
+++ b/src/include/replication/logicalproto.h
@@ -36,13 +36,17 @@
* LOGICALREP_PROTO_STREAM_PARALLEL_VERSION_NUM is the minimum protocol version
* where we support applying large streaming transactions in parallel.
* Introduced in PG16.
+ *
+ * LOGICALREP_PROTO_MIN_SEND_DELAY_VERSION_NUM is the minimum protocol version
+ * with support for delaying to send transactions. Introduced in PG16.
*/
#define LOGICALREP_PROTO_MIN_VERSION_NUM 1
#define LOGICALREP_PROTO_VERSION_NUM 1
#define LOGICALREP_PROTO_STREAM_VERSION_NUM 2
#define LOGICALREP_PROTO_TWOPHASE_VERSION_NUM 3
#define LOGICALREP_PROTO_STREAM_PARALLEL_VERSION_NUM 4
-#define LOGICALREP_PROTO_MAX_VERSION_NUM LOGICALREP_PROTO_STREAM_PARALLEL_VERSION_NUM
+#define LOGICALREP_PROTO_MIN_SEND_DELAY_VERSION_NUM 4
+#define LOGICALREP_PROTO_MAX_VERSION_NUM LOGICALREP_PROTO_MIN_SEND_DELAY_VERSION_NUM
/*
* Logical message types
diff --git a/src/include/replication/pgoutput.h b/src/include/replication/pgoutput.h
index b4a8015403..d2fde09e00 100644
--- a/src/include/replication/pgoutput.h
+++ b/src/include/replication/pgoutput.h
@@ -30,6 +30,7 @@ typedef struct PGOutputData
bool messages;
bool two_phase;
char *origin;
+ int32 min_send_delay;
} PGOutputData;
#endif /* PGOUTPUT_H */
diff --git a/src/include/replication/walreceiver.h b/src/include/replication/walreceiver.h
index decffe352d..c20969aed7 100644
--- a/src/include/replication/walreceiver.h
+++ b/src/include/replication/walreceiver.h
@@ -187,6 +187,7 @@ typedef struct
* prepare time */
char *origin; /* Only publish data originating from the
* specified origin */
+ int32 min_send_delay; /* The minimum send delay */
} logical;
} proto;
} WalRcvStreamOptions;
diff --git a/src/include/utils/wait_event.h b/src/include/utils/wait_event.h
index 9ab23e1c4a..cc3a234eba 100644
--- a/src/include/utils/wait_event.h
+++ b/src/include/utils/wait_event.h
@@ -150,7 +150,8 @@ typedef enum
WAIT_EVENT_REGISTER_SYNC_REQUEST,
WAIT_EVENT_SPIN_DELAY,
WAIT_EVENT_VACUUM_DELAY,
- WAIT_EVENT_VACUUM_TRUNCATE
+ WAIT_EVENT_VACUUM_TRUNCATE,
+ WAIT_EVENT_WALSENDER_SEND_DELAY
} WaitEventTimeout;
/* ----------
diff --git a/src/test/regress/expected/subscription.out b/src/test/regress/expected/subscription.out
index 3f99b14394..2027316233 100644
--- a/src/test/regress/expected/subscription.out
+++ b/src/test/regress/expected/subscription.out
@@ -114,18 +114,18 @@ CREATE SUBSCRIPTION regress_testsub4 CONNECTION 'dbname=regress_doesnotexist' PU
WARNING: subscription was created, but is not connected
HINT: To initiate replication, you must manually create the replication slot, enable the subscription, and refresh the subscription.
\dRs+ regress_testsub4
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
-------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub4 | regress_subscription_user | f | {testpub} | f | off | d | f | none | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min send delay | Synchronous commit | Conninfo | Skip LSN
+------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+----------------+--------------------+-----------------------------+----------
+ regress_testsub4 | regress_subscription_user | f | {testpub} | f | off | d | f | none | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
ALTER SUBSCRIPTION regress_testsub4 SET (origin = any);
\dRs+ regress_testsub4
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
-------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub4 | regress_subscription_user | f | {testpub} | f | off | d | f | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min send delay | Synchronous commit | Conninfo | Skip LSN
+------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+----------------+--------------------+-----------------------------+----------
+ regress_testsub4 | regress_subscription_user | f | {testpub} | f | off | d | f | any | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
DROP SUBSCRIPTION regress_testsub3;
@@ -143,10 +143,10 @@ ALTER SUBSCRIPTION regress_testsub CONNECTION 'foobar';
ERROR: invalid connection string syntax: missing "=" after "foobar" in connection info string
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub} | f | off | d | f | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min send delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+----------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | off | d | f | any | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
ALTER SUBSCRIPTION regress_testsub SET PUBLICATION testpub2, testpub3 WITH (refresh = false);
@@ -163,10 +163,10 @@ ERROR: unrecognized subscription parameter: "create_slot"
-- ok
ALTER SUBSCRIPTION regress_testsub SKIP (lsn = '0/12345');
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+---------------------+--------+-----------+------------------+------------------+--------+--------------------+------------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub2,testpub3} | f | off | d | f | any | off | dbname=regress_doesnotexist2 | 0/12345
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min send delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+---------------------+--------+-----------+------------------+------------------+--------+----------------+--------------------+------------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub2,testpub3} | f | off | d | f | any | 0 | off | dbname=regress_doesnotexist2 | 0/12345
(1 row)
-- ok - with lsn = NONE
@@ -175,10 +175,10 @@ ALTER SUBSCRIPTION regress_testsub SKIP (lsn = NONE);
ALTER SUBSCRIPTION regress_testsub SKIP (lsn = '0/0');
ERROR: invalid WAL location (LSN): 0/0
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+---------------------+--------+-----------+------------------+------------------+--------+--------------------+------------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub2,testpub3} | f | off | d | f | any | off | dbname=regress_doesnotexist2 | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min send delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+---------------------+--------+-----------+------------------+------------------+--------+----------------+--------------------+------------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub2,testpub3} | f | off | d | f | any | 0 | off | dbname=regress_doesnotexist2 | 0/0
(1 row)
BEGIN;
@@ -210,10 +210,10 @@ ALTER SUBSCRIPTION regress_testsub_foo SET (synchronous_commit = foobar);
ERROR: invalid value for parameter "synchronous_commit": "foobar"
HINT: Available values: local, remote_write, remote_apply, on, off.
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
----------------------+---------------------------+---------+---------------------+--------+-----------+------------------+------------------+--------+--------------------+------------------------------+----------
- regress_testsub_foo | regress_subscription_user | f | {testpub2,testpub3} | f | off | d | f | any | local | dbname=regress_doesnotexist2 | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min send delay | Synchronous commit | Conninfo | Skip LSN
+---------------------+---------------------------+---------+---------------------+--------+-----------+------------------+------------------+--------+----------------+--------------------+------------------------------+----------
+ regress_testsub_foo | regress_subscription_user | f | {testpub2,testpub3} | f | off | d | f | any | 0 | local | dbname=regress_doesnotexist2 | 0/0
(1 row)
-- rename back to keep the rest simple
@@ -247,19 +247,19 @@ CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUB
WARNING: subscription was created, but is not connected
HINT: To initiate replication, you must manually create the replication slot, enable the subscription, and refresh the subscription.
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub} | t | off | d | f | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min send delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+----------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | t | off | d | f | any | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
ALTER SUBSCRIPTION regress_testsub SET (binary = false);
ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE);
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub} | f | off | d | f | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min send delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+----------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | off | d | f | any | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
DROP SUBSCRIPTION regress_testsub;
@@ -271,27 +271,27 @@ CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUB
WARNING: subscription was created, but is not connected
HINT: To initiate replication, you must manually create the replication slot, enable the subscription, and refresh the subscription.
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub} | f | on | d | f | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min send delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+----------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | on | d | f | any | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
ALTER SUBSCRIPTION regress_testsub SET (streaming = parallel);
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub} | f | parallel | d | f | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min send delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+----------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | parallel | d | f | any | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
ALTER SUBSCRIPTION regress_testsub SET (streaming = false);
ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE);
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub} | f | off | d | f | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min send delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+----------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | off | d | f | any | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
-- fail - publication already exists
@@ -306,10 +306,10 @@ ALTER SUBSCRIPTION regress_testsub ADD PUBLICATION testpub1, testpub2 WITH (refr
ALTER SUBSCRIPTION regress_testsub ADD PUBLICATION testpub1, testpub2 WITH (refresh = false);
ERROR: publication "testpub1" is already in subscription "regress_testsub"
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-----------------------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub,testpub1,testpub2} | f | off | d | f | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min send delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-----------------------------+--------+-----------+------------------+------------------+--------+----------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub,testpub1,testpub2} | f | off | d | f | any | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
-- fail - publication used more than once
@@ -324,10 +324,10 @@ ERROR: publication "testpub3" is not in subscription "regress_testsub"
-- ok - delete publications
ALTER SUBSCRIPTION regress_testsub DROP PUBLICATION testpub1, testpub2 WITH (refresh = false);
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub} | f | off | d | f | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min send delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+----------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | off | d | f | any | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
DROP SUBSCRIPTION regress_testsub;
@@ -363,10 +363,10 @@ CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUB
WARNING: subscription was created, but is not connected
HINT: To initiate replication, you must manually create the replication slot, enable the subscription, and refresh the subscription.
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub} | f | off | p | f | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min send delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+----------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | off | p | f | any | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
--fail - alter of two_phase option not supported.
@@ -375,10 +375,10 @@ ERROR: unrecognized subscription parameter: "two_phase"
-- but can alter streaming when two_phase enabled
ALTER SUBSCRIPTION regress_testsub SET (streaming = true);
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub} | f | on | p | f | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min send delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+----------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | on | p | f | any | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE);
@@ -388,10 +388,10 @@ CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUB
WARNING: subscription was created, but is not connected
HINT: To initiate replication, you must manually create the replication slot, enable the subscription, and refresh the subscription.
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub} | f | on | p | f | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min send delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+----------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | on | p | f | any | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE);
@@ -404,20 +404,61 @@ CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUB
WARNING: subscription was created, but is not connected
HINT: To initiate replication, you must manually create the replication slot, enable the subscription, and refresh the subscription.
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub} | f | off | d | f | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min send delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+----------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | off | d | f | any | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
ALTER SUBSCRIPTION regress_testsub SET (disable_on_error = true);
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub} | f | off | d | t | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min send delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+----------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | off | d | t | any | 0 | off | dbname=regress_doesnotexist | 0/0
+(1 row)
+
+ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE);
+DROP SUBSCRIPTION regress_testsub;
+-- fail -- min_send_delay must be a non-negative integer
+CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexit' PUBLICATION testpub WITH (connect = false, min_send_delay = foo);
+ERROR: invalid value for parameter "min_send_delay": "foo"
+CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexit' PUBLICATION testpub WITH (connect = false, min_send_delay = -1);
+ERROR: -1 ms is outside the valid range for parameter "min_send_delay" (0 .. 2147483647)
+-- fail - specifying streaming = parallel with time-delayed replication is not
+-- supported
+CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUBLICATION testpub WITH (connect = false, streaming = parallel, min_send_delay = 123);
+ERROR: min_send_delay > 0 and streaming = parallel are mutually exclusive options
+-- success -- min_send_delay value without units is taken as milliseconds
+CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexit' PUBLICATION testpub WITH (connect = false, min_send_delay = 123);
+WARNING: subscription was created, but is not connected
+HINT: To initiate replication, you must manually create the replication slot, enable the subscription, and refresh the subscription.
+\dRs+
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min send delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+----------------+--------------------+----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | off | d | f | any | 123 | off | dbname=regress_doesnotexit | 0/0
+(1 row)
+
+-- success -- min_send_delay value with units other than ms is converted to ms
+-- and stored as an integer
+ALTER SUBSCRIPTION regress_testsub SET (min_send_delay = '1 d');
+\dRs+
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min send delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+----------------+--------------------+----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | off | d | f | any | 86400000 | off | dbname=regress_doesnotexit | 0/0
(1 row)
+-- fail - alter subscription with streaming = parallel should fail when
+-- time-delayed replication is set
+ALTER SUBSCRIPTION regress_testsub SET (streaming = parallel);
+ERROR: cannot set parallel streaming mode for subscription with min_send_delay
+-- fail - alter subscription with min_send_delay should fail when
+-- streaming = parallel is set
+ALTER SUBSCRIPTION regress_testsub SET (min_send_delay = 0, streaming = parallel);
+ALTER SUBSCRIPTION regress_testsub SET (min_send_delay = 123);
+ERROR: cannot set min_send_delay for subscription in parallel streaming mode
ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE);
DROP SUBSCRIPTION regress_testsub;
RESET SESSION AUTHORIZATION;
diff --git a/src/test/regress/sql/subscription.sql b/src/test/regress/sql/subscription.sql
index 7281f5fee2..46bf4a27d9 100644
--- a/src/test/regress/sql/subscription.sql
+++ b/src/test/regress/sql/subscription.sql
@@ -286,6 +286,35 @@ ALTER SUBSCRIPTION regress_testsub SET (disable_on_error = true);
ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE);
DROP SUBSCRIPTION regress_testsub;
+-- fail -- min_send_delay must be a non-negative integer
+CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexit' PUBLICATION testpub WITH (connect = false, min_send_delay = foo);
+CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexit' PUBLICATION testpub WITH (connect = false, min_send_delay = -1);
+
+-- fail - specifying streaming = parallel with time-delayed replication is not
+-- supported
+CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUBLICATION testpub WITH (connect = false, streaming = parallel, min_send_delay = 123);
+
+-- success -- min_send_delay value without units is taken as milliseconds
+CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexit' PUBLICATION testpub WITH (connect = false, min_send_delay = 123);
+\dRs+
+
+-- success -- min_send_delay value with units other than ms is converted to ms
+-- and stored as an integer
+ALTER SUBSCRIPTION regress_testsub SET (min_send_delay = '1 d');
+\dRs+
+
+-- fail - alter subscription with streaming = parallel should fail when
+-- time-delayed replication is set
+ALTER SUBSCRIPTION regress_testsub SET (streaming = parallel);
+
+-- fail - alter subscription with min_send_delay should fail when
+-- streaming = parallel is set
+ALTER SUBSCRIPTION regress_testsub SET (min_send_delay = 0, streaming = parallel);
+ALTER SUBSCRIPTION regress_testsub SET (min_send_delay = 123);
+
+ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE);
+DROP SUBSCRIPTION regress_testsub;
+
RESET SESSION AUTHORIZATION;
DROP ROLE regress_subscription_user;
DROP ROLE regress_subscription_user2;
diff --git a/src/test/subscription/t/001_rep_changes.pl b/src/test/subscription/t/001_rep_changes.pl
index 91aa068c95..063a98fde9 100644
--- a/src/test/subscription/t/001_rep_changes.pl
+++ b/src/test/subscription/t/001_rep_changes.pl
@@ -515,6 +515,33 @@ $node_publisher->poll_query_until('postgres',
or die
"Timed out while waiting for apply to restart after renaming SUBSCRIPTION";
+# Test time-delayed logical replication
+#
+# If the subscription sets min_send_delay parameter, the walsender will delay
+# the transaction send for min_send_delay milliseconds. We verify this by
+# looking at the time difference between a) when tuples are inserted on the
+# publisher, and b) when those changes are replicated on the subscriber. Even
+# on slow machines, this strategy will give predictable behavior.
+
+# Set min_send_delay parameter to 3 seconds
+my $delay = 3;
+$node_subscriber->safe_psql('postgres',
+ "ALTER SUBSCRIPTION tap_sub_renamed SET (min_send_delay = '${delay}s')");
+
+# Before doing the insertion, get the current timestamp that will be
+# used as a comparison base.
+my $publisher_insert_time = time();
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_ins VALUES (generate_series(1101, 1120))");
+
+# The publisher waits for the replication to complete
+$node_publisher->wait_for_catchup('tap_sub_renamed');
+
+# This test is successful only if at least the configured delay has elapsed.
+ok( time() - $publisher_insert_time >= $delay,
+ "subscriber applies WAL only after replication delay for non-streaming transaction"
+);
+
# check all the cleanup
$node_subscriber->safe_psql('postgres', "DROP SUBSCRIPTION tap_sub_renamed");
--
2.27.0
On Tue, Feb 21, 2023 at 1:28 PM Hayato Kuroda (Fujitsu)
<kuroda.hayato@fujitsu.com> wrote:
doc/src/sgml/catalogs.sgml
4. + <row> + <entry role="catalog_table_entry"><para role="column_definition"> + <structfield>subminsenddelay</structfield> <type>int4</type> + </para> + <para> + The minimum delay, in milliseconds, by the publisher to send changes + </para></entry> + </row>"by the publisher to send changes" --> "by the publisher before sending changes"
As Amit said[1], there is a possibility to delay after sending delay. So I changed to
"before sending COMMIT record". How do you think?
I think it would be better to say: "The minimum delay, in
milliseconds, by the publisher before sending all the changes". If you
agree then similar change is required in below comment as well:
+ /*
+ * The minimum delay, in milliseconds, by the publisher before sending
+ * COMMIT/PREPARE record.
+ */
+ int32 min_send_delay;
+
src/backend/replication/pgoutput/pgoutput.c
11. + errno = 0; + parsed = strtoul(strVal(defel->arg), &endptr, 10); + if (errno != 0 || *endptr != '\0') + ereport(ERROR, + (errcode(ERRCODE_INVALID_PARAMETER_VALUE), + errmsg("invalid min_send_delay"))); + + if (parsed > PG_INT32_MAX) + ereport(ERROR, + (errcode(ERRCODE_INVALID_PARAMETER_VALUE), + errmsg("min_send_delay \"%s\" out of range", + strVal(defel->arg))));Should the validation be also checking/asserting no negative numbers,
or actually should the min_send_delay be defined as a uint32 in the
first place?I think you are right because min_apply_delay does not have related code.
we must consider additional possibility that user may send START_REPLICATION
by hand and it has minus value.
Fixed.
Your reasoning for adding the additional check seems good to me but I
don't see it in the patch. The check as I see is as below:
+ if (delay_val > PG_INT32_MAX)
+ ereport(ERROR,
+ (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+ errmsg("min_send_delay \"%s\" out of range",
+ strVal(defel->arg))));
Am, I missing something, and the new check is at some other place?
+ has been finished. However, there is a possibility that the table
+ status written in <link
linkend="catalog-pg-subscription-rel"><structname>pg_subscription_rel</structname></link>
+ will be delayed in getting to "ready" state, and also two-phase
+ (if specified) will be delayed in getting to "enabled".
+ </para>
There appears to be a special value <0x00> after "ready". I think that
is added by mistake or probably you have used some editor which has
added this value. Can we slightly reword this to: "However, there is a
possibility that the table status updated in <link
linkend="catalog-pg-subscription-rel"><structname>pg_subscription_rel</structname></link>
could be delayed in getting to the "ready" state, and also two-phase
(if specified) could be delayed in getting to "enabled"."?
--
With Regards,
Amit Kapila.
Dear Amit,
Thank you for reviewing! PSA new version.
I think it would be better to say: "The minimum delay, in milliseconds, by the publisher before sending all the changes". If you agree then similar change is required in below comment as well: + /* + * The minimum delay, in milliseconds, by the publisher before sending + * COMMIT/PREPARE record. + */ + int32 min_send_delay;
OK, both of them were fixed.
Should the validation be also checking/asserting no negative numbers,
or actually should the min_send_delay be defined as a uint32 in the
first place?I think you are right because min_apply_delay does not have related code.
we must consider additional possibility that user may sendSTART_REPLICATION
by hand and it has minus value.
Fixed.Your reasoning for adding the additional check seems good to me but I don't see it in the patch. The check as I see is as below: + if (delay_val > PG_INT32_MAX) + ereport(ERROR, + (errcode(ERRCODE_INVALID_PARAMETER_VALUE), + errmsg("min_send_delay \"%s\" out of range", + strVal(defel->arg))));Am, I missing something, and the new check is at some other place?
For extracting value from the string, strtoul() is used.
This is an important point.
```
delay_val = strtoul(strVal(defel->arg), &endptr, 10);
```
If user specifies min_send_delay as '-1', the value is read as a bit string
'0xFFFFFFFFFFFFFFFF', and it is interpreted as PG_UINT64_MAX. After that such a
strange value is rejected by the part you copied. I have tested the case and it has
correctly rejected.
```
postgres=# START_REPLICATION SLOT "sub" LOGICAL 0/0 (min_send_delay '-1');
ERROR: min_send_delay "-1" out of range
CONTEXT: slot "sub", output plugin "pgoutput", in the startup callback
```
+ has been finished. However, there is a possibility that the table + status written in <link linkend="catalog-pg-subscription-rel"><structname>pg_subscription_rel</stru ctname></link> + will be delayed in getting to "ready" state, and also two-phase + (if specified) will be delayed in getting to "enabled". + </para>There appears to be a special value <0x00> after "ready". I think that
is added by mistake or probably you have used some editor which has
added this value. Can we slightly reword this to: "However, there is a
possibility that the table status updated in <link
linkend="catalog-pg-subscription-rel"><structname>pg_subscription_rel</stru
ctname></link>
could be delayed in getting to the "ready" state, and also two-phase
(if specified) could be delayed in getting to "enabled"."?
Oh, my Visual Studio Code did not detect the strange character.
And reworded accordingly.
Additionally, I modified the commit message to describe more clearly the reason
why the do not allow combination of min_send_delay and streaming = parallel.
Best Regards,
Hayato Kuroda
FUJITSU LIMITED
Attachments:
v6-0001-Time-delayed-logical-replication-on-publisher-sid.patchapplication/octet-stream; name=v6-0001-Time-delayed-logical-replication-on-publisher-sid.patchDownload
From 6721d1747e055478d204ce479ba883659bfb5301 Mon Sep 17 00:00:00 2001
From: Hayato Kuroda <kuroda.hayato@fujitsu.com>
Date: Thu, 16 Feb 2023 07:52:23 +0000
Subject: [PATCH v6] Time-delayed logical replication on publisher side
Similar to physical replication, a time-delayed copy of the data for
logical replication is useful for some scenarios (particularly to fix
errors that might cause data loss).
This patch implements a new subscription parameter called 'min_send_delay'.
If the subscription sets min_send_delay parameter, the apply worker
(via walrcv_startstreaming) passes the value to the publisher as an output plugin
option. The walsender will delay the transaction sending for given milliseconds.
The delay does not take into account the overhead of time spent in transferring
the transaction, which means that the arrival time at the subscriber may be
delayed more than the given time.
The delay occurs before we start to send the transaction on the publisher.
Regular and prepared transactions are covered. Streamed transactions are also
covered.
The combination of parallel streaming mode and min_send_delay is not allowed
1. This is because in parallel streaming mode, we start applying the transaction
stream as soon as the first change arrives without knowing the transaction's
prepare/commit time. Always waiting for the full 'min_send_delay' period might
include unnecessary delay.
2. Another reason is, for parallel streaming, the transaction will be opened
immediately by the parallel apply worker. So if walsender delayed to send the
final record of the transaction, the parallel worker must wait receiving with an
opened transaction. This would lead that locks acquired during the transaction
not getting released till min_send_delay elapsed.
Earlier versions were written by Euler Taveira, Takamichi Osumi, and Kuroda Hayato
Author: Kuroda Hayato, Takamichi Osumi
Reviewed-by: Amit Kapila, Peter Smith, Vignesh C, Shveta Malik,
Kyotaro Horiguchi, Shi Yu, Wang Wei, Dilip Kumar, Melih Mutlu,
Andres Freund
---
doc/src/sgml/catalogs.sgml | 10 +
doc/src/sgml/glossary.sgml | 15 ++
doc/src/sgml/logical-replication.sgml | 6 +
doc/src/sgml/monitoring.sgml | 5 +
doc/src/sgml/ref/alter_subscription.sgml | 5 +-
doc/src/sgml/ref/create_subscription.sgml | 43 +++-
src/backend/catalog/pg_subscription.c | 1 +
src/backend/catalog/system_views.sql | 7 +-
src/backend/commands/subscriptioncmds.c | 119 ++++++++++-
.../libpqwalreceiver/libpqwalreceiver.c | 5 +
src/backend/replication/logical/decode.c | 18 ++
src/backend/replication/logical/logical.c | 18 +-
.../replication/logical/logicalfuncs.c | 2 +-
src/backend/replication/logical/worker.c | 12 +-
src/backend/replication/pgoutput/pgoutput.c | 36 ++++
src/backend/replication/slotfuncs.c | 4 +-
src/backend/replication/walsender.c | 78 +++++++-
src/backend/utils/activity/wait_event.c | 3 +
src/bin/pg_dump/pg_dump.c | 15 +-
src/bin/pg_dump/pg_dump.h | 1 +
src/bin/psql/describe.c | 9 +-
src/bin/psql/tab-complete.c | 4 +-
src/include/catalog/pg_subscription.h | 3 +
src/include/replication/logical.h | 17 +-
src/include/replication/logicalproto.h | 6 +-
src/include/replication/pgoutput.h | 1 +
src/include/replication/walreceiver.h | 1 +
src/include/utils/wait_event.h | 3 +-
src/test/regress/expected/subscription.out | 185 +++++++++++-------
src/test/regress/sql/subscription.sql | 29 +++
src/test/subscription/t/001_rep_changes.pl | 27 +++
31 files changed, 583 insertions(+), 105 deletions(-)
diff --git a/doc/src/sgml/catalogs.sgml b/doc/src/sgml/catalogs.sgml
index c1e4048054..ce6521fe6c 100644
--- a/doc/src/sgml/catalogs.sgml
+++ b/doc/src/sgml/catalogs.sgml
@@ -7892,6 +7892,16 @@ SCRAM-SHA-256$<replaceable><iteration count></replaceable>:<replaceable>&l
</para></entry>
</row>
+ <row>
+ <entry role="catalog_table_entry"><para role="column_definition">
+ <structfield>subminsenddelay</structfield> <type>int4</type>
+ </para>
+ <para>
+ The minimum delay, in milliseconds, by the publisher before sending all
+ the changes
+ </para></entry>
+ </row>
+
<row>
<entry role="catalog_table_entry"><para role="column_definition">
<structfield>subenabled</structfield> <type>bool</type>
diff --git a/doc/src/sgml/glossary.sgml b/doc/src/sgml/glossary.sgml
index 7c01a541fe..9ede9d05f6 100644
--- a/doc/src/sgml/glossary.sgml
+++ b/doc/src/sgml/glossary.sgml
@@ -1729,6 +1729,21 @@
</glossdef>
</glossentry>
+ <glossentry id="glossary-time-delayed-replication">
+ <glossterm>Time-delayed replication</glossterm>
+ <glossdef>
+ <para>
+ Replication setup that delays the application of changes by a specified
+ minimum time-delay period.
+ </para>
+ <para>
+ For more information, see
+ <xref linkend="guc-recovery-min-apply-delay"/> for physical replication
+ and <xref linkend="sql-createsubscription"/> for logical replication.
+ </para>
+ </glossdef>
+ </glossentry>
+
<glossentry id="glossary-toast">
<glossterm>TOAST</glossterm>
<glossdef>
diff --git a/doc/src/sgml/logical-replication.sgml b/doc/src/sgml/logical-replication.sgml
index 1bd5660c87..fe9e7f7b26 100644
--- a/doc/src/sgml/logical-replication.sgml
+++ b/doc/src/sgml/logical-replication.sgml
@@ -247,6 +247,12 @@
target table.
</para>
+ <para>
+ A subscription can delay the receipt of changes by specifying the
+ <literal>min_send_delay</literal> subscription parameter. See
+ <xref linkend="sql-createsubscription"/> for details.
+ </para>
+
<sect2 id="logical-replication-subscription-slot">
<title>Replication Slot Management</title>
diff --git a/doc/src/sgml/monitoring.sgml b/doc/src/sgml/monitoring.sgml
index b0b997f092..6158587644 100644
--- a/doc/src/sgml/monitoring.sgml
+++ b/doc/src/sgml/monitoring.sgml
@@ -2349,6 +2349,11 @@ postgres 27093 0.0 0.0 30096 2752 ? Ss 11:34 0:00 postgres: ser
<entry>Waiting to acquire an exclusive lock to truncate off any
empty pages at the end of a table vacuumed.</entry>
</row>
+ <row>
+ <entry><literal>WalSenderSendDelay</literal></entry>
+ <entry>Waiting while sending changes for time-delayed logical replication
+ in the WAL sender process.</entry>
+ </row>
</tbody>
</tgroup>
</table>
diff --git a/doc/src/sgml/ref/alter_subscription.sgml b/doc/src/sgml/ref/alter_subscription.sgml
index 964fcbb8ff..3f238b958b 100644
--- a/doc/src/sgml/ref/alter_subscription.sgml
+++ b/doc/src/sgml/ref/alter_subscription.sgml
@@ -213,8 +213,9 @@ ALTER SUBSCRIPTION <replaceable class="parameter">name</replaceable> RENAME TO <
are <literal>slot_name</literal>,
<literal>synchronous_commit</literal>,
<literal>binary</literal>, <literal>streaming</literal>,
- <literal>disable_on_error</literal>, and
- <literal>origin</literal>.
+ <literal>disable_on_error</literal>,
+ <literal>origin</literal>, and
+ <literal>min_send_delay</literal>.
</para>
</listitem>
</varlistentry>
diff --git a/doc/src/sgml/ref/create_subscription.sgml b/doc/src/sgml/ref/create_subscription.sgml
index 51c45f17c7..9d08740ba2 100644
--- a/doc/src/sgml/ref/create_subscription.sgml
+++ b/doc/src/sgml/ref/create_subscription.sgml
@@ -349,7 +349,43 @@ CREATE SUBSCRIPTION <replaceable class="parameter">subscription_name</replaceabl
</para>
</listitem>
</varlistentry>
- </variablelist></para>
+
+ <varlistentry>
+ <term><literal>min_send_delay</literal> (<type>integer</type>)</term>
+ <listitem>
+ <para>
+ By default, the publisher sends changes as soon as possible. This
+ parameter allows the user to delay changes by the given time period.
+ If the value is specified without units, it is taken as milliseconds.
+ The default is zero (no delay). See <xref linkend="config-setting-names-values"/>
+ for details on the available valid time units.
+ </para>
+ <para>
+ The delay is effective only when the initial table synchronization
+ has been finished. However, there is a possibility that the table
+ status updated in <link linkend="catalog-pg-subscription-rel"><structname>pg_subscription_rel</structname></link>
+ could be delayed in getting to the "ready" state, and also two-phase
+ (if specified) could be delayed in getting to "enabled".
+ </para>
+ <para>
+ The delay does not take into account the overhead of time spent
+ transferring the transaction. Therefore, the arrival time at the
+ subscriber may be delayed more than the specified
+ <literal>min_send_delay</literal> time.
+ </para>
+ <warning>
+ <para>
+ Delaying the replication means there is a much longer time between
+ making a change on the publisher, and that change being committed
+ on the subscriber. This can impact the performance of synchronous
+ replication. See <xref linkend="guc-synchronous-commit"/>
+ parameter.
+ </para>
+ </warning>
+ </listitem>
+ </varlistentry>
+ </variablelist>
+ </para>
</listitem>
</varlistentry>
@@ -420,6 +456,11 @@ CREATE SUBSCRIPTION <replaceable class="parameter">subscription_name</replaceabl
published with different column lists are not supported.
</para>
+ <para>
+ A non-zero <literal>min_send_delay</literal> parameter is not allowed when
+ streaming in parallel mode.
+ </para>
+
<para>
We allow non-existent publications to be specified so that users can add
those later. This means
diff --git a/src/backend/catalog/pg_subscription.c b/src/backend/catalog/pg_subscription.c
index a56ae311c3..63a10b06d1 100644
--- a/src/backend/catalog/pg_subscription.c
+++ b/src/backend/catalog/pg_subscription.c
@@ -66,6 +66,7 @@ GetSubscription(Oid subid, bool missing_ok)
sub->skiplsn = subform->subskiplsn;
sub->name = pstrdup(NameStr(subform->subname));
sub->owner = subform->subowner;
+ sub->minsenddelay = subform->subminsenddelay;
sub->enabled = subform->subenabled;
sub->binary = subform->subbinary;
sub->stream = subform->substream;
diff --git a/src/backend/catalog/system_views.sql b/src/backend/catalog/system_views.sql
index 34ca0e739f..54a705d71b 100644
--- a/src/backend/catalog/system_views.sql
+++ b/src/backend/catalog/system_views.sql
@@ -1314,9 +1314,10 @@ REVOKE ALL ON pg_replication_origin_status FROM public;
-- All columns of pg_subscription except subconninfo are publicly readable.
REVOKE ALL ON pg_subscription FROM public;
-GRANT SELECT (oid, subdbid, subskiplsn, subname, subowner, subenabled,
- subbinary, substream, subtwophasestate, subdisableonerr,
- subslotname, subsynccommit, subpublications, suborigin)
+GRANT SELECT (oid, subdbid, subskiplsn, subname, subowner, subminsenddelay,
+ subenabled, subbinary, substream, subtwophasestate,
+ subdisableonerr, subslotname, subsynccommit, subpublications,
+ suborigin)
ON pg_subscription TO public;
CREATE VIEW pg_stat_subscription_stats AS
diff --git a/src/backend/commands/subscriptioncmds.c b/src/backend/commands/subscriptioncmds.c
index 464db6d247..4a8cd47171 100644
--- a/src/backend/commands/subscriptioncmds.c
+++ b/src/backend/commands/subscriptioncmds.c
@@ -66,6 +66,7 @@
#define SUBOPT_DISABLE_ON_ERR 0x00000400
#define SUBOPT_LSN 0x00000800
#define SUBOPT_ORIGIN 0x00001000
+#define SUBOPT_MIN_SEND_DELAY 0x00002000
/* check if the 'val' has 'bits' set */
#define IsSet(val, bits) (((val) & (bits)) == (bits))
@@ -90,6 +91,7 @@ typedef struct SubOpts
bool disableonerr;
char *origin;
XLogRecPtr lsn;
+ int32 min_send_delay;
} SubOpts;
static List *fetch_table_list(WalReceiverConn *wrconn, List *publications);
@@ -100,7 +102,7 @@ static void check_publications_origin(WalReceiverConn *wrconn,
static void check_duplicates_in_publist(List *publist, Datum *datums);
static List *merge_publications(List *oldpublist, List *newpublist, bool addpub, const char *subname);
static void ReportSlotConnectionError(List *rstates, Oid subid, char *slotname, char *err);
-
+static int32 defGetMinSendDelay(DefElem *def);
/*
* Common option parsing function for CREATE and ALTER SUBSCRIPTION commands.
@@ -146,6 +148,8 @@ parse_subscription_options(ParseState *pstate, List *stmt_options,
opts->disableonerr = false;
if (IsSet(supported_opts, SUBOPT_ORIGIN))
opts->origin = pstrdup(LOGICALREP_ORIGIN_ANY);
+ if (IsSet(supported_opts, SUBOPT_MIN_SEND_DELAY))
+ opts->min_send_delay = 0;
/* Parse options */
foreach(lc, stmt_options)
@@ -324,6 +328,15 @@ parse_subscription_options(ParseState *pstate, List *stmt_options,
opts->specified_opts |= SUBOPT_LSN;
opts->lsn = lsn;
}
+ else if (IsSet(supported_opts, SUBOPT_MIN_SEND_DELAY) &&
+ strcmp(defel->defname, "min_send_delay") == 0)
+ {
+ if (IsSet(opts->specified_opts, SUBOPT_MIN_SEND_DELAY))
+ errorConflictingDefElem(defel, pstate);
+
+ opts->specified_opts |= SUBOPT_MIN_SEND_DELAY;
+ opts->min_send_delay = defGetMinSendDelay(defel);
+ }
else
ereport(ERROR,
(errcode(ERRCODE_SYNTAX_ERROR),
@@ -404,6 +417,29 @@ parse_subscription_options(ParseState *pstate, List *stmt_options,
"slot_name = NONE", "create_slot = false")));
}
}
+
+ /*
+ * The combination of parallel streaming mode and min_send_delay is not
+ * allowed. This is because in parallel streaming mode, the walsender
+ * starts sending the transaction stream without knowing the prepare/commit
+ * time of the transaction. Always waiting for the full 'min_send_delay'
+ * time to send may introduce unnecessary delay.
+ *
+ * The other possibility was to wait sending COMMIT record of the parallel
+ * apply transaction but that would cause issues related to resource bloat
+ * and locks being held for a long time.
+ */
+ if (IsSet(supported_opts, SUBOPT_MIN_SEND_DELAY) &&
+ opts->min_send_delay > 0 &&
+ opts->streaming == LOGICALREP_STREAM_PARALLEL)
+ ereport(ERROR,
+ errcode(ERRCODE_SYNTAX_ERROR),
+ /*
+ * translator: the first %s is a string of the form "parameter > 0"
+ * and the second one is "option = value".
+ */
+ errmsg("%s and %s are mutually exclusive options",
+ "min_send_delay > 0", "streaming = parallel"));
}
/*
@@ -560,7 +596,8 @@ CreateSubscription(ParseState *pstate, CreateSubscriptionStmt *stmt,
SUBOPT_SLOT_NAME | SUBOPT_COPY_DATA |
SUBOPT_SYNCHRONOUS_COMMIT | SUBOPT_BINARY |
SUBOPT_STREAMING | SUBOPT_TWOPHASE_COMMIT |
- SUBOPT_DISABLE_ON_ERR | SUBOPT_ORIGIN);
+ SUBOPT_DISABLE_ON_ERR | SUBOPT_ORIGIN |
+ SUBOPT_MIN_SEND_DELAY);
parse_subscription_options(pstate, stmt->options, supported_opts, &opts);
/*
@@ -628,6 +665,7 @@ CreateSubscription(ParseState *pstate, CreateSubscriptionStmt *stmt,
values[Anum_pg_subscription_subname - 1] =
DirectFunctionCall1(namein, CStringGetDatum(stmt->subname));
values[Anum_pg_subscription_subowner - 1] = ObjectIdGetDatum(owner);
+ values[Anum_pg_subscription_subminsenddelay - 1] = Int32GetDatum(opts.min_send_delay);
values[Anum_pg_subscription_subenabled - 1] = BoolGetDatum(opts.enabled);
values[Anum_pg_subscription_subbinary - 1] = BoolGetDatum(opts.binary);
values[Anum_pg_subscription_substream - 1] = CharGetDatum(opts.streaming);
@@ -1054,7 +1092,7 @@ AlterSubscription(ParseState *pstate, AlterSubscriptionStmt *stmt,
supported_opts = (SUBOPT_SLOT_NAME |
SUBOPT_SYNCHRONOUS_COMMIT | SUBOPT_BINARY |
SUBOPT_STREAMING | SUBOPT_DISABLE_ON_ERR |
- SUBOPT_ORIGIN);
+ SUBOPT_ORIGIN | SUBOPT_MIN_SEND_DELAY);
parse_subscription_options(pstate, stmt->options,
supported_opts, &opts);
@@ -1098,6 +1136,19 @@ AlterSubscription(ParseState *pstate, AlterSubscriptionStmt *stmt,
if (IsSet(opts.specified_opts, SUBOPT_STREAMING))
{
+ /*
+ * The combination of parallel streaming mode and
+ * min_send_delay is not allowed. See
+ * parse_subscription_options.
+ */
+ if (opts.streaming == LOGICALREP_STREAM_PARALLEL &&
+ !IsSet(opts.specified_opts, SUBOPT_MIN_SEND_DELAY) &&
+ sub->minsenddelay > 0)
+ ereport(ERROR,
+ errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
+ errmsg("cannot set parallel streaming mode for subscription with %s",
+ "min_send_delay"));
+
values[Anum_pg_subscription_substream - 1] =
CharGetDatum(opts.streaming);
replaces[Anum_pg_subscription_substream - 1] = true;
@@ -1111,6 +1162,26 @@ AlterSubscription(ParseState *pstate, AlterSubscriptionStmt *stmt,
= true;
}
+ if (IsSet(opts.specified_opts, SUBOPT_MIN_SEND_DELAY))
+ {
+ /*
+ * The combination of parallel streaming mode and
+ * min_send_delay is not allowed. See
+ * parse_subscription_options.
+ */
+ if (opts.min_send_delay > 0 &&
+ !IsSet(opts.specified_opts, SUBOPT_STREAMING) &&
+ sub->stream == LOGICALREP_STREAM_PARALLEL)
+ ereport(ERROR,
+ errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
+ errmsg("cannot set %s for subscription in parallel streaming mode",
+ "min_send_delay"));
+
+ values[Anum_pg_subscription_subminsenddelay - 1] =
+ Int32GetDatum(opts.min_send_delay);
+ replaces[Anum_pg_subscription_subminsenddelay - 1] = true;
+ }
+
if (IsSet(opts.specified_opts, SUBOPT_ORIGIN))
{
values[Anum_pg_subscription_suborigin - 1] =
@@ -2195,3 +2266,45 @@ defGetStreamingMode(DefElem *def)
def->defname)));
return LOGICALREP_STREAM_OFF; /* keep compiler quiet */
}
+
+/*
+ * Extract the min_send_delay value from a DefElem. This is very similar to
+ * parse_and_validate_value() for integer values, because min_send_delay
+ * accepts the same parameter format as recovery_min_apply_delay.
+ */
+static int32
+defGetMinSendDelay(DefElem *def)
+{
+ char *input_string;
+ int result;
+ const char *hintmsg;
+
+ input_string = defGetString(def);
+
+ /*
+ * Parse given string as parameter which has millisecond unit
+ */
+ if (!parse_int(input_string, &result, GUC_UNIT_MS, &hintmsg))
+ ereport(ERROR,
+ (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+ errmsg("invalid value for parameter \"%s\": \"%s\"",
+ "min_send_delay", input_string),
+ hintmsg ? errhint("%s", _(hintmsg)) : 0));
+
+ /*
+ * Check both the lower boundary for the valid min_send_delay range and
+ * the upper boundary as the safeguard for some platforms where INT_MAX is
+ * wider than int32 respectively. Although parse_int() has confirmed that
+ * the result is less than or equal to INT_MAX, the value will be stored
+ * in a catalog column of int32.
+ */
+ if (result < 0 || result > PG_INT32_MAX)
+ ereport(ERROR,
+ (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+ errmsg("%d ms is outside the valid range for parameter \"%s\" (%d .. %d)",
+ result,
+ "min_send_delay",
+ 0, PG_INT32_MAX)));
+
+ return result;
+}
diff --git a/src/backend/replication/libpqwalreceiver/libpqwalreceiver.c b/src/backend/replication/libpqwalreceiver/libpqwalreceiver.c
index 560ec974fa..89a72c1abe 100644
--- a/src/backend/replication/libpqwalreceiver/libpqwalreceiver.c
+++ b/src/backend/replication/libpqwalreceiver/libpqwalreceiver.c
@@ -443,6 +443,11 @@ libpqrcv_startstreaming(WalReceiverConn *conn,
PQserverVersion(conn->streamConn) >= 140000)
appendStringInfoString(&cmd, ", binary 'true'");
+ if (options->proto.logical.min_send_delay > 0 &&
+ PQserverVersion(conn->streamConn) >= 160000)
+ appendStringInfo(&cmd, ", min_send_delay '%d'",
+ options->proto.logical.min_send_delay);
+
appendStringInfoChar(&cmd, ')');
}
else
diff --git a/src/backend/replication/logical/decode.c b/src/backend/replication/logical/decode.c
index a53e23c679..2674444894 100644
--- a/src/backend/replication/logical/decode.c
+++ b/src/backend/replication/logical/decode.c
@@ -679,6 +679,15 @@ DecodeCommit(LogicalDecodingContext *ctx, XLogRecordBuffer *buf,
}
else
{
+ /*
+ * Delay sending the changes if required. For streaming transactions,
+ * this means a delay in sending the last stream but that is OK because
+ * on the downstream the changes will be applied only after receiving
+ * the last stream.
+ */
+ if (ctx->min_send_delay > 0 && ctx->delay_send)
+ ctx->delay_send(ctx, commit_time);
+
ReorderBufferCommit(ctx->reorder, xid, buf->origptr, buf->endptr,
commit_time, origin_id, origin_lsn);
}
@@ -763,6 +772,15 @@ DecodePrepare(LogicalDecodingContext *ctx, XLogRecordBuffer *buf,
buf->origptr, buf->endptr);
}
+ /*
+ * Delay sending the changes if required. For streaming transactions,
+ * this means a delay in sending the last stream but that is OK because
+ * on the downstream the changes will be applied only after receiving
+ * the last stream.
+ */
+ if (ctx->min_send_delay > 0 && ctx->delay_send)
+ ctx->delay_send(ctx, prepare_time);
+
/* replay actions of all transaction + subtransactions in order */
ReorderBufferPrepare(ctx->reorder, xid, parsed->twophase_gid);
diff --git a/src/backend/replication/logical/logical.c b/src/backend/replication/logical/logical.c
index c3ec97a0a6..e4dd822cdc 100644
--- a/src/backend/replication/logical/logical.c
+++ b/src/backend/replication/logical/logical.c
@@ -156,7 +156,8 @@ StartupDecodingContext(List *output_plugin_options,
XLogReaderRoutine *xl_routine,
LogicalOutputPluginWriterPrepareWrite prepare_write,
LogicalOutputPluginWriterWrite do_write,
- LogicalOutputPluginWriterUpdateProgress update_progress)
+ LogicalOutputPluginWriterUpdateProgress update_progress,
+ LogicalOutputPluginWriterDelay delay_send)
{
ReplicationSlot *slot;
MemoryContext context,
@@ -293,6 +294,7 @@ StartupDecodingContext(List *output_plugin_options,
ctx->prepare_write = prepare_write;
ctx->write = do_write;
ctx->update_progress = update_progress;
+ ctx->delay_send = delay_send;
ctx->output_plugin_options = output_plugin_options;
@@ -316,7 +318,7 @@ StartupDecodingContext(List *output_plugin_options,
* marking WAL reserved beforehand. In that scenario, it's up to the
* caller to guarantee that WAL remains available.
* xl_routine -- XLogReaderRoutine for underlying XLogReader
- * prepare_write, do_write, update_progress --
+ * prepare_write, do_write, update_progress, delay_send --
* callbacks that perform the use-case dependent, actual, work.
*
* Needs to be called while in a memory context that's at least as long lived
@@ -334,7 +336,8 @@ CreateInitDecodingContext(const char *plugin,
XLogReaderRoutine *xl_routine,
LogicalOutputPluginWriterPrepareWrite prepare_write,
LogicalOutputPluginWriterWrite do_write,
- LogicalOutputPluginWriterUpdateProgress update_progress)
+ LogicalOutputPluginWriterUpdateProgress update_progress,
+ LogicalOutputPluginWriterDelay delay_send)
{
TransactionId xmin_horizon = InvalidTransactionId;
ReplicationSlot *slot;
@@ -435,7 +438,7 @@ CreateInitDecodingContext(const char *plugin,
ctx = StartupDecodingContext(NIL, restart_lsn, xmin_horizon,
need_full_snapshot, false,
xl_routine, prepare_write, do_write,
- update_progress);
+ update_progress, delay_send);
/* call output plugin initialization callback */
old_context = MemoryContextSwitchTo(ctx->context);
@@ -475,7 +478,7 @@ CreateInitDecodingContext(const char *plugin,
* xl_routine
* XLogReaderRoutine used by underlying xlogreader
*
- * prepare_write, do_write, update_progress
+ * prepare_write, do_write, update_progress, delay_send
* callbacks that have to be filled to perform the use-case dependent,
* actual work.
*
@@ -493,7 +496,8 @@ CreateDecodingContext(XLogRecPtr start_lsn,
XLogReaderRoutine *xl_routine,
LogicalOutputPluginWriterPrepareWrite prepare_write,
LogicalOutputPluginWriterWrite do_write,
- LogicalOutputPluginWriterUpdateProgress update_progress)
+ LogicalOutputPluginWriterUpdateProgress update_progress,
+ LogicalOutputPluginWriterDelay delay_send)
{
LogicalDecodingContext *ctx;
ReplicationSlot *slot;
@@ -547,7 +551,7 @@ CreateDecodingContext(XLogRecPtr start_lsn,
ctx = StartupDecodingContext(output_plugin_options,
start_lsn, InvalidTransactionId, false,
fast_forward, xl_routine, prepare_write,
- do_write, update_progress);
+ do_write, update_progress, delay_send);
/* call output plugin initialization callback */
old_context = MemoryContextSwitchTo(ctx->context);
diff --git a/src/backend/replication/logical/logicalfuncs.c b/src/backend/replication/logical/logicalfuncs.c
index fa1b641a2b..960025197f 100644
--- a/src/backend/replication/logical/logicalfuncs.c
+++ b/src/backend/replication/logical/logicalfuncs.c
@@ -212,7 +212,7 @@ pg_logical_slot_get_changes_guts(FunctionCallInfo fcinfo, bool confirm, bool bin
.segment_open = wal_segment_open,
.segment_close = wal_segment_close),
LogicalOutputPrepareWrite,
- LogicalOutputWrite, NULL);
+ LogicalOutputWrite, NULL, NULL);
/*
* After the sanity checks in CreateDecodingContext, make sure the
diff --git a/src/backend/replication/logical/worker.c b/src/backend/replication/logical/worker.c
index cfb2ab6248..e68902ae34 100644
--- a/src/backend/replication/logical/worker.c
+++ b/src/backend/replication/logical/worker.c
@@ -3898,7 +3898,8 @@ maybe_reread_subscription(void)
newsub->stream != MySubscription->stream ||
strcmp(newsub->origin, MySubscription->origin) != 0 ||
newsub->owner != MySubscription->owner ||
- !equal(newsub->publications, MySubscription->publications))
+ !equal(newsub->publications, MySubscription->publications) ||
+ newsub->minsenddelay != MySubscription->minsenddelay)
{
if (am_parallel_apply_worker())
ereport(LOG,
@@ -4617,9 +4618,18 @@ ApplyWorkerMain(Datum main_arg)
options.proto.logical.twophase = false;
options.proto.logical.origin = pstrdup(MySubscription->origin);
+ options.proto.logical.min_send_delay = 0;
if (!am_tablesync_worker())
{
+ /*
+ * Time-delayed logical replication does not support tablesync
+ * workers, so only the leader apply worker can request walsenders to
+ * delay on the publisher side.
+ */
+ if (server_version >= 160000 && MySubscription->minsenddelay > 0)
+ options.proto.logical.min_send_delay = MySubscription->minsenddelay;
+
/*
* Even when the two_phase mode is requested by the user, it remains
* as the tri-state PENDING until all tablesyncs have reached READY
diff --git a/src/backend/replication/pgoutput/pgoutput.c b/src/backend/replication/pgoutput/pgoutput.c
index 98377c094b..df6a87b0ba 100644
--- a/src/backend/replication/pgoutput/pgoutput.c
+++ b/src/backend/replication/pgoutput/pgoutput.c
@@ -285,6 +285,7 @@ parse_output_parameters(List *options, PGOutputData *data)
bool streaming_given = false;
bool two_phase_option_given = false;
bool origin_option_given = false;
+ bool min_send_delay_option_given = false;
data->binary = false;
data->streaming = LOGICALREP_STREAM_OFF;
@@ -396,6 +397,32 @@ parse_output_parameters(List *options, PGOutputData *data)
errcode(ERRCODE_INVALID_PARAMETER_VALUE),
errmsg("unrecognized origin value: \"%s\"", data->origin));
}
+ else if (strcmp(defel->defname, "min_send_delay") == 0)
+ {
+ unsigned long delay_val;
+ char *endptr;
+
+ if (min_send_delay_option_given)
+ ereport(ERROR,
+ errcode(ERRCODE_SYNTAX_ERROR),
+ errmsg("conflicting or redundant options"));
+ min_send_delay_option_given = true;
+
+ errno = 0;
+ delay_val = strtoul(strVal(defel->arg), &endptr, 10);
+ if (errno != 0 || *endptr != '\0')
+ ereport(ERROR,
+ (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+ errmsg("invalid min_send_delay")));
+
+ if (delay_val > PG_INT32_MAX)
+ ereport(ERROR,
+ (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+ errmsg("min_send_delay \"%s\" out of range",
+ strVal(defel->arg))));
+
+ data->min_send_delay = (int32) delay_val;
+ }
else
elog(ERROR, "unrecognized pgoutput option: %s", defel->defname);
}
@@ -501,6 +528,15 @@ pgoutput_startup(LogicalDecodingContext *ctx, OutputPluginOptions *opt,
else
ctx->twophase_opt_given = true;
+ if (data->min_send_delay &&
+ data->protocol_version < LOGICALREP_PROTO_MIN_SEND_DELAY_VERSION_NUM)
+ ereport(ERROR,
+ (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+ errmsg("requested proto_version=%d does not support delay sending data, need %d or higher",
+ data->protocol_version, LOGICALREP_PROTO_MIN_SEND_DELAY_VERSION_NUM)));
+ else
+ ctx->min_send_delay = data->min_send_delay;
+
/* Init publication state. */
data->publications = NIL;
publications_valid = false;
diff --git a/src/backend/replication/slotfuncs.c b/src/backend/replication/slotfuncs.c
index 2f3c964824..522f7600a1 100644
--- a/src/backend/replication/slotfuncs.c
+++ b/src/backend/replication/slotfuncs.c
@@ -148,7 +148,7 @@ create_logical_replication_slot(char *name, char *plugin,
XL_ROUTINE(.page_read = read_local_xlog_page,
.segment_open = wal_segment_open,
.segment_close = wal_segment_close),
- NULL, NULL, NULL);
+ NULL, NULL, NULL, NULL);
/*
* If caller needs us to determine the decoding start point, do so now.
@@ -481,7 +481,7 @@ pg_logical_replication_slot_advance(XLogRecPtr moveto)
XL_ROUTINE(.page_read = read_local_xlog_page,
.segment_open = wal_segment_open,
.segment_close = wal_segment_close),
- NULL, NULL, NULL);
+ NULL, NULL, NULL, NULL);
/*
* Start reading at the slot's restart_lsn, which we know to point to
diff --git a/src/backend/replication/walsender.c b/src/backend/replication/walsender.c
index 75e8363e24..9537fba7df 100644
--- a/src/backend/replication/walsender.c
+++ b/src/backend/replication/walsender.c
@@ -252,6 +252,7 @@ static void WalSndPrepareWrite(LogicalDecodingContext *ctx, XLogRecPtr lsn, Tran
static void WalSndWriteData(LogicalDecodingContext *ctx, XLogRecPtr lsn, TransactionId xid, bool last_write);
static void WalSndUpdateProgress(LogicalDecodingContext *ctx, XLogRecPtr lsn, TransactionId xid,
bool skipped_xact);
+static void WalSndDelay(LogicalDecodingContext *ctx, TimestampTz delay_start);
static XLogRecPtr WalSndWaitForWal(XLogRecPtr loc);
static void LagTrackerWrite(XLogRecPtr lsn, TimestampTz local_flush_time);
static TimeOffset LagTrackerRead(int head, XLogRecPtr lsn, TimestampTz now);
@@ -1126,7 +1127,7 @@ CreateReplicationSlot(CreateReplicationSlotCmd *cmd)
.segment_open = WalSndSegmentOpen,
.segment_close = wal_segment_close),
WalSndPrepareWrite, WalSndWriteData,
- WalSndUpdateProgress);
+ WalSndUpdateProgress, WalSndDelay);
/*
* Signal that we don't need the timeout mechanism. We're just
@@ -1285,7 +1286,7 @@ StartLogicalReplication(StartReplicationCmd *cmd)
.segment_open = WalSndSegmentOpen,
.segment_close = wal_segment_close),
WalSndPrepareWrite, WalSndWriteData,
- WalSndUpdateProgress);
+ WalSndUpdateProgress, WalSndDelay);
xlogreader = logical_decoding_ctx->reader;
WalSndSetState(WALSNDSTATE_CATCHUP);
@@ -3849,3 +3850,76 @@ LagTrackerRead(int head, XLogRecPtr lsn, TimestampTz now)
Assert(time != 0);
return now - time;
}
+
+/*
+ * LogicalDecodingContext 'delay' callback.
+ *
+ * Wait long enough to make sure a transaction is applied at least that
+ * period behind the publisher.
+ */
+static void
+WalSndDelay(LogicalDecodingContext *ctx, TimestampTz delay_start)
+{
+ /* Wait till delayUntil by the latch mechanism */
+ while (true)
+ {
+ TimestampTz delayUntil;
+ long remaining_wait_time_ms;
+ long timeout_sleeptime_ms;
+
+ ResetLatch(MyLatch);
+
+ CHECK_FOR_INTERRUPTS();
+
+ /* This might change wal_sender_timeout */
+ if (ConfigReloadPending)
+ {
+ ConfigReloadPending = false;
+ ProcessConfigFile(PGC_SIGHUP);
+ }
+
+ /* Check for input from the client */
+ ProcessRepliesIfAny();
+
+ /* die if timeout was reached */
+ WalSndCheckTimeOut();
+
+ /* Send keepalive if the time has come */
+ WalSndKeepaliveIfNecessary();
+
+ /* Try to flush pending output to the client */
+ if (pq_flush_if_writable() != 0)
+ WalSndShutdown();
+
+ /*
+ * If we've requested to shut down, exit the process.
+ *
+ * Note that WalSndDone() cannot be used here because the delaying
+ * changes will be sent in the function.
+ */
+ if (got_STOPPING)
+ WalSndShutdown();
+
+ delayUntil = TimestampTzPlusMilliseconds(delay_start, ctx->min_send_delay);
+ remaining_wait_time_ms = TimestampDifferenceMilliseconds(GetCurrentTimestamp(), delayUntil);
+
+ /*
+ * Exit without arming the latch if it's already past time to send
+ * this transaction.
+ */
+ if (remaining_wait_time_ms <= 0)
+ break;
+
+ /* Sleep until appropriate time. */
+ timeout_sleeptime_ms = WalSndComputeSleeptime(GetCurrentTimestamp());
+
+ elog(DEBUG2, "time-delayed replication for txid %u, delay_time = %d ms, remaining wait time: %ld ms",
+ ctx->write_xid, (int) ctx->min_send_delay,
+ remaining_wait_time_ms);
+
+ /* Sleep until we get reply from worker or we time out */
+ WalSndWait(WL_SOCKET_READABLE,
+ Min(timeout_sleeptime_ms, remaining_wait_time_ms),
+ WAIT_EVENT_WALSENDER_SEND_DELAY);
+ }
+}
diff --git a/src/backend/utils/activity/wait_event.c b/src/backend/utils/activity/wait_event.c
index cb99cc6339..76c19fe11d 100644
--- a/src/backend/utils/activity/wait_event.c
+++ b/src/backend/utils/activity/wait_event.c
@@ -515,6 +515,9 @@ pgstat_get_wait_timeout(WaitEventTimeout w)
case WAIT_EVENT_VACUUM_TRUNCATE:
event_name = "VacuumTruncate";
break;
+ case WAIT_EVENT_WALSENDER_SEND_DELAY:
+ event_name = "WalSenderSendDelay";
+ break;
/* no default case, so that compiler will warn */
}
diff --git a/src/bin/pg_dump/pg_dump.c b/src/bin/pg_dump/pg_dump.c
index 1a06eeaf6a..9754487921 100644
--- a/src/bin/pg_dump/pg_dump.c
+++ b/src/bin/pg_dump/pg_dump.c
@@ -4494,6 +4494,7 @@ getSubscriptions(Archive *fout)
int i_subsynccommit;
int i_subpublications;
int i_subbinary;
+ int i_subminsenddelay;
int i,
ntups;
@@ -4546,9 +4547,13 @@ getSubscriptions(Archive *fout)
LOGICALREP_TWOPHASE_STATE_DISABLED);
if (fout->remoteVersion >= 160000)
- appendPQExpBufferStr(query, " s.suborigin\n");
+ appendPQExpBufferStr(query,
+ " s.suborigin,\n"
+ " s.subminsenddelay\n");
else
- appendPQExpBuffer(query, " '%s' AS suborigin\n", LOGICALREP_ORIGIN_ANY);
+ appendPQExpBuffer(query, " '%s' AS suborigin,\n"
+ " 0 AS subminsenddelay\n",
+ LOGICALREP_ORIGIN_ANY);
appendPQExpBufferStr(query,
"FROM pg_subscription s\n"
@@ -4576,6 +4581,7 @@ getSubscriptions(Archive *fout)
i_subtwophasestate = PQfnumber(res, "subtwophasestate");
i_subdisableonerr = PQfnumber(res, "subdisableonerr");
i_suborigin = PQfnumber(res, "suborigin");
+ i_subminsenddelay = PQfnumber(res, "subminsenddelay");
subinfo = pg_malloc(ntups * sizeof(SubscriptionInfo));
@@ -4606,6 +4612,8 @@ getSubscriptions(Archive *fout)
subinfo[i].subdisableonerr =
pg_strdup(PQgetvalue(res, i, i_subdisableonerr));
subinfo[i].suborigin = pg_strdup(PQgetvalue(res, i, i_suborigin));
+ subinfo[i].subminsenddelay =
+ atoi(PQgetvalue(res, i, i_subminsenddelay));
/* Decide whether we want to dump it */
selectDumpableObject(&(subinfo[i].dobj), fout);
@@ -4687,6 +4695,9 @@ dumpSubscription(Archive *fout, const SubscriptionInfo *subinfo)
if (strcmp(subinfo->subsynccommit, "off") != 0)
appendPQExpBuffer(query, ", synchronous_commit = %s", fmtId(subinfo->subsynccommit));
+ if (subinfo->subminsenddelay > 0)
+ appendPQExpBuffer(query, ", min_send_delay = '%d ms'", subinfo->subminsenddelay);
+
appendPQExpBufferStr(query, ");\n");
if (subinfo->dobj.dump & DUMP_COMPONENT_DEFINITION)
diff --git a/src/bin/pg_dump/pg_dump.h b/src/bin/pg_dump/pg_dump.h
index cdca0b993d..4c55f8efc4 100644
--- a/src/bin/pg_dump/pg_dump.h
+++ b/src/bin/pg_dump/pg_dump.h
@@ -660,6 +660,7 @@ typedef struct _SubscriptionInfo
char *subdisableonerr;
char *suborigin;
char *subsynccommit;
+ int subminsenddelay;
char *subpublications;
} SubscriptionInfo;
diff --git a/src/bin/psql/describe.c b/src/bin/psql/describe.c
index c8a0bb7b3a..c7d303a168 100644
--- a/src/bin/psql/describe.c
+++ b/src/bin/psql/describe.c
@@ -6472,7 +6472,7 @@ describeSubscriptions(const char *pattern, bool verbose)
PGresult *res;
printQueryOpt myopt = pset.popt;
static const bool translate_columns[] = {false, false, false, false,
- false, false, false, false, false, false, false, false};
+ false, false, false, false, false, false, false, false, false};
if (pset.sversion < 100000)
{
@@ -6527,10 +6527,13 @@ describeSubscriptions(const char *pattern, bool verbose)
gettext_noop("Two-phase commit"),
gettext_noop("Disable on error"));
+ /* Origin and min_send_delay are only supported in v16 and higher */
if (pset.sversion >= 160000)
appendPQExpBuffer(&buf,
- ", suborigin AS \"%s\"\n",
- gettext_noop("Origin"));
+ ", suborigin AS \"%s\"\n"
+ ", subminsenddelay AS \"%s\"\n",
+ gettext_noop("Origin"),
+ gettext_noop("Min send delay"));
appendPQExpBuffer(&buf,
", subsynccommit AS \"%s\"\n"
diff --git a/src/bin/psql/tab-complete.c b/src/bin/psql/tab-complete.c
index 5e1882eaea..6643db6f55 100644
--- a/src/bin/psql/tab-complete.c
+++ b/src/bin/psql/tab-complete.c
@@ -1925,7 +1925,7 @@ psql_completion(const char *text, int start, int end)
COMPLETE_WITH("(", "PUBLICATION");
/* ALTER SUBSCRIPTION <name> SET ( */
else if (HeadMatches("ALTER", "SUBSCRIPTION", MatchAny) && TailMatches("SET", "("))
- COMPLETE_WITH("binary", "disable_on_error", "origin", "slot_name",
+ COMPLETE_WITH("binary", "disable_on_error", "min_send_delay", "origin", "slot_name",
"streaming", "synchronous_commit");
/* ALTER SUBSCRIPTION <name> SKIP ( */
else if (HeadMatches("ALTER", "SUBSCRIPTION", MatchAny) && TailMatches("SKIP", "("))
@@ -3268,7 +3268,7 @@ psql_completion(const char *text, int start, int end)
/* Complete "CREATE SUBSCRIPTION <name> ... WITH ( <opt>" */
else if (HeadMatches("CREATE", "SUBSCRIPTION") && TailMatches("WITH", "("))
COMPLETE_WITH("binary", "connect", "copy_data", "create_slot",
- "disable_on_error", "enabled", "origin", "slot_name",
+ "disable_on_error", "enabled", "min_send_delay", "origin", "slot_name",
"streaming", "synchronous_commit", "two_phase");
/* CREATE TRIGGER --- is allowed inside CREATE SCHEMA, so use TailMatches */
diff --git a/src/include/catalog/pg_subscription.h b/src/include/catalog/pg_subscription.h
index b0f2a1705d..69ae4314b4 100644
--- a/src/include/catalog/pg_subscription.h
+++ b/src/include/catalog/pg_subscription.h
@@ -74,6 +74,8 @@ CATALOG(pg_subscription,6100,SubscriptionRelationId) BKI_SHARED_RELATION BKI_ROW
Oid subowner BKI_LOOKUP(pg_authid); /* Owner of the subscription */
+ int32 subminsenddelay; /* Replication send delay (ms) */
+
bool subenabled; /* True if the subscription is enabled (the
* worker should be running) */
@@ -122,6 +124,7 @@ typedef struct Subscription
* skipped */
char *name; /* Name of the subscription */
Oid owner; /* Oid of the subscription owner */
+ int32 minsenddelay; /* Replication send delay (ms) */
bool enabled; /* Indicates if the subscription is enabled */
bool binary; /* Indicates if the subscription wants data in
* binary format */
diff --git a/src/include/replication/logical.h b/src/include/replication/logical.h
index 5f49554ea0..b0770edf49 100644
--- a/src/include/replication/logical.h
+++ b/src/include/replication/logical.h
@@ -30,6 +30,10 @@ typedef void (*LogicalOutputPluginWriterUpdateProgress) (struct LogicalDecodingC
bool skipped_xact
);
+typedef void (*LogicalOutputPluginWriterDelay) (struct LogicalDecodingContext *lr,
+ TimestampTz start_time
+);
+
typedef struct LogicalDecodingContext
{
/* memory context this is all allocated in */
@@ -64,6 +68,7 @@ typedef struct LogicalDecodingContext
LogicalOutputPluginWriterPrepareWrite prepare_write;
LogicalOutputPluginWriterWrite write;
LogicalOutputPluginWriterUpdateProgress update_progress;
+ LogicalOutputPluginWriterDelay delay_send;
/*
* Output buffer.
@@ -100,6 +105,12 @@ typedef struct LogicalDecodingContext
*/
bool twophase_opt_given;
+ /*
+ * The minimum delay, in milliseconds, by the publisher before sending all
+ * the changes
+ */
+ int32 min_send_delay;
+
/*
* State for writing output.
*/
@@ -121,14 +132,16 @@ extern LogicalDecodingContext *CreateInitDecodingContext(const char *plugin,
XLogReaderRoutine *xl_routine,
LogicalOutputPluginWriterPrepareWrite prepare_write,
LogicalOutputPluginWriterWrite do_write,
- LogicalOutputPluginWriterUpdateProgress update_progress);
+ LogicalOutputPluginWriterUpdateProgress update_progress,
+ LogicalOutputPluginWriterDelay delay_send);
extern LogicalDecodingContext *CreateDecodingContext(XLogRecPtr start_lsn,
List *output_plugin_options,
bool fast_forward,
XLogReaderRoutine *xl_routine,
LogicalOutputPluginWriterPrepareWrite prepare_write,
LogicalOutputPluginWriterWrite do_write,
- LogicalOutputPluginWriterUpdateProgress update_progress);
+ LogicalOutputPluginWriterUpdateProgress update_progress,
+ LogicalOutputPluginWriterDelay delay_send);
extern void DecodingContextFindStartpoint(LogicalDecodingContext *ctx);
extern bool DecodingContextReady(LogicalDecodingContext *ctx);
extern void FreeDecodingContext(LogicalDecodingContext *ctx);
diff --git a/src/include/replication/logicalproto.h b/src/include/replication/logicalproto.h
index 0ea2df5088..46faadbd7a 100644
--- a/src/include/replication/logicalproto.h
+++ b/src/include/replication/logicalproto.h
@@ -36,13 +36,17 @@
* LOGICALREP_PROTO_STREAM_PARALLEL_VERSION_NUM is the minimum protocol version
* where we support applying large streaming transactions in parallel.
* Introduced in PG16.
+ *
+ * LOGICALREP_PROTO_MIN_SEND_DELAY_VERSION_NUM is the minimum protocol version
+ * with support for delaying to send transactions. Introduced in PG16.
*/
#define LOGICALREP_PROTO_MIN_VERSION_NUM 1
#define LOGICALREP_PROTO_VERSION_NUM 1
#define LOGICALREP_PROTO_STREAM_VERSION_NUM 2
#define LOGICALREP_PROTO_TWOPHASE_VERSION_NUM 3
#define LOGICALREP_PROTO_STREAM_PARALLEL_VERSION_NUM 4
-#define LOGICALREP_PROTO_MAX_VERSION_NUM LOGICALREP_PROTO_STREAM_PARALLEL_VERSION_NUM
+#define LOGICALREP_PROTO_MIN_SEND_DELAY_VERSION_NUM 4
+#define LOGICALREP_PROTO_MAX_VERSION_NUM LOGICALREP_PROTO_MIN_SEND_DELAY_VERSION_NUM
/*
* Logical message types
diff --git a/src/include/replication/pgoutput.h b/src/include/replication/pgoutput.h
index b4a8015403..d2fde09e00 100644
--- a/src/include/replication/pgoutput.h
+++ b/src/include/replication/pgoutput.h
@@ -30,6 +30,7 @@ typedef struct PGOutputData
bool messages;
bool two_phase;
char *origin;
+ int32 min_send_delay;
} PGOutputData;
#endif /* PGOUTPUT_H */
diff --git a/src/include/replication/walreceiver.h b/src/include/replication/walreceiver.h
index decffe352d..c20969aed7 100644
--- a/src/include/replication/walreceiver.h
+++ b/src/include/replication/walreceiver.h
@@ -187,6 +187,7 @@ typedef struct
* prepare time */
char *origin; /* Only publish data originating from the
* specified origin */
+ int32 min_send_delay; /* The minimum send delay */
} logical;
} proto;
} WalRcvStreamOptions;
diff --git a/src/include/utils/wait_event.h b/src/include/utils/wait_event.h
index 9ab23e1c4a..cc3a234eba 100644
--- a/src/include/utils/wait_event.h
+++ b/src/include/utils/wait_event.h
@@ -150,7 +150,8 @@ typedef enum
WAIT_EVENT_REGISTER_SYNC_REQUEST,
WAIT_EVENT_SPIN_DELAY,
WAIT_EVENT_VACUUM_DELAY,
- WAIT_EVENT_VACUUM_TRUNCATE
+ WAIT_EVENT_VACUUM_TRUNCATE,
+ WAIT_EVENT_WALSENDER_SEND_DELAY
} WaitEventTimeout;
/* ----------
diff --git a/src/test/regress/expected/subscription.out b/src/test/regress/expected/subscription.out
index 3f99b14394..2027316233 100644
--- a/src/test/regress/expected/subscription.out
+++ b/src/test/regress/expected/subscription.out
@@ -114,18 +114,18 @@ CREATE SUBSCRIPTION regress_testsub4 CONNECTION 'dbname=regress_doesnotexist' PU
WARNING: subscription was created, but is not connected
HINT: To initiate replication, you must manually create the replication slot, enable the subscription, and refresh the subscription.
\dRs+ regress_testsub4
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
-------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub4 | regress_subscription_user | f | {testpub} | f | off | d | f | none | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min send delay | Synchronous commit | Conninfo | Skip LSN
+------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+----------------+--------------------+-----------------------------+----------
+ regress_testsub4 | regress_subscription_user | f | {testpub} | f | off | d | f | none | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
ALTER SUBSCRIPTION regress_testsub4 SET (origin = any);
\dRs+ regress_testsub4
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
-------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub4 | regress_subscription_user | f | {testpub} | f | off | d | f | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min send delay | Synchronous commit | Conninfo | Skip LSN
+------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+----------------+--------------------+-----------------------------+----------
+ regress_testsub4 | regress_subscription_user | f | {testpub} | f | off | d | f | any | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
DROP SUBSCRIPTION regress_testsub3;
@@ -143,10 +143,10 @@ ALTER SUBSCRIPTION regress_testsub CONNECTION 'foobar';
ERROR: invalid connection string syntax: missing "=" after "foobar" in connection info string
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub} | f | off | d | f | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min send delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+----------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | off | d | f | any | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
ALTER SUBSCRIPTION regress_testsub SET PUBLICATION testpub2, testpub3 WITH (refresh = false);
@@ -163,10 +163,10 @@ ERROR: unrecognized subscription parameter: "create_slot"
-- ok
ALTER SUBSCRIPTION regress_testsub SKIP (lsn = '0/12345');
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+---------------------+--------+-----------+------------------+------------------+--------+--------------------+------------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub2,testpub3} | f | off | d | f | any | off | dbname=regress_doesnotexist2 | 0/12345
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min send delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+---------------------+--------+-----------+------------------+------------------+--------+----------------+--------------------+------------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub2,testpub3} | f | off | d | f | any | 0 | off | dbname=regress_doesnotexist2 | 0/12345
(1 row)
-- ok - with lsn = NONE
@@ -175,10 +175,10 @@ ALTER SUBSCRIPTION regress_testsub SKIP (lsn = NONE);
ALTER SUBSCRIPTION regress_testsub SKIP (lsn = '0/0');
ERROR: invalid WAL location (LSN): 0/0
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+---------------------+--------+-----------+------------------+------------------+--------+--------------------+------------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub2,testpub3} | f | off | d | f | any | off | dbname=regress_doesnotexist2 | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min send delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+---------------------+--------+-----------+------------------+------------------+--------+----------------+--------------------+------------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub2,testpub3} | f | off | d | f | any | 0 | off | dbname=regress_doesnotexist2 | 0/0
(1 row)
BEGIN;
@@ -210,10 +210,10 @@ ALTER SUBSCRIPTION regress_testsub_foo SET (synchronous_commit = foobar);
ERROR: invalid value for parameter "synchronous_commit": "foobar"
HINT: Available values: local, remote_write, remote_apply, on, off.
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
----------------------+---------------------------+---------+---------------------+--------+-----------+------------------+------------------+--------+--------------------+------------------------------+----------
- regress_testsub_foo | regress_subscription_user | f | {testpub2,testpub3} | f | off | d | f | any | local | dbname=regress_doesnotexist2 | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min send delay | Synchronous commit | Conninfo | Skip LSN
+---------------------+---------------------------+---------+---------------------+--------+-----------+------------------+------------------+--------+----------------+--------------------+------------------------------+----------
+ regress_testsub_foo | regress_subscription_user | f | {testpub2,testpub3} | f | off | d | f | any | 0 | local | dbname=regress_doesnotexist2 | 0/0
(1 row)
-- rename back to keep the rest simple
@@ -247,19 +247,19 @@ CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUB
WARNING: subscription was created, but is not connected
HINT: To initiate replication, you must manually create the replication slot, enable the subscription, and refresh the subscription.
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub} | t | off | d | f | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min send delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+----------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | t | off | d | f | any | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
ALTER SUBSCRIPTION regress_testsub SET (binary = false);
ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE);
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub} | f | off | d | f | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min send delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+----------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | off | d | f | any | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
DROP SUBSCRIPTION regress_testsub;
@@ -271,27 +271,27 @@ CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUB
WARNING: subscription was created, but is not connected
HINT: To initiate replication, you must manually create the replication slot, enable the subscription, and refresh the subscription.
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub} | f | on | d | f | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min send delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+----------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | on | d | f | any | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
ALTER SUBSCRIPTION regress_testsub SET (streaming = parallel);
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub} | f | parallel | d | f | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min send delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+----------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | parallel | d | f | any | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
ALTER SUBSCRIPTION regress_testsub SET (streaming = false);
ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE);
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub} | f | off | d | f | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min send delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+----------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | off | d | f | any | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
-- fail - publication already exists
@@ -306,10 +306,10 @@ ALTER SUBSCRIPTION regress_testsub ADD PUBLICATION testpub1, testpub2 WITH (refr
ALTER SUBSCRIPTION regress_testsub ADD PUBLICATION testpub1, testpub2 WITH (refresh = false);
ERROR: publication "testpub1" is already in subscription "regress_testsub"
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-----------------------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub,testpub1,testpub2} | f | off | d | f | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min send delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-----------------------------+--------+-----------+------------------+------------------+--------+----------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub,testpub1,testpub2} | f | off | d | f | any | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
-- fail - publication used more than once
@@ -324,10 +324,10 @@ ERROR: publication "testpub3" is not in subscription "regress_testsub"
-- ok - delete publications
ALTER SUBSCRIPTION regress_testsub DROP PUBLICATION testpub1, testpub2 WITH (refresh = false);
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub} | f | off | d | f | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min send delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+----------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | off | d | f | any | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
DROP SUBSCRIPTION regress_testsub;
@@ -363,10 +363,10 @@ CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUB
WARNING: subscription was created, but is not connected
HINT: To initiate replication, you must manually create the replication slot, enable the subscription, and refresh the subscription.
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub} | f | off | p | f | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min send delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+----------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | off | p | f | any | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
--fail - alter of two_phase option not supported.
@@ -375,10 +375,10 @@ ERROR: unrecognized subscription parameter: "two_phase"
-- but can alter streaming when two_phase enabled
ALTER SUBSCRIPTION regress_testsub SET (streaming = true);
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub} | f | on | p | f | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min send delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+----------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | on | p | f | any | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE);
@@ -388,10 +388,10 @@ CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUB
WARNING: subscription was created, but is not connected
HINT: To initiate replication, you must manually create the replication slot, enable the subscription, and refresh the subscription.
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub} | f | on | p | f | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min send delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+----------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | on | p | f | any | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE);
@@ -404,20 +404,61 @@ CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUB
WARNING: subscription was created, but is not connected
HINT: To initiate replication, you must manually create the replication slot, enable the subscription, and refresh the subscription.
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub} | f | off | d | f | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min send delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+----------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | off | d | f | any | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
ALTER SUBSCRIPTION regress_testsub SET (disable_on_error = true);
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub} | f | off | d | t | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min send delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+----------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | off | d | t | any | 0 | off | dbname=regress_doesnotexist | 0/0
+(1 row)
+
+ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE);
+DROP SUBSCRIPTION regress_testsub;
+-- fail -- min_send_delay must be a non-negative integer
+CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexit' PUBLICATION testpub WITH (connect = false, min_send_delay = foo);
+ERROR: invalid value for parameter "min_send_delay": "foo"
+CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexit' PUBLICATION testpub WITH (connect = false, min_send_delay = -1);
+ERROR: -1 ms is outside the valid range for parameter "min_send_delay" (0 .. 2147483647)
+-- fail - specifying streaming = parallel with time-delayed replication is not
+-- supported
+CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUBLICATION testpub WITH (connect = false, streaming = parallel, min_send_delay = 123);
+ERROR: min_send_delay > 0 and streaming = parallel are mutually exclusive options
+-- success -- min_send_delay value without units is taken as milliseconds
+CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexit' PUBLICATION testpub WITH (connect = false, min_send_delay = 123);
+WARNING: subscription was created, but is not connected
+HINT: To initiate replication, you must manually create the replication slot, enable the subscription, and refresh the subscription.
+\dRs+
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min send delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+----------------+--------------------+----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | off | d | f | any | 123 | off | dbname=regress_doesnotexit | 0/0
+(1 row)
+
+-- success -- min_send_delay value with units other than ms is converted to ms
+-- and stored as an integer
+ALTER SUBSCRIPTION regress_testsub SET (min_send_delay = '1 d');
+\dRs+
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min send delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+----------------+--------------------+----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | off | d | f | any | 86400000 | off | dbname=regress_doesnotexit | 0/0
(1 row)
+-- fail - alter subscription with streaming = parallel should fail when
+-- time-delayed replication is set
+ALTER SUBSCRIPTION regress_testsub SET (streaming = parallel);
+ERROR: cannot set parallel streaming mode for subscription with min_send_delay
+-- fail - alter subscription with min_send_delay should fail when
+-- streaming = parallel is set
+ALTER SUBSCRIPTION regress_testsub SET (min_send_delay = 0, streaming = parallel);
+ALTER SUBSCRIPTION regress_testsub SET (min_send_delay = 123);
+ERROR: cannot set min_send_delay for subscription in parallel streaming mode
ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE);
DROP SUBSCRIPTION regress_testsub;
RESET SESSION AUTHORIZATION;
diff --git a/src/test/regress/sql/subscription.sql b/src/test/regress/sql/subscription.sql
index 7281f5fee2..46bf4a27d9 100644
--- a/src/test/regress/sql/subscription.sql
+++ b/src/test/regress/sql/subscription.sql
@@ -286,6 +286,35 @@ ALTER SUBSCRIPTION regress_testsub SET (disable_on_error = true);
ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE);
DROP SUBSCRIPTION regress_testsub;
+-- fail -- min_send_delay must be a non-negative integer
+CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexit' PUBLICATION testpub WITH (connect = false, min_send_delay = foo);
+CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexit' PUBLICATION testpub WITH (connect = false, min_send_delay = -1);
+
+-- fail - specifying streaming = parallel with time-delayed replication is not
+-- supported
+CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUBLICATION testpub WITH (connect = false, streaming = parallel, min_send_delay = 123);
+
+-- success -- min_send_delay value without units is taken as milliseconds
+CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexit' PUBLICATION testpub WITH (connect = false, min_send_delay = 123);
+\dRs+
+
+-- success -- min_send_delay value with units other than ms is converted to ms
+-- and stored as an integer
+ALTER SUBSCRIPTION regress_testsub SET (min_send_delay = '1 d');
+\dRs+
+
+-- fail - alter subscription with streaming = parallel should fail when
+-- time-delayed replication is set
+ALTER SUBSCRIPTION regress_testsub SET (streaming = parallel);
+
+-- fail - alter subscription with min_send_delay should fail when
+-- streaming = parallel is set
+ALTER SUBSCRIPTION regress_testsub SET (min_send_delay = 0, streaming = parallel);
+ALTER SUBSCRIPTION regress_testsub SET (min_send_delay = 123);
+
+ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE);
+DROP SUBSCRIPTION regress_testsub;
+
RESET SESSION AUTHORIZATION;
DROP ROLE regress_subscription_user;
DROP ROLE regress_subscription_user2;
diff --git a/src/test/subscription/t/001_rep_changes.pl b/src/test/subscription/t/001_rep_changes.pl
index 91aa068c95..063a98fde9 100644
--- a/src/test/subscription/t/001_rep_changes.pl
+++ b/src/test/subscription/t/001_rep_changes.pl
@@ -515,6 +515,33 @@ $node_publisher->poll_query_until('postgres',
or die
"Timed out while waiting for apply to restart after renaming SUBSCRIPTION";
+# Test time-delayed logical replication
+#
+# If the subscription sets min_send_delay parameter, the walsender will delay
+# the transaction send for min_send_delay milliseconds. We verify this by
+# looking at the time difference between a) when tuples are inserted on the
+# publisher, and b) when those changes are replicated on the subscriber. Even
+# on slow machines, this strategy will give predictable behavior.
+
+# Set min_send_delay parameter to 3 seconds
+my $delay = 3;
+$node_subscriber->safe_psql('postgres',
+ "ALTER SUBSCRIPTION tap_sub_renamed SET (min_send_delay = '${delay}s')");
+
+# Before doing the insertion, get the current timestamp that will be
+# used as a comparison base.
+my $publisher_insert_time = time();
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_ins VALUES (generate_series(1101, 1120))");
+
+# The publisher waits for the replication to complete
+$node_publisher->wait_for_catchup('tap_sub_renamed');
+
+# This test is successful only if at least the configured delay has elapsed.
+ok( time() - $publisher_insert_time >= $delay,
+ "subscriber applies WAL only after replication delay for non-streaming transaction"
+);
+
# check all the cleanup
$node_subscriber->safe_psql('postgres', "DROP SUBSCRIPTION tap_sub_renamed");
--
2.27.0
Patch v6 LGTM.
------
Kind Regards,
Peter Smith.
Fujitsu Australia
On Wed, Feb 22, 2023 9:48 PM Kuroda, Hayato/黒田 隼人 <kuroda.hayato@fujitsu.com> wrote:
Thank you for reviewing! PSA new version.
Thanks for your patch. Here is a comment.
+ elog(DEBUG2, "time-delayed replication for txid %u, delay_time = %d ms, remaining wait time: %ld ms",
+ ctx->write_xid, (int) ctx->min_send_delay,
+ remaining_wait_time_ms);
I tried and saw that the xid here looks wrong, what it got is the xid of the
previous transaction. It seems `ctx->write_xid` has not been updated and we
can't use it.
Regards,
Shi Yu
Dear Shi,
Thank you for reviewing! PSA new version.
+ elog(DEBUG2, "time-delayed replication for txid %u, delay_time = %d ms, remaining wait time: %ld ms", + ctx->write_xid, (int) ctx->min_send_delay, + remaining_wait_time_ms);I tried and saw that the xid here looks wrong, what it got is the xid of the
previous transaction. It seems `ctx->write_xid` has not been updated and we
can't use it.
Good catch. There are several approaches to fix that, I choose the simplest way.
TransactionId was added as an argument of functions.
Best Regards,
Hayato Kuroda
FUJITSU LIMITED
Attachments:
v7-0001-Time-delayed-logical-replication-on-publisher-sid.patchapplication/octet-stream; name=v7-0001-Time-delayed-logical-replication-on-publisher-sid.patchDownload
From d4b9a056139c8fccc7cdc73d26ef6dd804ae5564 Mon Sep 17 00:00:00 2001
From: Hayato Kuroda <kuroda.hayato@fujitsu.com>
Date: Thu, 16 Feb 2023 07:52:23 +0000
Subject: [PATCH v7] Time-delayed logical replication on publisher side
Similar to physical replication, a time-delayed copy of the data for
logical replication is useful for some scenarios (particularly to fix
errors that might cause data loss).
This patch implements a new subscription parameter called 'min_send_delay'.
If the subscription sets min_send_delay parameter, the apply worker
(via walrcv_startstreaming) passes the value to the publisher as an output plugin
option. The walsender will delay the transaction sending for given milliseconds.
The delay does not take into account the overhead of time spent in transferring
the transaction, which means that the arrival time at the subscriber may be
delayed more than the given time.
The delay occurs before we start to send the transaction on the publisher.
Regular and prepared transactions are covered. Streamed transactions are also
covered.
The combination of parallel streaming mode and min_send_delay is not allowed
1. This is because in parallel streaming mode, we start applying the transaction
stream as soon as the first change arrives without knowing the transaction's
prepare/commit time. Always waiting for the full 'min_send_delay' period might
include unnecessary delay.
2. Another reason is, for parallel streaming, the transaction will be opened
immediately by the parallel apply worker. So if walsender delayed to send the
final record of the transaction, the parallel worker must wait receiving with an
opened transaction. This would lead that locks acquired during the transaction
not getting released till min_send_delay elapsed.
Earlier versions were written by Euler Taveira, Takamichi Osumi, and Kuroda Hayato
Author: Kuroda Hayato, Takamichi Osumi
Reviewed-by: Amit Kapila, Peter Smith, Vignesh C, Shveta Malik,
Kyotaro Horiguchi, Shi Yu, Wang Wei, Dilip Kumar, Melih Mutlu,
Andres Freund
---
doc/src/sgml/catalogs.sgml | 10 +
doc/src/sgml/glossary.sgml | 15 ++
doc/src/sgml/logical-replication.sgml | 6 +
doc/src/sgml/monitoring.sgml | 5 +
doc/src/sgml/ref/alter_subscription.sgml | 5 +-
doc/src/sgml/ref/create_subscription.sgml | 43 +++-
src/backend/catalog/pg_subscription.c | 1 +
src/backend/catalog/system_views.sql | 7 +-
src/backend/commands/subscriptioncmds.c | 119 ++++++++++-
.../libpqwalreceiver/libpqwalreceiver.c | 5 +
src/backend/replication/logical/decode.c | 18 ++
src/backend/replication/logical/logical.c | 18 +-
.../replication/logical/logicalfuncs.c | 2 +-
src/backend/replication/logical/worker.c | 12 +-
src/backend/replication/pgoutput/pgoutput.c | 36 ++++
src/backend/replication/slotfuncs.c | 4 +-
src/backend/replication/walsender.c | 77 +++++++-
src/backend/utils/activity/wait_event.c | 3 +
src/bin/pg_dump/pg_dump.c | 15 +-
src/bin/pg_dump/pg_dump.h | 1 +
src/bin/psql/describe.c | 9 +-
src/bin/psql/tab-complete.c | 4 +-
src/include/catalog/pg_subscription.h | 3 +
src/include/replication/logical.h | 18 +-
src/include/replication/logicalproto.h | 6 +-
src/include/replication/pgoutput.h | 1 +
src/include/replication/walreceiver.h | 1 +
src/include/utils/wait_event.h | 3 +-
src/test/regress/expected/subscription.out | 185 +++++++++++-------
src/test/regress/sql/subscription.sql | 29 +++
src/test/subscription/t/001_rep_changes.pl | 27 +++
31 files changed, 583 insertions(+), 105 deletions(-)
diff --git a/doc/src/sgml/catalogs.sgml b/doc/src/sgml/catalogs.sgml
index c1e4048054..ce6521fe6c 100644
--- a/doc/src/sgml/catalogs.sgml
+++ b/doc/src/sgml/catalogs.sgml
@@ -7892,6 +7892,16 @@ SCRAM-SHA-256$<replaceable><iteration count></replaceable>:<replaceable>&l
</para></entry>
</row>
+ <row>
+ <entry role="catalog_table_entry"><para role="column_definition">
+ <structfield>subminsenddelay</structfield> <type>int4</type>
+ </para>
+ <para>
+ The minimum delay, in milliseconds, by the publisher before sending all
+ the changes
+ </para></entry>
+ </row>
+
<row>
<entry role="catalog_table_entry"><para role="column_definition">
<structfield>subenabled</structfield> <type>bool</type>
diff --git a/doc/src/sgml/glossary.sgml b/doc/src/sgml/glossary.sgml
index 7c01a541fe..9ede9d05f6 100644
--- a/doc/src/sgml/glossary.sgml
+++ b/doc/src/sgml/glossary.sgml
@@ -1729,6 +1729,21 @@
</glossdef>
</glossentry>
+ <glossentry id="glossary-time-delayed-replication">
+ <glossterm>Time-delayed replication</glossterm>
+ <glossdef>
+ <para>
+ Replication setup that delays the application of changes by a specified
+ minimum time-delay period.
+ </para>
+ <para>
+ For more information, see
+ <xref linkend="guc-recovery-min-apply-delay"/> for physical replication
+ and <xref linkend="sql-createsubscription"/> for logical replication.
+ </para>
+ </glossdef>
+ </glossentry>
+
<glossentry id="glossary-toast">
<glossterm>TOAST</glossterm>
<glossdef>
diff --git a/doc/src/sgml/logical-replication.sgml b/doc/src/sgml/logical-replication.sgml
index 1bd5660c87..fe9e7f7b26 100644
--- a/doc/src/sgml/logical-replication.sgml
+++ b/doc/src/sgml/logical-replication.sgml
@@ -247,6 +247,12 @@
target table.
</para>
+ <para>
+ A subscription can delay the receipt of changes by specifying the
+ <literal>min_send_delay</literal> subscription parameter. See
+ <xref linkend="sql-createsubscription"/> for details.
+ </para>
+
<sect2 id="logical-replication-subscription-slot">
<title>Replication Slot Management</title>
diff --git a/doc/src/sgml/monitoring.sgml b/doc/src/sgml/monitoring.sgml
index b0b997f092..6158587644 100644
--- a/doc/src/sgml/monitoring.sgml
+++ b/doc/src/sgml/monitoring.sgml
@@ -2349,6 +2349,11 @@ postgres 27093 0.0 0.0 30096 2752 ? Ss 11:34 0:00 postgres: ser
<entry>Waiting to acquire an exclusive lock to truncate off any
empty pages at the end of a table vacuumed.</entry>
</row>
+ <row>
+ <entry><literal>WalSenderSendDelay</literal></entry>
+ <entry>Waiting while sending changes for time-delayed logical replication
+ in the WAL sender process.</entry>
+ </row>
</tbody>
</tgroup>
</table>
diff --git a/doc/src/sgml/ref/alter_subscription.sgml b/doc/src/sgml/ref/alter_subscription.sgml
index 964fcbb8ff..3f238b958b 100644
--- a/doc/src/sgml/ref/alter_subscription.sgml
+++ b/doc/src/sgml/ref/alter_subscription.sgml
@@ -213,8 +213,9 @@ ALTER SUBSCRIPTION <replaceable class="parameter">name</replaceable> RENAME TO <
are <literal>slot_name</literal>,
<literal>synchronous_commit</literal>,
<literal>binary</literal>, <literal>streaming</literal>,
- <literal>disable_on_error</literal>, and
- <literal>origin</literal>.
+ <literal>disable_on_error</literal>,
+ <literal>origin</literal>, and
+ <literal>min_send_delay</literal>.
</para>
</listitem>
</varlistentry>
diff --git a/doc/src/sgml/ref/create_subscription.sgml b/doc/src/sgml/ref/create_subscription.sgml
index 51c45f17c7..9d08740ba2 100644
--- a/doc/src/sgml/ref/create_subscription.sgml
+++ b/doc/src/sgml/ref/create_subscription.sgml
@@ -349,7 +349,43 @@ CREATE SUBSCRIPTION <replaceable class="parameter">subscription_name</replaceabl
</para>
</listitem>
</varlistentry>
- </variablelist></para>
+
+ <varlistentry>
+ <term><literal>min_send_delay</literal> (<type>integer</type>)</term>
+ <listitem>
+ <para>
+ By default, the publisher sends changes as soon as possible. This
+ parameter allows the user to delay changes by the given time period.
+ If the value is specified without units, it is taken as milliseconds.
+ The default is zero (no delay). See <xref linkend="config-setting-names-values"/>
+ for details on the available valid time units.
+ </para>
+ <para>
+ The delay is effective only when the initial table synchronization
+ has been finished. However, there is a possibility that the table
+ status updated in <link linkend="catalog-pg-subscription-rel"><structname>pg_subscription_rel</structname></link>
+ could be delayed in getting to the "ready" state, and also two-phase
+ (if specified) could be delayed in getting to "enabled".
+ </para>
+ <para>
+ The delay does not take into account the overhead of time spent
+ transferring the transaction. Therefore, the arrival time at the
+ subscriber may be delayed more than the specified
+ <literal>min_send_delay</literal> time.
+ </para>
+ <warning>
+ <para>
+ Delaying the replication means there is a much longer time between
+ making a change on the publisher, and that change being committed
+ on the subscriber. This can impact the performance of synchronous
+ replication. See <xref linkend="guc-synchronous-commit"/>
+ parameter.
+ </para>
+ </warning>
+ </listitem>
+ </varlistentry>
+ </variablelist>
+ </para>
</listitem>
</varlistentry>
@@ -420,6 +456,11 @@ CREATE SUBSCRIPTION <replaceable class="parameter">subscription_name</replaceabl
published with different column lists are not supported.
</para>
+ <para>
+ A non-zero <literal>min_send_delay</literal> parameter is not allowed when
+ streaming in parallel mode.
+ </para>
+
<para>
We allow non-existent publications to be specified so that users can add
those later. This means
diff --git a/src/backend/catalog/pg_subscription.c b/src/backend/catalog/pg_subscription.c
index a56ae311c3..63a10b06d1 100644
--- a/src/backend/catalog/pg_subscription.c
+++ b/src/backend/catalog/pg_subscription.c
@@ -66,6 +66,7 @@ GetSubscription(Oid subid, bool missing_ok)
sub->skiplsn = subform->subskiplsn;
sub->name = pstrdup(NameStr(subform->subname));
sub->owner = subform->subowner;
+ sub->minsenddelay = subform->subminsenddelay;
sub->enabled = subform->subenabled;
sub->binary = subform->subbinary;
sub->stream = subform->substream;
diff --git a/src/backend/catalog/system_views.sql b/src/backend/catalog/system_views.sql
index 34ca0e739f..54a705d71b 100644
--- a/src/backend/catalog/system_views.sql
+++ b/src/backend/catalog/system_views.sql
@@ -1314,9 +1314,10 @@ REVOKE ALL ON pg_replication_origin_status FROM public;
-- All columns of pg_subscription except subconninfo are publicly readable.
REVOKE ALL ON pg_subscription FROM public;
-GRANT SELECT (oid, subdbid, subskiplsn, subname, subowner, subenabled,
- subbinary, substream, subtwophasestate, subdisableonerr,
- subslotname, subsynccommit, subpublications, suborigin)
+GRANT SELECT (oid, subdbid, subskiplsn, subname, subowner, subminsenddelay,
+ subenabled, subbinary, substream, subtwophasestate,
+ subdisableonerr, subslotname, subsynccommit, subpublications,
+ suborigin)
ON pg_subscription TO public;
CREATE VIEW pg_stat_subscription_stats AS
diff --git a/src/backend/commands/subscriptioncmds.c b/src/backend/commands/subscriptioncmds.c
index 464db6d247..4a8cd47171 100644
--- a/src/backend/commands/subscriptioncmds.c
+++ b/src/backend/commands/subscriptioncmds.c
@@ -66,6 +66,7 @@
#define SUBOPT_DISABLE_ON_ERR 0x00000400
#define SUBOPT_LSN 0x00000800
#define SUBOPT_ORIGIN 0x00001000
+#define SUBOPT_MIN_SEND_DELAY 0x00002000
/* check if the 'val' has 'bits' set */
#define IsSet(val, bits) (((val) & (bits)) == (bits))
@@ -90,6 +91,7 @@ typedef struct SubOpts
bool disableonerr;
char *origin;
XLogRecPtr lsn;
+ int32 min_send_delay;
} SubOpts;
static List *fetch_table_list(WalReceiverConn *wrconn, List *publications);
@@ -100,7 +102,7 @@ static void check_publications_origin(WalReceiverConn *wrconn,
static void check_duplicates_in_publist(List *publist, Datum *datums);
static List *merge_publications(List *oldpublist, List *newpublist, bool addpub, const char *subname);
static void ReportSlotConnectionError(List *rstates, Oid subid, char *slotname, char *err);
-
+static int32 defGetMinSendDelay(DefElem *def);
/*
* Common option parsing function for CREATE and ALTER SUBSCRIPTION commands.
@@ -146,6 +148,8 @@ parse_subscription_options(ParseState *pstate, List *stmt_options,
opts->disableonerr = false;
if (IsSet(supported_opts, SUBOPT_ORIGIN))
opts->origin = pstrdup(LOGICALREP_ORIGIN_ANY);
+ if (IsSet(supported_opts, SUBOPT_MIN_SEND_DELAY))
+ opts->min_send_delay = 0;
/* Parse options */
foreach(lc, stmt_options)
@@ -324,6 +328,15 @@ parse_subscription_options(ParseState *pstate, List *stmt_options,
opts->specified_opts |= SUBOPT_LSN;
opts->lsn = lsn;
}
+ else if (IsSet(supported_opts, SUBOPT_MIN_SEND_DELAY) &&
+ strcmp(defel->defname, "min_send_delay") == 0)
+ {
+ if (IsSet(opts->specified_opts, SUBOPT_MIN_SEND_DELAY))
+ errorConflictingDefElem(defel, pstate);
+
+ opts->specified_opts |= SUBOPT_MIN_SEND_DELAY;
+ opts->min_send_delay = defGetMinSendDelay(defel);
+ }
else
ereport(ERROR,
(errcode(ERRCODE_SYNTAX_ERROR),
@@ -404,6 +417,29 @@ parse_subscription_options(ParseState *pstate, List *stmt_options,
"slot_name = NONE", "create_slot = false")));
}
}
+
+ /*
+ * The combination of parallel streaming mode and min_send_delay is not
+ * allowed. This is because in parallel streaming mode, the walsender
+ * starts sending the transaction stream without knowing the prepare/commit
+ * time of the transaction. Always waiting for the full 'min_send_delay'
+ * time to send may introduce unnecessary delay.
+ *
+ * The other possibility was to wait sending COMMIT record of the parallel
+ * apply transaction but that would cause issues related to resource bloat
+ * and locks being held for a long time.
+ */
+ if (IsSet(supported_opts, SUBOPT_MIN_SEND_DELAY) &&
+ opts->min_send_delay > 0 &&
+ opts->streaming == LOGICALREP_STREAM_PARALLEL)
+ ereport(ERROR,
+ errcode(ERRCODE_SYNTAX_ERROR),
+ /*
+ * translator: the first %s is a string of the form "parameter > 0"
+ * and the second one is "option = value".
+ */
+ errmsg("%s and %s are mutually exclusive options",
+ "min_send_delay > 0", "streaming = parallel"));
}
/*
@@ -560,7 +596,8 @@ CreateSubscription(ParseState *pstate, CreateSubscriptionStmt *stmt,
SUBOPT_SLOT_NAME | SUBOPT_COPY_DATA |
SUBOPT_SYNCHRONOUS_COMMIT | SUBOPT_BINARY |
SUBOPT_STREAMING | SUBOPT_TWOPHASE_COMMIT |
- SUBOPT_DISABLE_ON_ERR | SUBOPT_ORIGIN);
+ SUBOPT_DISABLE_ON_ERR | SUBOPT_ORIGIN |
+ SUBOPT_MIN_SEND_DELAY);
parse_subscription_options(pstate, stmt->options, supported_opts, &opts);
/*
@@ -628,6 +665,7 @@ CreateSubscription(ParseState *pstate, CreateSubscriptionStmt *stmt,
values[Anum_pg_subscription_subname - 1] =
DirectFunctionCall1(namein, CStringGetDatum(stmt->subname));
values[Anum_pg_subscription_subowner - 1] = ObjectIdGetDatum(owner);
+ values[Anum_pg_subscription_subminsenddelay - 1] = Int32GetDatum(opts.min_send_delay);
values[Anum_pg_subscription_subenabled - 1] = BoolGetDatum(opts.enabled);
values[Anum_pg_subscription_subbinary - 1] = BoolGetDatum(opts.binary);
values[Anum_pg_subscription_substream - 1] = CharGetDatum(opts.streaming);
@@ -1054,7 +1092,7 @@ AlterSubscription(ParseState *pstate, AlterSubscriptionStmt *stmt,
supported_opts = (SUBOPT_SLOT_NAME |
SUBOPT_SYNCHRONOUS_COMMIT | SUBOPT_BINARY |
SUBOPT_STREAMING | SUBOPT_DISABLE_ON_ERR |
- SUBOPT_ORIGIN);
+ SUBOPT_ORIGIN | SUBOPT_MIN_SEND_DELAY);
parse_subscription_options(pstate, stmt->options,
supported_opts, &opts);
@@ -1098,6 +1136,19 @@ AlterSubscription(ParseState *pstate, AlterSubscriptionStmt *stmt,
if (IsSet(opts.specified_opts, SUBOPT_STREAMING))
{
+ /*
+ * The combination of parallel streaming mode and
+ * min_send_delay is not allowed. See
+ * parse_subscription_options.
+ */
+ if (opts.streaming == LOGICALREP_STREAM_PARALLEL &&
+ !IsSet(opts.specified_opts, SUBOPT_MIN_SEND_DELAY) &&
+ sub->minsenddelay > 0)
+ ereport(ERROR,
+ errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
+ errmsg("cannot set parallel streaming mode for subscription with %s",
+ "min_send_delay"));
+
values[Anum_pg_subscription_substream - 1] =
CharGetDatum(opts.streaming);
replaces[Anum_pg_subscription_substream - 1] = true;
@@ -1111,6 +1162,26 @@ AlterSubscription(ParseState *pstate, AlterSubscriptionStmt *stmt,
= true;
}
+ if (IsSet(opts.specified_opts, SUBOPT_MIN_SEND_DELAY))
+ {
+ /*
+ * The combination of parallel streaming mode and
+ * min_send_delay is not allowed. See
+ * parse_subscription_options.
+ */
+ if (opts.min_send_delay > 0 &&
+ !IsSet(opts.specified_opts, SUBOPT_STREAMING) &&
+ sub->stream == LOGICALREP_STREAM_PARALLEL)
+ ereport(ERROR,
+ errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
+ errmsg("cannot set %s for subscription in parallel streaming mode",
+ "min_send_delay"));
+
+ values[Anum_pg_subscription_subminsenddelay - 1] =
+ Int32GetDatum(opts.min_send_delay);
+ replaces[Anum_pg_subscription_subminsenddelay - 1] = true;
+ }
+
if (IsSet(opts.specified_opts, SUBOPT_ORIGIN))
{
values[Anum_pg_subscription_suborigin - 1] =
@@ -2195,3 +2266,45 @@ defGetStreamingMode(DefElem *def)
def->defname)));
return LOGICALREP_STREAM_OFF; /* keep compiler quiet */
}
+
+/*
+ * Extract the min_send_delay value from a DefElem. This is very similar to
+ * parse_and_validate_value() for integer values, because min_send_delay
+ * accepts the same parameter format as recovery_min_apply_delay.
+ */
+static int32
+defGetMinSendDelay(DefElem *def)
+{
+ char *input_string;
+ int result;
+ const char *hintmsg;
+
+ input_string = defGetString(def);
+
+ /*
+ * Parse given string as parameter which has millisecond unit
+ */
+ if (!parse_int(input_string, &result, GUC_UNIT_MS, &hintmsg))
+ ereport(ERROR,
+ (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+ errmsg("invalid value for parameter \"%s\": \"%s\"",
+ "min_send_delay", input_string),
+ hintmsg ? errhint("%s", _(hintmsg)) : 0));
+
+ /*
+ * Check both the lower boundary for the valid min_send_delay range and
+ * the upper boundary as the safeguard for some platforms where INT_MAX is
+ * wider than int32 respectively. Although parse_int() has confirmed that
+ * the result is less than or equal to INT_MAX, the value will be stored
+ * in a catalog column of int32.
+ */
+ if (result < 0 || result > PG_INT32_MAX)
+ ereport(ERROR,
+ (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+ errmsg("%d ms is outside the valid range for parameter \"%s\" (%d .. %d)",
+ result,
+ "min_send_delay",
+ 0, PG_INT32_MAX)));
+
+ return result;
+}
diff --git a/src/backend/replication/libpqwalreceiver/libpqwalreceiver.c b/src/backend/replication/libpqwalreceiver/libpqwalreceiver.c
index 560ec974fa..89a72c1abe 100644
--- a/src/backend/replication/libpqwalreceiver/libpqwalreceiver.c
+++ b/src/backend/replication/libpqwalreceiver/libpqwalreceiver.c
@@ -443,6 +443,11 @@ libpqrcv_startstreaming(WalReceiverConn *conn,
PQserverVersion(conn->streamConn) >= 140000)
appendStringInfoString(&cmd, ", binary 'true'");
+ if (options->proto.logical.min_send_delay > 0 &&
+ PQserverVersion(conn->streamConn) >= 160000)
+ appendStringInfo(&cmd, ", min_send_delay '%d'",
+ options->proto.logical.min_send_delay);
+
appendStringInfoChar(&cmd, ')');
}
else
diff --git a/src/backend/replication/logical/decode.c b/src/backend/replication/logical/decode.c
index 8fe7bb65f1..31bf43bd63 100644
--- a/src/backend/replication/logical/decode.c
+++ b/src/backend/replication/logical/decode.c
@@ -689,6 +689,15 @@ DecodeCommit(LogicalDecodingContext *ctx, XLogRecordBuffer *buf,
}
else
{
+ /*
+ * Delay sending the changes if required. For streaming transactions,
+ * this means a delay in sending the last stream but that is OK
+ * because on the downstream the changes will be applied only after
+ * receiving the last stream.
+ */
+ if (ctx->min_send_delay > 0 && ctx->delay_send)
+ ctx->delay_send(ctx, xid, commit_time);
+
ReorderBufferCommit(ctx->reorder, xid, buf->origptr, buf->endptr,
commit_time, origin_id, origin_lsn);
}
@@ -773,6 +782,15 @@ DecodePrepare(LogicalDecodingContext *ctx, XLogRecordBuffer *buf,
buf->origptr, buf->endptr);
}
+ /*
+ * Delay sending the changes if required. For streaming transactions, this
+ * means a delay in sending the last stream but that is OK because on the
+ * downstream the changes will be applied only after receiving the last
+ * stream.
+ */
+ if (ctx->min_send_delay > 0 && ctx->delay_send)
+ ctx->delay_send(ctx, xid, prepare_time);
+
/* replay actions of all transaction + subtransactions in order */
ReorderBufferPrepare(ctx->reorder, xid, parsed->twophase_gid);
diff --git a/src/backend/replication/logical/logical.c b/src/backend/replication/logical/logical.c
index c3ec97a0a6..e4dd822cdc 100644
--- a/src/backend/replication/logical/logical.c
+++ b/src/backend/replication/logical/logical.c
@@ -156,7 +156,8 @@ StartupDecodingContext(List *output_plugin_options,
XLogReaderRoutine *xl_routine,
LogicalOutputPluginWriterPrepareWrite prepare_write,
LogicalOutputPluginWriterWrite do_write,
- LogicalOutputPluginWriterUpdateProgress update_progress)
+ LogicalOutputPluginWriterUpdateProgress update_progress,
+ LogicalOutputPluginWriterDelay delay_send)
{
ReplicationSlot *slot;
MemoryContext context,
@@ -293,6 +294,7 @@ StartupDecodingContext(List *output_plugin_options,
ctx->prepare_write = prepare_write;
ctx->write = do_write;
ctx->update_progress = update_progress;
+ ctx->delay_send = delay_send;
ctx->output_plugin_options = output_plugin_options;
@@ -316,7 +318,7 @@ StartupDecodingContext(List *output_plugin_options,
* marking WAL reserved beforehand. In that scenario, it's up to the
* caller to guarantee that WAL remains available.
* xl_routine -- XLogReaderRoutine for underlying XLogReader
- * prepare_write, do_write, update_progress --
+ * prepare_write, do_write, update_progress, delay_send --
* callbacks that perform the use-case dependent, actual, work.
*
* Needs to be called while in a memory context that's at least as long lived
@@ -334,7 +336,8 @@ CreateInitDecodingContext(const char *plugin,
XLogReaderRoutine *xl_routine,
LogicalOutputPluginWriterPrepareWrite prepare_write,
LogicalOutputPluginWriterWrite do_write,
- LogicalOutputPluginWriterUpdateProgress update_progress)
+ LogicalOutputPluginWriterUpdateProgress update_progress,
+ LogicalOutputPluginWriterDelay delay_send)
{
TransactionId xmin_horizon = InvalidTransactionId;
ReplicationSlot *slot;
@@ -435,7 +438,7 @@ CreateInitDecodingContext(const char *plugin,
ctx = StartupDecodingContext(NIL, restart_lsn, xmin_horizon,
need_full_snapshot, false,
xl_routine, prepare_write, do_write,
- update_progress);
+ update_progress, delay_send);
/* call output plugin initialization callback */
old_context = MemoryContextSwitchTo(ctx->context);
@@ -475,7 +478,7 @@ CreateInitDecodingContext(const char *plugin,
* xl_routine
* XLogReaderRoutine used by underlying xlogreader
*
- * prepare_write, do_write, update_progress
+ * prepare_write, do_write, update_progress, delay_send
* callbacks that have to be filled to perform the use-case dependent,
* actual work.
*
@@ -493,7 +496,8 @@ CreateDecodingContext(XLogRecPtr start_lsn,
XLogReaderRoutine *xl_routine,
LogicalOutputPluginWriterPrepareWrite prepare_write,
LogicalOutputPluginWriterWrite do_write,
- LogicalOutputPluginWriterUpdateProgress update_progress)
+ LogicalOutputPluginWriterUpdateProgress update_progress,
+ LogicalOutputPluginWriterDelay delay_send)
{
LogicalDecodingContext *ctx;
ReplicationSlot *slot;
@@ -547,7 +551,7 @@ CreateDecodingContext(XLogRecPtr start_lsn,
ctx = StartupDecodingContext(output_plugin_options,
start_lsn, InvalidTransactionId, false,
fast_forward, xl_routine, prepare_write,
- do_write, update_progress);
+ do_write, update_progress, delay_send);
/* call output plugin initialization callback */
old_context = MemoryContextSwitchTo(ctx->context);
diff --git a/src/backend/replication/logical/logicalfuncs.c b/src/backend/replication/logical/logicalfuncs.c
index fa1b641a2b..960025197f 100644
--- a/src/backend/replication/logical/logicalfuncs.c
+++ b/src/backend/replication/logical/logicalfuncs.c
@@ -212,7 +212,7 @@ pg_logical_slot_get_changes_guts(FunctionCallInfo fcinfo, bool confirm, bool bin
.segment_open = wal_segment_open,
.segment_close = wal_segment_close),
LogicalOutputPrepareWrite,
- LogicalOutputWrite, NULL);
+ LogicalOutputWrite, NULL, NULL);
/*
* After the sanity checks in CreateDecodingContext, make sure the
diff --git a/src/backend/replication/logical/worker.c b/src/backend/replication/logical/worker.c
index cfb2ab6248..e68902ae34 100644
--- a/src/backend/replication/logical/worker.c
+++ b/src/backend/replication/logical/worker.c
@@ -3898,7 +3898,8 @@ maybe_reread_subscription(void)
newsub->stream != MySubscription->stream ||
strcmp(newsub->origin, MySubscription->origin) != 0 ||
newsub->owner != MySubscription->owner ||
- !equal(newsub->publications, MySubscription->publications))
+ !equal(newsub->publications, MySubscription->publications) ||
+ newsub->minsenddelay != MySubscription->minsenddelay)
{
if (am_parallel_apply_worker())
ereport(LOG,
@@ -4617,9 +4618,18 @@ ApplyWorkerMain(Datum main_arg)
options.proto.logical.twophase = false;
options.proto.logical.origin = pstrdup(MySubscription->origin);
+ options.proto.logical.min_send_delay = 0;
if (!am_tablesync_worker())
{
+ /*
+ * Time-delayed logical replication does not support tablesync
+ * workers, so only the leader apply worker can request walsenders to
+ * delay on the publisher side.
+ */
+ if (server_version >= 160000 && MySubscription->minsenddelay > 0)
+ options.proto.logical.min_send_delay = MySubscription->minsenddelay;
+
/*
* Even when the two_phase mode is requested by the user, it remains
* as the tri-state PENDING until all tablesyncs have reached READY
diff --git a/src/backend/replication/pgoutput/pgoutput.c b/src/backend/replication/pgoutput/pgoutput.c
index 98377c094b..df6a87b0ba 100644
--- a/src/backend/replication/pgoutput/pgoutput.c
+++ b/src/backend/replication/pgoutput/pgoutput.c
@@ -285,6 +285,7 @@ parse_output_parameters(List *options, PGOutputData *data)
bool streaming_given = false;
bool two_phase_option_given = false;
bool origin_option_given = false;
+ bool min_send_delay_option_given = false;
data->binary = false;
data->streaming = LOGICALREP_STREAM_OFF;
@@ -396,6 +397,32 @@ parse_output_parameters(List *options, PGOutputData *data)
errcode(ERRCODE_INVALID_PARAMETER_VALUE),
errmsg("unrecognized origin value: \"%s\"", data->origin));
}
+ else if (strcmp(defel->defname, "min_send_delay") == 0)
+ {
+ unsigned long delay_val;
+ char *endptr;
+
+ if (min_send_delay_option_given)
+ ereport(ERROR,
+ errcode(ERRCODE_SYNTAX_ERROR),
+ errmsg("conflicting or redundant options"));
+ min_send_delay_option_given = true;
+
+ errno = 0;
+ delay_val = strtoul(strVal(defel->arg), &endptr, 10);
+ if (errno != 0 || *endptr != '\0')
+ ereport(ERROR,
+ (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+ errmsg("invalid min_send_delay")));
+
+ if (delay_val > PG_INT32_MAX)
+ ereport(ERROR,
+ (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+ errmsg("min_send_delay \"%s\" out of range",
+ strVal(defel->arg))));
+
+ data->min_send_delay = (int32) delay_val;
+ }
else
elog(ERROR, "unrecognized pgoutput option: %s", defel->defname);
}
@@ -501,6 +528,15 @@ pgoutput_startup(LogicalDecodingContext *ctx, OutputPluginOptions *opt,
else
ctx->twophase_opt_given = true;
+ if (data->min_send_delay &&
+ data->protocol_version < LOGICALREP_PROTO_MIN_SEND_DELAY_VERSION_NUM)
+ ereport(ERROR,
+ (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+ errmsg("requested proto_version=%d does not support delay sending data, need %d or higher",
+ data->protocol_version, LOGICALREP_PROTO_MIN_SEND_DELAY_VERSION_NUM)));
+ else
+ ctx->min_send_delay = data->min_send_delay;
+
/* Init publication state. */
data->publications = NIL;
publications_valid = false;
diff --git a/src/backend/replication/slotfuncs.c b/src/backend/replication/slotfuncs.c
index 2f3c964824..522f7600a1 100644
--- a/src/backend/replication/slotfuncs.c
+++ b/src/backend/replication/slotfuncs.c
@@ -148,7 +148,7 @@ create_logical_replication_slot(char *name, char *plugin,
XL_ROUTINE(.page_read = read_local_xlog_page,
.segment_open = wal_segment_open,
.segment_close = wal_segment_close),
- NULL, NULL, NULL);
+ NULL, NULL, NULL, NULL);
/*
* If caller needs us to determine the decoding start point, do so now.
@@ -481,7 +481,7 @@ pg_logical_replication_slot_advance(XLogRecPtr moveto)
XL_ROUTINE(.page_read = read_local_xlog_page,
.segment_open = wal_segment_open,
.segment_close = wal_segment_close),
- NULL, NULL, NULL);
+ NULL, NULL, NULL, NULL);
/*
* Start reading at the slot's restart_lsn, which we know to point to
diff --git a/src/backend/replication/walsender.c b/src/backend/replication/walsender.c
index 75e8363e24..8cefd8cd0a 100644
--- a/src/backend/replication/walsender.c
+++ b/src/backend/replication/walsender.c
@@ -252,6 +252,7 @@ static void WalSndPrepareWrite(LogicalDecodingContext *ctx, XLogRecPtr lsn, Tran
static void WalSndWriteData(LogicalDecodingContext *ctx, XLogRecPtr lsn, TransactionId xid, bool last_write);
static void WalSndUpdateProgress(LogicalDecodingContext *ctx, XLogRecPtr lsn, TransactionId xid,
bool skipped_xact);
+static void WalSndDelay(LogicalDecodingContext *ctx, TransactionId xid, TimestampTz delay_start);
static XLogRecPtr WalSndWaitForWal(XLogRecPtr loc);
static void LagTrackerWrite(XLogRecPtr lsn, TimestampTz local_flush_time);
static TimeOffset LagTrackerRead(int head, XLogRecPtr lsn, TimestampTz now);
@@ -1126,7 +1127,7 @@ CreateReplicationSlot(CreateReplicationSlotCmd *cmd)
.segment_open = WalSndSegmentOpen,
.segment_close = wal_segment_close),
WalSndPrepareWrite, WalSndWriteData,
- WalSndUpdateProgress);
+ WalSndUpdateProgress, WalSndDelay);
/*
* Signal that we don't need the timeout mechanism. We're just
@@ -1285,7 +1286,7 @@ StartLogicalReplication(StartReplicationCmd *cmd)
.segment_open = WalSndSegmentOpen,
.segment_close = wal_segment_close),
WalSndPrepareWrite, WalSndWriteData,
- WalSndUpdateProgress);
+ WalSndUpdateProgress, WalSndDelay);
xlogreader = logical_decoding_ctx->reader;
WalSndSetState(WALSNDSTATE_CATCHUP);
@@ -3849,3 +3850,75 @@ LagTrackerRead(int head, XLogRecPtr lsn, TimestampTz now)
Assert(time != 0);
return now - time;
}
+
+/*
+ * LogicalDecodingContext 'delay' callback.
+ *
+ * Wait long enough to make sure a transaction is applied at least that
+ * period behind the publisher.
+ */
+static void
+WalSndDelay(LogicalDecodingContext *ctx, TransactionId xid, TimestampTz delay_start)
+{
+ /* Wait till delayUntil by the latch mechanism */
+ while (true)
+ {
+ TimestampTz delayUntil;
+ long remaining_wait_time_ms;
+ long timeout_sleeptime_ms;
+
+ ResetLatch(MyLatch);
+
+ CHECK_FOR_INTERRUPTS();
+
+ /* This might change wal_sender_timeout */
+ if (ConfigReloadPending)
+ {
+ ConfigReloadPending = false;
+ ProcessConfigFile(PGC_SIGHUP);
+ }
+
+ /* Check for input from the client */
+ ProcessRepliesIfAny();
+
+ /* die if timeout was reached */
+ WalSndCheckTimeOut();
+
+ /* Send keepalive if the time has come */
+ WalSndKeepaliveIfNecessary();
+
+ /* Try to flush pending output to the client */
+ if (pq_flush_if_writable() != 0)
+ WalSndShutdown();
+
+ /*
+ * If we've requested to shut down, exit the process.
+ *
+ * Note that WalSndDone() cannot be used here because the delaying
+ * changes will be sent in the function.
+ */
+ if (got_STOPPING)
+ WalSndShutdown();
+
+ delayUntil = TimestampTzPlusMilliseconds(delay_start, ctx->min_send_delay);
+ remaining_wait_time_ms = TimestampDifferenceMilliseconds(GetCurrentTimestamp(), delayUntil);
+
+ /*
+ * Exit without arming the latch if it's already past time to send
+ * this transaction.
+ */
+ if (remaining_wait_time_ms <= 0)
+ break;
+
+ /* Sleep until appropriate time. */
+ timeout_sleeptime_ms = WalSndComputeSleeptime(GetCurrentTimestamp());
+
+ elog(DEBUG2, "time-delayed replication for txid %u, delay_time = %d ms, remaining wait time: %ld ms",
+ xid, (int) ctx->min_send_delay, remaining_wait_time_ms);
+
+ /* Sleep until we get reply from worker or we time out */
+ WalSndWait(WL_SOCKET_READABLE,
+ Min(timeout_sleeptime_ms, remaining_wait_time_ms),
+ WAIT_EVENT_WALSENDER_SEND_DELAY);
+ }
+}
diff --git a/src/backend/utils/activity/wait_event.c b/src/backend/utils/activity/wait_event.c
index cb99cc6339..76c19fe11d 100644
--- a/src/backend/utils/activity/wait_event.c
+++ b/src/backend/utils/activity/wait_event.c
@@ -515,6 +515,9 @@ pgstat_get_wait_timeout(WaitEventTimeout w)
case WAIT_EVENT_VACUUM_TRUNCATE:
event_name = "VacuumTruncate";
break;
+ case WAIT_EVENT_WALSENDER_SEND_DELAY:
+ event_name = "WalSenderSendDelay";
+ break;
/* no default case, so that compiler will warn */
}
diff --git a/src/bin/pg_dump/pg_dump.c b/src/bin/pg_dump/pg_dump.c
index 1a06eeaf6a..9754487921 100644
--- a/src/bin/pg_dump/pg_dump.c
+++ b/src/bin/pg_dump/pg_dump.c
@@ -4494,6 +4494,7 @@ getSubscriptions(Archive *fout)
int i_subsynccommit;
int i_subpublications;
int i_subbinary;
+ int i_subminsenddelay;
int i,
ntups;
@@ -4546,9 +4547,13 @@ getSubscriptions(Archive *fout)
LOGICALREP_TWOPHASE_STATE_DISABLED);
if (fout->remoteVersion >= 160000)
- appendPQExpBufferStr(query, " s.suborigin\n");
+ appendPQExpBufferStr(query,
+ " s.suborigin,\n"
+ " s.subminsenddelay\n");
else
- appendPQExpBuffer(query, " '%s' AS suborigin\n", LOGICALREP_ORIGIN_ANY);
+ appendPQExpBuffer(query, " '%s' AS suborigin,\n"
+ " 0 AS subminsenddelay\n",
+ LOGICALREP_ORIGIN_ANY);
appendPQExpBufferStr(query,
"FROM pg_subscription s\n"
@@ -4576,6 +4581,7 @@ getSubscriptions(Archive *fout)
i_subtwophasestate = PQfnumber(res, "subtwophasestate");
i_subdisableonerr = PQfnumber(res, "subdisableonerr");
i_suborigin = PQfnumber(res, "suborigin");
+ i_subminsenddelay = PQfnumber(res, "subminsenddelay");
subinfo = pg_malloc(ntups * sizeof(SubscriptionInfo));
@@ -4606,6 +4612,8 @@ getSubscriptions(Archive *fout)
subinfo[i].subdisableonerr =
pg_strdup(PQgetvalue(res, i, i_subdisableonerr));
subinfo[i].suborigin = pg_strdup(PQgetvalue(res, i, i_suborigin));
+ subinfo[i].subminsenddelay =
+ atoi(PQgetvalue(res, i, i_subminsenddelay));
/* Decide whether we want to dump it */
selectDumpableObject(&(subinfo[i].dobj), fout);
@@ -4687,6 +4695,9 @@ dumpSubscription(Archive *fout, const SubscriptionInfo *subinfo)
if (strcmp(subinfo->subsynccommit, "off") != 0)
appendPQExpBuffer(query, ", synchronous_commit = %s", fmtId(subinfo->subsynccommit));
+ if (subinfo->subminsenddelay > 0)
+ appendPQExpBuffer(query, ", min_send_delay = '%d ms'", subinfo->subminsenddelay);
+
appendPQExpBufferStr(query, ");\n");
if (subinfo->dobj.dump & DUMP_COMPONENT_DEFINITION)
diff --git a/src/bin/pg_dump/pg_dump.h b/src/bin/pg_dump/pg_dump.h
index cdca0b993d..4c55f8efc4 100644
--- a/src/bin/pg_dump/pg_dump.h
+++ b/src/bin/pg_dump/pg_dump.h
@@ -660,6 +660,7 @@ typedef struct _SubscriptionInfo
char *subdisableonerr;
char *suborigin;
char *subsynccommit;
+ int subminsenddelay;
char *subpublications;
} SubscriptionInfo;
diff --git a/src/bin/psql/describe.c b/src/bin/psql/describe.c
index c8a0bb7b3a..c7d303a168 100644
--- a/src/bin/psql/describe.c
+++ b/src/bin/psql/describe.c
@@ -6472,7 +6472,7 @@ describeSubscriptions(const char *pattern, bool verbose)
PGresult *res;
printQueryOpt myopt = pset.popt;
static const bool translate_columns[] = {false, false, false, false,
- false, false, false, false, false, false, false, false};
+ false, false, false, false, false, false, false, false, false};
if (pset.sversion < 100000)
{
@@ -6527,10 +6527,13 @@ describeSubscriptions(const char *pattern, bool verbose)
gettext_noop("Two-phase commit"),
gettext_noop("Disable on error"));
+ /* Origin and min_send_delay are only supported in v16 and higher */
if (pset.sversion >= 160000)
appendPQExpBuffer(&buf,
- ", suborigin AS \"%s\"\n",
- gettext_noop("Origin"));
+ ", suborigin AS \"%s\"\n"
+ ", subminsenddelay AS \"%s\"\n",
+ gettext_noop("Origin"),
+ gettext_noop("Min send delay"));
appendPQExpBuffer(&buf,
", subsynccommit AS \"%s\"\n"
diff --git a/src/bin/psql/tab-complete.c b/src/bin/psql/tab-complete.c
index 5e1882eaea..6643db6f55 100644
--- a/src/bin/psql/tab-complete.c
+++ b/src/bin/psql/tab-complete.c
@@ -1925,7 +1925,7 @@ psql_completion(const char *text, int start, int end)
COMPLETE_WITH("(", "PUBLICATION");
/* ALTER SUBSCRIPTION <name> SET ( */
else if (HeadMatches("ALTER", "SUBSCRIPTION", MatchAny) && TailMatches("SET", "("))
- COMPLETE_WITH("binary", "disable_on_error", "origin", "slot_name",
+ COMPLETE_WITH("binary", "disable_on_error", "min_send_delay", "origin", "slot_name",
"streaming", "synchronous_commit");
/* ALTER SUBSCRIPTION <name> SKIP ( */
else if (HeadMatches("ALTER", "SUBSCRIPTION", MatchAny) && TailMatches("SKIP", "("))
@@ -3268,7 +3268,7 @@ psql_completion(const char *text, int start, int end)
/* Complete "CREATE SUBSCRIPTION <name> ... WITH ( <opt>" */
else if (HeadMatches("CREATE", "SUBSCRIPTION") && TailMatches("WITH", "("))
COMPLETE_WITH("binary", "connect", "copy_data", "create_slot",
- "disable_on_error", "enabled", "origin", "slot_name",
+ "disable_on_error", "enabled", "min_send_delay", "origin", "slot_name",
"streaming", "synchronous_commit", "two_phase");
/* CREATE TRIGGER --- is allowed inside CREATE SCHEMA, so use TailMatches */
diff --git a/src/include/catalog/pg_subscription.h b/src/include/catalog/pg_subscription.h
index b0f2a1705d..69ae4314b4 100644
--- a/src/include/catalog/pg_subscription.h
+++ b/src/include/catalog/pg_subscription.h
@@ -74,6 +74,8 @@ CATALOG(pg_subscription,6100,SubscriptionRelationId) BKI_SHARED_RELATION BKI_ROW
Oid subowner BKI_LOOKUP(pg_authid); /* Owner of the subscription */
+ int32 subminsenddelay; /* Replication send delay (ms) */
+
bool subenabled; /* True if the subscription is enabled (the
* worker should be running) */
@@ -122,6 +124,7 @@ typedef struct Subscription
* skipped */
char *name; /* Name of the subscription */
Oid owner; /* Oid of the subscription owner */
+ int32 minsenddelay; /* Replication send delay (ms) */
bool enabled; /* Indicates if the subscription is enabled */
bool binary; /* Indicates if the subscription wants data in
* binary format */
diff --git a/src/include/replication/logical.h b/src/include/replication/logical.h
index 5f49554ea0..c389dc17e5 100644
--- a/src/include/replication/logical.h
+++ b/src/include/replication/logical.h
@@ -30,6 +30,11 @@ typedef void (*LogicalOutputPluginWriterUpdateProgress) (struct LogicalDecodingC
bool skipped_xact
);
+typedef void (*LogicalOutputPluginWriterDelay) (struct LogicalDecodingContext *lr,
+ TransactionId xid,
+ TimestampTz start_time
+);
+
typedef struct LogicalDecodingContext
{
/* memory context this is all allocated in */
@@ -64,6 +69,7 @@ typedef struct LogicalDecodingContext
LogicalOutputPluginWriterPrepareWrite prepare_write;
LogicalOutputPluginWriterWrite write;
LogicalOutputPluginWriterUpdateProgress update_progress;
+ LogicalOutputPluginWriterDelay delay_send;
/*
* Output buffer.
@@ -100,6 +106,12 @@ typedef struct LogicalDecodingContext
*/
bool twophase_opt_given;
+ /*
+ * The minimum delay, in milliseconds, by the publisher before sending all
+ * the changes.
+ */
+ int32 min_send_delay;
+
/*
* State for writing output.
*/
@@ -121,14 +133,16 @@ extern LogicalDecodingContext *CreateInitDecodingContext(const char *plugin,
XLogReaderRoutine *xl_routine,
LogicalOutputPluginWriterPrepareWrite prepare_write,
LogicalOutputPluginWriterWrite do_write,
- LogicalOutputPluginWriterUpdateProgress update_progress);
+ LogicalOutputPluginWriterUpdateProgress update_progress,
+ LogicalOutputPluginWriterDelay delay_send);
extern LogicalDecodingContext *CreateDecodingContext(XLogRecPtr start_lsn,
List *output_plugin_options,
bool fast_forward,
XLogReaderRoutine *xl_routine,
LogicalOutputPluginWriterPrepareWrite prepare_write,
LogicalOutputPluginWriterWrite do_write,
- LogicalOutputPluginWriterUpdateProgress update_progress);
+ LogicalOutputPluginWriterUpdateProgress update_progress,
+ LogicalOutputPluginWriterDelay delay_send);
extern void DecodingContextFindStartpoint(LogicalDecodingContext *ctx);
extern bool DecodingContextReady(LogicalDecodingContext *ctx);
extern void FreeDecodingContext(LogicalDecodingContext *ctx);
diff --git a/src/include/replication/logicalproto.h b/src/include/replication/logicalproto.h
index 0ea2df5088..46faadbd7a 100644
--- a/src/include/replication/logicalproto.h
+++ b/src/include/replication/logicalproto.h
@@ -36,13 +36,17 @@
* LOGICALREP_PROTO_STREAM_PARALLEL_VERSION_NUM is the minimum protocol version
* where we support applying large streaming transactions in parallel.
* Introduced in PG16.
+ *
+ * LOGICALREP_PROTO_MIN_SEND_DELAY_VERSION_NUM is the minimum protocol version
+ * with support for delaying to send transactions. Introduced in PG16.
*/
#define LOGICALREP_PROTO_MIN_VERSION_NUM 1
#define LOGICALREP_PROTO_VERSION_NUM 1
#define LOGICALREP_PROTO_STREAM_VERSION_NUM 2
#define LOGICALREP_PROTO_TWOPHASE_VERSION_NUM 3
#define LOGICALREP_PROTO_STREAM_PARALLEL_VERSION_NUM 4
-#define LOGICALREP_PROTO_MAX_VERSION_NUM LOGICALREP_PROTO_STREAM_PARALLEL_VERSION_NUM
+#define LOGICALREP_PROTO_MIN_SEND_DELAY_VERSION_NUM 4
+#define LOGICALREP_PROTO_MAX_VERSION_NUM LOGICALREP_PROTO_MIN_SEND_DELAY_VERSION_NUM
/*
* Logical message types
diff --git a/src/include/replication/pgoutput.h b/src/include/replication/pgoutput.h
index b4a8015403..d2fde09e00 100644
--- a/src/include/replication/pgoutput.h
+++ b/src/include/replication/pgoutput.h
@@ -30,6 +30,7 @@ typedef struct PGOutputData
bool messages;
bool two_phase;
char *origin;
+ int32 min_send_delay;
} PGOutputData;
#endif /* PGOUTPUT_H */
diff --git a/src/include/replication/walreceiver.h b/src/include/replication/walreceiver.h
index decffe352d..c20969aed7 100644
--- a/src/include/replication/walreceiver.h
+++ b/src/include/replication/walreceiver.h
@@ -187,6 +187,7 @@ typedef struct
* prepare time */
char *origin; /* Only publish data originating from the
* specified origin */
+ int32 min_send_delay; /* The minimum send delay */
} logical;
} proto;
} WalRcvStreamOptions;
diff --git a/src/include/utils/wait_event.h b/src/include/utils/wait_event.h
index 9ab23e1c4a..cc3a234eba 100644
--- a/src/include/utils/wait_event.h
+++ b/src/include/utils/wait_event.h
@@ -150,7 +150,8 @@ typedef enum
WAIT_EVENT_REGISTER_SYNC_REQUEST,
WAIT_EVENT_SPIN_DELAY,
WAIT_EVENT_VACUUM_DELAY,
- WAIT_EVENT_VACUUM_TRUNCATE
+ WAIT_EVENT_VACUUM_TRUNCATE,
+ WAIT_EVENT_WALSENDER_SEND_DELAY
} WaitEventTimeout;
/* ----------
diff --git a/src/test/regress/expected/subscription.out b/src/test/regress/expected/subscription.out
index 3f99b14394..2027316233 100644
--- a/src/test/regress/expected/subscription.out
+++ b/src/test/regress/expected/subscription.out
@@ -114,18 +114,18 @@ CREATE SUBSCRIPTION regress_testsub4 CONNECTION 'dbname=regress_doesnotexist' PU
WARNING: subscription was created, but is not connected
HINT: To initiate replication, you must manually create the replication slot, enable the subscription, and refresh the subscription.
\dRs+ regress_testsub4
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
-------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub4 | regress_subscription_user | f | {testpub} | f | off | d | f | none | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min send delay | Synchronous commit | Conninfo | Skip LSN
+------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+----------------+--------------------+-----------------------------+----------
+ regress_testsub4 | regress_subscription_user | f | {testpub} | f | off | d | f | none | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
ALTER SUBSCRIPTION regress_testsub4 SET (origin = any);
\dRs+ regress_testsub4
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
-------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub4 | regress_subscription_user | f | {testpub} | f | off | d | f | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min send delay | Synchronous commit | Conninfo | Skip LSN
+------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+----------------+--------------------+-----------------------------+----------
+ regress_testsub4 | regress_subscription_user | f | {testpub} | f | off | d | f | any | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
DROP SUBSCRIPTION regress_testsub3;
@@ -143,10 +143,10 @@ ALTER SUBSCRIPTION regress_testsub CONNECTION 'foobar';
ERROR: invalid connection string syntax: missing "=" after "foobar" in connection info string
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub} | f | off | d | f | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min send delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+----------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | off | d | f | any | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
ALTER SUBSCRIPTION regress_testsub SET PUBLICATION testpub2, testpub3 WITH (refresh = false);
@@ -163,10 +163,10 @@ ERROR: unrecognized subscription parameter: "create_slot"
-- ok
ALTER SUBSCRIPTION regress_testsub SKIP (lsn = '0/12345');
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+---------------------+--------+-----------+------------------+------------------+--------+--------------------+------------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub2,testpub3} | f | off | d | f | any | off | dbname=regress_doesnotexist2 | 0/12345
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min send delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+---------------------+--------+-----------+------------------+------------------+--------+----------------+--------------------+------------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub2,testpub3} | f | off | d | f | any | 0 | off | dbname=regress_doesnotexist2 | 0/12345
(1 row)
-- ok - with lsn = NONE
@@ -175,10 +175,10 @@ ALTER SUBSCRIPTION regress_testsub SKIP (lsn = NONE);
ALTER SUBSCRIPTION regress_testsub SKIP (lsn = '0/0');
ERROR: invalid WAL location (LSN): 0/0
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+---------------------+--------+-----------+------------------+------------------+--------+--------------------+------------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub2,testpub3} | f | off | d | f | any | off | dbname=regress_doesnotexist2 | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min send delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+---------------------+--------+-----------+------------------+------------------+--------+----------------+--------------------+------------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub2,testpub3} | f | off | d | f | any | 0 | off | dbname=regress_doesnotexist2 | 0/0
(1 row)
BEGIN;
@@ -210,10 +210,10 @@ ALTER SUBSCRIPTION regress_testsub_foo SET (synchronous_commit = foobar);
ERROR: invalid value for parameter "synchronous_commit": "foobar"
HINT: Available values: local, remote_write, remote_apply, on, off.
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
----------------------+---------------------------+---------+---------------------+--------+-----------+------------------+------------------+--------+--------------------+------------------------------+----------
- regress_testsub_foo | regress_subscription_user | f | {testpub2,testpub3} | f | off | d | f | any | local | dbname=regress_doesnotexist2 | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min send delay | Synchronous commit | Conninfo | Skip LSN
+---------------------+---------------------------+---------+---------------------+--------+-----------+------------------+------------------+--------+----------------+--------------------+------------------------------+----------
+ regress_testsub_foo | regress_subscription_user | f | {testpub2,testpub3} | f | off | d | f | any | 0 | local | dbname=regress_doesnotexist2 | 0/0
(1 row)
-- rename back to keep the rest simple
@@ -247,19 +247,19 @@ CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUB
WARNING: subscription was created, but is not connected
HINT: To initiate replication, you must manually create the replication slot, enable the subscription, and refresh the subscription.
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub} | t | off | d | f | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min send delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+----------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | t | off | d | f | any | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
ALTER SUBSCRIPTION regress_testsub SET (binary = false);
ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE);
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub} | f | off | d | f | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min send delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+----------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | off | d | f | any | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
DROP SUBSCRIPTION regress_testsub;
@@ -271,27 +271,27 @@ CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUB
WARNING: subscription was created, but is not connected
HINT: To initiate replication, you must manually create the replication slot, enable the subscription, and refresh the subscription.
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub} | f | on | d | f | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min send delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+----------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | on | d | f | any | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
ALTER SUBSCRIPTION regress_testsub SET (streaming = parallel);
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub} | f | parallel | d | f | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min send delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+----------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | parallel | d | f | any | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
ALTER SUBSCRIPTION regress_testsub SET (streaming = false);
ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE);
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub} | f | off | d | f | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min send delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+----------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | off | d | f | any | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
-- fail - publication already exists
@@ -306,10 +306,10 @@ ALTER SUBSCRIPTION regress_testsub ADD PUBLICATION testpub1, testpub2 WITH (refr
ALTER SUBSCRIPTION regress_testsub ADD PUBLICATION testpub1, testpub2 WITH (refresh = false);
ERROR: publication "testpub1" is already in subscription "regress_testsub"
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-----------------------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub,testpub1,testpub2} | f | off | d | f | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min send delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-----------------------------+--------+-----------+------------------+------------------+--------+----------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub,testpub1,testpub2} | f | off | d | f | any | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
-- fail - publication used more than once
@@ -324,10 +324,10 @@ ERROR: publication "testpub3" is not in subscription "regress_testsub"
-- ok - delete publications
ALTER SUBSCRIPTION regress_testsub DROP PUBLICATION testpub1, testpub2 WITH (refresh = false);
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub} | f | off | d | f | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min send delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+----------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | off | d | f | any | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
DROP SUBSCRIPTION regress_testsub;
@@ -363,10 +363,10 @@ CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUB
WARNING: subscription was created, but is not connected
HINT: To initiate replication, you must manually create the replication slot, enable the subscription, and refresh the subscription.
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub} | f | off | p | f | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min send delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+----------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | off | p | f | any | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
--fail - alter of two_phase option not supported.
@@ -375,10 +375,10 @@ ERROR: unrecognized subscription parameter: "two_phase"
-- but can alter streaming when two_phase enabled
ALTER SUBSCRIPTION regress_testsub SET (streaming = true);
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub} | f | on | p | f | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min send delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+----------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | on | p | f | any | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE);
@@ -388,10 +388,10 @@ CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUB
WARNING: subscription was created, but is not connected
HINT: To initiate replication, you must manually create the replication slot, enable the subscription, and refresh the subscription.
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub} | f | on | p | f | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min send delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+----------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | on | p | f | any | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE);
@@ -404,20 +404,61 @@ CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUB
WARNING: subscription was created, but is not connected
HINT: To initiate replication, you must manually create the replication slot, enable the subscription, and refresh the subscription.
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub} | f | off | d | f | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min send delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+----------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | off | d | f | any | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
ALTER SUBSCRIPTION regress_testsub SET (disable_on_error = true);
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub} | f | off | d | t | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min send delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+----------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | off | d | t | any | 0 | off | dbname=regress_doesnotexist | 0/0
+(1 row)
+
+ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE);
+DROP SUBSCRIPTION regress_testsub;
+-- fail -- min_send_delay must be a non-negative integer
+CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexit' PUBLICATION testpub WITH (connect = false, min_send_delay = foo);
+ERROR: invalid value for parameter "min_send_delay": "foo"
+CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexit' PUBLICATION testpub WITH (connect = false, min_send_delay = -1);
+ERROR: -1 ms is outside the valid range for parameter "min_send_delay" (0 .. 2147483647)
+-- fail - specifying streaming = parallel with time-delayed replication is not
+-- supported
+CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUBLICATION testpub WITH (connect = false, streaming = parallel, min_send_delay = 123);
+ERROR: min_send_delay > 0 and streaming = parallel are mutually exclusive options
+-- success -- min_send_delay value without units is taken as milliseconds
+CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexit' PUBLICATION testpub WITH (connect = false, min_send_delay = 123);
+WARNING: subscription was created, but is not connected
+HINT: To initiate replication, you must manually create the replication slot, enable the subscription, and refresh the subscription.
+\dRs+
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min send delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+----------------+--------------------+----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | off | d | f | any | 123 | off | dbname=regress_doesnotexit | 0/0
+(1 row)
+
+-- success -- min_send_delay value with units other than ms is converted to ms
+-- and stored as an integer
+ALTER SUBSCRIPTION regress_testsub SET (min_send_delay = '1 d');
+\dRs+
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min send delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+----------------+--------------------+----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | off | d | f | any | 86400000 | off | dbname=regress_doesnotexit | 0/0
(1 row)
+-- fail - alter subscription with streaming = parallel should fail when
+-- time-delayed replication is set
+ALTER SUBSCRIPTION regress_testsub SET (streaming = parallel);
+ERROR: cannot set parallel streaming mode for subscription with min_send_delay
+-- fail - alter subscription with min_send_delay should fail when
+-- streaming = parallel is set
+ALTER SUBSCRIPTION regress_testsub SET (min_send_delay = 0, streaming = parallel);
+ALTER SUBSCRIPTION regress_testsub SET (min_send_delay = 123);
+ERROR: cannot set min_send_delay for subscription in parallel streaming mode
ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE);
DROP SUBSCRIPTION regress_testsub;
RESET SESSION AUTHORIZATION;
diff --git a/src/test/regress/sql/subscription.sql b/src/test/regress/sql/subscription.sql
index 7281f5fee2..46bf4a27d9 100644
--- a/src/test/regress/sql/subscription.sql
+++ b/src/test/regress/sql/subscription.sql
@@ -286,6 +286,35 @@ ALTER SUBSCRIPTION regress_testsub SET (disable_on_error = true);
ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE);
DROP SUBSCRIPTION regress_testsub;
+-- fail -- min_send_delay must be a non-negative integer
+CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexit' PUBLICATION testpub WITH (connect = false, min_send_delay = foo);
+CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexit' PUBLICATION testpub WITH (connect = false, min_send_delay = -1);
+
+-- fail - specifying streaming = parallel with time-delayed replication is not
+-- supported
+CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUBLICATION testpub WITH (connect = false, streaming = parallel, min_send_delay = 123);
+
+-- success -- min_send_delay value without units is taken as milliseconds
+CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexit' PUBLICATION testpub WITH (connect = false, min_send_delay = 123);
+\dRs+
+
+-- success -- min_send_delay value with units other than ms is converted to ms
+-- and stored as an integer
+ALTER SUBSCRIPTION regress_testsub SET (min_send_delay = '1 d');
+\dRs+
+
+-- fail - alter subscription with streaming = parallel should fail when
+-- time-delayed replication is set
+ALTER SUBSCRIPTION regress_testsub SET (streaming = parallel);
+
+-- fail - alter subscription with min_send_delay should fail when
+-- streaming = parallel is set
+ALTER SUBSCRIPTION regress_testsub SET (min_send_delay = 0, streaming = parallel);
+ALTER SUBSCRIPTION regress_testsub SET (min_send_delay = 123);
+
+ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE);
+DROP SUBSCRIPTION regress_testsub;
+
RESET SESSION AUTHORIZATION;
DROP ROLE regress_subscription_user;
DROP ROLE regress_subscription_user2;
diff --git a/src/test/subscription/t/001_rep_changes.pl b/src/test/subscription/t/001_rep_changes.pl
index 91aa068c95..063a98fde9 100644
--- a/src/test/subscription/t/001_rep_changes.pl
+++ b/src/test/subscription/t/001_rep_changes.pl
@@ -515,6 +515,33 @@ $node_publisher->poll_query_until('postgres',
or die
"Timed out while waiting for apply to restart after renaming SUBSCRIPTION";
+# Test time-delayed logical replication
+#
+# If the subscription sets min_send_delay parameter, the walsender will delay
+# the transaction send for min_send_delay milliseconds. We verify this by
+# looking at the time difference between a) when tuples are inserted on the
+# publisher, and b) when those changes are replicated on the subscriber. Even
+# on slow machines, this strategy will give predictable behavior.
+
+# Set min_send_delay parameter to 3 seconds
+my $delay = 3;
+$node_subscriber->safe_psql('postgres',
+ "ALTER SUBSCRIPTION tap_sub_renamed SET (min_send_delay = '${delay}s')");
+
+# Before doing the insertion, get the current timestamp that will be
+# used as a comparison base.
+my $publisher_insert_time = time();
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_ins VALUES (generate_series(1101, 1120))");
+
+# The publisher waits for the replication to complete
+$node_publisher->wait_for_catchup('tap_sub_renamed');
+
+# This test is successful only if at least the configured delay has elapsed.
+ok( time() - $publisher_insert_time >= $delay,
+ "subscriber applies WAL only after replication delay for non-streaming transaction"
+);
+
# check all the cleanup
$node_subscriber->safe_psql('postgres', "DROP SUBSCRIPTION tap_sub_renamed");
--
2.27.0
On Thu, Feb 23, 2023 at 9:10 PM Hayato Kuroda (Fujitsu)
<kuroda.hayato@fujitsu.com> wrote:
Dear Shi,
Thank you for reviewing! PSA new version.
+ elog(DEBUG2, "time-delayed replication for txid %u, delay_time = %d ms, remaining wait time: %ld ms", + ctx->write_xid, (int) ctx->min_send_delay, + remaining_wait_time_ms);I tried and saw that the xid here looks wrong, what it got is the xid of the
previous transaction. It seems `ctx->write_xid` has not been updated and we
can't use it.Good catch. There are several approaches to fix that, I choose the simplest way.
TransactionId was added as an argument of functions.
Thank you for updating the patch. Here are some comments on v7 patch:
+ *
+ * LOGICALREP_PROTO_MIN_SEND_DELAY_VERSION_NUM is the minimum protocol version
+ * with support for delaying to send transactions. Introduced in PG16.
*/
#define LOGICALREP_PROTO_MIN_VERSION_NUM 1
#define LOGICALREP_PROTO_VERSION_NUM 1
#define LOGICALREP_PROTO_STREAM_VERSION_NUM 2
#define LOGICALREP_PROTO_TWOPHASE_VERSION_NUM 3
#define LOGICALREP_PROTO_STREAM_PARALLEL_VERSION_NUM 4
-#define LOGICALREP_PROTO_MAX_VERSION_NUM
LOGICALREP_PROTO_STREAM_PARALLEL_VERSION_NUM
+#define LOGICALREP_PROTO_MIN_SEND_DELAY_VERSION_NUM 4
+#define LOGICALREP_PROTO_MAX_VERSION_NUM
LOGICALREP_PROTO_MIN_SEND_DELAY_VERSION_NUM
What is the usecase of the old macro,
LOGICALREP_PROTO_STREAM_PARALLEL_VERSION_NUM, after adding
LOGICALREP_PROTO_MIN_SEND_DELAY_VERSION_NUM ? I think if we go this
way, we will end up adding macros every time when adding a new option,
which seems not a good idea. I'm really not sure we need to change the
protocol version or the macro. Commit
366283961ac0ed6d89014444c6090f3fd02fce0a introduced the 'origin'
subscription parameter that is also sent to the publisher, but we
didn't touch the protocol version at all.
---
Why do we not to delay sending COMMIT PREPARED messages?
---
+ /*
+ * If we've requested to shut down, exit the process.
+ *
+ * Note that WalSndDone() cannot be used here because
the delaying
+ * changes will be sent in the function.
+ */
+ if (got_STOPPING)
+ WalSndShutdown();
Since the walsender exits without sending the done message at a server
shutdown, we get the following log message on the subscriber:
ERROR: could not receive data from WAL stream: server closed the
connection unexpectedly
I think that since the walsender is just waiting for sending data, it
can send the done message if the socket is writable.
---
+ delayUntil = TimestampTzPlusMilliseconds(delay_start,
ctx->min_send_delay);
+ remaining_wait_time_ms =
TimestampDifferenceMilliseconds(GetCurrentTimestamp(), delayUntil);
+
(snip)
+
+ /* Sleep until appropriate time. */
+ timeout_sleeptime_ms =
WalSndComputeSleeptime(GetCurrentTimestamp());
I think it's better to call GetCurrentTimestamp() only once.
---
+# This test is successful only if at least the configured delay has elapsed.
+ok( time() - $publisher_insert_time >= $delay,
+ "subscriber applies WAL only after replication delay for
non-streaming transaction"
+);
The subscriber doesn't actually apply WAL records, but logically
replicated changes. How about "subscriber applies changes only
after..."?
Regards,
--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com
On Thu, Feb 23, 2023 at 5:40 PM Hayato Kuroda (Fujitsu)
<kuroda.hayato@fujitsu.com> wrote:
Thank you for reviewing! PSA new version.
I was trying to think if there is any better way to implement the
newly added callback (WalSndDelay()) but couldn't find any. For
example, one idea I tried to evaluate is whether we can merge it with
the existing callback WalSndUpdateProgress() or maybe extract the part
other than progress tracking from that function into a new callback
and then try to reuse it here as well. Though there is some common
functionality like checking for timeout and processing replies still
they are different enough that they seem to need separate callbacks.
The prime purpose of a callback for the patch being discussed here is
to delay the xact before sending the commit/prepare whereas the
existing callback (WalSndUpdateProgress()) or what we are discussing
at [1]/messages/by-id/20230210210423.r26ndnfmuifie4f6@awork3.anarazel.de allows sending the keepalive message in some special cases
where there is no communication between walsender and walreceiver.
Now, the WalSndDelay() also tries to check for timeout and send
keepalive if necessary but there is also dependency on the delay
parameter, so don't think it is a good idea of trying to combine those
functionalities into one API.
Thoughts?
[1]: /messages/by-id/20230210210423.r26ndnfmuifie4f6@awork3.anarazel.de
--
With Regards,
Amit Kapila.
On Mon, Feb 27, 2023 at 11:11 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
On Thu, Feb 23, 2023 at 9:10 PM Hayato Kuroda (Fujitsu)
<kuroda.hayato@fujitsu.com> wrote:Thank you for reviewing! PSA new version.
Thank you for updating the patch. Here are some comments on v7 patch:
+ * + * LOGICALREP_PROTO_MIN_SEND_DELAY_VERSION_NUM is the minimum protocol version + * with support for delaying to send transactions. Introduced in PG16. */ #define LOGICALREP_PROTO_MIN_VERSION_NUM 1 #define LOGICALREP_PROTO_VERSION_NUM 1 #define LOGICALREP_PROTO_STREAM_VERSION_NUM 2 #define LOGICALREP_PROTO_TWOPHASE_VERSION_NUM 3 #define LOGICALREP_PROTO_STREAM_PARALLEL_VERSION_NUM 4 -#define LOGICALREP_PROTO_MAX_VERSION_NUM LOGICALREP_PROTO_STREAM_PARALLEL_VERSION_NUM +#define LOGICALREP_PROTO_MIN_SEND_DELAY_VERSION_NUM 4 +#define LOGICALREP_PROTO_MAX_VERSION_NUM LOGICALREP_PROTO_MIN_SEND_DELAY_VERSION_NUMWhat is the usecase of the old macro,
LOGICALREP_PROTO_STREAM_PARALLEL_VERSION_NUM, after adding
LOGICALREP_PROTO_MIN_SEND_DELAY_VERSION_NUM ? I think if we go this
way, we will end up adding macros every time when adding a new option,
which seems not a good idea. I'm really not sure we need to change the
protocol version or the macro. Commit
366283961ac0ed6d89014444c6090f3fd02fce0a introduced the 'origin'
subscription parameter that is also sent to the publisher, but we
didn't touch the protocol version at all.
Right, I also don't see a reason to do anything for this. We have
previously bumped the protocol version when we send extra/additional
information from walsender but here that is not the requirement, so
this change doesn't seem to be required.
---
Why do we not to delay sending COMMIT PREPARED messages?
I think we need to either add delay for prepare or commit prepared as
otherwise, it will lead to delaying the xact more than required. The
patch seems to add a delay before sending a PREPARE as that is the
time when the subscriber will apply the changes.
--
With Regards,
Amit Kapila.
Dear Amit,
I was trying to think if there is any better way to implement the
newly added callback (WalSndDelay()) but couldn't find any. For
example, one idea I tried to evaluate is whether we can merge it with
the existing callback WalSndUpdateProgress() or maybe extract the part
other than progress tracking from that function into a new callback
and then try to reuse it here as well. Though there is some common
functionality like checking for timeout and processing replies still
they are different enough that they seem to need separate callbacks.
The prime purpose of a callback for the patch being discussed here is
to delay the xact before sending the commit/prepare whereas the
existing callback (WalSndUpdateProgress()) or what we are discussing
at [1] allows sending the keepalive message in some special cases
where there is no communication between walsender and walreceiver.
Now, the WalSndDelay() also tries to check for timeout and send
keepalive if necessary but there is also dependency on the delay
parameter, so don't think it is a good idea of trying to combine those
functionalities into one API.Thoughts?
[1] -
/messages/by-id/20230210210423.r26ndnfmuifie4f6@
awork3.anarazel.de
Thank you for confirming. My understanding was that we should keep the current design.
I agree with your posting.
In the current callback and modified version in [1], sending keepalives is done
via ProcessPendingWrites(). It is called by many functions and should not be changed,
like adding end_time only for us. Moreover, the name is not suitable because
time-delayed logical replication does not wait until the send buffer becomes empty.
If we reconstruct WalSndUpdateProgress() and change mechanisms around that,
codes will become dirty. As Amit said, in one path, the lag will be tracked and
the walsender will wait until the buffer is empty.
In another path, the lag calculation will be ignored, and the walsender will wait
until the process spends time till a given period. Such a function is painful to read later.
I think callbacks that have different purposes should not be mixed.
Best Regards,
Hayato Kuroda
FUJITSU LIMITED
On Mon, Feb 27, 2023 at 3:34 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
On Mon, Feb 27, 2023 at 11:11 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
On Thu, Feb 23, 2023 at 9:10 PM Hayato Kuroda (Fujitsu)
<kuroda.hayato@fujitsu.com> wrote:Thank you for reviewing! PSA new version.
Thank you for updating the patch. Here are some comments on v7 patch:
+ * + * LOGICALREP_PROTO_MIN_SEND_DELAY_VERSION_NUM is the minimum protocol version + * with support for delaying to send transactions. Introduced in PG16. */ #define LOGICALREP_PROTO_MIN_VERSION_NUM 1 #define LOGICALREP_PROTO_VERSION_NUM 1 #define LOGICALREP_PROTO_STREAM_VERSION_NUM 2 #define LOGICALREP_PROTO_TWOPHASE_VERSION_NUM 3 #define LOGICALREP_PROTO_STREAM_PARALLEL_VERSION_NUM 4 -#define LOGICALREP_PROTO_MAX_VERSION_NUM LOGICALREP_PROTO_STREAM_PARALLEL_VERSION_NUM +#define LOGICALREP_PROTO_MIN_SEND_DELAY_VERSION_NUM 4 +#define LOGICALREP_PROTO_MAX_VERSION_NUM LOGICALREP_PROTO_MIN_SEND_DELAY_VERSION_NUMWhat is the usecase of the old macro,
LOGICALREP_PROTO_STREAM_PARALLEL_VERSION_NUM, after adding
LOGICALREP_PROTO_MIN_SEND_DELAY_VERSION_NUM ? I think if we go this
way, we will end up adding macros every time when adding a new option,
which seems not a good idea. I'm really not sure we need to change the
protocol version or the macro. Commit
366283961ac0ed6d89014444c6090f3fd02fce0a introduced the 'origin'
subscription parameter that is also sent to the publisher, but we
didn't touch the protocol version at all.Right, I also don't see a reason to do anything for this. We have
previously bumped the protocol version when we send extra/additional
information from walsender but here that is not the requirement, so
this change doesn't seem to be required.---
Why do we not to delay sending COMMIT PREPARED messages?I think we need to either add delay for prepare or commit prepared as
otherwise, it will lead to delaying the xact more than required.
Agreed.
The
patch seems to add a delay before sending a PREPARE as that is the
time when the subscriber will apply the changes.
Considering the purpose of this feature mentioned in the commit
message "particularly to fix errors that might cause data loss",
delaying sending PREPARE would really help that situation? For
example, even after (mistakenly) executing PREPARE for a transaction
executing DELETE without WHERE clause on the publisher the user still
can rollback the transaction. They don't lose data on both nodes yet.
After executing (and replicating) COMMIT PREPARED for that
transaction, they lose the data on both nodes. IIUC the time-delayed
logical replication should help this situation by delaying sending
COMMIT PREPARED so that, for example, the user can stop logical
replication before COMMIT PREPARED message arrives to the subscriber.
So I think we should delay sending COMMIT PREPARED (and COMMIT)
instead of PREPARE. This would help users to correct data loss errors,
and would be more consistent with what recovery_min_apply_delay does.
Regards,
--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com
Dear Sawada-san, Amit,
Thank you for reviewing!
+ * + * LOGICALREP_PROTO_MIN_SEND_DELAY_VERSION_NUM is the minimum protocol version + * with support for delaying to send transactions. Introduced in PG16. */ #define LOGICALREP_PROTO_MIN_VERSION_NUM 1 #define LOGICALREP_PROTO_VERSION_NUM 1 #define LOGICALREP_PROTO_STREAM_VERSION_NUM 2 #define LOGICALREP_PROTO_TWOPHASE_VERSION_NUM 3 #define LOGICALREP_PROTO_STREAM_PARALLEL_VERSION_NUM 4 -#define LOGICALREP_PROTO_MAX_VERSION_NUM LOGICALREP_PROTO_STREAM_PARALLEL_VERSION_NUM +#define LOGICALREP_PROTO_MIN_SEND_DELAY_VERSION_NUM 4 +#define LOGICALREP_PROTO_MAX_VERSION_NUM LOGICALREP_PROTO_MIN_SEND_DELAY_VERSION_NUMWhat is the usecase of the old macro,
LOGICALREP_PROTO_STREAM_PARALLEL_VERSION_NUM, after adding
LOGICALREP_PROTO_MIN_SEND_DELAY_VERSION_NUM ? I think if we go this
way, we will end up adding macros every time when adding a new option,
which seems not a good idea. I'm really not sure we need to change the
protocol version or the macro. Commit
366283961ac0ed6d89014444c6090f3fd02fce0a introduced the 'origin'
subscription parameter that is also sent to the publisher, but we
didn't touch the protocol version at all.
I removed the protocol number.
I checked the previous discussion[1]/messages/by-id/CAA4eK1LjOm6-OHggYVH35dQ_v40jOXrJW0GFy3kuwTd2J48=Ug@mail.gmail.com. According to it, the protocol version must
be modified when new message is added or exiting messages are changed.
This patch intentionally make walsenders delay sending data, and at that time no
extra information is added. Therefore I think it is not needed.
---
Why do we not to delay sending COMMIT PREPARED messages?
This is motivated by the comment[2]/messages/by-id/CAA4eK1K4uPbudrNdH+=_vN-Hpe9wYh=3vBS5Ww9dHn-LOWMV0g@mail.gmail.com but I preferred your opinion[3]/messages/by-id/CAD21AoA0mPq_m6USfAC8DAkvFfwjqGvGq++Uv=avryYotvq98A@mail.gmail.com.
Now COMMIT PREPARED is delayed instead of PREPARE message.
--- + /* + * If we've requested to shut down, exit the process. + * + * Note that WalSndDone() cannot be used here because the delaying + * changes will be sent in the function. + */ + if (got_STOPPING) + WalSndShutdown();Since the walsender exits without sending the done message at a server
shutdown, we get the following log message on the subscriber:ERROR: could not receive data from WAL stream: server closed the
connection unexpectedlyI think that since the walsender is just waiting for sending data, it
can send the done message if the socket is writable.
You are right. I was confused with the previous implementation that workers cannot
accept any messages. I make walsenders send the end-command message directly.
Is it what you expeced?
--- + delayUntil = TimestampTzPlusMilliseconds(delay_start, ctx->min_send_delay); + remaining_wait_time_ms = TimestampDifferenceMilliseconds(GetCurrentTimestamp(), delayUntil); + (snip) + + /* Sleep until appropriate time. */ + timeout_sleeptime_ms = WalSndComputeSleeptime(GetCurrentTimestamp());I think it's better to call GetCurrentTimestamp() only once.
Right, fixed.
--- +# This test is successful only if at least the configured delay has elapsed. +ok( time() - $publisher_insert_time >= $delay, + "subscriber applies WAL only after replication delay for non-streaming transaction" +);The subscriber doesn't actually apply WAL records, but logically
replicated changes. How about "subscriber applies changes only
after..."?
I grepped other tests, and I could not find the same usage of the word "WAL".
So fixed as you said.
In next version I will use grammar checker like Chat-GPT to modify commit messages...
[1]: /messages/by-id/CAA4eK1LjOm6-OHggYVH35dQ_v40jOXrJW0GFy3kuwTd2J48=Ug@mail.gmail.com
[2]: /messages/by-id/CAA4eK1K4uPbudrNdH+=_vN-Hpe9wYh=3vBS5Ww9dHn-LOWMV0g@mail.gmail.com
[3]: /messages/by-id/CAD21AoA0mPq_m6USfAC8DAkvFfwjqGvGq++Uv=avryYotvq98A@mail.gmail.com
Best Regards,
Hayato Kuroda
FUJITSU LIMITED
Attachments:
v8-0001-Time-delayed-logical-replication-on-publisher-sid.patchapplication/octet-stream; name=v8-0001-Time-delayed-logical-replication-on-publisher-sid.patchDownload
From 820c256832bfae18ae59908f53745a9d6b4c346f Mon Sep 17 00:00:00 2001
From: Hayato Kuroda <kuroda.hayato@fujitsu.com>
Date: Thu, 16 Feb 2023 07:52:23 +0000
Subject: [PATCH v8] Time-delayed logical replication on publisher side
Similar to physical replication, a time-delayed copy of the data for
logical replication is useful for some scenarios (particularly to fix
errors that might cause data loss).
This patch implements a new subscription parameter called 'min_send_delay'.
If the subscription sets min_send_delay parameter, the apply worker
(via walrcv_startstreaming) passes the value to the publisher as an output plugin
option. The walsender will delay the transaction sending for given milliseconds.
The delay does not take into account the overhead of time spent in transferring
the transaction, which means that the arrival time at the subscriber may be
delayed more than the given time.
The delay occurs before we start to send the transaction on the publisher.
Regular and prepared transactions are covered. Streamed transactions are also
covered.
The combination of parallel streaming mode and min_send_delay is not allowed
1. This is because in parallel streaming mode, we start applying the transaction
stream as soon as the first change arrives without knowing the transaction's
prepare/commit time. Always waiting for the full 'min_send_delay' period might
include unnecessary delay.
2. Another reason is that for parallel streaming, the transaction will be opened
immediately by the parallel apply worker. So if the walsender is delayed in
sending the final record of the transaction, the parallel apply worker must wait
to receive it with an open transaction. This would result in the locks acquired
during the transaction not being released until the min_send_delay has elapsed.
Earlier versions were written by Euler Taveira, Takamichi Osumi, and Kuroda Hayato
Author: Kuroda Hayato, Takamichi Osumi
Reviewed-by: Amit Kapila, Peter Smith, Vignesh C, Shveta Malik,
Kyotaro Horiguchi, Shi Yu, Wang Wei, Dilip Kumar, Melih Mutlu,
Andres Freund
---
doc/src/sgml/catalogs.sgml | 10 +
doc/src/sgml/glossary.sgml | 15 ++
doc/src/sgml/logical-replication.sgml | 6 +
doc/src/sgml/monitoring.sgml | 5 +
doc/src/sgml/ref/alter_subscription.sgml | 5 +-
doc/src/sgml/ref/create_subscription.sgml | 43 +++-
src/backend/catalog/pg_subscription.c | 1 +
src/backend/catalog/system_views.sql | 7 +-
src/backend/commands/subscriptioncmds.c | 119 ++++++++++-
.../libpqwalreceiver/libpqwalreceiver.c | 5 +
src/backend/replication/logical/decode.c | 9 +
src/backend/replication/logical/logical.c | 18 +-
.../replication/logical/logicalfuncs.c | 2 +-
src/backend/replication/logical/worker.c | 12 +-
src/backend/replication/pgoutput/pgoutput.c | 30 +++
src/backend/replication/slotfuncs.c | 4 +-
src/backend/replication/walsender.c | 88 ++++++++-
src/backend/utils/activity/wait_event.c | 3 +
src/bin/pg_dump/pg_dump.c | 15 +-
src/bin/pg_dump/pg_dump.h | 1 +
src/bin/psql/describe.c | 9 +-
src/bin/psql/tab-complete.c | 4 +-
src/include/catalog/pg_subscription.h | 3 +
src/include/replication/logical.h | 18 +-
src/include/replication/pgoutput.h | 1 +
src/include/replication/walreceiver.h | 1 +
src/include/utils/wait_event.h | 3 +-
src/test/regress/expected/subscription.out | 185 +++++++++++-------
src/test/regress/sql/subscription.sql | 29 +++
src/test/subscription/t/001_rep_changes.pl | 27 +++
30 files changed, 574 insertions(+), 104 deletions(-)
diff --git a/doc/src/sgml/catalogs.sgml b/doc/src/sgml/catalogs.sgml
index c1e4048054..ce6521fe6c 100644
--- a/doc/src/sgml/catalogs.sgml
+++ b/doc/src/sgml/catalogs.sgml
@@ -7892,6 +7892,16 @@ SCRAM-SHA-256$<replaceable><iteration count></replaceable>:<replaceable>&l
</para></entry>
</row>
+ <row>
+ <entry role="catalog_table_entry"><para role="column_definition">
+ <structfield>subminsenddelay</structfield> <type>int4</type>
+ </para>
+ <para>
+ The minimum delay, in milliseconds, by the publisher before sending all
+ the changes
+ </para></entry>
+ </row>
+
<row>
<entry role="catalog_table_entry"><para role="column_definition">
<structfield>subenabled</structfield> <type>bool</type>
diff --git a/doc/src/sgml/glossary.sgml b/doc/src/sgml/glossary.sgml
index 7c01a541fe..9ede9d05f6 100644
--- a/doc/src/sgml/glossary.sgml
+++ b/doc/src/sgml/glossary.sgml
@@ -1729,6 +1729,21 @@
</glossdef>
</glossentry>
+ <glossentry id="glossary-time-delayed-replication">
+ <glossterm>Time-delayed replication</glossterm>
+ <glossdef>
+ <para>
+ Replication setup that delays the application of changes by a specified
+ minimum time-delay period.
+ </para>
+ <para>
+ For more information, see
+ <xref linkend="guc-recovery-min-apply-delay"/> for physical replication
+ and <xref linkend="sql-createsubscription"/> for logical replication.
+ </para>
+ </glossdef>
+ </glossentry>
+
<glossentry id="glossary-toast">
<glossterm>TOAST</glossterm>
<glossdef>
diff --git a/doc/src/sgml/logical-replication.sgml b/doc/src/sgml/logical-replication.sgml
index 1bd5660c87..fe9e7f7b26 100644
--- a/doc/src/sgml/logical-replication.sgml
+++ b/doc/src/sgml/logical-replication.sgml
@@ -247,6 +247,12 @@
target table.
</para>
+ <para>
+ A subscription can delay the receipt of changes by specifying the
+ <literal>min_send_delay</literal> subscription parameter. See
+ <xref linkend="sql-createsubscription"/> for details.
+ </para>
+
<sect2 id="logical-replication-subscription-slot">
<title>Replication Slot Management</title>
diff --git a/doc/src/sgml/monitoring.sgml b/doc/src/sgml/monitoring.sgml
index b0b997f092..6158587644 100644
--- a/doc/src/sgml/monitoring.sgml
+++ b/doc/src/sgml/monitoring.sgml
@@ -2349,6 +2349,11 @@ postgres 27093 0.0 0.0 30096 2752 ? Ss 11:34 0:00 postgres: ser
<entry>Waiting to acquire an exclusive lock to truncate off any
empty pages at the end of a table vacuumed.</entry>
</row>
+ <row>
+ <entry><literal>WalSenderSendDelay</literal></entry>
+ <entry>Waiting while sending changes for time-delayed logical replication
+ in the WAL sender process.</entry>
+ </row>
</tbody>
</tgroup>
</table>
diff --git a/doc/src/sgml/ref/alter_subscription.sgml b/doc/src/sgml/ref/alter_subscription.sgml
index 964fcbb8ff..3f238b958b 100644
--- a/doc/src/sgml/ref/alter_subscription.sgml
+++ b/doc/src/sgml/ref/alter_subscription.sgml
@@ -213,8 +213,9 @@ ALTER SUBSCRIPTION <replaceable class="parameter">name</replaceable> RENAME TO <
are <literal>slot_name</literal>,
<literal>synchronous_commit</literal>,
<literal>binary</literal>, <literal>streaming</literal>,
- <literal>disable_on_error</literal>, and
- <literal>origin</literal>.
+ <literal>disable_on_error</literal>,
+ <literal>origin</literal>, and
+ <literal>min_send_delay</literal>.
</para>
</listitem>
</varlistentry>
diff --git a/doc/src/sgml/ref/create_subscription.sgml b/doc/src/sgml/ref/create_subscription.sgml
index 51c45f17c7..9d08740ba2 100644
--- a/doc/src/sgml/ref/create_subscription.sgml
+++ b/doc/src/sgml/ref/create_subscription.sgml
@@ -349,7 +349,43 @@ CREATE SUBSCRIPTION <replaceable class="parameter">subscription_name</replaceabl
</para>
</listitem>
</varlistentry>
- </variablelist></para>
+
+ <varlistentry>
+ <term><literal>min_send_delay</literal> (<type>integer</type>)</term>
+ <listitem>
+ <para>
+ By default, the publisher sends changes as soon as possible. This
+ parameter allows the user to delay changes by the given time period.
+ If the value is specified without units, it is taken as milliseconds.
+ The default is zero (no delay). See <xref linkend="config-setting-names-values"/>
+ for details on the available valid time units.
+ </para>
+ <para>
+ The delay is effective only when the initial table synchronization
+ has been finished. However, there is a possibility that the table
+ status updated in <link linkend="catalog-pg-subscription-rel"><structname>pg_subscription_rel</structname></link>
+ could be delayed in getting to the "ready" state, and also two-phase
+ (if specified) could be delayed in getting to "enabled".
+ </para>
+ <para>
+ The delay does not take into account the overhead of time spent
+ transferring the transaction. Therefore, the arrival time at the
+ subscriber may be delayed more than the specified
+ <literal>min_send_delay</literal> time.
+ </para>
+ <warning>
+ <para>
+ Delaying the replication means there is a much longer time between
+ making a change on the publisher, and that change being committed
+ on the subscriber. This can impact the performance of synchronous
+ replication. See <xref linkend="guc-synchronous-commit"/>
+ parameter.
+ </para>
+ </warning>
+ </listitem>
+ </varlistentry>
+ </variablelist>
+ </para>
</listitem>
</varlistentry>
@@ -420,6 +456,11 @@ CREATE SUBSCRIPTION <replaceable class="parameter">subscription_name</replaceabl
published with different column lists are not supported.
</para>
+ <para>
+ A non-zero <literal>min_send_delay</literal> parameter is not allowed when
+ streaming in parallel mode.
+ </para>
+
<para>
We allow non-existent publications to be specified so that users can add
those later. This means
diff --git a/src/backend/catalog/pg_subscription.c b/src/backend/catalog/pg_subscription.c
index a56ae311c3..63a10b06d1 100644
--- a/src/backend/catalog/pg_subscription.c
+++ b/src/backend/catalog/pg_subscription.c
@@ -66,6 +66,7 @@ GetSubscription(Oid subid, bool missing_ok)
sub->skiplsn = subform->subskiplsn;
sub->name = pstrdup(NameStr(subform->subname));
sub->owner = subform->subowner;
+ sub->minsenddelay = subform->subminsenddelay;
sub->enabled = subform->subenabled;
sub->binary = subform->subbinary;
sub->stream = subform->substream;
diff --git a/src/backend/catalog/system_views.sql b/src/backend/catalog/system_views.sql
index 34ca0e739f..54a705d71b 100644
--- a/src/backend/catalog/system_views.sql
+++ b/src/backend/catalog/system_views.sql
@@ -1314,9 +1314,10 @@ REVOKE ALL ON pg_replication_origin_status FROM public;
-- All columns of pg_subscription except subconninfo are publicly readable.
REVOKE ALL ON pg_subscription FROM public;
-GRANT SELECT (oid, subdbid, subskiplsn, subname, subowner, subenabled,
- subbinary, substream, subtwophasestate, subdisableonerr,
- subslotname, subsynccommit, subpublications, suborigin)
+GRANT SELECT (oid, subdbid, subskiplsn, subname, subowner, subminsenddelay,
+ subenabled, subbinary, substream, subtwophasestate,
+ subdisableonerr, subslotname, subsynccommit, subpublications,
+ suborigin)
ON pg_subscription TO public;
CREATE VIEW pg_stat_subscription_stats AS
diff --git a/src/backend/commands/subscriptioncmds.c b/src/backend/commands/subscriptioncmds.c
index 464db6d247..4a8cd47171 100644
--- a/src/backend/commands/subscriptioncmds.c
+++ b/src/backend/commands/subscriptioncmds.c
@@ -66,6 +66,7 @@
#define SUBOPT_DISABLE_ON_ERR 0x00000400
#define SUBOPT_LSN 0x00000800
#define SUBOPT_ORIGIN 0x00001000
+#define SUBOPT_MIN_SEND_DELAY 0x00002000
/* check if the 'val' has 'bits' set */
#define IsSet(val, bits) (((val) & (bits)) == (bits))
@@ -90,6 +91,7 @@ typedef struct SubOpts
bool disableonerr;
char *origin;
XLogRecPtr lsn;
+ int32 min_send_delay;
} SubOpts;
static List *fetch_table_list(WalReceiverConn *wrconn, List *publications);
@@ -100,7 +102,7 @@ static void check_publications_origin(WalReceiverConn *wrconn,
static void check_duplicates_in_publist(List *publist, Datum *datums);
static List *merge_publications(List *oldpublist, List *newpublist, bool addpub, const char *subname);
static void ReportSlotConnectionError(List *rstates, Oid subid, char *slotname, char *err);
-
+static int32 defGetMinSendDelay(DefElem *def);
/*
* Common option parsing function for CREATE and ALTER SUBSCRIPTION commands.
@@ -146,6 +148,8 @@ parse_subscription_options(ParseState *pstate, List *stmt_options,
opts->disableonerr = false;
if (IsSet(supported_opts, SUBOPT_ORIGIN))
opts->origin = pstrdup(LOGICALREP_ORIGIN_ANY);
+ if (IsSet(supported_opts, SUBOPT_MIN_SEND_DELAY))
+ opts->min_send_delay = 0;
/* Parse options */
foreach(lc, stmt_options)
@@ -324,6 +328,15 @@ parse_subscription_options(ParseState *pstate, List *stmt_options,
opts->specified_opts |= SUBOPT_LSN;
opts->lsn = lsn;
}
+ else if (IsSet(supported_opts, SUBOPT_MIN_SEND_DELAY) &&
+ strcmp(defel->defname, "min_send_delay") == 0)
+ {
+ if (IsSet(opts->specified_opts, SUBOPT_MIN_SEND_DELAY))
+ errorConflictingDefElem(defel, pstate);
+
+ opts->specified_opts |= SUBOPT_MIN_SEND_DELAY;
+ opts->min_send_delay = defGetMinSendDelay(defel);
+ }
else
ereport(ERROR,
(errcode(ERRCODE_SYNTAX_ERROR),
@@ -404,6 +417,29 @@ parse_subscription_options(ParseState *pstate, List *stmt_options,
"slot_name = NONE", "create_slot = false")));
}
}
+
+ /*
+ * The combination of parallel streaming mode and min_send_delay is not
+ * allowed. This is because in parallel streaming mode, the walsender
+ * starts sending the transaction stream without knowing the prepare/commit
+ * time of the transaction. Always waiting for the full 'min_send_delay'
+ * time to send may introduce unnecessary delay.
+ *
+ * The other possibility was to wait sending COMMIT record of the parallel
+ * apply transaction but that would cause issues related to resource bloat
+ * and locks being held for a long time.
+ */
+ if (IsSet(supported_opts, SUBOPT_MIN_SEND_DELAY) &&
+ opts->min_send_delay > 0 &&
+ opts->streaming == LOGICALREP_STREAM_PARALLEL)
+ ereport(ERROR,
+ errcode(ERRCODE_SYNTAX_ERROR),
+ /*
+ * translator: the first %s is a string of the form "parameter > 0"
+ * and the second one is "option = value".
+ */
+ errmsg("%s and %s are mutually exclusive options",
+ "min_send_delay > 0", "streaming = parallel"));
}
/*
@@ -560,7 +596,8 @@ CreateSubscription(ParseState *pstate, CreateSubscriptionStmt *stmt,
SUBOPT_SLOT_NAME | SUBOPT_COPY_DATA |
SUBOPT_SYNCHRONOUS_COMMIT | SUBOPT_BINARY |
SUBOPT_STREAMING | SUBOPT_TWOPHASE_COMMIT |
- SUBOPT_DISABLE_ON_ERR | SUBOPT_ORIGIN);
+ SUBOPT_DISABLE_ON_ERR | SUBOPT_ORIGIN |
+ SUBOPT_MIN_SEND_DELAY);
parse_subscription_options(pstate, stmt->options, supported_opts, &opts);
/*
@@ -628,6 +665,7 @@ CreateSubscription(ParseState *pstate, CreateSubscriptionStmt *stmt,
values[Anum_pg_subscription_subname - 1] =
DirectFunctionCall1(namein, CStringGetDatum(stmt->subname));
values[Anum_pg_subscription_subowner - 1] = ObjectIdGetDatum(owner);
+ values[Anum_pg_subscription_subminsenddelay - 1] = Int32GetDatum(opts.min_send_delay);
values[Anum_pg_subscription_subenabled - 1] = BoolGetDatum(opts.enabled);
values[Anum_pg_subscription_subbinary - 1] = BoolGetDatum(opts.binary);
values[Anum_pg_subscription_substream - 1] = CharGetDatum(opts.streaming);
@@ -1054,7 +1092,7 @@ AlterSubscription(ParseState *pstate, AlterSubscriptionStmt *stmt,
supported_opts = (SUBOPT_SLOT_NAME |
SUBOPT_SYNCHRONOUS_COMMIT | SUBOPT_BINARY |
SUBOPT_STREAMING | SUBOPT_DISABLE_ON_ERR |
- SUBOPT_ORIGIN);
+ SUBOPT_ORIGIN | SUBOPT_MIN_SEND_DELAY);
parse_subscription_options(pstate, stmt->options,
supported_opts, &opts);
@@ -1098,6 +1136,19 @@ AlterSubscription(ParseState *pstate, AlterSubscriptionStmt *stmt,
if (IsSet(opts.specified_opts, SUBOPT_STREAMING))
{
+ /*
+ * The combination of parallel streaming mode and
+ * min_send_delay is not allowed. See
+ * parse_subscription_options.
+ */
+ if (opts.streaming == LOGICALREP_STREAM_PARALLEL &&
+ !IsSet(opts.specified_opts, SUBOPT_MIN_SEND_DELAY) &&
+ sub->minsenddelay > 0)
+ ereport(ERROR,
+ errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
+ errmsg("cannot set parallel streaming mode for subscription with %s",
+ "min_send_delay"));
+
values[Anum_pg_subscription_substream - 1] =
CharGetDatum(opts.streaming);
replaces[Anum_pg_subscription_substream - 1] = true;
@@ -1111,6 +1162,26 @@ AlterSubscription(ParseState *pstate, AlterSubscriptionStmt *stmt,
= true;
}
+ if (IsSet(opts.specified_opts, SUBOPT_MIN_SEND_DELAY))
+ {
+ /*
+ * The combination of parallel streaming mode and
+ * min_send_delay is not allowed. See
+ * parse_subscription_options.
+ */
+ if (opts.min_send_delay > 0 &&
+ !IsSet(opts.specified_opts, SUBOPT_STREAMING) &&
+ sub->stream == LOGICALREP_STREAM_PARALLEL)
+ ereport(ERROR,
+ errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
+ errmsg("cannot set %s for subscription in parallel streaming mode",
+ "min_send_delay"));
+
+ values[Anum_pg_subscription_subminsenddelay - 1] =
+ Int32GetDatum(opts.min_send_delay);
+ replaces[Anum_pg_subscription_subminsenddelay - 1] = true;
+ }
+
if (IsSet(opts.specified_opts, SUBOPT_ORIGIN))
{
values[Anum_pg_subscription_suborigin - 1] =
@@ -2195,3 +2266,45 @@ defGetStreamingMode(DefElem *def)
def->defname)));
return LOGICALREP_STREAM_OFF; /* keep compiler quiet */
}
+
+/*
+ * Extract the min_send_delay value from a DefElem. This is very similar to
+ * parse_and_validate_value() for integer values, because min_send_delay
+ * accepts the same parameter format as recovery_min_apply_delay.
+ */
+static int32
+defGetMinSendDelay(DefElem *def)
+{
+ char *input_string;
+ int result;
+ const char *hintmsg;
+
+ input_string = defGetString(def);
+
+ /*
+ * Parse given string as parameter which has millisecond unit
+ */
+ if (!parse_int(input_string, &result, GUC_UNIT_MS, &hintmsg))
+ ereport(ERROR,
+ (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+ errmsg("invalid value for parameter \"%s\": \"%s\"",
+ "min_send_delay", input_string),
+ hintmsg ? errhint("%s", _(hintmsg)) : 0));
+
+ /*
+ * Check both the lower boundary for the valid min_send_delay range and
+ * the upper boundary as the safeguard for some platforms where INT_MAX is
+ * wider than int32 respectively. Although parse_int() has confirmed that
+ * the result is less than or equal to INT_MAX, the value will be stored
+ * in a catalog column of int32.
+ */
+ if (result < 0 || result > PG_INT32_MAX)
+ ereport(ERROR,
+ (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+ errmsg("%d ms is outside the valid range for parameter \"%s\" (%d .. %d)",
+ result,
+ "min_send_delay",
+ 0, PG_INT32_MAX)));
+
+ return result;
+}
diff --git a/src/backend/replication/libpqwalreceiver/libpqwalreceiver.c b/src/backend/replication/libpqwalreceiver/libpqwalreceiver.c
index 560ec974fa..89a72c1abe 100644
--- a/src/backend/replication/libpqwalreceiver/libpqwalreceiver.c
+++ b/src/backend/replication/libpqwalreceiver/libpqwalreceiver.c
@@ -443,6 +443,11 @@ libpqrcv_startstreaming(WalReceiverConn *conn,
PQserverVersion(conn->streamConn) >= 140000)
appendStringInfoString(&cmd, ", binary 'true'");
+ if (options->proto.logical.min_send_delay > 0 &&
+ PQserverVersion(conn->streamConn) >= 160000)
+ appendStringInfo(&cmd, ", min_send_delay '%d'",
+ options->proto.logical.min_send_delay);
+
appendStringInfoChar(&cmd, ')');
}
else
diff --git a/src/backend/replication/logical/decode.c b/src/backend/replication/logical/decode.c
index 8fe7bb65f1..fe969e7bab 100644
--- a/src/backend/replication/logical/decode.c
+++ b/src/backend/replication/logical/decode.c
@@ -676,6 +676,15 @@ DecodeCommit(LogicalDecodingContext *ctx, XLogRecordBuffer *buf,
buf->origptr, buf->endptr);
}
+ /*
+ * Delay sending the changes if required. For streaming transactions,
+ * this means a delay in sending the last stream but that is OK
+ * because on the downstream the changes will be applied only after
+ * receiving the last stream.
+ */
+ if (ctx->min_send_delay > 0 && ctx->delay_send)
+ ctx->delay_send(ctx, xid, commit_time);
+
/*
* Send the final commit record if the transaction data is already
* decoded, otherwise, process the entire transaction.
diff --git a/src/backend/replication/logical/logical.c b/src/backend/replication/logical/logical.c
index c3ec97a0a6..e4dd822cdc 100644
--- a/src/backend/replication/logical/logical.c
+++ b/src/backend/replication/logical/logical.c
@@ -156,7 +156,8 @@ StartupDecodingContext(List *output_plugin_options,
XLogReaderRoutine *xl_routine,
LogicalOutputPluginWriterPrepareWrite prepare_write,
LogicalOutputPluginWriterWrite do_write,
- LogicalOutputPluginWriterUpdateProgress update_progress)
+ LogicalOutputPluginWriterUpdateProgress update_progress,
+ LogicalOutputPluginWriterDelay delay_send)
{
ReplicationSlot *slot;
MemoryContext context,
@@ -293,6 +294,7 @@ StartupDecodingContext(List *output_plugin_options,
ctx->prepare_write = prepare_write;
ctx->write = do_write;
ctx->update_progress = update_progress;
+ ctx->delay_send = delay_send;
ctx->output_plugin_options = output_plugin_options;
@@ -316,7 +318,7 @@ StartupDecodingContext(List *output_plugin_options,
* marking WAL reserved beforehand. In that scenario, it's up to the
* caller to guarantee that WAL remains available.
* xl_routine -- XLogReaderRoutine for underlying XLogReader
- * prepare_write, do_write, update_progress --
+ * prepare_write, do_write, update_progress, delay_send --
* callbacks that perform the use-case dependent, actual, work.
*
* Needs to be called while in a memory context that's at least as long lived
@@ -334,7 +336,8 @@ CreateInitDecodingContext(const char *plugin,
XLogReaderRoutine *xl_routine,
LogicalOutputPluginWriterPrepareWrite prepare_write,
LogicalOutputPluginWriterWrite do_write,
- LogicalOutputPluginWriterUpdateProgress update_progress)
+ LogicalOutputPluginWriterUpdateProgress update_progress,
+ LogicalOutputPluginWriterDelay delay_send)
{
TransactionId xmin_horizon = InvalidTransactionId;
ReplicationSlot *slot;
@@ -435,7 +438,7 @@ CreateInitDecodingContext(const char *plugin,
ctx = StartupDecodingContext(NIL, restart_lsn, xmin_horizon,
need_full_snapshot, false,
xl_routine, prepare_write, do_write,
- update_progress);
+ update_progress, delay_send);
/* call output plugin initialization callback */
old_context = MemoryContextSwitchTo(ctx->context);
@@ -475,7 +478,7 @@ CreateInitDecodingContext(const char *plugin,
* xl_routine
* XLogReaderRoutine used by underlying xlogreader
*
- * prepare_write, do_write, update_progress
+ * prepare_write, do_write, update_progress, delay_send
* callbacks that have to be filled to perform the use-case dependent,
* actual work.
*
@@ -493,7 +496,8 @@ CreateDecodingContext(XLogRecPtr start_lsn,
XLogReaderRoutine *xl_routine,
LogicalOutputPluginWriterPrepareWrite prepare_write,
LogicalOutputPluginWriterWrite do_write,
- LogicalOutputPluginWriterUpdateProgress update_progress)
+ LogicalOutputPluginWriterUpdateProgress update_progress,
+ LogicalOutputPluginWriterDelay delay_send)
{
LogicalDecodingContext *ctx;
ReplicationSlot *slot;
@@ -547,7 +551,7 @@ CreateDecodingContext(XLogRecPtr start_lsn,
ctx = StartupDecodingContext(output_plugin_options,
start_lsn, InvalidTransactionId, false,
fast_forward, xl_routine, prepare_write,
- do_write, update_progress);
+ do_write, update_progress, delay_send);
/* call output plugin initialization callback */
old_context = MemoryContextSwitchTo(ctx->context);
diff --git a/src/backend/replication/logical/logicalfuncs.c b/src/backend/replication/logical/logicalfuncs.c
index fa1b641a2b..960025197f 100644
--- a/src/backend/replication/logical/logicalfuncs.c
+++ b/src/backend/replication/logical/logicalfuncs.c
@@ -212,7 +212,7 @@ pg_logical_slot_get_changes_guts(FunctionCallInfo fcinfo, bool confirm, bool bin
.segment_open = wal_segment_open,
.segment_close = wal_segment_close),
LogicalOutputPrepareWrite,
- LogicalOutputWrite, NULL);
+ LogicalOutputWrite, NULL, NULL);
/*
* After the sanity checks in CreateDecodingContext, make sure the
diff --git a/src/backend/replication/logical/worker.c b/src/backend/replication/logical/worker.c
index cfb2ab6248..e68902ae34 100644
--- a/src/backend/replication/logical/worker.c
+++ b/src/backend/replication/logical/worker.c
@@ -3898,7 +3898,8 @@ maybe_reread_subscription(void)
newsub->stream != MySubscription->stream ||
strcmp(newsub->origin, MySubscription->origin) != 0 ||
newsub->owner != MySubscription->owner ||
- !equal(newsub->publications, MySubscription->publications))
+ !equal(newsub->publications, MySubscription->publications) ||
+ newsub->minsenddelay != MySubscription->minsenddelay)
{
if (am_parallel_apply_worker())
ereport(LOG,
@@ -4617,9 +4618,18 @@ ApplyWorkerMain(Datum main_arg)
options.proto.logical.twophase = false;
options.proto.logical.origin = pstrdup(MySubscription->origin);
+ options.proto.logical.min_send_delay = 0;
if (!am_tablesync_worker())
{
+ /*
+ * Time-delayed logical replication does not support tablesync
+ * workers, so only the leader apply worker can request walsenders to
+ * delay on the publisher side.
+ */
+ if (server_version >= 160000 && MySubscription->minsenddelay > 0)
+ options.proto.logical.min_send_delay = MySubscription->minsenddelay;
+
/*
* Even when the two_phase mode is requested by the user, it remains
* as the tri-state PENDING until all tablesyncs have reached READY
diff --git a/src/backend/replication/pgoutput/pgoutput.c b/src/backend/replication/pgoutput/pgoutput.c
index 0df1acbb7a..61faf2d685 100644
--- a/src/backend/replication/pgoutput/pgoutput.c
+++ b/src/backend/replication/pgoutput/pgoutput.c
@@ -285,6 +285,7 @@ parse_output_parameters(List *options, PGOutputData *data)
bool streaming_given = false;
bool two_phase_option_given = false;
bool origin_option_given = false;
+ bool min_send_delay_option_given = false;
data->binary = false;
data->streaming = LOGICALREP_STREAM_OFF;
@@ -396,6 +397,32 @@ parse_output_parameters(List *options, PGOutputData *data)
errcode(ERRCODE_INVALID_PARAMETER_VALUE),
errmsg("unrecognized origin value: \"%s\"", data->origin));
}
+ else if (strcmp(defel->defname, "min_send_delay") == 0)
+ {
+ unsigned long delay_val;
+ char *endptr;
+
+ if (min_send_delay_option_given)
+ ereport(ERROR,
+ errcode(ERRCODE_SYNTAX_ERROR),
+ errmsg("conflicting or redundant options"));
+ min_send_delay_option_given = true;
+
+ errno = 0;
+ delay_val = strtoul(strVal(defel->arg), &endptr, 10);
+ if (errno != 0 || *endptr != '\0')
+ ereport(ERROR,
+ (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+ errmsg("invalid min_send_delay")));
+
+ if (delay_val > PG_INT32_MAX)
+ ereport(ERROR,
+ (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+ errmsg("min_send_delay \"%s\" out of range",
+ strVal(defel->arg))));
+
+ data->min_send_delay = (int32) delay_val;
+ }
else
elog(ERROR, "unrecognized pgoutput option: %s", defel->defname);
}
@@ -502,6 +529,9 @@ pgoutput_startup(LogicalDecodingContext *ctx, OutputPluginOptions *opt,
else
ctx->twophase_opt_given = true;
+ /* Copy given time period to decoding context */
+ ctx->min_send_delay = data->min_send_delay;
+
/* Init publication state. */
data->publications = NIL;
publications_valid = false;
diff --git a/src/backend/replication/slotfuncs.c b/src/backend/replication/slotfuncs.c
index 2f3c964824..522f7600a1 100644
--- a/src/backend/replication/slotfuncs.c
+++ b/src/backend/replication/slotfuncs.c
@@ -148,7 +148,7 @@ create_logical_replication_slot(char *name, char *plugin,
XL_ROUTINE(.page_read = read_local_xlog_page,
.segment_open = wal_segment_open,
.segment_close = wal_segment_close),
- NULL, NULL, NULL);
+ NULL, NULL, NULL, NULL);
/*
* If caller needs us to determine the decoding start point, do so now.
@@ -481,7 +481,7 @@ pg_logical_replication_slot_advance(XLogRecPtr moveto)
XL_ROUTINE(.page_read = read_local_xlog_page,
.segment_open = wal_segment_open,
.segment_close = wal_segment_close),
- NULL, NULL, NULL);
+ NULL, NULL, NULL, NULL);
/*
* Start reading at the slot's restart_lsn, which we know to point to
diff --git a/src/backend/replication/walsender.c b/src/backend/replication/walsender.c
index 75e8363e24..9f3968928f 100644
--- a/src/backend/replication/walsender.c
+++ b/src/backend/replication/walsender.c
@@ -252,6 +252,7 @@ static void WalSndPrepareWrite(LogicalDecodingContext *ctx, XLogRecPtr lsn, Tran
static void WalSndWriteData(LogicalDecodingContext *ctx, XLogRecPtr lsn, TransactionId xid, bool last_write);
static void WalSndUpdateProgress(LogicalDecodingContext *ctx, XLogRecPtr lsn, TransactionId xid,
bool skipped_xact);
+static void WalSndDelay(LogicalDecodingContext *ctx, TransactionId xid, TimestampTz delay_start);
static XLogRecPtr WalSndWaitForWal(XLogRecPtr loc);
static void LagTrackerWrite(XLogRecPtr lsn, TimestampTz local_flush_time);
static TimeOffset LagTrackerRead(int head, XLogRecPtr lsn, TimestampTz now);
@@ -1126,7 +1127,7 @@ CreateReplicationSlot(CreateReplicationSlotCmd *cmd)
.segment_open = WalSndSegmentOpen,
.segment_close = wal_segment_close),
WalSndPrepareWrite, WalSndWriteData,
- WalSndUpdateProgress);
+ WalSndUpdateProgress, WalSndDelay);
/*
* Signal that we don't need the timeout mechanism. We're just
@@ -1285,7 +1286,7 @@ StartLogicalReplication(StartReplicationCmd *cmd)
.segment_open = WalSndSegmentOpen,
.segment_close = wal_segment_close),
WalSndPrepareWrite, WalSndWriteData,
- WalSndUpdateProgress);
+ WalSndUpdateProgress, WalSndDelay);
xlogreader = logical_decoding_ctx->reader;
WalSndSetState(WALSNDSTATE_CATCHUP);
@@ -3849,3 +3850,86 @@ LagTrackerRead(int head, XLogRecPtr lsn, TimestampTz now)
Assert(time != 0);
return now - time;
}
+
+/*
+ * LogicalDecodingContext 'delay' callback.
+ *
+ * Wait long enough to make sure a transaction is applied at least that
+ * period behind the publisher.
+ */
+static void
+WalSndDelay(LogicalDecodingContext *ctx, TransactionId xid, TimestampTz delay_start)
+{
+ /* Wait till delayUntil by the latch mechanism */
+ while (true)
+ {
+ TimestampTz now;
+ TimestampTz delayUntil;
+ long remaining_wait_time_ms;
+ long timeout_sleeptime_ms;
+
+ ResetLatch(MyLatch);
+
+ CHECK_FOR_INTERRUPTS();
+
+ /* This might change wal_sender_timeout */
+ if (ConfigReloadPending)
+ {
+ ConfigReloadPending = false;
+ ProcessConfigFile(PGC_SIGHUP);
+ }
+
+ /* Check for input from the client */
+ ProcessRepliesIfAny();
+
+ /* die if timeout was reached */
+ WalSndCheckTimeOut();
+
+ /* Send keepalive if the time has come */
+ WalSndKeepaliveIfNecessary();
+
+ /* Try to flush pending output to the client */
+ if (pq_flush_if_writable() != 0)
+ WalSndShutdown();
+
+ /*
+ * If we've requested to shut down, exit the process.
+ *
+ * Note that WalSndDone() cannot be used here because the delaying
+ * changes will be sent in the function.
+ */
+ if (got_STOPPING)
+ {
+ QueryCompletion qc;
+
+ /* Inform the standby that XLOG streaming is done */
+ SetQueryCompletion(&qc, CMDTAG_COPY, 0);
+ EndCommand(&qc, DestRemote, false);
+ pq_flush();
+
+ proc_exit(0);
+ }
+
+ now = GetCurrentTimestamp();
+ delayUntil = TimestampTzPlusMilliseconds(delay_start, ctx->min_send_delay);
+ remaining_wait_time_ms = TimestampDifferenceMilliseconds(now, delayUntil);
+
+ /*
+ * Exit without arming the latch if it's already past time to send
+ * this transaction.
+ */
+ if (remaining_wait_time_ms <= 0)
+ break;
+
+ /* Sleep until appropriate time. */
+ timeout_sleeptime_ms = WalSndComputeSleeptime(now);
+
+ elog(DEBUG2, "time-delayed replication for txid %u, delay_time = %d ms, remaining wait time: %ld ms",
+ xid, (int) ctx->min_send_delay, remaining_wait_time_ms);
+
+ /* Sleep until we get reply from worker or we time out */
+ WalSndWait(WL_SOCKET_READABLE,
+ Min(timeout_sleeptime_ms, remaining_wait_time_ms),
+ WAIT_EVENT_WALSENDER_SEND_DELAY);
+ }
+}
diff --git a/src/backend/utils/activity/wait_event.c b/src/backend/utils/activity/wait_event.c
index cb99cc6339..76c19fe11d 100644
--- a/src/backend/utils/activity/wait_event.c
+++ b/src/backend/utils/activity/wait_event.c
@@ -515,6 +515,9 @@ pgstat_get_wait_timeout(WaitEventTimeout w)
case WAIT_EVENT_VACUUM_TRUNCATE:
event_name = "VacuumTruncate";
break;
+ case WAIT_EVENT_WALSENDER_SEND_DELAY:
+ event_name = "WalSenderSendDelay";
+ break;
/* no default case, so that compiler will warn */
}
diff --git a/src/bin/pg_dump/pg_dump.c b/src/bin/pg_dump/pg_dump.c
index 24ba936332..e60293dc0e 100644
--- a/src/bin/pg_dump/pg_dump.c
+++ b/src/bin/pg_dump/pg_dump.c
@@ -4493,6 +4493,7 @@ getSubscriptions(Archive *fout)
int i_subsynccommit;
int i_subpublications;
int i_subbinary;
+ int i_subminsenddelay;
int i,
ntups;
@@ -4545,9 +4546,13 @@ getSubscriptions(Archive *fout)
LOGICALREP_TWOPHASE_STATE_DISABLED);
if (fout->remoteVersion >= 160000)
- appendPQExpBufferStr(query, " s.suborigin\n");
+ appendPQExpBufferStr(query,
+ " s.suborigin,\n"
+ " s.subminsenddelay\n");
else
- appendPQExpBuffer(query, " '%s' AS suborigin\n", LOGICALREP_ORIGIN_ANY);
+ appendPQExpBuffer(query, " '%s' AS suborigin,\n"
+ " 0 AS subminsenddelay\n",
+ LOGICALREP_ORIGIN_ANY);
appendPQExpBufferStr(query,
"FROM pg_subscription s\n"
@@ -4575,6 +4580,7 @@ getSubscriptions(Archive *fout)
i_subtwophasestate = PQfnumber(res, "subtwophasestate");
i_subdisableonerr = PQfnumber(res, "subdisableonerr");
i_suborigin = PQfnumber(res, "suborigin");
+ i_subminsenddelay = PQfnumber(res, "subminsenddelay");
subinfo = pg_malloc(ntups * sizeof(SubscriptionInfo));
@@ -4605,6 +4611,8 @@ getSubscriptions(Archive *fout)
subinfo[i].subdisableonerr =
pg_strdup(PQgetvalue(res, i, i_subdisableonerr));
subinfo[i].suborigin = pg_strdup(PQgetvalue(res, i, i_suborigin));
+ subinfo[i].subminsenddelay =
+ atoi(PQgetvalue(res, i, i_subminsenddelay));
/* Decide whether we want to dump it */
selectDumpableObject(&(subinfo[i].dobj), fout);
@@ -4686,6 +4694,9 @@ dumpSubscription(Archive *fout, const SubscriptionInfo *subinfo)
if (strcmp(subinfo->subsynccommit, "off") != 0)
appendPQExpBuffer(query, ", synchronous_commit = %s", fmtId(subinfo->subsynccommit));
+ if (subinfo->subminsenddelay > 0)
+ appendPQExpBuffer(query, ", min_send_delay = '%d ms'", subinfo->subminsenddelay);
+
appendPQExpBufferStr(query, ");\n");
if (subinfo->dobj.dump & DUMP_COMPONENT_DEFINITION)
diff --git a/src/bin/pg_dump/pg_dump.h b/src/bin/pg_dump/pg_dump.h
index cdca0b993d..4c55f8efc4 100644
--- a/src/bin/pg_dump/pg_dump.h
+++ b/src/bin/pg_dump/pg_dump.h
@@ -660,6 +660,7 @@ typedef struct _SubscriptionInfo
char *subdisableonerr;
char *suborigin;
char *subsynccommit;
+ int subminsenddelay;
char *subpublications;
} SubscriptionInfo;
diff --git a/src/bin/psql/describe.c b/src/bin/psql/describe.c
index c8a0bb7b3a..c7d303a168 100644
--- a/src/bin/psql/describe.c
+++ b/src/bin/psql/describe.c
@@ -6472,7 +6472,7 @@ describeSubscriptions(const char *pattern, bool verbose)
PGresult *res;
printQueryOpt myopt = pset.popt;
static const bool translate_columns[] = {false, false, false, false,
- false, false, false, false, false, false, false, false};
+ false, false, false, false, false, false, false, false, false};
if (pset.sversion < 100000)
{
@@ -6527,10 +6527,13 @@ describeSubscriptions(const char *pattern, bool verbose)
gettext_noop("Two-phase commit"),
gettext_noop("Disable on error"));
+ /* Origin and min_send_delay are only supported in v16 and higher */
if (pset.sversion >= 160000)
appendPQExpBuffer(&buf,
- ", suborigin AS \"%s\"\n",
- gettext_noop("Origin"));
+ ", suborigin AS \"%s\"\n"
+ ", subminsenddelay AS \"%s\"\n",
+ gettext_noop("Origin"),
+ gettext_noop("Min send delay"));
appendPQExpBuffer(&buf,
", subsynccommit AS \"%s\"\n"
diff --git a/src/bin/psql/tab-complete.c b/src/bin/psql/tab-complete.c
index 5e1882eaea..6643db6f55 100644
--- a/src/bin/psql/tab-complete.c
+++ b/src/bin/psql/tab-complete.c
@@ -1925,7 +1925,7 @@ psql_completion(const char *text, int start, int end)
COMPLETE_WITH("(", "PUBLICATION");
/* ALTER SUBSCRIPTION <name> SET ( */
else if (HeadMatches("ALTER", "SUBSCRIPTION", MatchAny) && TailMatches("SET", "("))
- COMPLETE_WITH("binary", "disable_on_error", "origin", "slot_name",
+ COMPLETE_WITH("binary", "disable_on_error", "min_send_delay", "origin", "slot_name",
"streaming", "synchronous_commit");
/* ALTER SUBSCRIPTION <name> SKIP ( */
else if (HeadMatches("ALTER", "SUBSCRIPTION", MatchAny) && TailMatches("SKIP", "("))
@@ -3268,7 +3268,7 @@ psql_completion(const char *text, int start, int end)
/* Complete "CREATE SUBSCRIPTION <name> ... WITH ( <opt>" */
else if (HeadMatches("CREATE", "SUBSCRIPTION") && TailMatches("WITH", "("))
COMPLETE_WITH("binary", "connect", "copy_data", "create_slot",
- "disable_on_error", "enabled", "origin", "slot_name",
+ "disable_on_error", "enabled", "min_send_delay", "origin", "slot_name",
"streaming", "synchronous_commit", "two_phase");
/* CREATE TRIGGER --- is allowed inside CREATE SCHEMA, so use TailMatches */
diff --git a/src/include/catalog/pg_subscription.h b/src/include/catalog/pg_subscription.h
index b0f2a1705d..69ae4314b4 100644
--- a/src/include/catalog/pg_subscription.h
+++ b/src/include/catalog/pg_subscription.h
@@ -74,6 +74,8 @@ CATALOG(pg_subscription,6100,SubscriptionRelationId) BKI_SHARED_RELATION BKI_ROW
Oid subowner BKI_LOOKUP(pg_authid); /* Owner of the subscription */
+ int32 subminsenddelay; /* Replication send delay (ms) */
+
bool subenabled; /* True if the subscription is enabled (the
* worker should be running) */
@@ -122,6 +124,7 @@ typedef struct Subscription
* skipped */
char *name; /* Name of the subscription */
Oid owner; /* Oid of the subscription owner */
+ int32 minsenddelay; /* Replication send delay (ms) */
bool enabled; /* Indicates if the subscription is enabled */
bool binary; /* Indicates if the subscription wants data in
* binary format */
diff --git a/src/include/replication/logical.h b/src/include/replication/logical.h
index 5f49554ea0..c389dc17e5 100644
--- a/src/include/replication/logical.h
+++ b/src/include/replication/logical.h
@@ -30,6 +30,11 @@ typedef void (*LogicalOutputPluginWriterUpdateProgress) (struct LogicalDecodingC
bool skipped_xact
);
+typedef void (*LogicalOutputPluginWriterDelay) (struct LogicalDecodingContext *lr,
+ TransactionId xid,
+ TimestampTz start_time
+);
+
typedef struct LogicalDecodingContext
{
/* memory context this is all allocated in */
@@ -64,6 +69,7 @@ typedef struct LogicalDecodingContext
LogicalOutputPluginWriterPrepareWrite prepare_write;
LogicalOutputPluginWriterWrite write;
LogicalOutputPluginWriterUpdateProgress update_progress;
+ LogicalOutputPluginWriterDelay delay_send;
/*
* Output buffer.
@@ -100,6 +106,12 @@ typedef struct LogicalDecodingContext
*/
bool twophase_opt_given;
+ /*
+ * The minimum delay, in milliseconds, by the publisher before sending all
+ * the changes.
+ */
+ int32 min_send_delay;
+
/*
* State for writing output.
*/
@@ -121,14 +133,16 @@ extern LogicalDecodingContext *CreateInitDecodingContext(const char *plugin,
XLogReaderRoutine *xl_routine,
LogicalOutputPluginWriterPrepareWrite prepare_write,
LogicalOutputPluginWriterWrite do_write,
- LogicalOutputPluginWriterUpdateProgress update_progress);
+ LogicalOutputPluginWriterUpdateProgress update_progress,
+ LogicalOutputPluginWriterDelay delay_send);
extern LogicalDecodingContext *CreateDecodingContext(XLogRecPtr start_lsn,
List *output_plugin_options,
bool fast_forward,
XLogReaderRoutine *xl_routine,
LogicalOutputPluginWriterPrepareWrite prepare_write,
LogicalOutputPluginWriterWrite do_write,
- LogicalOutputPluginWriterUpdateProgress update_progress);
+ LogicalOutputPluginWriterUpdateProgress update_progress,
+ LogicalOutputPluginWriterDelay delay_send);
extern void DecodingContextFindStartpoint(LogicalDecodingContext *ctx);
extern bool DecodingContextReady(LogicalDecodingContext *ctx);
extern void FreeDecodingContext(LogicalDecodingContext *ctx);
diff --git a/src/include/replication/pgoutput.h b/src/include/replication/pgoutput.h
index b4a8015403..d2fde09e00 100644
--- a/src/include/replication/pgoutput.h
+++ b/src/include/replication/pgoutput.h
@@ -30,6 +30,7 @@ typedef struct PGOutputData
bool messages;
bool two_phase;
char *origin;
+ int32 min_send_delay;
} PGOutputData;
#endif /* PGOUTPUT_H */
diff --git a/src/include/replication/walreceiver.h b/src/include/replication/walreceiver.h
index decffe352d..c20969aed7 100644
--- a/src/include/replication/walreceiver.h
+++ b/src/include/replication/walreceiver.h
@@ -187,6 +187,7 @@ typedef struct
* prepare time */
char *origin; /* Only publish data originating from the
* specified origin */
+ int32 min_send_delay; /* The minimum send delay */
} logical;
} proto;
} WalRcvStreamOptions;
diff --git a/src/include/utils/wait_event.h b/src/include/utils/wait_event.h
index 9ab23e1c4a..cc3a234eba 100644
--- a/src/include/utils/wait_event.h
+++ b/src/include/utils/wait_event.h
@@ -150,7 +150,8 @@ typedef enum
WAIT_EVENT_REGISTER_SYNC_REQUEST,
WAIT_EVENT_SPIN_DELAY,
WAIT_EVENT_VACUUM_DELAY,
- WAIT_EVENT_VACUUM_TRUNCATE
+ WAIT_EVENT_VACUUM_TRUNCATE,
+ WAIT_EVENT_WALSENDER_SEND_DELAY
} WaitEventTimeout;
/* ----------
diff --git a/src/test/regress/expected/subscription.out b/src/test/regress/expected/subscription.out
index 3f99b14394..2027316233 100644
--- a/src/test/regress/expected/subscription.out
+++ b/src/test/regress/expected/subscription.out
@@ -114,18 +114,18 @@ CREATE SUBSCRIPTION regress_testsub4 CONNECTION 'dbname=regress_doesnotexist' PU
WARNING: subscription was created, but is not connected
HINT: To initiate replication, you must manually create the replication slot, enable the subscription, and refresh the subscription.
\dRs+ regress_testsub4
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
-------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub4 | regress_subscription_user | f | {testpub} | f | off | d | f | none | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min send delay | Synchronous commit | Conninfo | Skip LSN
+------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+----------------+--------------------+-----------------------------+----------
+ regress_testsub4 | regress_subscription_user | f | {testpub} | f | off | d | f | none | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
ALTER SUBSCRIPTION regress_testsub4 SET (origin = any);
\dRs+ regress_testsub4
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
-------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub4 | regress_subscription_user | f | {testpub} | f | off | d | f | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min send delay | Synchronous commit | Conninfo | Skip LSN
+------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+----------------+--------------------+-----------------------------+----------
+ regress_testsub4 | regress_subscription_user | f | {testpub} | f | off | d | f | any | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
DROP SUBSCRIPTION regress_testsub3;
@@ -143,10 +143,10 @@ ALTER SUBSCRIPTION regress_testsub CONNECTION 'foobar';
ERROR: invalid connection string syntax: missing "=" after "foobar" in connection info string
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub} | f | off | d | f | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min send delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+----------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | off | d | f | any | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
ALTER SUBSCRIPTION regress_testsub SET PUBLICATION testpub2, testpub3 WITH (refresh = false);
@@ -163,10 +163,10 @@ ERROR: unrecognized subscription parameter: "create_slot"
-- ok
ALTER SUBSCRIPTION regress_testsub SKIP (lsn = '0/12345');
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+---------------------+--------+-----------+------------------+------------------+--------+--------------------+------------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub2,testpub3} | f | off | d | f | any | off | dbname=regress_doesnotexist2 | 0/12345
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min send delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+---------------------+--------+-----------+------------------+------------------+--------+----------------+--------------------+------------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub2,testpub3} | f | off | d | f | any | 0 | off | dbname=regress_doesnotexist2 | 0/12345
(1 row)
-- ok - with lsn = NONE
@@ -175,10 +175,10 @@ ALTER SUBSCRIPTION regress_testsub SKIP (lsn = NONE);
ALTER SUBSCRIPTION regress_testsub SKIP (lsn = '0/0');
ERROR: invalid WAL location (LSN): 0/0
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+---------------------+--------+-----------+------------------+------------------+--------+--------------------+------------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub2,testpub3} | f | off | d | f | any | off | dbname=regress_doesnotexist2 | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min send delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+---------------------+--------+-----------+------------------+------------------+--------+----------------+--------------------+------------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub2,testpub3} | f | off | d | f | any | 0 | off | dbname=regress_doesnotexist2 | 0/0
(1 row)
BEGIN;
@@ -210,10 +210,10 @@ ALTER SUBSCRIPTION regress_testsub_foo SET (synchronous_commit = foobar);
ERROR: invalid value for parameter "synchronous_commit": "foobar"
HINT: Available values: local, remote_write, remote_apply, on, off.
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
----------------------+---------------------------+---------+---------------------+--------+-----------+------------------+------------------+--------+--------------------+------------------------------+----------
- regress_testsub_foo | regress_subscription_user | f | {testpub2,testpub3} | f | off | d | f | any | local | dbname=regress_doesnotexist2 | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min send delay | Synchronous commit | Conninfo | Skip LSN
+---------------------+---------------------------+---------+---------------------+--------+-----------+------------------+------------------+--------+----------------+--------------------+------------------------------+----------
+ regress_testsub_foo | regress_subscription_user | f | {testpub2,testpub3} | f | off | d | f | any | 0 | local | dbname=regress_doesnotexist2 | 0/0
(1 row)
-- rename back to keep the rest simple
@@ -247,19 +247,19 @@ CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUB
WARNING: subscription was created, but is not connected
HINT: To initiate replication, you must manually create the replication slot, enable the subscription, and refresh the subscription.
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub} | t | off | d | f | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min send delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+----------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | t | off | d | f | any | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
ALTER SUBSCRIPTION regress_testsub SET (binary = false);
ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE);
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub} | f | off | d | f | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min send delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+----------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | off | d | f | any | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
DROP SUBSCRIPTION regress_testsub;
@@ -271,27 +271,27 @@ CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUB
WARNING: subscription was created, but is not connected
HINT: To initiate replication, you must manually create the replication slot, enable the subscription, and refresh the subscription.
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub} | f | on | d | f | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min send delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+----------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | on | d | f | any | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
ALTER SUBSCRIPTION regress_testsub SET (streaming = parallel);
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub} | f | parallel | d | f | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min send delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+----------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | parallel | d | f | any | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
ALTER SUBSCRIPTION regress_testsub SET (streaming = false);
ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE);
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub} | f | off | d | f | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min send delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+----------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | off | d | f | any | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
-- fail - publication already exists
@@ -306,10 +306,10 @@ ALTER SUBSCRIPTION regress_testsub ADD PUBLICATION testpub1, testpub2 WITH (refr
ALTER SUBSCRIPTION regress_testsub ADD PUBLICATION testpub1, testpub2 WITH (refresh = false);
ERROR: publication "testpub1" is already in subscription "regress_testsub"
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-----------------------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub,testpub1,testpub2} | f | off | d | f | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min send delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-----------------------------+--------+-----------+------------------+------------------+--------+----------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub,testpub1,testpub2} | f | off | d | f | any | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
-- fail - publication used more than once
@@ -324,10 +324,10 @@ ERROR: publication "testpub3" is not in subscription "regress_testsub"
-- ok - delete publications
ALTER SUBSCRIPTION regress_testsub DROP PUBLICATION testpub1, testpub2 WITH (refresh = false);
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub} | f | off | d | f | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min send delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+----------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | off | d | f | any | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
DROP SUBSCRIPTION regress_testsub;
@@ -363,10 +363,10 @@ CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUB
WARNING: subscription was created, but is not connected
HINT: To initiate replication, you must manually create the replication slot, enable the subscription, and refresh the subscription.
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub} | f | off | p | f | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min send delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+----------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | off | p | f | any | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
--fail - alter of two_phase option not supported.
@@ -375,10 +375,10 @@ ERROR: unrecognized subscription parameter: "two_phase"
-- but can alter streaming when two_phase enabled
ALTER SUBSCRIPTION regress_testsub SET (streaming = true);
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub} | f | on | p | f | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min send delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+----------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | on | p | f | any | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE);
@@ -388,10 +388,10 @@ CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUB
WARNING: subscription was created, but is not connected
HINT: To initiate replication, you must manually create the replication slot, enable the subscription, and refresh the subscription.
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub} | f | on | p | f | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min send delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+----------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | on | p | f | any | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE);
@@ -404,20 +404,61 @@ CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUB
WARNING: subscription was created, but is not connected
HINT: To initiate replication, you must manually create the replication slot, enable the subscription, and refresh the subscription.
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub} | f | off | d | f | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min send delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+----------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | off | d | f | any | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
ALTER SUBSCRIPTION regress_testsub SET (disable_on_error = true);
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub} | f | off | d | t | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min send delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+----------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | off | d | t | any | 0 | off | dbname=regress_doesnotexist | 0/0
+(1 row)
+
+ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE);
+DROP SUBSCRIPTION regress_testsub;
+-- fail -- min_send_delay must be a non-negative integer
+CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexit' PUBLICATION testpub WITH (connect = false, min_send_delay = foo);
+ERROR: invalid value for parameter "min_send_delay": "foo"
+CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexit' PUBLICATION testpub WITH (connect = false, min_send_delay = -1);
+ERROR: -1 ms is outside the valid range for parameter "min_send_delay" (0 .. 2147483647)
+-- fail - specifying streaming = parallel with time-delayed replication is not
+-- supported
+CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUBLICATION testpub WITH (connect = false, streaming = parallel, min_send_delay = 123);
+ERROR: min_send_delay > 0 and streaming = parallel are mutually exclusive options
+-- success -- min_send_delay value without units is taken as milliseconds
+CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexit' PUBLICATION testpub WITH (connect = false, min_send_delay = 123);
+WARNING: subscription was created, but is not connected
+HINT: To initiate replication, you must manually create the replication slot, enable the subscription, and refresh the subscription.
+\dRs+
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min send delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+----------------+--------------------+----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | off | d | f | any | 123 | off | dbname=regress_doesnotexit | 0/0
+(1 row)
+
+-- success -- min_send_delay value with units other than ms is converted to ms
+-- and stored as an integer
+ALTER SUBSCRIPTION regress_testsub SET (min_send_delay = '1 d');
+\dRs+
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min send delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+----------------+--------------------+----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | off | d | f | any | 86400000 | off | dbname=regress_doesnotexit | 0/0
(1 row)
+-- fail - alter subscription with streaming = parallel should fail when
+-- time-delayed replication is set
+ALTER SUBSCRIPTION regress_testsub SET (streaming = parallel);
+ERROR: cannot set parallel streaming mode for subscription with min_send_delay
+-- fail - alter subscription with min_send_delay should fail when
+-- streaming = parallel is set
+ALTER SUBSCRIPTION regress_testsub SET (min_send_delay = 0, streaming = parallel);
+ALTER SUBSCRIPTION regress_testsub SET (min_send_delay = 123);
+ERROR: cannot set min_send_delay for subscription in parallel streaming mode
ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE);
DROP SUBSCRIPTION regress_testsub;
RESET SESSION AUTHORIZATION;
diff --git a/src/test/regress/sql/subscription.sql b/src/test/regress/sql/subscription.sql
index 7281f5fee2..46bf4a27d9 100644
--- a/src/test/regress/sql/subscription.sql
+++ b/src/test/regress/sql/subscription.sql
@@ -286,6 +286,35 @@ ALTER SUBSCRIPTION regress_testsub SET (disable_on_error = true);
ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE);
DROP SUBSCRIPTION regress_testsub;
+-- fail -- min_send_delay must be a non-negative integer
+CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexit' PUBLICATION testpub WITH (connect = false, min_send_delay = foo);
+CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexit' PUBLICATION testpub WITH (connect = false, min_send_delay = -1);
+
+-- fail - specifying streaming = parallel with time-delayed replication is not
+-- supported
+CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUBLICATION testpub WITH (connect = false, streaming = parallel, min_send_delay = 123);
+
+-- success -- min_send_delay value without units is taken as milliseconds
+CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexit' PUBLICATION testpub WITH (connect = false, min_send_delay = 123);
+\dRs+
+
+-- success -- min_send_delay value with units other than ms is converted to ms
+-- and stored as an integer
+ALTER SUBSCRIPTION regress_testsub SET (min_send_delay = '1 d');
+\dRs+
+
+-- fail - alter subscription with streaming = parallel should fail when
+-- time-delayed replication is set
+ALTER SUBSCRIPTION regress_testsub SET (streaming = parallel);
+
+-- fail - alter subscription with min_send_delay should fail when
+-- streaming = parallel is set
+ALTER SUBSCRIPTION regress_testsub SET (min_send_delay = 0, streaming = parallel);
+ALTER SUBSCRIPTION regress_testsub SET (min_send_delay = 123);
+
+ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE);
+DROP SUBSCRIPTION regress_testsub;
+
RESET SESSION AUTHORIZATION;
DROP ROLE regress_subscription_user;
DROP ROLE regress_subscription_user2;
diff --git a/src/test/subscription/t/001_rep_changes.pl b/src/test/subscription/t/001_rep_changes.pl
index 91aa068c95..ad0831bb1a 100644
--- a/src/test/subscription/t/001_rep_changes.pl
+++ b/src/test/subscription/t/001_rep_changes.pl
@@ -515,6 +515,33 @@ $node_publisher->poll_query_until('postgres',
or die
"Timed out while waiting for apply to restart after renaming SUBSCRIPTION";
+# Test time-delayed logical replication
+#
+# If the subscription sets min_send_delay parameter, the walsender will delay
+# the transaction send for min_send_delay milliseconds. We verify this by
+# looking at the time difference between a) when tuples are inserted on the
+# publisher, and b) when those changes are replicated on the subscriber. Even
+# on slow machines, this strategy will give predictable behavior.
+
+# Set min_send_delay parameter to 3 seconds
+my $delay = 3;
+$node_subscriber->safe_psql('postgres',
+ "ALTER SUBSCRIPTION tap_sub_renamed SET (min_send_delay = '${delay}s')");
+
+# Before doing the insertion, get the current timestamp that will be
+# used as a comparison base.
+my $publisher_insert_time = time();
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_ins VALUES (generate_series(1101, 1120))");
+
+# The publisher waits for the replication to complete
+$node_publisher->wait_for_catchup('tap_sub_renamed');
+
+# This test is successful only if at least the configured delay has elapsed.
+ok( time() - $publisher_insert_time >= $delay,
+ "subscriber applies changes only after replication delay for non-streaming transaction"
+);
+
# check all the cleanup
$node_subscriber->safe_psql('postgres', "DROP SUBSCRIPTION tap_sub_renamed");
--
2.27.0
On Mon, Feb 27, 2023 at 1:50 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
On Mon, Feb 27, 2023 at 3:34 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
---
Why do we not to delay sending COMMIT PREPARED messages?I think we need to either add delay for prepare or commit prepared as
otherwise, it will lead to delaying the xact more than required.Agreed.
The
patch seems to add a delay before sending a PREPARE as that is the
time when the subscriber will apply the changes.Considering the purpose of this feature mentioned in the commit
message "particularly to fix errors that might cause data loss",
delaying sending PREPARE would really help that situation? For
example, even after (mistakenly) executing PREPARE for a transaction
executing DELETE without WHERE clause on the publisher the user still
can rollback the transaction. They don't lose data on both nodes yet.
After executing (and replicating) COMMIT PREPARED for that
transaction, they lose the data on both nodes. IIUC the time-delayed
logical replication should help this situation by delaying sending
COMMIT PREPARED so that, for example, the user can stop logical
replication before COMMIT PREPARED message arrives to the subscriber.
So I think we should delay sending COMMIT PREPARED (and COMMIT)
instead of PREPARE. This would help users to correct data loss errors,
and would be more consistent with what recovery_min_apply_delay does.
The one difference w.r.t recovery_min_apply_delay is that here we will
hold locks for the duration of the delay which didn't seem to be a
good idea. This will also probably lead to more bloat as we will keep
transactions open for a long time. Doing it before DecodePrepare won't
have such problems. This is the reason that we decide to perform a
delay at the start of the transaction instead at commit/prepare in the
subscriber-side approach.
--
With Regards,
Amit Kapila.
At Mon, 27 Feb 2023 14:56:19 +0530, Amit Kapila <amit.kapila16@gmail.com> wrote in
The one difference w.r.t recovery_min_apply_delay is that here we will
hold locks for the duration of the delay which didn't seem to be a
good idea. This will also probably lead to more bloat as we will keep
transactions open for a long time. Doing it before DecodePrepare won't
I don't have a concrete picture but could we tell reorder buffer to
retain a PREPAREd transaction until a COMMIT PREPARED comes? If
delaying non-prepared transactions until COMMIT is adequate, then the
same thing seems to work for prepared transactions.
have such problems. This is the reason that we decide to perform a
delay at the start of the transaction instead at commit/prepare in the
subscriber-side approach.
It seems that there are no technical obstacles to do that on the
publisher side. The only observable difference would be that
relatively large prepared transactions may experience noticeable
additional delays. IMHO I don't think it's a good practice
protocol-wise to intentionally choke a stream at the receiving end
when it has not been flow-controlled on the transmitting end.
regards.
--
Kyotaro Horiguchi
NTT Open Source Software Center
On Tue, Feb 28, 2023 at 8:14 AM Kyotaro Horiguchi
<horikyota.ntt@gmail.com> wrote:
At Mon, 27 Feb 2023 14:56:19 +0530, Amit Kapila <amit.kapila16@gmail.com> wrote in
The one difference w.r.t recovery_min_apply_delay is that here we will
hold locks for the duration of the delay which didn't seem to be a
good idea. This will also probably lead to more bloat as we will keep
transactions open for a long time. Doing it before DecodePrepare won'tI don't have a concrete picture but could we tell reorder buffer to
retain a PREPAREd transaction until a COMMIT PREPARED comes?
Yeah, we could do that and that is what is the behavior unless the
user enables 2PC via 'two_phase' subscription option. But, I don't see
the need to unnecessarily delay the prepare till the commit if a user
has specified 'two_phase' option. It is quite possible that COMMIT
PREPARED happens at a much later time frame than the amount of delay
the user is expecting.
If
delaying non-prepared transactions until COMMIT is adequate, then the
same thing seems to work for prepared transactions.have such problems. This is the reason that we decide to perform a
delay at the start of the transaction instead at commit/prepare in the
subscriber-side approach.It seems that there are no technical obstacles to do that on the
publisher side. The only observable difference would be that
relatively large prepared transactions may experience noticeable
additional delays. IMHO I don't think it's a good practice
protocol-wise to intentionally choke a stream at the receiving end
when it has not been flow-controlled on the transmitting end.
But in this proposal, we are not choking/delaying anything on the receiving end.
--
With Regards,
Amit Kapila.
On Mon, Feb 27, 2023 at 2:21 PM Hayato Kuroda (Fujitsu)
<kuroda.hayato@fujitsu.com> wrote:
Few comments:
1.
+ /*
+ * If we've requested to shut down, exit the process.
+ *
+ * Note that WalSndDone() cannot be used here because the delaying
+ * changes will be sent in the function.
+ */
+ if (got_STOPPING)
+ {
+ QueryCompletion qc;
+
+ /* Inform the standby that XLOG streaming is done */
+ SetQueryCompletion(&qc, CMDTAG_COPY, 0);
+ EndCommand(&qc, DestRemote, false);
+ pq_flush();
Do we really need to do anything except for breaking the loop and let
the exit handling happen in the main loop when 'got_STOPPING' is set?
AFAICS, this is what we are doing in some other palces (See
WalSndWaitForWal). Won't that work? It seems that will help us sending
all the pending WAL.
2.
+ /* Try to flush pending output to the client */
+ if (pq_flush_if_writable() != 0)
+ WalSndShutdown();
Is there a reason to try flushing here?
Apart from the above, I have made a few changes in the comments in the
attached diff patch. If you agree with those then please include them
in the next version.
--
With Regards,
Amit Kapila.
Attachments:
changes_amit_1.patchapplication/octet-stream; name=changes_amit_1.patchDownload
diff --git a/src/backend/replication/logical/worker.c b/src/backend/replication/logical/worker.c
index e68902ae34..3cde6d37fa 100644
--- a/src/backend/replication/logical/worker.c
+++ b/src/backend/replication/logical/worker.c
@@ -4623,9 +4623,11 @@ ApplyWorkerMain(Datum main_arg)
if (!am_tablesync_worker())
{
/*
- * Time-delayed logical replication does not support tablesync
- * workers, so only the leader apply worker can request walsenders to
- * delay on the publisher side.
+ * We support time-delayed logical replication only for the apply
+ * worker. This is because if we support delay during the initial sync
+ * then once we reach the limit of tablesync workers it would impose a
+ * delay for each subsequent worker. That would cause initial table
+ * synchronization completion to take a long time.
*/
if (server_version >= 160000 && MySubscription->minsenddelay > 0)
options.proto.logical.min_send_delay = MySubscription->minsenddelay;
diff --git a/src/backend/replication/pgoutput/pgoutput.c b/src/backend/replication/pgoutput/pgoutput.c
index 61faf2d685..90bbea941b 100644
--- a/src/backend/replication/pgoutput/pgoutput.c
+++ b/src/backend/replication/pgoutput/pgoutput.c
@@ -529,7 +529,10 @@ pgoutput_startup(LogicalDecodingContext *ctx, OutputPluginOptions *opt,
else
ctx->twophase_opt_given = true;
- /* Copy given time period to decoding context */
+ /*
+ * Remember the delay time period to be used later before sending the
+ * changes.
+ */
ctx->min_send_delay = data->min_send_delay;
/* Init publication state. */
diff --git a/src/backend/replication/walsender.c b/src/backend/replication/walsender.c
index 9f3968928f..ad3515821e 100644
--- a/src/backend/replication/walsender.c
+++ b/src/backend/replication/walsender.c
@@ -3854,13 +3854,15 @@ LagTrackerRead(int head, XLogRecPtr lsn, TimestampTz now)
/*
* LogicalDecodingContext 'delay' callback.
*
- * Wait long enough to make sure a transaction is applied at least that
- * period behind the publisher.
+ * Wait long enough to make sure a transaction is applied at least
+ * min_send_delay time period after it is performed at the publisher.
+ *
+ * delay_start is the transaction end time.
*/
static void
WalSndDelay(LogicalDecodingContext *ctx, TransactionId xid, TimestampTz delay_start)
{
- /* Wait till delayUntil by the latch mechanism */
+ /* Apply the delay by the latch mechanism */
while (true)
{
TimestampTz now;
diff --git a/src/test/subscription/t/001_rep_changes.pl b/src/test/subscription/t/001_rep_changes.pl
index ad0831bb1a..a273e7df68 100644
--- a/src/test/subscription/t/001_rep_changes.pl
+++ b/src/test/subscription/t/001_rep_changes.pl
@@ -518,10 +518,10 @@ $node_publisher->poll_query_until('postgres',
# Test time-delayed logical replication
#
# If the subscription sets min_send_delay parameter, the walsender will delay
-# the transaction send for min_send_delay milliseconds. We verify this by
-# looking at the time difference between a) when tuples are inserted on the
-# publisher, and b) when those changes are replicated on the subscriber. Even
-# on slow machines, this strategy will give predictable behavior.
+# the transaction for min_send_delay milliseconds. We verify this by looking
+# at the time difference between a) when tuples are inserted on the publisher,
+# and b) when those changes are replicated on the subscriber. Even on slow
+# machines, this strategy will give predictable behavior.
# Set min_send_delay parameter to 3 seconds
my $delay = 3;
Dear Amit,
Few comments:
Thank you for reviewing! PSA new version.
Note that the starting point of delay for 2PC was not changed,
I think it has been under discussion.
1. + /* + * If we've requested to shut down, exit the process. + * + * Note that WalSndDone() cannot be used here because the delaying + * changes will be sent in the function. + */ + if (got_STOPPING) + { + QueryCompletion qc; + + /* Inform the standby that XLOG streaming is done */ + SetQueryCompletion(&qc, CMDTAG_COPY, 0); + EndCommand(&qc, DestRemote, false); + pq_flush();Do we really need to do anything except for breaking the loop and let
the exit handling happen in the main loop when 'got_STOPPING' is set?
AFAICS, this is what we are doing in some other palces (See
WalSndWaitForWal). Won't that work? It seems that will help us sending
all the pending WAL.
If we exit the loop after got_STOPPING is set, as you said, the walsender will
send delaying changes and then exit. The behavior is same as the case that WalSndDone()
is called. But I think it is not suitable for the motivation of the feature.
If users notice the miss operation like TRUNCATE, they must shut down the publisher
once and then recovery from back up or old subscriber. If the walsender sends all
pending changes, miss operations will be also propagated to subscriber and data
cannot be protected. So currently I want to keep the style.
FYI - In case of physical replication, received WALs are not applied when the
secondary is shutted down.
2. + /* Try to flush pending output to the client */ + if (pq_flush_if_writable() != 0) + WalSndShutdown();Is there a reason to try flushing here?
IIUC if pq_flush_if_writable() returns non-zero (EOF), it means that there is a
trouble and walsender fails to send messages to subscriber.
In Linux, the stuck trace from pq_flush_if_writable() will finally reach the send() system call.
And according to man page[1]https://man7.org/linux/man-pages/man3/send.3p.html, it will be triggered by some unexpected state or the connection is closed.
Based on above, I think the returned value should be confirmed.
Apart from the above, I have made a few changes in the comments in the
attached diff patch. If you agree with those then please include them
in the next version.
Thanks! I checked and I thought all of them should be included.
Moreover, I used grammar checker and slightly reworded the commit message.
[1]: https://man7.org/linux/man-pages/man3/send.3p.html
Best Regards,
Hayato Kuroda
FUJITSU LIMITED
Attachments:
v9-0001-Time-delayed-logical-replication-on-publisher-sid.patchapplication/octet-stream; name=v9-0001-Time-delayed-logical-replication-on-publisher-sid.patchDownload
From b4f5450670ba3c3034eecdbc7b8fcc909db5c81b Mon Sep 17 00:00:00 2001
From: Hayato Kuroda <kuroda.hayato@fujitsu.com>
Date: Thu, 16 Feb 2023 07:52:23 +0000
Subject: [PATCH v9] Time-delayed logical replication on publisher side
Similar to physical replication, a time-delayed copy of the data for
logical replication can be useful in certain scenarios, particularly
to fix errors that may cause data loss.
This patch introduces a new subscription parameter called
'min_send_delay'. If this parameter is set, the apply worker (via
walrcv_startstreaming) passes the value to the publisher as an output
plugin option. The walsender will then delay sending the transaction
for the given number of milliseconds. It is important to note that
this delay does not take into account the time spent transferring the
transaction, which means that the arrival time at the subscriber may
be delayed further.
The delay occurs before we start sending the transaction on the
publisher, and both regular and prepared transactions are covered.
Streamed transactions are also covered.
The combination of parallel streaming mode and min_send_delay is not allowed.
There are two reasons for this:
1. In parallel streaming mode, we start applying the transaction stream
as soon as the first change arrives, without knowing the transaction's
prepare/commit time. Always waiting for the full 'min_send_delay'
period may result in unnecessary delay.
2. Another reason is for that parallel streaming, the transaction will be opened
immediately by the parallel apply worker. Therefore, if the walsender
is delayed in sending the final record of the transaction, the
parallel apply worker must wait to receive it with an open
transaction. This would result in the locks acquired during the
transaction not being released until the min_send_delay has elapsed.
Earlier versions were written by Euler Taveira, Takamichi Osumi, and Kuroda Hayato
Author: Kuroda Hayato, Takamichi Osumi
Reviewed-by: Amit Kapila, Peter Smith, Vignesh C, Shveta Malik,
Kyotaro Horiguchi, Shi Yu, Wang Wei, Dilip Kumar, Melih Mutlu,
Andres Freund
---
doc/src/sgml/catalogs.sgml | 10 +
doc/src/sgml/glossary.sgml | 15 ++
doc/src/sgml/logical-replication.sgml | 6 +
doc/src/sgml/monitoring.sgml | 5 +
doc/src/sgml/ref/alter_subscription.sgml | 5 +-
doc/src/sgml/ref/create_subscription.sgml | 43 +++-
src/backend/catalog/pg_subscription.c | 1 +
src/backend/catalog/system_views.sql | 7 +-
src/backend/commands/subscriptioncmds.c | 119 ++++++++++-
.../libpqwalreceiver/libpqwalreceiver.c | 5 +
src/backend/replication/logical/decode.c | 9 +
src/backend/replication/logical/logical.c | 18 +-
.../replication/logical/logicalfuncs.c | 2 +-
src/backend/replication/logical/worker.c | 14 +-
src/backend/replication/pgoutput/pgoutput.c | 33 ++++
src/backend/replication/slotfuncs.c | 4 +-
src/backend/replication/walsender.c | 90 ++++++++-
src/backend/utils/activity/wait_event.c | 3 +
src/bin/pg_dump/pg_dump.c | 15 +-
src/bin/pg_dump/pg_dump.h | 1 +
src/bin/psql/describe.c | 9 +-
src/bin/psql/tab-complete.c | 4 +-
src/include/catalog/pg_subscription.h | 3 +
src/include/replication/logical.h | 18 +-
src/include/replication/pgoutput.h | 1 +
src/include/replication/walreceiver.h | 1 +
src/include/utils/wait_event.h | 3 +-
src/test/regress/expected/subscription.out | 185 +++++++++++-------
src/test/regress/sql/subscription.sql | 29 +++
src/test/subscription/t/001_rep_changes.pl | 27 +++
30 files changed, 581 insertions(+), 104 deletions(-)
diff --git a/doc/src/sgml/catalogs.sgml b/doc/src/sgml/catalogs.sgml
index c1e4048054..ce6521fe6c 100644
--- a/doc/src/sgml/catalogs.sgml
+++ b/doc/src/sgml/catalogs.sgml
@@ -7892,6 +7892,16 @@ SCRAM-SHA-256$<replaceable><iteration count></replaceable>:<replaceable>&l
</para></entry>
</row>
+ <row>
+ <entry role="catalog_table_entry"><para role="column_definition">
+ <structfield>subminsenddelay</structfield> <type>int4</type>
+ </para>
+ <para>
+ The minimum delay, in milliseconds, by the publisher before sending all
+ the changes
+ </para></entry>
+ </row>
+
<row>
<entry role="catalog_table_entry"><para role="column_definition">
<structfield>subenabled</structfield> <type>bool</type>
diff --git a/doc/src/sgml/glossary.sgml b/doc/src/sgml/glossary.sgml
index 7c01a541fe..9ede9d05f6 100644
--- a/doc/src/sgml/glossary.sgml
+++ b/doc/src/sgml/glossary.sgml
@@ -1729,6 +1729,21 @@
</glossdef>
</glossentry>
+ <glossentry id="glossary-time-delayed-replication">
+ <glossterm>Time-delayed replication</glossterm>
+ <glossdef>
+ <para>
+ Replication setup that delays the application of changes by a specified
+ minimum time-delay period.
+ </para>
+ <para>
+ For more information, see
+ <xref linkend="guc-recovery-min-apply-delay"/> for physical replication
+ and <xref linkend="sql-createsubscription"/> for logical replication.
+ </para>
+ </glossdef>
+ </glossentry>
+
<glossentry id="glossary-toast">
<glossterm>TOAST</glossterm>
<glossdef>
diff --git a/doc/src/sgml/logical-replication.sgml b/doc/src/sgml/logical-replication.sgml
index 1bd5660c87..fe9e7f7b26 100644
--- a/doc/src/sgml/logical-replication.sgml
+++ b/doc/src/sgml/logical-replication.sgml
@@ -247,6 +247,12 @@
target table.
</para>
+ <para>
+ A subscription can delay the receipt of changes by specifying the
+ <literal>min_send_delay</literal> subscription parameter. See
+ <xref linkend="sql-createsubscription"/> for details.
+ </para>
+
<sect2 id="logical-replication-subscription-slot">
<title>Replication Slot Management</title>
diff --git a/doc/src/sgml/monitoring.sgml b/doc/src/sgml/monitoring.sgml
index 6249bb50d0..7c47723153 100644
--- a/doc/src/sgml/monitoring.sgml
+++ b/doc/src/sgml/monitoring.sgml
@@ -2349,6 +2349,11 @@ postgres 27093 0.0 0.0 30096 2752 ? Ss 11:34 0:00 postgres: ser
<entry>Waiting to acquire an exclusive lock to truncate off any
empty pages at the end of a table vacuumed.</entry>
</row>
+ <row>
+ <entry><literal>WalSenderSendDelay</literal></entry>
+ <entry>Waiting while sending changes for time-delayed logical replication
+ in the WAL sender process.</entry>
+ </row>
</tbody>
</tgroup>
</table>
diff --git a/doc/src/sgml/ref/alter_subscription.sgml b/doc/src/sgml/ref/alter_subscription.sgml
index 964fcbb8ff..3f238b958b 100644
--- a/doc/src/sgml/ref/alter_subscription.sgml
+++ b/doc/src/sgml/ref/alter_subscription.sgml
@@ -213,8 +213,9 @@ ALTER SUBSCRIPTION <replaceable class="parameter">name</replaceable> RENAME TO <
are <literal>slot_name</literal>,
<literal>synchronous_commit</literal>,
<literal>binary</literal>, <literal>streaming</literal>,
- <literal>disable_on_error</literal>, and
- <literal>origin</literal>.
+ <literal>disable_on_error</literal>,
+ <literal>origin</literal>, and
+ <literal>min_send_delay</literal>.
</para>
</listitem>
</varlistentry>
diff --git a/doc/src/sgml/ref/create_subscription.sgml b/doc/src/sgml/ref/create_subscription.sgml
index 51c45f17c7..9d08740ba2 100644
--- a/doc/src/sgml/ref/create_subscription.sgml
+++ b/doc/src/sgml/ref/create_subscription.sgml
@@ -349,7 +349,43 @@ CREATE SUBSCRIPTION <replaceable class="parameter">subscription_name</replaceabl
</para>
</listitem>
</varlistentry>
- </variablelist></para>
+
+ <varlistentry>
+ <term><literal>min_send_delay</literal> (<type>integer</type>)</term>
+ <listitem>
+ <para>
+ By default, the publisher sends changes as soon as possible. This
+ parameter allows the user to delay changes by the given time period.
+ If the value is specified without units, it is taken as milliseconds.
+ The default is zero (no delay). See <xref linkend="config-setting-names-values"/>
+ for details on the available valid time units.
+ </para>
+ <para>
+ The delay is effective only when the initial table synchronization
+ has been finished. However, there is a possibility that the table
+ status updated in <link linkend="catalog-pg-subscription-rel"><structname>pg_subscription_rel</structname></link>
+ could be delayed in getting to the "ready" state, and also two-phase
+ (if specified) could be delayed in getting to "enabled".
+ </para>
+ <para>
+ The delay does not take into account the overhead of time spent
+ transferring the transaction. Therefore, the arrival time at the
+ subscriber may be delayed more than the specified
+ <literal>min_send_delay</literal> time.
+ </para>
+ <warning>
+ <para>
+ Delaying the replication means there is a much longer time between
+ making a change on the publisher, and that change being committed
+ on the subscriber. This can impact the performance of synchronous
+ replication. See <xref linkend="guc-synchronous-commit"/>
+ parameter.
+ </para>
+ </warning>
+ </listitem>
+ </varlistentry>
+ </variablelist>
+ </para>
</listitem>
</varlistentry>
@@ -420,6 +456,11 @@ CREATE SUBSCRIPTION <replaceable class="parameter">subscription_name</replaceabl
published with different column lists are not supported.
</para>
+ <para>
+ A non-zero <literal>min_send_delay</literal> parameter is not allowed when
+ streaming in parallel mode.
+ </para>
+
<para>
We allow non-existent publications to be specified so that users can add
those later. This means
diff --git a/src/backend/catalog/pg_subscription.c b/src/backend/catalog/pg_subscription.c
index a56ae311c3..63a10b06d1 100644
--- a/src/backend/catalog/pg_subscription.c
+++ b/src/backend/catalog/pg_subscription.c
@@ -66,6 +66,7 @@ GetSubscription(Oid subid, bool missing_ok)
sub->skiplsn = subform->subskiplsn;
sub->name = pstrdup(NameStr(subform->subname));
sub->owner = subform->subowner;
+ sub->minsenddelay = subform->subminsenddelay;
sub->enabled = subform->subenabled;
sub->binary = subform->subbinary;
sub->stream = subform->substream;
diff --git a/src/backend/catalog/system_views.sql b/src/backend/catalog/system_views.sql
index 34ca0e739f..54a705d71b 100644
--- a/src/backend/catalog/system_views.sql
+++ b/src/backend/catalog/system_views.sql
@@ -1314,9 +1314,10 @@ REVOKE ALL ON pg_replication_origin_status FROM public;
-- All columns of pg_subscription except subconninfo are publicly readable.
REVOKE ALL ON pg_subscription FROM public;
-GRANT SELECT (oid, subdbid, subskiplsn, subname, subowner, subenabled,
- subbinary, substream, subtwophasestate, subdisableonerr,
- subslotname, subsynccommit, subpublications, suborigin)
+GRANT SELECT (oid, subdbid, subskiplsn, subname, subowner, subminsenddelay,
+ subenabled, subbinary, substream, subtwophasestate,
+ subdisableonerr, subslotname, subsynccommit, subpublications,
+ suborigin)
ON pg_subscription TO public;
CREATE VIEW pg_stat_subscription_stats AS
diff --git a/src/backend/commands/subscriptioncmds.c b/src/backend/commands/subscriptioncmds.c
index 464db6d247..4a8cd47171 100644
--- a/src/backend/commands/subscriptioncmds.c
+++ b/src/backend/commands/subscriptioncmds.c
@@ -66,6 +66,7 @@
#define SUBOPT_DISABLE_ON_ERR 0x00000400
#define SUBOPT_LSN 0x00000800
#define SUBOPT_ORIGIN 0x00001000
+#define SUBOPT_MIN_SEND_DELAY 0x00002000
/* check if the 'val' has 'bits' set */
#define IsSet(val, bits) (((val) & (bits)) == (bits))
@@ -90,6 +91,7 @@ typedef struct SubOpts
bool disableonerr;
char *origin;
XLogRecPtr lsn;
+ int32 min_send_delay;
} SubOpts;
static List *fetch_table_list(WalReceiverConn *wrconn, List *publications);
@@ -100,7 +102,7 @@ static void check_publications_origin(WalReceiverConn *wrconn,
static void check_duplicates_in_publist(List *publist, Datum *datums);
static List *merge_publications(List *oldpublist, List *newpublist, bool addpub, const char *subname);
static void ReportSlotConnectionError(List *rstates, Oid subid, char *slotname, char *err);
-
+static int32 defGetMinSendDelay(DefElem *def);
/*
* Common option parsing function for CREATE and ALTER SUBSCRIPTION commands.
@@ -146,6 +148,8 @@ parse_subscription_options(ParseState *pstate, List *stmt_options,
opts->disableonerr = false;
if (IsSet(supported_opts, SUBOPT_ORIGIN))
opts->origin = pstrdup(LOGICALREP_ORIGIN_ANY);
+ if (IsSet(supported_opts, SUBOPT_MIN_SEND_DELAY))
+ opts->min_send_delay = 0;
/* Parse options */
foreach(lc, stmt_options)
@@ -324,6 +328,15 @@ parse_subscription_options(ParseState *pstate, List *stmt_options,
opts->specified_opts |= SUBOPT_LSN;
opts->lsn = lsn;
}
+ else if (IsSet(supported_opts, SUBOPT_MIN_SEND_DELAY) &&
+ strcmp(defel->defname, "min_send_delay") == 0)
+ {
+ if (IsSet(opts->specified_opts, SUBOPT_MIN_SEND_DELAY))
+ errorConflictingDefElem(defel, pstate);
+
+ opts->specified_opts |= SUBOPT_MIN_SEND_DELAY;
+ opts->min_send_delay = defGetMinSendDelay(defel);
+ }
else
ereport(ERROR,
(errcode(ERRCODE_SYNTAX_ERROR),
@@ -404,6 +417,29 @@ parse_subscription_options(ParseState *pstate, List *stmt_options,
"slot_name = NONE", "create_slot = false")));
}
}
+
+ /*
+ * The combination of parallel streaming mode and min_send_delay is not
+ * allowed. This is because in parallel streaming mode, the walsender
+ * starts sending the transaction stream without knowing the prepare/commit
+ * time of the transaction. Always waiting for the full 'min_send_delay'
+ * time to send may introduce unnecessary delay.
+ *
+ * The other possibility was to wait sending COMMIT record of the parallel
+ * apply transaction but that would cause issues related to resource bloat
+ * and locks being held for a long time.
+ */
+ if (IsSet(supported_opts, SUBOPT_MIN_SEND_DELAY) &&
+ opts->min_send_delay > 0 &&
+ opts->streaming == LOGICALREP_STREAM_PARALLEL)
+ ereport(ERROR,
+ errcode(ERRCODE_SYNTAX_ERROR),
+ /*
+ * translator: the first %s is a string of the form "parameter > 0"
+ * and the second one is "option = value".
+ */
+ errmsg("%s and %s are mutually exclusive options",
+ "min_send_delay > 0", "streaming = parallel"));
}
/*
@@ -560,7 +596,8 @@ CreateSubscription(ParseState *pstate, CreateSubscriptionStmt *stmt,
SUBOPT_SLOT_NAME | SUBOPT_COPY_DATA |
SUBOPT_SYNCHRONOUS_COMMIT | SUBOPT_BINARY |
SUBOPT_STREAMING | SUBOPT_TWOPHASE_COMMIT |
- SUBOPT_DISABLE_ON_ERR | SUBOPT_ORIGIN);
+ SUBOPT_DISABLE_ON_ERR | SUBOPT_ORIGIN |
+ SUBOPT_MIN_SEND_DELAY);
parse_subscription_options(pstate, stmt->options, supported_opts, &opts);
/*
@@ -628,6 +665,7 @@ CreateSubscription(ParseState *pstate, CreateSubscriptionStmt *stmt,
values[Anum_pg_subscription_subname - 1] =
DirectFunctionCall1(namein, CStringGetDatum(stmt->subname));
values[Anum_pg_subscription_subowner - 1] = ObjectIdGetDatum(owner);
+ values[Anum_pg_subscription_subminsenddelay - 1] = Int32GetDatum(opts.min_send_delay);
values[Anum_pg_subscription_subenabled - 1] = BoolGetDatum(opts.enabled);
values[Anum_pg_subscription_subbinary - 1] = BoolGetDatum(opts.binary);
values[Anum_pg_subscription_substream - 1] = CharGetDatum(opts.streaming);
@@ -1054,7 +1092,7 @@ AlterSubscription(ParseState *pstate, AlterSubscriptionStmt *stmt,
supported_opts = (SUBOPT_SLOT_NAME |
SUBOPT_SYNCHRONOUS_COMMIT | SUBOPT_BINARY |
SUBOPT_STREAMING | SUBOPT_DISABLE_ON_ERR |
- SUBOPT_ORIGIN);
+ SUBOPT_ORIGIN | SUBOPT_MIN_SEND_DELAY);
parse_subscription_options(pstate, stmt->options,
supported_opts, &opts);
@@ -1098,6 +1136,19 @@ AlterSubscription(ParseState *pstate, AlterSubscriptionStmt *stmt,
if (IsSet(opts.specified_opts, SUBOPT_STREAMING))
{
+ /*
+ * The combination of parallel streaming mode and
+ * min_send_delay is not allowed. See
+ * parse_subscription_options.
+ */
+ if (opts.streaming == LOGICALREP_STREAM_PARALLEL &&
+ !IsSet(opts.specified_opts, SUBOPT_MIN_SEND_DELAY) &&
+ sub->minsenddelay > 0)
+ ereport(ERROR,
+ errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
+ errmsg("cannot set parallel streaming mode for subscription with %s",
+ "min_send_delay"));
+
values[Anum_pg_subscription_substream - 1] =
CharGetDatum(opts.streaming);
replaces[Anum_pg_subscription_substream - 1] = true;
@@ -1111,6 +1162,26 @@ AlterSubscription(ParseState *pstate, AlterSubscriptionStmt *stmt,
= true;
}
+ if (IsSet(opts.specified_opts, SUBOPT_MIN_SEND_DELAY))
+ {
+ /*
+ * The combination of parallel streaming mode and
+ * min_send_delay is not allowed. See
+ * parse_subscription_options.
+ */
+ if (opts.min_send_delay > 0 &&
+ !IsSet(opts.specified_opts, SUBOPT_STREAMING) &&
+ sub->stream == LOGICALREP_STREAM_PARALLEL)
+ ereport(ERROR,
+ errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
+ errmsg("cannot set %s for subscription in parallel streaming mode",
+ "min_send_delay"));
+
+ values[Anum_pg_subscription_subminsenddelay - 1] =
+ Int32GetDatum(opts.min_send_delay);
+ replaces[Anum_pg_subscription_subminsenddelay - 1] = true;
+ }
+
if (IsSet(opts.specified_opts, SUBOPT_ORIGIN))
{
values[Anum_pg_subscription_suborigin - 1] =
@@ -2195,3 +2266,45 @@ defGetStreamingMode(DefElem *def)
def->defname)));
return LOGICALREP_STREAM_OFF; /* keep compiler quiet */
}
+
+/*
+ * Extract the min_send_delay value from a DefElem. This is very similar to
+ * parse_and_validate_value() for integer values, because min_send_delay
+ * accepts the same parameter format as recovery_min_apply_delay.
+ */
+static int32
+defGetMinSendDelay(DefElem *def)
+{
+ char *input_string;
+ int result;
+ const char *hintmsg;
+
+ input_string = defGetString(def);
+
+ /*
+ * Parse given string as parameter which has millisecond unit
+ */
+ if (!parse_int(input_string, &result, GUC_UNIT_MS, &hintmsg))
+ ereport(ERROR,
+ (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+ errmsg("invalid value for parameter \"%s\": \"%s\"",
+ "min_send_delay", input_string),
+ hintmsg ? errhint("%s", _(hintmsg)) : 0));
+
+ /*
+ * Check both the lower boundary for the valid min_send_delay range and
+ * the upper boundary as the safeguard for some platforms where INT_MAX is
+ * wider than int32 respectively. Although parse_int() has confirmed that
+ * the result is less than or equal to INT_MAX, the value will be stored
+ * in a catalog column of int32.
+ */
+ if (result < 0 || result > PG_INT32_MAX)
+ ereport(ERROR,
+ (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+ errmsg("%d ms is outside the valid range for parameter \"%s\" (%d .. %d)",
+ result,
+ "min_send_delay",
+ 0, PG_INT32_MAX)));
+
+ return result;
+}
diff --git a/src/backend/replication/libpqwalreceiver/libpqwalreceiver.c b/src/backend/replication/libpqwalreceiver/libpqwalreceiver.c
index 560ec974fa..89a72c1abe 100644
--- a/src/backend/replication/libpqwalreceiver/libpqwalreceiver.c
+++ b/src/backend/replication/libpqwalreceiver/libpqwalreceiver.c
@@ -443,6 +443,11 @@ libpqrcv_startstreaming(WalReceiverConn *conn,
PQserverVersion(conn->streamConn) >= 140000)
appendStringInfoString(&cmd, ", binary 'true'");
+ if (options->proto.logical.min_send_delay > 0 &&
+ PQserverVersion(conn->streamConn) >= 160000)
+ appendStringInfo(&cmd, ", min_send_delay '%d'",
+ options->proto.logical.min_send_delay);
+
appendStringInfoChar(&cmd, ')');
}
else
diff --git a/src/backend/replication/logical/decode.c b/src/backend/replication/logical/decode.c
index 8fe7bb65f1..fe969e7bab 100644
--- a/src/backend/replication/logical/decode.c
+++ b/src/backend/replication/logical/decode.c
@@ -676,6 +676,15 @@ DecodeCommit(LogicalDecodingContext *ctx, XLogRecordBuffer *buf,
buf->origptr, buf->endptr);
}
+ /*
+ * Delay sending the changes if required. For streaming transactions,
+ * this means a delay in sending the last stream but that is OK
+ * because on the downstream the changes will be applied only after
+ * receiving the last stream.
+ */
+ if (ctx->min_send_delay > 0 && ctx->delay_send)
+ ctx->delay_send(ctx, xid, commit_time);
+
/*
* Send the final commit record if the transaction data is already
* decoded, otherwise, process the entire transaction.
diff --git a/src/backend/replication/logical/logical.c b/src/backend/replication/logical/logical.c
index c3ec97a0a6..e4dd822cdc 100644
--- a/src/backend/replication/logical/logical.c
+++ b/src/backend/replication/logical/logical.c
@@ -156,7 +156,8 @@ StartupDecodingContext(List *output_plugin_options,
XLogReaderRoutine *xl_routine,
LogicalOutputPluginWriterPrepareWrite prepare_write,
LogicalOutputPluginWriterWrite do_write,
- LogicalOutputPluginWriterUpdateProgress update_progress)
+ LogicalOutputPluginWriterUpdateProgress update_progress,
+ LogicalOutputPluginWriterDelay delay_send)
{
ReplicationSlot *slot;
MemoryContext context,
@@ -293,6 +294,7 @@ StartupDecodingContext(List *output_plugin_options,
ctx->prepare_write = prepare_write;
ctx->write = do_write;
ctx->update_progress = update_progress;
+ ctx->delay_send = delay_send;
ctx->output_plugin_options = output_plugin_options;
@@ -316,7 +318,7 @@ StartupDecodingContext(List *output_plugin_options,
* marking WAL reserved beforehand. In that scenario, it's up to the
* caller to guarantee that WAL remains available.
* xl_routine -- XLogReaderRoutine for underlying XLogReader
- * prepare_write, do_write, update_progress --
+ * prepare_write, do_write, update_progress, delay_send --
* callbacks that perform the use-case dependent, actual, work.
*
* Needs to be called while in a memory context that's at least as long lived
@@ -334,7 +336,8 @@ CreateInitDecodingContext(const char *plugin,
XLogReaderRoutine *xl_routine,
LogicalOutputPluginWriterPrepareWrite prepare_write,
LogicalOutputPluginWriterWrite do_write,
- LogicalOutputPluginWriterUpdateProgress update_progress)
+ LogicalOutputPluginWriterUpdateProgress update_progress,
+ LogicalOutputPluginWriterDelay delay_send)
{
TransactionId xmin_horizon = InvalidTransactionId;
ReplicationSlot *slot;
@@ -435,7 +438,7 @@ CreateInitDecodingContext(const char *plugin,
ctx = StartupDecodingContext(NIL, restart_lsn, xmin_horizon,
need_full_snapshot, false,
xl_routine, prepare_write, do_write,
- update_progress);
+ update_progress, delay_send);
/* call output plugin initialization callback */
old_context = MemoryContextSwitchTo(ctx->context);
@@ -475,7 +478,7 @@ CreateInitDecodingContext(const char *plugin,
* xl_routine
* XLogReaderRoutine used by underlying xlogreader
*
- * prepare_write, do_write, update_progress
+ * prepare_write, do_write, update_progress, delay_send
* callbacks that have to be filled to perform the use-case dependent,
* actual work.
*
@@ -493,7 +496,8 @@ CreateDecodingContext(XLogRecPtr start_lsn,
XLogReaderRoutine *xl_routine,
LogicalOutputPluginWriterPrepareWrite prepare_write,
LogicalOutputPluginWriterWrite do_write,
- LogicalOutputPluginWriterUpdateProgress update_progress)
+ LogicalOutputPluginWriterUpdateProgress update_progress,
+ LogicalOutputPluginWriterDelay delay_send)
{
LogicalDecodingContext *ctx;
ReplicationSlot *slot;
@@ -547,7 +551,7 @@ CreateDecodingContext(XLogRecPtr start_lsn,
ctx = StartupDecodingContext(output_plugin_options,
start_lsn, InvalidTransactionId, false,
fast_forward, xl_routine, prepare_write,
- do_write, update_progress);
+ do_write, update_progress, delay_send);
/* call output plugin initialization callback */
old_context = MemoryContextSwitchTo(ctx->context);
diff --git a/src/backend/replication/logical/logicalfuncs.c b/src/backend/replication/logical/logicalfuncs.c
index fa1b641a2b..960025197f 100644
--- a/src/backend/replication/logical/logicalfuncs.c
+++ b/src/backend/replication/logical/logicalfuncs.c
@@ -212,7 +212,7 @@ pg_logical_slot_get_changes_guts(FunctionCallInfo fcinfo, bool confirm, bool bin
.segment_open = wal_segment_open,
.segment_close = wal_segment_close),
LogicalOutputPrepareWrite,
- LogicalOutputWrite, NULL);
+ LogicalOutputWrite, NULL, NULL);
/*
* After the sanity checks in CreateDecodingContext, make sure the
diff --git a/src/backend/replication/logical/worker.c b/src/backend/replication/logical/worker.c
index cfb2ab6248..3cde6d37fa 100644
--- a/src/backend/replication/logical/worker.c
+++ b/src/backend/replication/logical/worker.c
@@ -3898,7 +3898,8 @@ maybe_reread_subscription(void)
newsub->stream != MySubscription->stream ||
strcmp(newsub->origin, MySubscription->origin) != 0 ||
newsub->owner != MySubscription->owner ||
- !equal(newsub->publications, MySubscription->publications))
+ !equal(newsub->publications, MySubscription->publications) ||
+ newsub->minsenddelay != MySubscription->minsenddelay)
{
if (am_parallel_apply_worker())
ereport(LOG,
@@ -4617,9 +4618,20 @@ ApplyWorkerMain(Datum main_arg)
options.proto.logical.twophase = false;
options.proto.logical.origin = pstrdup(MySubscription->origin);
+ options.proto.logical.min_send_delay = 0;
if (!am_tablesync_worker())
{
+ /*
+ * We support time-delayed logical replication only for the apply
+ * worker. This is because if we support delay during the initial sync
+ * then once we reach the limit of tablesync workers it would impose a
+ * delay for each subsequent worker. That would cause initial table
+ * synchronization completion to take a long time.
+ */
+ if (server_version >= 160000 && MySubscription->minsenddelay > 0)
+ options.proto.logical.min_send_delay = MySubscription->minsenddelay;
+
/*
* Even when the two_phase mode is requested by the user, it remains
* as the tri-state PENDING until all tablesyncs have reached READY
diff --git a/src/backend/replication/pgoutput/pgoutput.c b/src/backend/replication/pgoutput/pgoutput.c
index 0df1acbb7a..90bbea941b 100644
--- a/src/backend/replication/pgoutput/pgoutput.c
+++ b/src/backend/replication/pgoutput/pgoutput.c
@@ -285,6 +285,7 @@ parse_output_parameters(List *options, PGOutputData *data)
bool streaming_given = false;
bool two_phase_option_given = false;
bool origin_option_given = false;
+ bool min_send_delay_option_given = false;
data->binary = false;
data->streaming = LOGICALREP_STREAM_OFF;
@@ -396,6 +397,32 @@ parse_output_parameters(List *options, PGOutputData *data)
errcode(ERRCODE_INVALID_PARAMETER_VALUE),
errmsg("unrecognized origin value: \"%s\"", data->origin));
}
+ else if (strcmp(defel->defname, "min_send_delay") == 0)
+ {
+ unsigned long delay_val;
+ char *endptr;
+
+ if (min_send_delay_option_given)
+ ereport(ERROR,
+ errcode(ERRCODE_SYNTAX_ERROR),
+ errmsg("conflicting or redundant options"));
+ min_send_delay_option_given = true;
+
+ errno = 0;
+ delay_val = strtoul(strVal(defel->arg), &endptr, 10);
+ if (errno != 0 || *endptr != '\0')
+ ereport(ERROR,
+ (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+ errmsg("invalid min_send_delay")));
+
+ if (delay_val > PG_INT32_MAX)
+ ereport(ERROR,
+ (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+ errmsg("min_send_delay \"%s\" out of range",
+ strVal(defel->arg))));
+
+ data->min_send_delay = (int32) delay_val;
+ }
else
elog(ERROR, "unrecognized pgoutput option: %s", defel->defname);
}
@@ -502,6 +529,12 @@ pgoutput_startup(LogicalDecodingContext *ctx, OutputPluginOptions *opt,
else
ctx->twophase_opt_given = true;
+ /*
+ * Remember the delay time period to be used later before sending the
+ * changes.
+ */
+ ctx->min_send_delay = data->min_send_delay;
+
/* Init publication state. */
data->publications = NIL;
publications_valid = false;
diff --git a/src/backend/replication/slotfuncs.c b/src/backend/replication/slotfuncs.c
index 2f3c964824..522f7600a1 100644
--- a/src/backend/replication/slotfuncs.c
+++ b/src/backend/replication/slotfuncs.c
@@ -148,7 +148,7 @@ create_logical_replication_slot(char *name, char *plugin,
XL_ROUTINE(.page_read = read_local_xlog_page,
.segment_open = wal_segment_open,
.segment_close = wal_segment_close),
- NULL, NULL, NULL);
+ NULL, NULL, NULL, NULL);
/*
* If caller needs us to determine the decoding start point, do so now.
@@ -481,7 +481,7 @@ pg_logical_replication_slot_advance(XLogRecPtr moveto)
XL_ROUTINE(.page_read = read_local_xlog_page,
.segment_open = wal_segment_open,
.segment_close = wal_segment_close),
- NULL, NULL, NULL);
+ NULL, NULL, NULL, NULL);
/*
* Start reading at the slot's restart_lsn, which we know to point to
diff --git a/src/backend/replication/walsender.c b/src/backend/replication/walsender.c
index 75e8363e24..ad3515821e 100644
--- a/src/backend/replication/walsender.c
+++ b/src/backend/replication/walsender.c
@@ -252,6 +252,7 @@ static void WalSndPrepareWrite(LogicalDecodingContext *ctx, XLogRecPtr lsn, Tran
static void WalSndWriteData(LogicalDecodingContext *ctx, XLogRecPtr lsn, TransactionId xid, bool last_write);
static void WalSndUpdateProgress(LogicalDecodingContext *ctx, XLogRecPtr lsn, TransactionId xid,
bool skipped_xact);
+static void WalSndDelay(LogicalDecodingContext *ctx, TransactionId xid, TimestampTz delay_start);
static XLogRecPtr WalSndWaitForWal(XLogRecPtr loc);
static void LagTrackerWrite(XLogRecPtr lsn, TimestampTz local_flush_time);
static TimeOffset LagTrackerRead(int head, XLogRecPtr lsn, TimestampTz now);
@@ -1126,7 +1127,7 @@ CreateReplicationSlot(CreateReplicationSlotCmd *cmd)
.segment_open = WalSndSegmentOpen,
.segment_close = wal_segment_close),
WalSndPrepareWrite, WalSndWriteData,
- WalSndUpdateProgress);
+ WalSndUpdateProgress, WalSndDelay);
/*
* Signal that we don't need the timeout mechanism. We're just
@@ -1285,7 +1286,7 @@ StartLogicalReplication(StartReplicationCmd *cmd)
.segment_open = WalSndSegmentOpen,
.segment_close = wal_segment_close),
WalSndPrepareWrite, WalSndWriteData,
- WalSndUpdateProgress);
+ WalSndUpdateProgress, WalSndDelay);
xlogreader = logical_decoding_ctx->reader;
WalSndSetState(WALSNDSTATE_CATCHUP);
@@ -3849,3 +3850,88 @@ LagTrackerRead(int head, XLogRecPtr lsn, TimestampTz now)
Assert(time != 0);
return now - time;
}
+
+/*
+ * LogicalDecodingContext 'delay' callback.
+ *
+ * Wait long enough to make sure a transaction is applied at least
+ * min_send_delay time period after it is performed at the publisher.
+ *
+ * delay_start is the transaction end time.
+ */
+static void
+WalSndDelay(LogicalDecodingContext *ctx, TransactionId xid, TimestampTz delay_start)
+{
+ /* Apply the delay by the latch mechanism */
+ while (true)
+ {
+ TimestampTz now;
+ TimestampTz delayUntil;
+ long remaining_wait_time_ms;
+ long timeout_sleeptime_ms;
+
+ ResetLatch(MyLatch);
+
+ CHECK_FOR_INTERRUPTS();
+
+ /* This might change wal_sender_timeout */
+ if (ConfigReloadPending)
+ {
+ ConfigReloadPending = false;
+ ProcessConfigFile(PGC_SIGHUP);
+ }
+
+ /* Check for input from the client */
+ ProcessRepliesIfAny();
+
+ /* die if timeout was reached */
+ WalSndCheckTimeOut();
+
+ /* Send keepalive if the time has come */
+ WalSndKeepaliveIfNecessary();
+
+ /* Try to flush pending output to the client */
+ if (pq_flush_if_writable() != 0)
+ WalSndShutdown();
+
+ /*
+ * If we've requested to shut down, exit the process.
+ *
+ * Note that WalSndDone() cannot be used here because the delaying
+ * changes will be sent in the function.
+ */
+ if (got_STOPPING)
+ {
+ QueryCompletion qc;
+
+ /* Inform the standby that XLOG streaming is done */
+ SetQueryCompletion(&qc, CMDTAG_COPY, 0);
+ EndCommand(&qc, DestRemote, false);
+ pq_flush();
+
+ proc_exit(0);
+ }
+
+ now = GetCurrentTimestamp();
+ delayUntil = TimestampTzPlusMilliseconds(delay_start, ctx->min_send_delay);
+ remaining_wait_time_ms = TimestampDifferenceMilliseconds(now, delayUntil);
+
+ /*
+ * Exit without arming the latch if it's already past time to send
+ * this transaction.
+ */
+ if (remaining_wait_time_ms <= 0)
+ break;
+
+ /* Sleep until appropriate time. */
+ timeout_sleeptime_ms = WalSndComputeSleeptime(now);
+
+ elog(DEBUG2, "time-delayed replication for txid %u, delay_time = %d ms, remaining wait time: %ld ms",
+ xid, (int) ctx->min_send_delay, remaining_wait_time_ms);
+
+ /* Sleep until we get reply from worker or we time out */
+ WalSndWait(WL_SOCKET_READABLE,
+ Min(timeout_sleeptime_ms, remaining_wait_time_ms),
+ WAIT_EVENT_WALSENDER_SEND_DELAY);
+ }
+}
diff --git a/src/backend/utils/activity/wait_event.c b/src/backend/utils/activity/wait_event.c
index cb99cc6339..76c19fe11d 100644
--- a/src/backend/utils/activity/wait_event.c
+++ b/src/backend/utils/activity/wait_event.c
@@ -515,6 +515,9 @@ pgstat_get_wait_timeout(WaitEventTimeout w)
case WAIT_EVENT_VACUUM_TRUNCATE:
event_name = "VacuumTruncate";
break;
+ case WAIT_EVENT_WALSENDER_SEND_DELAY:
+ event_name = "WalSenderSendDelay";
+ break;
/* no default case, so that compiler will warn */
}
diff --git a/src/bin/pg_dump/pg_dump.c b/src/bin/pg_dump/pg_dump.c
index 24ba936332..e60293dc0e 100644
--- a/src/bin/pg_dump/pg_dump.c
+++ b/src/bin/pg_dump/pg_dump.c
@@ -4493,6 +4493,7 @@ getSubscriptions(Archive *fout)
int i_subsynccommit;
int i_subpublications;
int i_subbinary;
+ int i_subminsenddelay;
int i,
ntups;
@@ -4545,9 +4546,13 @@ getSubscriptions(Archive *fout)
LOGICALREP_TWOPHASE_STATE_DISABLED);
if (fout->remoteVersion >= 160000)
- appendPQExpBufferStr(query, " s.suborigin\n");
+ appendPQExpBufferStr(query,
+ " s.suborigin,\n"
+ " s.subminsenddelay\n");
else
- appendPQExpBuffer(query, " '%s' AS suborigin\n", LOGICALREP_ORIGIN_ANY);
+ appendPQExpBuffer(query, " '%s' AS suborigin,\n"
+ " 0 AS subminsenddelay\n",
+ LOGICALREP_ORIGIN_ANY);
appendPQExpBufferStr(query,
"FROM pg_subscription s\n"
@@ -4575,6 +4580,7 @@ getSubscriptions(Archive *fout)
i_subtwophasestate = PQfnumber(res, "subtwophasestate");
i_subdisableonerr = PQfnumber(res, "subdisableonerr");
i_suborigin = PQfnumber(res, "suborigin");
+ i_subminsenddelay = PQfnumber(res, "subminsenddelay");
subinfo = pg_malloc(ntups * sizeof(SubscriptionInfo));
@@ -4605,6 +4611,8 @@ getSubscriptions(Archive *fout)
subinfo[i].subdisableonerr =
pg_strdup(PQgetvalue(res, i, i_subdisableonerr));
subinfo[i].suborigin = pg_strdup(PQgetvalue(res, i, i_suborigin));
+ subinfo[i].subminsenddelay =
+ atoi(PQgetvalue(res, i, i_subminsenddelay));
/* Decide whether we want to dump it */
selectDumpableObject(&(subinfo[i].dobj), fout);
@@ -4686,6 +4694,9 @@ dumpSubscription(Archive *fout, const SubscriptionInfo *subinfo)
if (strcmp(subinfo->subsynccommit, "off") != 0)
appendPQExpBuffer(query, ", synchronous_commit = %s", fmtId(subinfo->subsynccommit));
+ if (subinfo->subminsenddelay > 0)
+ appendPQExpBuffer(query, ", min_send_delay = '%d ms'", subinfo->subminsenddelay);
+
appendPQExpBufferStr(query, ");\n");
if (subinfo->dobj.dump & DUMP_COMPONENT_DEFINITION)
diff --git a/src/bin/pg_dump/pg_dump.h b/src/bin/pg_dump/pg_dump.h
index cdca0b993d..4c55f8efc4 100644
--- a/src/bin/pg_dump/pg_dump.h
+++ b/src/bin/pg_dump/pg_dump.h
@@ -660,6 +660,7 @@ typedef struct _SubscriptionInfo
char *subdisableonerr;
char *suborigin;
char *subsynccommit;
+ int subminsenddelay;
char *subpublications;
} SubscriptionInfo;
diff --git a/src/bin/psql/describe.c b/src/bin/psql/describe.c
index c8a0bb7b3a..c7d303a168 100644
--- a/src/bin/psql/describe.c
+++ b/src/bin/psql/describe.c
@@ -6472,7 +6472,7 @@ describeSubscriptions(const char *pattern, bool verbose)
PGresult *res;
printQueryOpt myopt = pset.popt;
static const bool translate_columns[] = {false, false, false, false,
- false, false, false, false, false, false, false, false};
+ false, false, false, false, false, false, false, false, false};
if (pset.sversion < 100000)
{
@@ -6527,10 +6527,13 @@ describeSubscriptions(const char *pattern, bool verbose)
gettext_noop("Two-phase commit"),
gettext_noop("Disable on error"));
+ /* Origin and min_send_delay are only supported in v16 and higher */
if (pset.sversion >= 160000)
appendPQExpBuffer(&buf,
- ", suborigin AS \"%s\"\n",
- gettext_noop("Origin"));
+ ", suborigin AS \"%s\"\n"
+ ", subminsenddelay AS \"%s\"\n",
+ gettext_noop("Origin"),
+ gettext_noop("Min send delay"));
appendPQExpBuffer(&buf,
", subsynccommit AS \"%s\"\n"
diff --git a/src/bin/psql/tab-complete.c b/src/bin/psql/tab-complete.c
index 5e1882eaea..6643db6f55 100644
--- a/src/bin/psql/tab-complete.c
+++ b/src/bin/psql/tab-complete.c
@@ -1925,7 +1925,7 @@ psql_completion(const char *text, int start, int end)
COMPLETE_WITH("(", "PUBLICATION");
/* ALTER SUBSCRIPTION <name> SET ( */
else if (HeadMatches("ALTER", "SUBSCRIPTION", MatchAny) && TailMatches("SET", "("))
- COMPLETE_WITH("binary", "disable_on_error", "origin", "slot_name",
+ COMPLETE_WITH("binary", "disable_on_error", "min_send_delay", "origin", "slot_name",
"streaming", "synchronous_commit");
/* ALTER SUBSCRIPTION <name> SKIP ( */
else if (HeadMatches("ALTER", "SUBSCRIPTION", MatchAny) && TailMatches("SKIP", "("))
@@ -3268,7 +3268,7 @@ psql_completion(const char *text, int start, int end)
/* Complete "CREATE SUBSCRIPTION <name> ... WITH ( <opt>" */
else if (HeadMatches("CREATE", "SUBSCRIPTION") && TailMatches("WITH", "("))
COMPLETE_WITH("binary", "connect", "copy_data", "create_slot",
- "disable_on_error", "enabled", "origin", "slot_name",
+ "disable_on_error", "enabled", "min_send_delay", "origin", "slot_name",
"streaming", "synchronous_commit", "two_phase");
/* CREATE TRIGGER --- is allowed inside CREATE SCHEMA, so use TailMatches */
diff --git a/src/include/catalog/pg_subscription.h b/src/include/catalog/pg_subscription.h
index b0f2a1705d..69ae4314b4 100644
--- a/src/include/catalog/pg_subscription.h
+++ b/src/include/catalog/pg_subscription.h
@@ -74,6 +74,8 @@ CATALOG(pg_subscription,6100,SubscriptionRelationId) BKI_SHARED_RELATION BKI_ROW
Oid subowner BKI_LOOKUP(pg_authid); /* Owner of the subscription */
+ int32 subminsenddelay; /* Replication send delay (ms) */
+
bool subenabled; /* True if the subscription is enabled (the
* worker should be running) */
@@ -122,6 +124,7 @@ typedef struct Subscription
* skipped */
char *name; /* Name of the subscription */
Oid owner; /* Oid of the subscription owner */
+ int32 minsenddelay; /* Replication send delay (ms) */
bool enabled; /* Indicates if the subscription is enabled */
bool binary; /* Indicates if the subscription wants data in
* binary format */
diff --git a/src/include/replication/logical.h b/src/include/replication/logical.h
index 5f49554ea0..c389dc17e5 100644
--- a/src/include/replication/logical.h
+++ b/src/include/replication/logical.h
@@ -30,6 +30,11 @@ typedef void (*LogicalOutputPluginWriterUpdateProgress) (struct LogicalDecodingC
bool skipped_xact
);
+typedef void (*LogicalOutputPluginWriterDelay) (struct LogicalDecodingContext *lr,
+ TransactionId xid,
+ TimestampTz start_time
+);
+
typedef struct LogicalDecodingContext
{
/* memory context this is all allocated in */
@@ -64,6 +69,7 @@ typedef struct LogicalDecodingContext
LogicalOutputPluginWriterPrepareWrite prepare_write;
LogicalOutputPluginWriterWrite write;
LogicalOutputPluginWriterUpdateProgress update_progress;
+ LogicalOutputPluginWriterDelay delay_send;
/*
* Output buffer.
@@ -100,6 +106,12 @@ typedef struct LogicalDecodingContext
*/
bool twophase_opt_given;
+ /*
+ * The minimum delay, in milliseconds, by the publisher before sending all
+ * the changes.
+ */
+ int32 min_send_delay;
+
/*
* State for writing output.
*/
@@ -121,14 +133,16 @@ extern LogicalDecodingContext *CreateInitDecodingContext(const char *plugin,
XLogReaderRoutine *xl_routine,
LogicalOutputPluginWriterPrepareWrite prepare_write,
LogicalOutputPluginWriterWrite do_write,
- LogicalOutputPluginWriterUpdateProgress update_progress);
+ LogicalOutputPluginWriterUpdateProgress update_progress,
+ LogicalOutputPluginWriterDelay delay_send);
extern LogicalDecodingContext *CreateDecodingContext(XLogRecPtr start_lsn,
List *output_plugin_options,
bool fast_forward,
XLogReaderRoutine *xl_routine,
LogicalOutputPluginWriterPrepareWrite prepare_write,
LogicalOutputPluginWriterWrite do_write,
- LogicalOutputPluginWriterUpdateProgress update_progress);
+ LogicalOutputPluginWriterUpdateProgress update_progress,
+ LogicalOutputPluginWriterDelay delay_send);
extern void DecodingContextFindStartpoint(LogicalDecodingContext *ctx);
extern bool DecodingContextReady(LogicalDecodingContext *ctx);
extern void FreeDecodingContext(LogicalDecodingContext *ctx);
diff --git a/src/include/replication/pgoutput.h b/src/include/replication/pgoutput.h
index b4a8015403..d2fde09e00 100644
--- a/src/include/replication/pgoutput.h
+++ b/src/include/replication/pgoutput.h
@@ -30,6 +30,7 @@ typedef struct PGOutputData
bool messages;
bool two_phase;
char *origin;
+ int32 min_send_delay;
} PGOutputData;
#endif /* PGOUTPUT_H */
diff --git a/src/include/replication/walreceiver.h b/src/include/replication/walreceiver.h
index decffe352d..c20969aed7 100644
--- a/src/include/replication/walreceiver.h
+++ b/src/include/replication/walreceiver.h
@@ -187,6 +187,7 @@ typedef struct
* prepare time */
char *origin; /* Only publish data originating from the
* specified origin */
+ int32 min_send_delay; /* The minimum send delay */
} logical;
} proto;
} WalRcvStreamOptions;
diff --git a/src/include/utils/wait_event.h b/src/include/utils/wait_event.h
index 9ab23e1c4a..cc3a234eba 100644
--- a/src/include/utils/wait_event.h
+++ b/src/include/utils/wait_event.h
@@ -150,7 +150,8 @@ typedef enum
WAIT_EVENT_REGISTER_SYNC_REQUEST,
WAIT_EVENT_SPIN_DELAY,
WAIT_EVENT_VACUUM_DELAY,
- WAIT_EVENT_VACUUM_TRUNCATE
+ WAIT_EVENT_VACUUM_TRUNCATE,
+ WAIT_EVENT_WALSENDER_SEND_DELAY
} WaitEventTimeout;
/* ----------
diff --git a/src/test/regress/expected/subscription.out b/src/test/regress/expected/subscription.out
index 3f99b14394..2027316233 100644
--- a/src/test/regress/expected/subscription.out
+++ b/src/test/regress/expected/subscription.out
@@ -114,18 +114,18 @@ CREATE SUBSCRIPTION regress_testsub4 CONNECTION 'dbname=regress_doesnotexist' PU
WARNING: subscription was created, but is not connected
HINT: To initiate replication, you must manually create the replication slot, enable the subscription, and refresh the subscription.
\dRs+ regress_testsub4
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
-------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub4 | regress_subscription_user | f | {testpub} | f | off | d | f | none | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min send delay | Synchronous commit | Conninfo | Skip LSN
+------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+----------------+--------------------+-----------------------------+----------
+ regress_testsub4 | regress_subscription_user | f | {testpub} | f | off | d | f | none | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
ALTER SUBSCRIPTION regress_testsub4 SET (origin = any);
\dRs+ regress_testsub4
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
-------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub4 | regress_subscription_user | f | {testpub} | f | off | d | f | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min send delay | Synchronous commit | Conninfo | Skip LSN
+------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+----------------+--------------------+-----------------------------+----------
+ regress_testsub4 | regress_subscription_user | f | {testpub} | f | off | d | f | any | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
DROP SUBSCRIPTION regress_testsub3;
@@ -143,10 +143,10 @@ ALTER SUBSCRIPTION regress_testsub CONNECTION 'foobar';
ERROR: invalid connection string syntax: missing "=" after "foobar" in connection info string
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub} | f | off | d | f | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min send delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+----------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | off | d | f | any | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
ALTER SUBSCRIPTION regress_testsub SET PUBLICATION testpub2, testpub3 WITH (refresh = false);
@@ -163,10 +163,10 @@ ERROR: unrecognized subscription parameter: "create_slot"
-- ok
ALTER SUBSCRIPTION regress_testsub SKIP (lsn = '0/12345');
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+---------------------+--------+-----------+------------------+------------------+--------+--------------------+------------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub2,testpub3} | f | off | d | f | any | off | dbname=regress_doesnotexist2 | 0/12345
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min send delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+---------------------+--------+-----------+------------------+------------------+--------+----------------+--------------------+------------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub2,testpub3} | f | off | d | f | any | 0 | off | dbname=regress_doesnotexist2 | 0/12345
(1 row)
-- ok - with lsn = NONE
@@ -175,10 +175,10 @@ ALTER SUBSCRIPTION regress_testsub SKIP (lsn = NONE);
ALTER SUBSCRIPTION regress_testsub SKIP (lsn = '0/0');
ERROR: invalid WAL location (LSN): 0/0
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+---------------------+--------+-----------+------------------+------------------+--------+--------------------+------------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub2,testpub3} | f | off | d | f | any | off | dbname=regress_doesnotexist2 | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min send delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+---------------------+--------+-----------+------------------+------------------+--------+----------------+--------------------+------------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub2,testpub3} | f | off | d | f | any | 0 | off | dbname=regress_doesnotexist2 | 0/0
(1 row)
BEGIN;
@@ -210,10 +210,10 @@ ALTER SUBSCRIPTION regress_testsub_foo SET (synchronous_commit = foobar);
ERROR: invalid value for parameter "synchronous_commit": "foobar"
HINT: Available values: local, remote_write, remote_apply, on, off.
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
----------------------+---------------------------+---------+---------------------+--------+-----------+------------------+------------------+--------+--------------------+------------------------------+----------
- regress_testsub_foo | regress_subscription_user | f | {testpub2,testpub3} | f | off | d | f | any | local | dbname=regress_doesnotexist2 | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min send delay | Synchronous commit | Conninfo | Skip LSN
+---------------------+---------------------------+---------+---------------------+--------+-----------+------------------+------------------+--------+----------------+--------------------+------------------------------+----------
+ regress_testsub_foo | regress_subscription_user | f | {testpub2,testpub3} | f | off | d | f | any | 0 | local | dbname=regress_doesnotexist2 | 0/0
(1 row)
-- rename back to keep the rest simple
@@ -247,19 +247,19 @@ CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUB
WARNING: subscription was created, but is not connected
HINT: To initiate replication, you must manually create the replication slot, enable the subscription, and refresh the subscription.
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub} | t | off | d | f | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min send delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+----------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | t | off | d | f | any | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
ALTER SUBSCRIPTION regress_testsub SET (binary = false);
ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE);
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub} | f | off | d | f | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min send delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+----------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | off | d | f | any | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
DROP SUBSCRIPTION regress_testsub;
@@ -271,27 +271,27 @@ CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUB
WARNING: subscription was created, but is not connected
HINT: To initiate replication, you must manually create the replication slot, enable the subscription, and refresh the subscription.
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub} | f | on | d | f | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min send delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+----------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | on | d | f | any | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
ALTER SUBSCRIPTION regress_testsub SET (streaming = parallel);
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub} | f | parallel | d | f | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min send delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+----------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | parallel | d | f | any | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
ALTER SUBSCRIPTION regress_testsub SET (streaming = false);
ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE);
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub} | f | off | d | f | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min send delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+----------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | off | d | f | any | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
-- fail - publication already exists
@@ -306,10 +306,10 @@ ALTER SUBSCRIPTION regress_testsub ADD PUBLICATION testpub1, testpub2 WITH (refr
ALTER SUBSCRIPTION regress_testsub ADD PUBLICATION testpub1, testpub2 WITH (refresh = false);
ERROR: publication "testpub1" is already in subscription "regress_testsub"
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-----------------------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub,testpub1,testpub2} | f | off | d | f | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min send delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-----------------------------+--------+-----------+------------------+------------------+--------+----------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub,testpub1,testpub2} | f | off | d | f | any | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
-- fail - publication used more than once
@@ -324,10 +324,10 @@ ERROR: publication "testpub3" is not in subscription "regress_testsub"
-- ok - delete publications
ALTER SUBSCRIPTION regress_testsub DROP PUBLICATION testpub1, testpub2 WITH (refresh = false);
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub} | f | off | d | f | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min send delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+----------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | off | d | f | any | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
DROP SUBSCRIPTION regress_testsub;
@@ -363,10 +363,10 @@ CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUB
WARNING: subscription was created, but is not connected
HINT: To initiate replication, you must manually create the replication slot, enable the subscription, and refresh the subscription.
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub} | f | off | p | f | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min send delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+----------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | off | p | f | any | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
--fail - alter of two_phase option not supported.
@@ -375,10 +375,10 @@ ERROR: unrecognized subscription parameter: "two_phase"
-- but can alter streaming when two_phase enabled
ALTER SUBSCRIPTION regress_testsub SET (streaming = true);
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub} | f | on | p | f | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min send delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+----------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | on | p | f | any | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE);
@@ -388,10 +388,10 @@ CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUB
WARNING: subscription was created, but is not connected
HINT: To initiate replication, you must manually create the replication slot, enable the subscription, and refresh the subscription.
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub} | f | on | p | f | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min send delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+----------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | on | p | f | any | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE);
@@ -404,20 +404,61 @@ CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUB
WARNING: subscription was created, but is not connected
HINT: To initiate replication, you must manually create the replication slot, enable the subscription, and refresh the subscription.
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub} | f | off | d | f | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min send delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+----------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | off | d | f | any | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
ALTER SUBSCRIPTION regress_testsub SET (disable_on_error = true);
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub} | f | off | d | t | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min send delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+----------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | off | d | t | any | 0 | off | dbname=regress_doesnotexist | 0/0
+(1 row)
+
+ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE);
+DROP SUBSCRIPTION regress_testsub;
+-- fail -- min_send_delay must be a non-negative integer
+CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexit' PUBLICATION testpub WITH (connect = false, min_send_delay = foo);
+ERROR: invalid value for parameter "min_send_delay": "foo"
+CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexit' PUBLICATION testpub WITH (connect = false, min_send_delay = -1);
+ERROR: -1 ms is outside the valid range for parameter "min_send_delay" (0 .. 2147483647)
+-- fail - specifying streaming = parallel with time-delayed replication is not
+-- supported
+CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUBLICATION testpub WITH (connect = false, streaming = parallel, min_send_delay = 123);
+ERROR: min_send_delay > 0 and streaming = parallel are mutually exclusive options
+-- success -- min_send_delay value without units is taken as milliseconds
+CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexit' PUBLICATION testpub WITH (connect = false, min_send_delay = 123);
+WARNING: subscription was created, but is not connected
+HINT: To initiate replication, you must manually create the replication slot, enable the subscription, and refresh the subscription.
+\dRs+
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min send delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+----------------+--------------------+----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | off | d | f | any | 123 | off | dbname=regress_doesnotexit | 0/0
+(1 row)
+
+-- success -- min_send_delay value with units other than ms is converted to ms
+-- and stored as an integer
+ALTER SUBSCRIPTION regress_testsub SET (min_send_delay = '1 d');
+\dRs+
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min send delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+----------------+--------------------+----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | off | d | f | any | 86400000 | off | dbname=regress_doesnotexit | 0/0
(1 row)
+-- fail - alter subscription with streaming = parallel should fail when
+-- time-delayed replication is set
+ALTER SUBSCRIPTION regress_testsub SET (streaming = parallel);
+ERROR: cannot set parallel streaming mode for subscription with min_send_delay
+-- fail - alter subscription with min_send_delay should fail when
+-- streaming = parallel is set
+ALTER SUBSCRIPTION regress_testsub SET (min_send_delay = 0, streaming = parallel);
+ALTER SUBSCRIPTION regress_testsub SET (min_send_delay = 123);
+ERROR: cannot set min_send_delay for subscription in parallel streaming mode
ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE);
DROP SUBSCRIPTION regress_testsub;
RESET SESSION AUTHORIZATION;
diff --git a/src/test/regress/sql/subscription.sql b/src/test/regress/sql/subscription.sql
index 7281f5fee2..46bf4a27d9 100644
--- a/src/test/regress/sql/subscription.sql
+++ b/src/test/regress/sql/subscription.sql
@@ -286,6 +286,35 @@ ALTER SUBSCRIPTION regress_testsub SET (disable_on_error = true);
ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE);
DROP SUBSCRIPTION regress_testsub;
+-- fail -- min_send_delay must be a non-negative integer
+CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexit' PUBLICATION testpub WITH (connect = false, min_send_delay = foo);
+CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexit' PUBLICATION testpub WITH (connect = false, min_send_delay = -1);
+
+-- fail - specifying streaming = parallel with time-delayed replication is not
+-- supported
+CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUBLICATION testpub WITH (connect = false, streaming = parallel, min_send_delay = 123);
+
+-- success -- min_send_delay value without units is taken as milliseconds
+CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexit' PUBLICATION testpub WITH (connect = false, min_send_delay = 123);
+\dRs+
+
+-- success -- min_send_delay value with units other than ms is converted to ms
+-- and stored as an integer
+ALTER SUBSCRIPTION regress_testsub SET (min_send_delay = '1 d');
+\dRs+
+
+-- fail - alter subscription with streaming = parallel should fail when
+-- time-delayed replication is set
+ALTER SUBSCRIPTION regress_testsub SET (streaming = parallel);
+
+-- fail - alter subscription with min_send_delay should fail when
+-- streaming = parallel is set
+ALTER SUBSCRIPTION regress_testsub SET (min_send_delay = 0, streaming = parallel);
+ALTER SUBSCRIPTION regress_testsub SET (min_send_delay = 123);
+
+ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE);
+DROP SUBSCRIPTION regress_testsub;
+
RESET SESSION AUTHORIZATION;
DROP ROLE regress_subscription_user;
DROP ROLE regress_subscription_user2;
diff --git a/src/test/subscription/t/001_rep_changes.pl b/src/test/subscription/t/001_rep_changes.pl
index 91aa068c95..a273e7df68 100644
--- a/src/test/subscription/t/001_rep_changes.pl
+++ b/src/test/subscription/t/001_rep_changes.pl
@@ -515,6 +515,33 @@ $node_publisher->poll_query_until('postgres',
or die
"Timed out while waiting for apply to restart after renaming SUBSCRIPTION";
+# Test time-delayed logical replication
+#
+# If the subscription sets min_send_delay parameter, the walsender will delay
+# the transaction for min_send_delay milliseconds. We verify this by looking
+# at the time difference between a) when tuples are inserted on the publisher,
+# and b) when those changes are replicated on the subscriber. Even on slow
+# machines, this strategy will give predictable behavior.
+
+# Set min_send_delay parameter to 3 seconds
+my $delay = 3;
+$node_subscriber->safe_psql('postgres',
+ "ALTER SUBSCRIPTION tap_sub_renamed SET (min_send_delay = '${delay}s')");
+
+# Before doing the insertion, get the current timestamp that will be
+# used as a comparison base.
+my $publisher_insert_time = time();
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_ins VALUES (generate_series(1101, 1120))");
+
+# The publisher waits for the replication to complete
+$node_publisher->wait_for_catchup('tap_sub_renamed');
+
+# This test is successful only if at least the configured delay has elapsed.
+ok( time() - $publisher_insert_time >= $delay,
+ "subscriber applies changes only after replication delay for non-streaming transaction"
+);
+
# check all the cleanup
$node_subscriber->safe_psql('postgres', "DROP SUBSCRIPTION tap_sub_renamed");
--
2.27.0
Here are some review comments for v9-0001, but these are only very trivial.
======
Commit Message
1.
Nitpick. The new text is jagged-looking. It should wrap at ~80 chars.
~~~
2.
2. Another reason is for that parallel streaming, the transaction will be opened
immediately by the parallel apply worker. Therefore, if the walsender
is delayed in sending the final record of the transaction, the
parallel apply worker must wait to receive it with an open
transaction. This would result in the locks acquired during the
transaction not being released until the min_send_delay has elapsed.
~
The text already said there are "two reasons", and already this is
numbered as reason 2. So it doesn't need to keep saying "Another
reason" here.
"Another reason is for that parallel streaming" --> "For parallel streaming..."
======
src/backend/replication/walsender.c
3. WalSndDelay
+ /* die if timeout was reached */
+ WalSndCheckTimeOut();
Other nearby comments start uppercase, so this should too.
======
src/include/replication/walreceiver.h
4. WalRcvStreamOptions
@@ -187,6 +187,7 @@ typedef struct
* prepare time */
char *origin; /* Only publish data originating from the
* specified origin */
+ int32 min_send_delay; /* The minimum send delay */
} logical;
} proto;
} WalRcvStreamOptions;
~
Should that comment mention the units are "(ms)"
------
Kind Regards,
Peter Smith.
Fujitsu Australia
At Tue, 28 Feb 2023 08:35:11 +0530, Amit Kapila <amit.kapila16@gmail.com> wrote in
On Tue, Feb 28, 2023 at 8:14 AM Kyotaro Horiguchi
<horikyota.ntt@gmail.com> wrote:At Mon, 27 Feb 2023 14:56:19 +0530, Amit Kapila <amit.kapila16@gmail.com> wrote in
The one difference w.r.t recovery_min_apply_delay is that here we will
hold locks for the duration of the delay which didn't seem to be a
good idea. This will also probably lead to more bloat as we will keep
transactions open for a long time. Doing it before DecodePrepare won'tI don't have a concrete picture but could we tell reorder buffer to
retain a PREPAREd transaction until a COMMIT PREPARED comes?Yeah, we could do that and that is what is the behavior unless the
user enables 2PC via 'two_phase' subscription option. But, I don't see
the need to unnecessarily delay the prepare till the commit if a user
has specified 'two_phase' option. It is quite possible that COMMIT
PREPARED happens at a much later time frame than the amount of delay
the user is expecting.
It looks like the user should decide between potential long locks or
extra delays, and this choice ought to be documented.
If
delaying non-prepared transactions until COMMIT is adequate, then the
same thing seems to work for prepared transactions.have such problems. This is the reason that we decide to perform a
delay at the start of the transaction instead at commit/prepare in the
subscriber-side approach.It seems that there are no technical obstacles to do that on the
publisher side. The only observable difference would be that
relatively large prepared transactions may experience noticeable
additional delays. IMHO I don't think it's a good practice
protocol-wise to intentionally choke a stream at the receiving end
when it has not been flow-controlled on the transmitting end.But in this proposal, we are not choking/delaying anything on the receiving end.
I didn't say that to the latest patch. I interpreted the quote of
your description as saying that the subscriber-side solution is
effective in solving the long-lock problems, so I replied that that
can be solved with the publisher-side solution and the subscriber-side
solution could cause some unwanted behavior.
Do you think we have decided to go with the publisher-side solution?
I'm fine if so.
regards.
--
Kyotaro Horiguchi
NTT Open Source Software Center
On Wed, Mar 1, 2023 at 12:51 AM Hayato Kuroda (Fujitsu)
<kuroda.hayato@fujitsu.com> wrote:
Dear Amit,
Few comments:
Thank you for reviewing! PSA new version.
Note that the starting point of delay for 2PC was not changed,
I think it has been under discussion.1. + /* + * If we've requested to shut down, exit the process. + * + * Note that WalSndDone() cannot be used here because the delaying + * changes will be sent in the function. + */ + if (got_STOPPING) + { + QueryCompletion qc; + + /* Inform the standby that XLOG streaming is done */ + SetQueryCompletion(&qc, CMDTAG_COPY, 0); + EndCommand(&qc, DestRemote, false); + pq_flush();Do we really need to do anything except for breaking the loop and let
the exit handling happen in the main loop when 'got_STOPPING' is set?
AFAICS, this is what we are doing in some other palces (See
WalSndWaitForWal). Won't that work? It seems that will help us sending
all the pending WAL.If we exit the loop after got_STOPPING is set, as you said, the walsender will
send delaying changes and then exit. The behavior is same as the case that WalSndDone()
is called. But I think it is not suitable for the motivation of the feature.
If users notice the miss operation like TRUNCATE, they must shut down the publisher
once and then recovery from back up or old subscriber. If the walsender sends all
pending changes, miss operations will be also propagated to subscriber and data
cannot be protected. So currently I want to keep the style.
FYI - In case of physical replication, received WALs are not applied when the
secondary is shutted down.2. + /* Try to flush pending output to the client */ + if (pq_flush_if_writable() != 0) + WalSndShutdown();Is there a reason to try flushing here?
IIUC if pq_flush_if_writable() returns non-zero (EOF), it means that there is a
trouble and walsender fails to send messages to subscriber.In Linux, the stuck trace from pq_flush_if_writable() will finally reach the send() system call.
And according to man page[1], it will be triggered by some unexpected state or the connection is closed.Based on above, I think the returned value should be confirmed.
Apart from the above, I have made a few changes in the comments in the
attached diff patch. If you agree with those then please include them
in the next version.Thanks! I checked and I thought all of them should be included.
Moreover, I used grammar checker and slightly reworded the commit message.
Thinking of side effects of this feature (no matter where we delay
applying the changes), on the publisher, vacuum cannot collect garbage
and WAL cannot be recycled. Is that okay in the first place? The point
is that the subscription setting affects the publisher. That is,
min_send_delay is specified on the subscriber but the symptoms that
could ultimately lead to a server crash appear on the publisher, which
sounds dangerous to me.
Imagine a service or system like where there is a publication server
and it's somewhat exposed so that a user (or a subsystem) arbitrarily
can create a subscriber to replicate a subset of the data. A malicious
user can have the publisher crash by creating a subscription with,
say, min_send_delay = 20d. max_slot_wal_keep_size helps this situation
but it's -1 by default.
Regards,
--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com
On Wed, Mar 1, 2023 at 8:06 AM Kyotaro Horiguchi
<horikyota.ntt@gmail.com> wrote:
At Tue, 28 Feb 2023 08:35:11 +0530, Amit Kapila <amit.kapila16@gmail.com> wrote in
On Tue, Feb 28, 2023 at 8:14 AM Kyotaro Horiguchi
<horikyota.ntt@gmail.com> wrote:At Mon, 27 Feb 2023 14:56:19 +0530, Amit Kapila <amit.kapila16@gmail.com> wrote in
The one difference w.r.t recovery_min_apply_delay is that here we will
hold locks for the duration of the delay which didn't seem to be a
good idea. This will also probably lead to more bloat as we will keep
transactions open for a long time. Doing it before DecodePrepare won'tI don't have a concrete picture but could we tell reorder buffer to
retain a PREPAREd transaction until a COMMIT PREPARED comes?Yeah, we could do that and that is what is the behavior unless the
user enables 2PC via 'two_phase' subscription option. But, I don't see
the need to unnecessarily delay the prepare till the commit if a user
has specified 'two_phase' option. It is quite possible that COMMIT
PREPARED happens at a much later time frame than the amount of delay
the user is expecting.It looks like the user should decide between potential long locks or
extra delays, and this choice ought to be documented.
Sure, we can do that. However, I think the way this feature works is
that we keep standby/subscriber behind the primary/publisher by a
certain time period and if there is any unwanted transaction (say
Delete * .. without where clause), we can recover it from the receiver
side. So, it may not matter much even if we wait at PREPARE to avoid
long locks instead of documenting it.
If
delaying non-prepared transactions until COMMIT is adequate, then the
same thing seems to work for prepared transactions.have such problems. This is the reason that we decide to perform a
delay at the start of the transaction instead at commit/prepare in the
subscriber-side approach.It seems that there are no technical obstacles to do that on the
publisher side. The only observable difference would be that
relatively large prepared transactions may experience noticeable
additional delays. IMHO I don't think it's a good practice
protocol-wise to intentionally choke a stream at the receiving end
when it has not been flow-controlled on the transmitting end.But in this proposal, we are not choking/delaying anything on the receiving end.
I didn't say that to the latest patch. I interpreted the quote of
your description as saying that the subscriber-side solution is
effective in solving the long-lock problems, so I replied that that
can be solved with the publisher-side solution and the subscriber-side
solution could cause some unwanted behavior.Do you think we have decided to go with the publisher-side solution?
I'm fine if so.
I am fine too unless we discover any major challenges with
publisher-side implementation.
--
With Regards,
Amit Kapila.
On Wed, Mar 1, 2023 at 8:18 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
On Wed, Mar 1, 2023 at 12:51 AM Hayato Kuroda (Fujitsu)
<kuroda.hayato@fujitsu.com> wrote:Thinking of side effects of this feature (no matter where we delay
applying the changes), on the publisher, vacuum cannot collect garbage
and WAL cannot be recycled. Is that okay in the first place? The point
is that the subscription setting affects the publisher. That is,
min_send_delay is specified on the subscriber but the symptoms that
could ultimately lead to a server crash appear on the publisher, which
sounds dangerous to me.Imagine a service or system like where there is a publication server
and it's somewhat exposed so that a user (or a subsystem) arbitrarily
can create a subscriber to replicate a subset of the data. A malicious
user can have the publisher crash by creating a subscription with,
say, min_send_delay = 20d. max_slot_wal_keep_size helps this situation
but it's -1 by default.
By publisher crash, do you mean due to the disk full situation, it can
lead the publisher to stop/panic? Won't a malicious user can block the
replication in other ways as well and let the publisher stall (or
crash the publisher) even without setting min_send_delay? Basically,
one needs to either disable the subscription or create a
constraint-violating row in the table to make that happen. If the
system is exposed for arbitrarily allowing the creation of a
subscription then a malicious user can create a subscription similar
to one existing subscription and block the replication due to
constraint violations. I don't think it would be so easy to bypass the
current system that a malicious user will be allowed to create/alter
subscriptions arbitrarily. Similarly, if there is a network issue
(unreachable or slow), one will see similar symptoms. I think
retention of data and WAL on publisher do rely on acknowledgment from
subscribers and delay in that due to any reason can lead to the
symptoms you describe above. We have documented at least one such case
already where during Drop Subscription, if the network is not
reachable then also, a similar problem can happen and users need to be
careful about it [1]https://www.postgresql.org/docs/devel/logical-replication-subscription.html.
[1]: https://www.postgresql.org/docs/devel/logical-replication-subscription.html
--
With Regards,
Amit Kapila.
On Tue, Feb 28, 2023 at 9:21 PM Hayato Kuroda (Fujitsu)
<kuroda.hayato@fujitsu.com> wrote:
1. + /* + * If we've requested to shut down, exit the process. + * + * Note that WalSndDone() cannot be used here because the delaying + * changes will be sent in the function. + */ + if (got_STOPPING) + { + QueryCompletion qc; + + /* Inform the standby that XLOG streaming is done */ + SetQueryCompletion(&qc, CMDTAG_COPY, 0); + EndCommand(&qc, DestRemote, false); + pq_flush();Do we really need to do anything except for breaking the loop and let
the exit handling happen in the main loop when 'got_STOPPING' is set?
AFAICS, this is what we are doing in some other palces (See
WalSndWaitForWal). Won't that work? It seems that will help us sending
all the pending WAL.If we exit the loop after got_STOPPING is set, as you said, the walsender will
send delaying changes and then exit. The behavior is same as the case that WalSndDone()
is called. But I think it is not suitable for the motivation of the feature.
If users notice the miss operation like TRUNCATE, they must shut down the publisher
once and then recovery from back up or old subscriber. If the walsender sends all
pending changes, miss operations will be also propagated to subscriber and data
cannot be protected. So currently I want to keep the style.
FYI - In case of physical replication, received WALs are not applied when the
secondary is shutted down.
Fair point but I think the current comment should explain why we are
doing something different here. How about extending the existing
comments to something like: "If we've requested to shut down, exit the
process. This is unlike handling at other places where we allow
complete WAL to be sent before shutdown because we don't want the
delayed transactions to be applied downstream. This will allow one to
use the data from downstream in case of some unwanted operations on
the current node."
--
With Regards,
Amit Kapila.
On Wed, Mar 1, 2023 at 1:55 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
On Wed, Mar 1, 2023 at 8:18 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
On Wed, Mar 1, 2023 at 12:51 AM Hayato Kuroda (Fujitsu)
<kuroda.hayato@fujitsu.com> wrote:Thinking of side effects of this feature (no matter where we delay
applying the changes), on the publisher, vacuum cannot collect garbage
and WAL cannot be recycled. Is that okay in the first place? The point
is that the subscription setting affects the publisher. That is,
min_send_delay is specified on the subscriber but the symptoms that
could ultimately lead to a server crash appear on the publisher, which
sounds dangerous to me.Imagine a service or system like where there is a publication server
and it's somewhat exposed so that a user (or a subsystem) arbitrarily
can create a subscriber to replicate a subset of the data. A malicious
user can have the publisher crash by creating a subscription with,
say, min_send_delay = 20d. max_slot_wal_keep_size helps this situation
but it's -1 by default.By publisher crash, do you mean due to the disk full situation, it can
lead the publisher to stop/panic?
Exactly.
Won't a malicious user can block the
replication in other ways as well and let the publisher stall (or
crash the publisher) even without setting min_send_delay? Basically,
one needs to either disable the subscription or create a
constraint-violating row in the table to make that happen. If the
system is exposed for arbitrarily allowing the creation of a
subscription then a malicious user can create a subscription similar
to one existing subscription and block the replication due to
constraint violations. I don't think it would be so easy to bypass the
current system that a malicious user will be allowed to create/alter
subscriptions arbitrarily.
Right. But a difference is that with min_send_delay, it's just to
create a subscription.
Similarly, if there is a network issue
(unreachable or slow), one will see similar symptoms. I think
retention of data and WAL on publisher do rely on acknowledgment from
subscribers and delay in that due to any reason can lead to the
symptoms you describe above.
I think that piling up WAL files due to a slow network is a different
story since it's a problem not only on the subscriber side.
We have documented at least one such case
already where during Drop Subscription, if the network is not
reachable then also, a similar problem can happen and users need to be
careful about it [1].
Apart from a bad-use case example I mentioned, in general, piling up
WAL files due to the replication slot has many bad effects on the
system. I'm concerned that the side effect of this feature (at least
of the current design) is too huge compared to the benefit, and afraid
that users might end up using this feature without understanding the
side effect well. It might be okay if we thoroughly document it but
I'm not sure.
Regards,
--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com
On Wed, Mar 1, 2023 at 10:57 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
On Wed, Mar 1, 2023 at 1:55 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
Won't a malicious user can block the
replication in other ways as well and let the publisher stall (or
crash the publisher) even without setting min_send_delay? Basically,
one needs to either disable the subscription or create a
constraint-violating row in the table to make that happen. If the
system is exposed for arbitrarily allowing the creation of a
subscription then a malicious user can create a subscription similar
to one existing subscription and block the replication due to
constraint violations. I don't think it would be so easy to bypass the
current system that a malicious user will be allowed to create/alter
subscriptions arbitrarily.Right. But a difference is that with min_send_delay, it's just to
create a subscription.
But, currently, only superusers would be allowed to create
subscriptions. Even, if we change it and allow it based on some
pre-defined role, it won't be allowed to create a subscription
arbitrarily. So, not sure, if any malicious user can easily bypass it
as you are envisioning it.
--
With Regards,
Amit Kapila.
Dear Sawada-san,
Thank you for giving your consideration!
We have documented at least one such case
already where during Drop Subscription, if the network is not
reachable then also, a similar problem can happen and users need to be
careful about it [1].Apart from a bad-use case example I mentioned, in general, piling up
WAL files due to the replication slot has many bad effects on the
system. I'm concerned that the side effect of this feature (at least
of the current design) is too huge compared to the benefit, and afraid
that users might end up using this feature without understanding the
side effect well. It might be okay if we thoroughly document it but
I'm not sure.
One approach is that change max_slot_wal_keep_size forcibly when min_send_delay
is set. But it may lead to disable the slot because WALs needed by the time-delayed
replication may be also removed. Just the right value cannot be set by us because
it is quite depends on the min_send_delay and workload.
How about throwing the WARNING when min_send_delay > 0 but
max_slot_wal_keep_size < 0? Differ from previous, version the subscription
parameter min_send_delay will be sent to publisher. Therefore, we can compare
min_send_delay and max_slot_wal_keep_size when publisher receives the parameter.
Of course we can reject such a setup by using ereport(ERROR), but it may generate
abandoned replication slot. It is because we send the parameter at START_REPLICATION
and the slot has been already created.
Best Regards,
Hayato Kuroda
FUJITSU LIMITED
On Tue, 28 Feb 2023 at 21:21, Hayato Kuroda (Fujitsu)
<kuroda.hayato@fujitsu.com> wrote:
Dear Amit,
Few comments:
Thank you for reviewing! PSA new version.
Thanks for the updated patch, few comments:
1) Currently we have added the delay during the decode of commit,
while decoding the commit walsender process will stop decoding any
further transaction until delay is completed. There might be a
possibility that a lot of transactions will happen in parallel and
there will be a lot of transactions to be decoded after the delay is
completed.
Will it be possible to decode the WAL if any WAL is generated instead
of staying idle in the meantime, I'm not sure if this is feasible just
throwing my thought to see if it might be possible.
--- a/src/backend/replication/logical/decode.c
+++ b/src/backend/replication/logical/decode.c
@@ -676,6 +676,15 @@ DecodeCommit(LogicalDecodingContext *ctx,
XLogRecordBuffer *buf,
buf->origptr, buf->endptr);
}
+ /*
+ * Delay sending the changes if required. For streaming transactions,
+ * this means a delay in sending the last stream but that is OK
+ * because on the downstream the changes will be applied only after
+ * receiving the last stream.
+ */
+ if (ctx->min_send_delay > 0 && ctx->delay_send)
+ ctx->delay_send(ctx, xid, commit_time);
+
2) Generally single line comments are not terminated by ".", The
comment "/* Sleep until appropriate time. */" can be changed
appropriately:
+
+ /* Sleep until appropriate time. */
+ timeout_sleeptime_ms = WalSndComputeSleeptime(now);
+
+ elog(DEBUG2, "time-delayed replication for txid %u,
delay_time = %d ms, remaining wait time: %ld ms",
+ xid, (int) ctx->min_send_delay,
remaining_wait_time_ms);
+
+ /* Sleep until we get reply from worker or we time out */
+ WalSndWait(WL_SOCKET_READABLE,
3) In some places we mention as min_send_delay and in some places we
mention it as time-delayed replication, we can keep the comment
consistent by using the similar wordings.
+-- fail - specifying streaming = parallel with time-delayed replication is not
+-- supported
+CREATE SUBSCRIPTION regress_testsub CONNECTION
'dbname=regress_doesnotexist' PUBLICATION testpub WITH (connect =
false, streaming = parallel, min_send_delay = 123);
+-- fail - alter subscription with streaming = parallel should fail when
+-- time-delayed replication is set
+ALTER SUBSCRIPTION regress_testsub SET (streaming = parallel);
+-- fail - alter subscription with min_send_delay should fail when
+-- streaming = parallel is set
4) Since the value is stored in ms, we need not add ms again as the
default value is in ms:
@@ -4686,6 +4694,9 @@ dumpSubscription(Archive *fout, const
SubscriptionInfo *subinfo)
if (strcmp(subinfo->subsynccommit, "off") != 0)
appendPQExpBuffer(query, ", synchronous_commit = %s",
fmtId(subinfo->subsynccommit));
+ if (subinfo->subminsenddelay > 0)
+ appendPQExpBuffer(query, ", min_send_delay = '%d ms'",
subinfo->subminsenddelay);
+
5) we can use the new error reporting style:
5.a) brackets around errcode can be removed
+ ereport(ERROR,
+ (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+ errmsg("invalid value for parameter
\"%s\": \"%s\"",
+ "min_send_delay", input_string),
+ hintmsg ? errhint("%s", _(hintmsg)) : 0));
5.b) Similarly here too;
+ if (result < 0 || result > PG_INT32_MAX)
+ ereport(ERROR,
+ (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+ errmsg("%d ms is outside the valid
range for parameter \"%s\" (%d .. %d)",
+ result,
+ "min_send_delay",
+ 0, PG_INT32_MAX)));
5.c) Similarly here too;
+ delay_val = strtoul(strVal(defel->arg), &endptr, 10);
+ if (errno != 0 || *endptr != '\0')
+ ereport(ERROR,
+
(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+ errmsg("invalid
min_send_delay")));
5.d) Similarly here too;
+ if (delay_val > PG_INT32_MAX)
+ ereport(ERROR,
+
(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+
errmsg("min_send_delay \"%s\" out of range",
+
strVal(defel->arg))));
6) This can be changed to a single line comment:
+ /*
+ * Parse given string as parameter which has millisecond unit
+ */
+ if (!parse_int(input_string, &result, GUC_UNIT_MS, &hintmsg))
+ ereport(ERROR,
7) In the expect we have specifically mention "for non-streaming
transaction", is the behavior different for streaming transaction, if
not we can change the message accordingly
+# The publisher waits for the replication to complete
+$node_publisher->wait_for_catchup('tap_sub_renamed');
+
+# This test is successful only if at least the configured delay has elapsed.
+ok( time() - $publisher_insert_time >= $delay,
+ "subscriber applies changes only after replication delay for
non-streaming transaction"
+);
Regards,
Vignesh
On Wed, Mar 1, 2023 at 6:21 PM Hayato Kuroda (Fujitsu)
<kuroda.hayato@fujitsu.com> wrote:
Dear Sawada-san,
Thank you for giving your consideration!
We have documented at least one such case
already where during Drop Subscription, if the network is not
reachable then also, a similar problem can happen and users need to be
careful about it [1].Apart from a bad-use case example I mentioned, in general, piling up
WAL files due to the replication slot has many bad effects on the
system. I'm concerned that the side effect of this feature (at least
of the current design) is too huge compared to the benefit, and afraid
that users might end up using this feature without understanding the
side effect well. It might be okay if we thoroughly document it but
I'm not sure.One approach is that change max_slot_wal_keep_size forcibly when min_send_delay
is set. But it may lead to disable the slot because WALs needed by the time-delayed
replication may be also removed. Just the right value cannot be set by us because
it is quite depends on the min_send_delay and workload.How about throwing the WARNING when min_send_delay > 0 but
max_slot_wal_keep_size < 0? Differ from previous, version the subscription
parameter min_send_delay will be sent to publisher. Therefore, we can compare
min_send_delay and max_slot_wal_keep_size when publisher receives the parameter.
Since max_slot_wal_keep_size can be changed by reloading the config
file, each walsender warns it also at that time? Not sure it's
helpful. I think it's a legitimate use case to set min_send_delay > 0
and max_slot_wal_keep_size = -1, and users might not even notice the
WARNING message.
Regards,
--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com
On Thu, Mar 2, 2023 at 7:38 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
On Wed, Mar 1, 2023 at 6:21 PM Hayato Kuroda (Fujitsu)
<kuroda.hayato@fujitsu.com> wrote:Apart from a bad-use case example I mentioned, in general, piling up
WAL files due to the replication slot has many bad effects on the
system. I'm concerned that the side effect of this feature (at least
of the current design) is too huge compared to the benefit, and afraid
that users might end up using this feature without understanding the
side effect well. It might be okay if we thoroughly document it but
I'm not sure.One approach is that change max_slot_wal_keep_size forcibly when min_send_delay
is set. But it may lead to disable the slot because WALs needed by the time-delayed
replication may be also removed. Just the right value cannot be set by us because
it is quite depends on the min_send_delay and workload.How about throwing the WARNING when min_send_delay > 0 but
max_slot_wal_keep_size < 0? Differ from previous, version the subscription
parameter min_send_delay will be sent to publisher. Therefore, we can compare
min_send_delay and max_slot_wal_keep_size when publisher receives the parameter.Since max_slot_wal_keep_size can be changed by reloading the config
file, each walsender warns it also at that time?
I think Kuroda-San wants to emit a WARNING at the time of CREATE
SUBSCRIPTION. But it won't be possible to emit a WARNING at the time
of ALTER SUBSCRIPTION. Also, as you say if the user later changes the
value of max_slot_wal_keep_size, then even if we issue LOG/WARNING in
walsender, it may go unnoticed. If we really want to give WARNING for
this then we can probably give it as soon as user has set non-default
value of min_send_delay to indicate that this can lead to retaining
WAL on the publisher and they should consider setting
max_slot_wal_keep_size.
Having said that, I think users can always tune max_slot_wal_keep_size
and min_send_delay (as none of these requires restart) if they see any
indication of unexpected WAL size growth. There could be multiple ways
to check it but I think one can refer wal_status in
pg_replication_slots, the extended value can be an indicator of this.
Not sure it's
helpful. I think it's a legitimate use case to set min_send_delay > 0
and max_slot_wal_keep_size = -1, and users might not even notice the
WARNING message.
I think it would be better to tell about this in the docs along with
the 'min_send_delay' description. The key point is whether this would
be an acceptable trade-off for users who want to use this feature. I
think it can harm only if users use this without understanding the
corresponding trade-off. As we kept the default to no delay, it is
expected from users using this have an understanding of the trade-off.
--
With Regards,
Amit Kapila.
Dear Amit, Sawada-san,
I think Kuroda-San wants to emit a WARNING at the time of CREATE
SUBSCRIPTION. But it won't be possible to emit a WARNING at the time
of ALTER SUBSCRIPTION. Also, as you say if the user later changes the
value of max_slot_wal_keep_size, then even if we issue LOG/WARNING in
walsender, it may go unnoticed. If we really want to give WARNING for
this then we can probably give it as soon as user has set non-default
value of min_send_delay to indicate that this can lead to retaining
WAL on the publisher and they should consider setting
max_slot_wal_keep_size.
Yeah, my motivation is to emit WARNING at CREATE SUBSCRIPTION, but I have not noticed
that the approach has not covered ALTER SUBSCRIPTION.
Having said that, I think users can always tune max_slot_wal_keep_size
and min_send_delay (as none of these requires restart) if they see any
indication of unexpected WAL size growth. There could be multiple ways
to check it but I think one can refer wal_status in
pg_replication_slots, the extended value can be an indicator of this.
Yeah, min_send_delay and max_slots_wal_keep_size should be easily tunable because
the appropriate value depends on the enviroment and workload.
However, pg_replication_slots.pg_replication_slots cannot show the exact amout of WALs,
so it may not suitable for tuning. I think user can compare the value
pg_replication_slots.restart_lsn (or pg_stat_replication.sent_lsn) and
pg_current_wal_lsn() to calclate number of WALs to be delayed, like
```
postgres=# select pg_current_wal_lsn() - pg_replication_slots.restart_lsn as delayed from pg_replication_slots;
delayed
------------
1689153760
(1 row)
```
I think it would be better to tell about this in the docs along with
the 'min_send_delay' description. The key point is whether this would
be an acceptable trade-off for users who want to use this feature. I
think it can harm only if users use this without understanding the
corresponding trade-off. As we kept the default to no delay, it is
expected from users using this have an understanding of the trade-off.
Yes, the trade-off should be emphasized.
Best Regards,
Hayato Kuroda
FUJITSU LIMITED
Dear Peter,
Thank you for reviewing! PSA new version.
1.
Nitpick. The new text is jagged-looking. It should wrap at ~80 chars.
Addressed.
2.
2. Another reason is for that parallel streaming, the transaction will be opened
immediately by the parallel apply worker. Therefore, if the walsender
is delayed in sending the final record of the transaction, the
parallel apply worker must wait to receive it with an open
transaction. This would result in the locks acquired during the
transaction not being released until the min_send_delay has elapsed.~
The text already said there are "two reasons", and already this is
numbered as reason 2. So it doesn't need to keep saying "Another
reason" here."Another reason is for that parallel streaming" --> "For parallel streaming..."
Changed.
======
src/backend/replication/walsender.c3. WalSndDelay
+ /* die if timeout was reached */
+ WalSndCheckTimeOut();Other nearby comments start uppercase, so this should too.
I just picked from other part and they have lowercase, but fixed.
======
src/include/replication/walreceiver.h4. WalRcvStreamOptions
@@ -187,6 +187,7 @@ typedef struct * prepare time */ char *origin; /* Only publish data originating from the * specified origin */ + int32 min_send_delay; /* The minimum send delay */ } logical; } proto; } WalRcvStreamOptions;~
Should that comment mention the units are "(ms)"
Added.
Best Regards,
Hayato Kuroda
FUJITSU LIMITED
Attachments:
v10-0001-Time-delayed-logical-replication-on-publisher-si.patchapplication/octet-stream; name=v10-0001-Time-delayed-logical-replication-on-publisher-si.patchDownload
From 2e21bc932b683e44f2a914fa8691a7b17f2fbf31 Mon Sep 17 00:00:00 2001
From: Hayato Kuroda <kuroda.hayato@fujitsu.com>
Date: Thu, 16 Feb 2023 07:52:23 +0000
Subject: [PATCH v10] Time-delayed logical replication on publisher side
Similar to physical replication, a time-delayed copy of the data for logical
replication can be useful in certain scenarios, particularly to fix errors that
may cause data loss.
This patch introduces a new subscription parameter called 'min_send_delay'. If
this parameter is set, the apply worker (via walrcv_startstreaming) passes the
value to the publisher as an output plugin option. The walsender will then delay
sending the transaction for the given number of milliseconds. It is important to
note that this delay does not take into account the time spent transferring the
transaction, which means that the arrival time at the subscriber may be delayed
further.
The delay occurs before we start sending the transaction on the publisher, and
both regular and prepared transactions are covered. Streamed transactions are
also covered.
The combination of parallel streaming mode and min_send_delay is not allowed.
There are two reasons for this:
1. In parallel streaming mode, we start applying the transaction stream as soon
as the first change arrives, without knowing the transaction's prepare/commit
time. Always waiting for the full 'min_send_delay' period may result in
unnecessary delay.
2. For parallel streaming, the transaction will be opened immediately by the
parallel apply worker. Therefore, if the walsender is delayed in sending the
final record of the transaction, the parallel apply worker must wait to receive
it with an open transaction. This would result in the locks acquired during the
transaction not being released until the min_send_delay has elapsed.
Earlier versions were written by Euler Taveira, Takamichi Osumi, and Kuroda Hayato
Author: Kuroda Hayato, Takamichi Osumi
Reviewed-by: Amit Kapila, Peter Smith, Vignesh C, Shveta Malik,
Kyotaro Horiguchi, Shi Yu, Wang Wei, Dilip Kumar, Melih Mutlu,
Andres Freund
\# Please enter the commit message for your changes. Lines starting
---
doc/src/sgml/catalogs.sgml | 10 +
doc/src/sgml/glossary.sgml | 15 ++
doc/src/sgml/logical-replication.sgml | 6 +
doc/src/sgml/monitoring.sgml | 5 +
doc/src/sgml/ref/alter_subscription.sgml | 5 +-
doc/src/sgml/ref/create_subscription.sgml | 68 ++++++-
src/backend/catalog/pg_subscription.c | 1 +
src/backend/catalog/system_views.sql | 7 +-
src/backend/commands/subscriptioncmds.c | 116 ++++++++++-
.../libpqwalreceiver/libpqwalreceiver.c | 5 +
src/backend/replication/logical/decode.c | 9 +
src/backend/replication/logical/logical.c | 18 +-
.../replication/logical/logicalfuncs.c | 2 +-
src/backend/replication/logical/worker.c | 14 +-
src/backend/replication/pgoutput/pgoutput.c | 33 ++++
src/backend/replication/slotfuncs.c | 4 +-
src/backend/replication/walsender.c | 93 ++++++++-
src/backend/utils/activity/wait_event.c | 3 +
src/bin/pg_dump/pg_dump.c | 15 +-
src/bin/pg_dump/pg_dump.h | 1 +
src/bin/psql/describe.c | 9 +-
src/bin/psql/tab-complete.c | 4 +-
src/include/catalog/pg_subscription.h | 3 +
src/include/replication/logical.h | 18 +-
src/include/replication/pgoutput.h | 1 +
src/include/replication/walreceiver.h | 1 +
src/include/utils/wait_event.h | 3 +-
src/test/regress/expected/subscription.out | 185 +++++++++++-------
src/test/regress/sql/subscription.sql | 29 +++
src/test/subscription/t/001_rep_changes.pl | 27 +++
30 files changed, 606 insertions(+), 104 deletions(-)
diff --git a/doc/src/sgml/catalogs.sgml b/doc/src/sgml/catalogs.sgml
index c1e4048054..ce6521fe6c 100644
--- a/doc/src/sgml/catalogs.sgml
+++ b/doc/src/sgml/catalogs.sgml
@@ -7892,6 +7892,16 @@ SCRAM-SHA-256$<replaceable><iteration count></replaceable>:<replaceable>&l
</para></entry>
</row>
+ <row>
+ <entry role="catalog_table_entry"><para role="column_definition">
+ <structfield>subminsenddelay</structfield> <type>int4</type>
+ </para>
+ <para>
+ The minimum delay, in milliseconds, by the publisher before sending all
+ the changes
+ </para></entry>
+ </row>
+
<row>
<entry role="catalog_table_entry"><para role="column_definition">
<structfield>subenabled</structfield> <type>bool</type>
diff --git a/doc/src/sgml/glossary.sgml b/doc/src/sgml/glossary.sgml
index 7c01a541fe..9ede9d05f6 100644
--- a/doc/src/sgml/glossary.sgml
+++ b/doc/src/sgml/glossary.sgml
@@ -1729,6 +1729,21 @@
</glossdef>
</glossentry>
+ <glossentry id="glossary-time-delayed-replication">
+ <glossterm>Time-delayed replication</glossterm>
+ <glossdef>
+ <para>
+ Replication setup that delays the application of changes by a specified
+ minimum time-delay period.
+ </para>
+ <para>
+ For more information, see
+ <xref linkend="guc-recovery-min-apply-delay"/> for physical replication
+ and <xref linkend="sql-createsubscription"/> for logical replication.
+ </para>
+ </glossdef>
+ </glossentry>
+
<glossentry id="glossary-toast">
<glossterm>TOAST</glossterm>
<glossdef>
diff --git a/doc/src/sgml/logical-replication.sgml b/doc/src/sgml/logical-replication.sgml
index 1bd5660c87..fe9e7f7b26 100644
--- a/doc/src/sgml/logical-replication.sgml
+++ b/doc/src/sgml/logical-replication.sgml
@@ -247,6 +247,12 @@
target table.
</para>
+ <para>
+ A subscription can delay the receipt of changes by specifying the
+ <literal>min_send_delay</literal> subscription parameter. See
+ <xref linkend="sql-createsubscription"/> for details.
+ </para>
+
<sect2 id="logical-replication-subscription-slot">
<title>Replication Slot Management</title>
diff --git a/doc/src/sgml/monitoring.sgml b/doc/src/sgml/monitoring.sgml
index 6249bb50d0..7c47723153 100644
--- a/doc/src/sgml/monitoring.sgml
+++ b/doc/src/sgml/monitoring.sgml
@@ -2349,6 +2349,11 @@ postgres 27093 0.0 0.0 30096 2752 ? Ss 11:34 0:00 postgres: ser
<entry>Waiting to acquire an exclusive lock to truncate off any
empty pages at the end of a table vacuumed.</entry>
</row>
+ <row>
+ <entry><literal>WalSenderSendDelay</literal></entry>
+ <entry>Waiting while sending changes for time-delayed logical replication
+ in the WAL sender process.</entry>
+ </row>
</tbody>
</tgroup>
</table>
diff --git a/doc/src/sgml/ref/alter_subscription.sgml b/doc/src/sgml/ref/alter_subscription.sgml
index 964fcbb8ff..3f238b958b 100644
--- a/doc/src/sgml/ref/alter_subscription.sgml
+++ b/doc/src/sgml/ref/alter_subscription.sgml
@@ -213,8 +213,9 @@ ALTER SUBSCRIPTION <replaceable class="parameter">name</replaceable> RENAME TO <
are <literal>slot_name</literal>,
<literal>synchronous_commit</literal>,
<literal>binary</literal>, <literal>streaming</literal>,
- <literal>disable_on_error</literal>, and
- <literal>origin</literal>.
+ <literal>disable_on_error</literal>,
+ <literal>origin</literal>, and
+ <literal>min_send_delay</literal>.
</para>
</listitem>
</varlistentry>
diff --git a/doc/src/sgml/ref/create_subscription.sgml b/doc/src/sgml/ref/create_subscription.sgml
index 51c45f17c7..bfb73d9990 100644
--- a/doc/src/sgml/ref/create_subscription.sgml
+++ b/doc/src/sgml/ref/create_subscription.sgml
@@ -349,7 +349,68 @@ CREATE SUBSCRIPTION <replaceable class="parameter">subscription_name</replaceabl
</para>
</listitem>
</varlistentry>
- </variablelist></para>
+
+ <varlistentry>
+ <term><literal>min_send_delay</literal> (<type>integer</type>)</term>
+ <listitem>
+ <para>
+ By default, the publisher sends changes as soon as possible. This
+ parameter allows the user to delay changes by the given time period.
+ If the value is specified without units, it is taken as milliseconds.
+ The default is zero (no delay). See <xref linkend="config-setting-names-values"/>
+ for details on the available valid time units.
+ </para>
+ <para>
+ The delay is effective only when the initial table synchronization
+ has been finished. However, there is a possibility that the table
+ status updated in <link linkend="catalog-pg-subscription-rel"><structname>pg_subscription_rel</structname></link>
+ could be delayed in getting to the "ready" state, and also two-phase
+ (if specified) could be delayed in getting to "enabled".
+ </para>
+ <para>
+ The delay does not take into account the overhead of time spent
+ transferring the transaction. Therefore, the arrival time at the
+ subscriber may be delayed more than the specified
+ <literal>min_send_delay</literal> time.
+ </para>
+ <warning>
+ <para>
+ Delaying the replication means there is a much longer time between
+ making a change on the publisher, and that change being committed
+ on the subscriber. This can impact the performance of synchronous
+ replication. See <xref linkend="guc-synchronous-commit"/>
+ parameter.
+ </para>
+ </warning>
+ <caution>
+ <para>
+ While delaying the sending of changes, there is a possibility that
+ WAL segment files cannot be recycled, which can eventually lead to
+ exceeding the available disk space on the publisher. This is
+ because the replication slot related to the subscription regards
+ these WALs as needed until they are sent to the subscriber and
+ flushed. To avoid this, the <varname>max_slot_wal_keep_size</varname>
+ on the publisher and <literal>min_send_delay</literal> must be
+ tuned, and these values depend on the machine environment and
+ workload. There are several methods to check the amount of delayed
+ WALs, and a typical way is to calculate it from the
+ <link linkend="view-pg-replication-slots"><structname>pg_replication_slots</structname></link>
+ system view and <function>pg_current_wal_lsn</function>. Following
+ shows the amounts of WALs that are delayed the sending in bytes.
+<programlisting>
+SELECT pg_current_wal_lsn() - pg_replication_slots.restart_lsn
+ AS "amount of delaying WALs" FROM pg_replication_slots;
+ amount of delaying WALs
+-------------------------
+ 1688964992
+(1 row)
+</programlisting>
+ </para>
+ </caution>
+ </listitem>
+ </varlistentry>
+ </variablelist>
+ </para>
</listitem>
</varlistentry>
@@ -420,6 +481,11 @@ CREATE SUBSCRIPTION <replaceable class="parameter">subscription_name</replaceabl
published with different column lists are not supported.
</para>
+ <para>
+ A non-zero <literal>min_send_delay</literal> parameter is not allowed when
+ streaming in parallel mode.
+ </para>
+
<para>
We allow non-existent publications to be specified so that users can add
those later. This means
diff --git a/src/backend/catalog/pg_subscription.c b/src/backend/catalog/pg_subscription.c
index a56ae311c3..63a10b06d1 100644
--- a/src/backend/catalog/pg_subscription.c
+++ b/src/backend/catalog/pg_subscription.c
@@ -66,6 +66,7 @@ GetSubscription(Oid subid, bool missing_ok)
sub->skiplsn = subform->subskiplsn;
sub->name = pstrdup(NameStr(subform->subname));
sub->owner = subform->subowner;
+ sub->minsenddelay = subform->subminsenddelay;
sub->enabled = subform->subenabled;
sub->binary = subform->subbinary;
sub->stream = subform->substream;
diff --git a/src/backend/catalog/system_views.sql b/src/backend/catalog/system_views.sql
index 34ca0e739f..54a705d71b 100644
--- a/src/backend/catalog/system_views.sql
+++ b/src/backend/catalog/system_views.sql
@@ -1314,9 +1314,10 @@ REVOKE ALL ON pg_replication_origin_status FROM public;
-- All columns of pg_subscription except subconninfo are publicly readable.
REVOKE ALL ON pg_subscription FROM public;
-GRANT SELECT (oid, subdbid, subskiplsn, subname, subowner, subenabled,
- subbinary, substream, subtwophasestate, subdisableonerr,
- subslotname, subsynccommit, subpublications, suborigin)
+GRANT SELECT (oid, subdbid, subskiplsn, subname, subowner, subminsenddelay,
+ subenabled, subbinary, substream, subtwophasestate,
+ subdisableonerr, subslotname, subsynccommit, subpublications,
+ suborigin)
ON pg_subscription TO public;
CREATE VIEW pg_stat_subscription_stats AS
diff --git a/src/backend/commands/subscriptioncmds.c b/src/backend/commands/subscriptioncmds.c
index 464db6d247..875709dcda 100644
--- a/src/backend/commands/subscriptioncmds.c
+++ b/src/backend/commands/subscriptioncmds.c
@@ -66,6 +66,7 @@
#define SUBOPT_DISABLE_ON_ERR 0x00000400
#define SUBOPT_LSN 0x00000800
#define SUBOPT_ORIGIN 0x00001000
+#define SUBOPT_MIN_SEND_DELAY 0x00002000
/* check if the 'val' has 'bits' set */
#define IsSet(val, bits) (((val) & (bits)) == (bits))
@@ -90,6 +91,7 @@ typedef struct SubOpts
bool disableonerr;
char *origin;
XLogRecPtr lsn;
+ int32 min_send_delay;
} SubOpts;
static List *fetch_table_list(WalReceiverConn *wrconn, List *publications);
@@ -100,7 +102,7 @@ static void check_publications_origin(WalReceiverConn *wrconn,
static void check_duplicates_in_publist(List *publist, Datum *datums);
static List *merge_publications(List *oldpublist, List *newpublist, bool addpub, const char *subname);
static void ReportSlotConnectionError(List *rstates, Oid subid, char *slotname, char *err);
-
+static int32 defGetMinSendDelay(DefElem *def);
/*
* Common option parsing function for CREATE and ALTER SUBSCRIPTION commands.
@@ -146,6 +148,8 @@ parse_subscription_options(ParseState *pstate, List *stmt_options,
opts->disableonerr = false;
if (IsSet(supported_opts, SUBOPT_ORIGIN))
opts->origin = pstrdup(LOGICALREP_ORIGIN_ANY);
+ if (IsSet(supported_opts, SUBOPT_MIN_SEND_DELAY))
+ opts->min_send_delay = 0;
/* Parse options */
foreach(lc, stmt_options)
@@ -324,6 +328,15 @@ parse_subscription_options(ParseState *pstate, List *stmt_options,
opts->specified_opts |= SUBOPT_LSN;
opts->lsn = lsn;
}
+ else if (IsSet(supported_opts, SUBOPT_MIN_SEND_DELAY) &&
+ strcmp(defel->defname, "min_send_delay") == 0)
+ {
+ if (IsSet(opts->specified_opts, SUBOPT_MIN_SEND_DELAY))
+ errorConflictingDefElem(defel, pstate);
+
+ opts->specified_opts |= SUBOPT_MIN_SEND_DELAY;
+ opts->min_send_delay = defGetMinSendDelay(defel);
+ }
else
ereport(ERROR,
(errcode(ERRCODE_SYNTAX_ERROR),
@@ -404,6 +417,30 @@ parse_subscription_options(ParseState *pstate, List *stmt_options,
"slot_name = NONE", "create_slot = false")));
}
}
+
+ /*
+ * The combination of parallel streaming mode and min_send_delay is not
+ * allowed. This is because in parallel streaming mode, the walsender
+ * starts sending the transaction stream without knowing the
+ * prepare/commit time of the transaction. Always waiting for the full
+ * 'min_send_delay' time to send may introduce unnecessary delay.
+ *
+ * The other possibility was to wait sending COMMIT record of the parallel
+ * apply transaction but that would cause issues related to resource bloat
+ * and locks being held for a long time.
+ */
+ if (IsSet(supported_opts, SUBOPT_MIN_SEND_DELAY) &&
+ opts->min_send_delay > 0 &&
+ opts->streaming == LOGICALREP_STREAM_PARALLEL)
+ ereport(ERROR,
+ errcode(ERRCODE_SYNTAX_ERROR),
+
+ /*
+ * translator: the first %s is a string of the form "parameter > 0"
+ * and the second one is "option = value".
+ */
+ errmsg("%s and %s are mutually exclusive options",
+ "min_send_delay > 0", "streaming = parallel"));
}
/*
@@ -560,7 +597,8 @@ CreateSubscription(ParseState *pstate, CreateSubscriptionStmt *stmt,
SUBOPT_SLOT_NAME | SUBOPT_COPY_DATA |
SUBOPT_SYNCHRONOUS_COMMIT | SUBOPT_BINARY |
SUBOPT_STREAMING | SUBOPT_TWOPHASE_COMMIT |
- SUBOPT_DISABLE_ON_ERR | SUBOPT_ORIGIN);
+ SUBOPT_DISABLE_ON_ERR | SUBOPT_ORIGIN |
+ SUBOPT_MIN_SEND_DELAY);
parse_subscription_options(pstate, stmt->options, supported_opts, &opts);
/*
@@ -628,6 +666,7 @@ CreateSubscription(ParseState *pstate, CreateSubscriptionStmt *stmt,
values[Anum_pg_subscription_subname - 1] =
DirectFunctionCall1(namein, CStringGetDatum(stmt->subname));
values[Anum_pg_subscription_subowner - 1] = ObjectIdGetDatum(owner);
+ values[Anum_pg_subscription_subminsenddelay - 1] = Int32GetDatum(opts.min_send_delay);
values[Anum_pg_subscription_subenabled - 1] = BoolGetDatum(opts.enabled);
values[Anum_pg_subscription_subbinary - 1] = BoolGetDatum(opts.binary);
values[Anum_pg_subscription_substream - 1] = CharGetDatum(opts.streaming);
@@ -1054,7 +1093,7 @@ AlterSubscription(ParseState *pstate, AlterSubscriptionStmt *stmt,
supported_opts = (SUBOPT_SLOT_NAME |
SUBOPT_SYNCHRONOUS_COMMIT | SUBOPT_BINARY |
SUBOPT_STREAMING | SUBOPT_DISABLE_ON_ERR |
- SUBOPT_ORIGIN);
+ SUBOPT_ORIGIN | SUBOPT_MIN_SEND_DELAY);
parse_subscription_options(pstate, stmt->options,
supported_opts, &opts);
@@ -1098,6 +1137,19 @@ AlterSubscription(ParseState *pstate, AlterSubscriptionStmt *stmt,
if (IsSet(opts.specified_opts, SUBOPT_STREAMING))
{
+ /*
+ * The combination of parallel streaming mode and
+ * min_send_delay is not allowed. See
+ * parse_subscription_options.
+ */
+ if (opts.streaming == LOGICALREP_STREAM_PARALLEL &&
+ !IsSet(opts.specified_opts, SUBOPT_MIN_SEND_DELAY) &&
+ sub->minsenddelay > 0)
+ ereport(ERROR,
+ errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
+ errmsg("cannot set parallel streaming mode for subscription with %s",
+ "min_send_delay"));
+
values[Anum_pg_subscription_substream - 1] =
CharGetDatum(opts.streaming);
replaces[Anum_pg_subscription_substream - 1] = true;
@@ -1111,6 +1163,26 @@ AlterSubscription(ParseState *pstate, AlterSubscriptionStmt *stmt,
= true;
}
+ if (IsSet(opts.specified_opts, SUBOPT_MIN_SEND_DELAY))
+ {
+ /*
+ * The combination of parallel streaming mode and
+ * min_send_delay is not allowed. See
+ * parse_subscription_options.
+ */
+ if (opts.min_send_delay > 0 &&
+ !IsSet(opts.specified_opts, SUBOPT_STREAMING) &&
+ sub->stream == LOGICALREP_STREAM_PARALLEL)
+ ereport(ERROR,
+ errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
+ errmsg("cannot set %s for subscription in parallel streaming mode",
+ "min_send_delay"));
+
+ values[Anum_pg_subscription_subminsenddelay - 1] =
+ Int32GetDatum(opts.min_send_delay);
+ replaces[Anum_pg_subscription_subminsenddelay - 1] = true;
+ }
+
if (IsSet(opts.specified_opts, SUBOPT_ORIGIN))
{
values[Anum_pg_subscription_suborigin - 1] =
@@ -2195,3 +2267,41 @@ defGetStreamingMode(DefElem *def)
def->defname)));
return LOGICALREP_STREAM_OFF; /* keep compiler quiet */
}
+
+/*
+ * Extract the min_send_delay value from a DefElem. This is very similar to
+ * parse_and_validate_value() for integer values, because min_send_delay
+ * accepts the same parameter format as recovery_min_apply_delay.
+ */
+static int32
+defGetMinSendDelay(DefElem *def)
+{
+ char *input_string;
+ int result;
+ const char *hintmsg;
+
+ input_string = defGetString(def);
+
+ /* Parse given string as parameter which has millisecond unit */
+ if (!parse_int(input_string, &result, GUC_UNIT_MS, &hintmsg))
+ ereport(ERROR,
+ errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+ errmsg("invalid value for parameter \"%s\": \"%s\"",
+ "min_send_delay", input_string),
+ hintmsg ? errhint("%s", _(hintmsg)) : 0);
+
+ /*
+ * Check both the lower boundary for the valid min_send_delay range and
+ * the upper boundary as the safeguard for some platforms where INT_MAX is
+ * wider than int32 respectively. Although parse_int() has confirmed that
+ * the result is less than or equal to INT_MAX, the value will be stored
+ * in a catalog column of int32.
+ */
+ if (result < 0 || result > PG_INT32_MAX)
+ ereport(ERROR,
+ errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+ errmsg("%d ms is outside the valid range for parameter \"%s\" (%d .. %d)",
+ result, "min_send_delay", 0, PG_INT32_MAX));
+
+ return result;
+}
diff --git a/src/backend/replication/libpqwalreceiver/libpqwalreceiver.c b/src/backend/replication/libpqwalreceiver/libpqwalreceiver.c
index 560ec974fa..89a72c1abe 100644
--- a/src/backend/replication/libpqwalreceiver/libpqwalreceiver.c
+++ b/src/backend/replication/libpqwalreceiver/libpqwalreceiver.c
@@ -443,6 +443,11 @@ libpqrcv_startstreaming(WalReceiverConn *conn,
PQserverVersion(conn->streamConn) >= 140000)
appendStringInfoString(&cmd, ", binary 'true'");
+ if (options->proto.logical.min_send_delay > 0 &&
+ PQserverVersion(conn->streamConn) >= 160000)
+ appendStringInfo(&cmd, ", min_send_delay '%d'",
+ options->proto.logical.min_send_delay);
+
appendStringInfoChar(&cmd, ')');
}
else
diff --git a/src/backend/replication/logical/decode.c b/src/backend/replication/logical/decode.c
index 8fe7bb65f1..60f8f7a892 100644
--- a/src/backend/replication/logical/decode.c
+++ b/src/backend/replication/logical/decode.c
@@ -676,6 +676,15 @@ DecodeCommit(LogicalDecodingContext *ctx, XLogRecordBuffer *buf,
buf->origptr, buf->endptr);
}
+ /*
+ * Delay sending the changes if required. For streaming transactions, this
+ * means a delay in sending the last stream but that is OK because on the
+ * downstream the changes will be applied only after receiving the last
+ * stream.
+ */
+ if (ctx->min_send_delay > 0 && ctx->delay_send)
+ ctx->delay_send(ctx, xid, commit_time);
+
/*
* Send the final commit record if the transaction data is already
* decoded, otherwise, process the entire transaction.
diff --git a/src/backend/replication/logical/logical.c b/src/backend/replication/logical/logical.c
index c3ec97a0a6..e4dd822cdc 100644
--- a/src/backend/replication/logical/logical.c
+++ b/src/backend/replication/logical/logical.c
@@ -156,7 +156,8 @@ StartupDecodingContext(List *output_plugin_options,
XLogReaderRoutine *xl_routine,
LogicalOutputPluginWriterPrepareWrite prepare_write,
LogicalOutputPluginWriterWrite do_write,
- LogicalOutputPluginWriterUpdateProgress update_progress)
+ LogicalOutputPluginWriterUpdateProgress update_progress,
+ LogicalOutputPluginWriterDelay delay_send)
{
ReplicationSlot *slot;
MemoryContext context,
@@ -293,6 +294,7 @@ StartupDecodingContext(List *output_plugin_options,
ctx->prepare_write = prepare_write;
ctx->write = do_write;
ctx->update_progress = update_progress;
+ ctx->delay_send = delay_send;
ctx->output_plugin_options = output_plugin_options;
@@ -316,7 +318,7 @@ StartupDecodingContext(List *output_plugin_options,
* marking WAL reserved beforehand. In that scenario, it's up to the
* caller to guarantee that WAL remains available.
* xl_routine -- XLogReaderRoutine for underlying XLogReader
- * prepare_write, do_write, update_progress --
+ * prepare_write, do_write, update_progress, delay_send --
* callbacks that perform the use-case dependent, actual, work.
*
* Needs to be called while in a memory context that's at least as long lived
@@ -334,7 +336,8 @@ CreateInitDecodingContext(const char *plugin,
XLogReaderRoutine *xl_routine,
LogicalOutputPluginWriterPrepareWrite prepare_write,
LogicalOutputPluginWriterWrite do_write,
- LogicalOutputPluginWriterUpdateProgress update_progress)
+ LogicalOutputPluginWriterUpdateProgress update_progress,
+ LogicalOutputPluginWriterDelay delay_send)
{
TransactionId xmin_horizon = InvalidTransactionId;
ReplicationSlot *slot;
@@ -435,7 +438,7 @@ CreateInitDecodingContext(const char *plugin,
ctx = StartupDecodingContext(NIL, restart_lsn, xmin_horizon,
need_full_snapshot, false,
xl_routine, prepare_write, do_write,
- update_progress);
+ update_progress, delay_send);
/* call output plugin initialization callback */
old_context = MemoryContextSwitchTo(ctx->context);
@@ -475,7 +478,7 @@ CreateInitDecodingContext(const char *plugin,
* xl_routine
* XLogReaderRoutine used by underlying xlogreader
*
- * prepare_write, do_write, update_progress
+ * prepare_write, do_write, update_progress, delay_send
* callbacks that have to be filled to perform the use-case dependent,
* actual work.
*
@@ -493,7 +496,8 @@ CreateDecodingContext(XLogRecPtr start_lsn,
XLogReaderRoutine *xl_routine,
LogicalOutputPluginWriterPrepareWrite prepare_write,
LogicalOutputPluginWriterWrite do_write,
- LogicalOutputPluginWriterUpdateProgress update_progress)
+ LogicalOutputPluginWriterUpdateProgress update_progress,
+ LogicalOutputPluginWriterDelay delay_send)
{
LogicalDecodingContext *ctx;
ReplicationSlot *slot;
@@ -547,7 +551,7 @@ CreateDecodingContext(XLogRecPtr start_lsn,
ctx = StartupDecodingContext(output_plugin_options,
start_lsn, InvalidTransactionId, false,
fast_forward, xl_routine, prepare_write,
- do_write, update_progress);
+ do_write, update_progress, delay_send);
/* call output plugin initialization callback */
old_context = MemoryContextSwitchTo(ctx->context);
diff --git a/src/backend/replication/logical/logicalfuncs.c b/src/backend/replication/logical/logicalfuncs.c
index fa1b641a2b..960025197f 100644
--- a/src/backend/replication/logical/logicalfuncs.c
+++ b/src/backend/replication/logical/logicalfuncs.c
@@ -212,7 +212,7 @@ pg_logical_slot_get_changes_guts(FunctionCallInfo fcinfo, bool confirm, bool bin
.segment_open = wal_segment_open,
.segment_close = wal_segment_close),
LogicalOutputPrepareWrite,
- LogicalOutputWrite, NULL);
+ LogicalOutputWrite, NULL, NULL);
/*
* After the sanity checks in CreateDecodingContext, make sure the
diff --git a/src/backend/replication/logical/worker.c b/src/backend/replication/logical/worker.c
index cfb2ab6248..3cde6d37fa 100644
--- a/src/backend/replication/logical/worker.c
+++ b/src/backend/replication/logical/worker.c
@@ -3898,7 +3898,8 @@ maybe_reread_subscription(void)
newsub->stream != MySubscription->stream ||
strcmp(newsub->origin, MySubscription->origin) != 0 ||
newsub->owner != MySubscription->owner ||
- !equal(newsub->publications, MySubscription->publications))
+ !equal(newsub->publications, MySubscription->publications) ||
+ newsub->minsenddelay != MySubscription->minsenddelay)
{
if (am_parallel_apply_worker())
ereport(LOG,
@@ -4617,9 +4618,20 @@ ApplyWorkerMain(Datum main_arg)
options.proto.logical.twophase = false;
options.proto.logical.origin = pstrdup(MySubscription->origin);
+ options.proto.logical.min_send_delay = 0;
if (!am_tablesync_worker())
{
+ /*
+ * We support time-delayed logical replication only for the apply
+ * worker. This is because if we support delay during the initial sync
+ * then once we reach the limit of tablesync workers it would impose a
+ * delay for each subsequent worker. That would cause initial table
+ * synchronization completion to take a long time.
+ */
+ if (server_version >= 160000 && MySubscription->minsenddelay > 0)
+ options.proto.logical.min_send_delay = MySubscription->minsenddelay;
+
/*
* Even when the two_phase mode is requested by the user, it remains
* as the tri-state PENDING until all tablesyncs have reached READY
diff --git a/src/backend/replication/pgoutput/pgoutput.c b/src/backend/replication/pgoutput/pgoutput.c
index 0df1acbb7a..02bca8d380 100644
--- a/src/backend/replication/pgoutput/pgoutput.c
+++ b/src/backend/replication/pgoutput/pgoutput.c
@@ -285,6 +285,7 @@ parse_output_parameters(List *options, PGOutputData *data)
bool streaming_given = false;
bool two_phase_option_given = false;
bool origin_option_given = false;
+ bool min_send_delay_option_given = false;
data->binary = false;
data->streaming = LOGICALREP_STREAM_OFF;
@@ -396,6 +397,32 @@ parse_output_parameters(List *options, PGOutputData *data)
errcode(ERRCODE_INVALID_PARAMETER_VALUE),
errmsg("unrecognized origin value: \"%s\"", data->origin));
}
+ else if (strcmp(defel->defname, "min_send_delay") == 0)
+ {
+ unsigned long delay_val;
+ char *endptr;
+
+ if (min_send_delay_option_given)
+ ereport(ERROR,
+ errcode(ERRCODE_SYNTAX_ERROR),
+ errmsg("conflicting or redundant options"));
+ min_send_delay_option_given = true;
+
+ errno = 0;
+ delay_val = strtoul(strVal(defel->arg), &endptr, 10);
+ if (errno != 0 || *endptr != '\0')
+ ereport(ERROR,
+ errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+ errmsg("invalid min_send_delay"));
+
+ if (delay_val > PG_INT32_MAX)
+ ereport(ERROR,
+ errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+ errmsg("min_send_delay \"%s\" out of range",
+ strVal(defel->arg)));
+
+ data->min_send_delay = (int32) delay_val;
+ }
else
elog(ERROR, "unrecognized pgoutput option: %s", defel->defname);
}
@@ -502,6 +529,12 @@ pgoutput_startup(LogicalDecodingContext *ctx, OutputPluginOptions *opt,
else
ctx->twophase_opt_given = true;
+ /*
+ * Remember the delay time period to be used later before sending the
+ * changes.
+ */
+ ctx->min_send_delay = data->min_send_delay;
+
/* Init publication state. */
data->publications = NIL;
publications_valid = false;
diff --git a/src/backend/replication/slotfuncs.c b/src/backend/replication/slotfuncs.c
index 2f3c964824..522f7600a1 100644
--- a/src/backend/replication/slotfuncs.c
+++ b/src/backend/replication/slotfuncs.c
@@ -148,7 +148,7 @@ create_logical_replication_slot(char *name, char *plugin,
XL_ROUTINE(.page_read = read_local_xlog_page,
.segment_open = wal_segment_open,
.segment_close = wal_segment_close),
- NULL, NULL, NULL);
+ NULL, NULL, NULL, NULL);
/*
* If caller needs us to determine the decoding start point, do so now.
@@ -481,7 +481,7 @@ pg_logical_replication_slot_advance(XLogRecPtr moveto)
XL_ROUTINE(.page_read = read_local_xlog_page,
.segment_open = wal_segment_open,
.segment_close = wal_segment_close),
- NULL, NULL, NULL);
+ NULL, NULL, NULL, NULL);
/*
* Start reading at the slot's restart_lsn, which we know to point to
diff --git a/src/backend/replication/walsender.c b/src/backend/replication/walsender.c
index 75e8363e24..b6f4cbfd4c 100644
--- a/src/backend/replication/walsender.c
+++ b/src/backend/replication/walsender.c
@@ -252,6 +252,7 @@ static void WalSndPrepareWrite(LogicalDecodingContext *ctx, XLogRecPtr lsn, Tran
static void WalSndWriteData(LogicalDecodingContext *ctx, XLogRecPtr lsn, TransactionId xid, bool last_write);
static void WalSndUpdateProgress(LogicalDecodingContext *ctx, XLogRecPtr lsn, TransactionId xid,
bool skipped_xact);
+static void WalSndDelay(LogicalDecodingContext *ctx, TransactionId xid, TimestampTz delay_start);
static XLogRecPtr WalSndWaitForWal(XLogRecPtr loc);
static void LagTrackerWrite(XLogRecPtr lsn, TimestampTz local_flush_time);
static TimeOffset LagTrackerRead(int head, XLogRecPtr lsn, TimestampTz now);
@@ -1126,7 +1127,7 @@ CreateReplicationSlot(CreateReplicationSlotCmd *cmd)
.segment_open = WalSndSegmentOpen,
.segment_close = wal_segment_close),
WalSndPrepareWrite, WalSndWriteData,
- WalSndUpdateProgress);
+ WalSndUpdateProgress, WalSndDelay);
/*
* Signal that we don't need the timeout mechanism. We're just
@@ -1285,7 +1286,7 @@ StartLogicalReplication(StartReplicationCmd *cmd)
.segment_open = WalSndSegmentOpen,
.segment_close = wal_segment_close),
WalSndPrepareWrite, WalSndWriteData,
- WalSndUpdateProgress);
+ WalSndUpdateProgress, WalSndDelay);
xlogreader = logical_decoding_ctx->reader;
WalSndSetState(WALSNDSTATE_CATCHUP);
@@ -3849,3 +3850,91 @@ LagTrackerRead(int head, XLogRecPtr lsn, TimestampTz now)
Assert(time != 0);
return now - time;
}
+
+/*
+ * LogicalDecodingContext 'delay' callback.
+ *
+ * Wait long enough to make sure a transaction is applied at least
+ * min_send_delay time period after it is performed at the publisher.
+ *
+ * delay_start is the transaction end time.
+ */
+static void
+WalSndDelay(LogicalDecodingContext *ctx, TransactionId xid, TimestampTz delay_start)
+{
+ /* Apply the delay by the latch mechanism */
+ while (true)
+ {
+ TimestampTz now;
+ TimestampTz delayUntil;
+ long remaining_wait_time_ms;
+ long timeout_sleeptime_ms;
+
+ ResetLatch(MyLatch);
+
+ CHECK_FOR_INTERRUPTS();
+
+ /* This might change wal_sender_timeout */
+ if (ConfigReloadPending)
+ {
+ ConfigReloadPending = false;
+ ProcessConfigFile(PGC_SIGHUP);
+ }
+
+ /* Check for input from the client */
+ ProcessRepliesIfAny();
+
+ /* Die if timeout was reached */
+ WalSndCheckTimeOut();
+
+ /* Send keepalive if the time has come */
+ WalSndKeepaliveIfNecessary();
+
+ /* Try to flush pending output to the client */
+ if (pq_flush_if_writable() != 0)
+ WalSndShutdown();
+
+ /*
+ * If we've requested to shut down, exit the process.
+ *
+ * This is unlike handling at other places where we allow complete WAL
+ * to be sent before shutdown because we don't want the delayed
+ * transactions to be applied downstream. This will allow one to use
+ * the data from downstream in case of some unwanted operations on the
+ * current node.
+ */
+ if (got_STOPPING)
+ {
+ QueryCompletion qc;
+
+ /* Inform the standby that XLOG streaming is done */
+ SetQueryCompletion(&qc, CMDTAG_COPY, 0);
+ EndCommand(&qc, DestRemote, false);
+ pq_flush();
+
+ proc_exit(0);
+ }
+
+ now = GetCurrentTimestamp();
+ delayUntil = TimestampTzPlusMilliseconds(delay_start, ctx->min_send_delay);
+ remaining_wait_time_ms = TimestampDifferenceMilliseconds(now, delayUntil);
+
+ /*
+ * Exit without arming the latch if it's already past time to send
+ * this transaction.
+ */
+ if (remaining_wait_time_ms <= 0)
+ break;
+
+ /* Sleep until appropriate time */
+ timeout_sleeptime_ms = WalSndComputeSleeptime(now);
+
+ elog(DEBUG2, "time-delayed replication for txid %u, delay_time = %d ms, remaining wait time: %ld ms",
+ xid, (int) ctx->min_send_delay, remaining_wait_time_ms);
+
+ /* Sleep until we get reply from worker or we time out */
+ WalSndWait(WL_SOCKET_READABLE,
+ Min(timeout_sleeptime_ms, remaining_wait_time_ms),
+ WAIT_EVENT_WALSENDER_SEND_DELAY);
+ }
+}
diff --git a/src/backend/utils/activity/wait_event.c b/src/backend/utils/activity/wait_event.c
index cb99cc6339..76c19fe11d 100644
--- a/src/backend/utils/activity/wait_event.c
+++ b/src/backend/utils/activity/wait_event.c
@@ -515,6 +515,9 @@ pgstat_get_wait_timeout(WaitEventTimeout w)
case WAIT_EVENT_VACUUM_TRUNCATE:
event_name = "VacuumTruncate";
break;
+ case WAIT_EVENT_WALSENDER_SEND_DELAY:
+ event_name = "WalSenderSendDelay";
+ break;
/* no default case, so that compiler will warn */
}
diff --git a/src/bin/pg_dump/pg_dump.c b/src/bin/pg_dump/pg_dump.c
index 24ba936332..9aea3a6103 100644
--- a/src/bin/pg_dump/pg_dump.c
+++ b/src/bin/pg_dump/pg_dump.c
@@ -4493,6 +4493,7 @@ getSubscriptions(Archive *fout)
int i_subsynccommit;
int i_subpublications;
int i_subbinary;
+ int i_subminsenddelay;
int i,
ntups;
@@ -4545,9 +4546,13 @@ getSubscriptions(Archive *fout)
LOGICALREP_TWOPHASE_STATE_DISABLED);
if (fout->remoteVersion >= 160000)
- appendPQExpBufferStr(query, " s.suborigin\n");
+ appendPQExpBufferStr(query,
+ " s.suborigin,\n"
+ " s.subminsenddelay\n");
else
- appendPQExpBuffer(query, " '%s' AS suborigin\n", LOGICALREP_ORIGIN_ANY);
+ appendPQExpBuffer(query, " '%s' AS suborigin,\n"
+ " 0 AS subminsenddelay\n",
+ LOGICALREP_ORIGIN_ANY);
appendPQExpBufferStr(query,
"FROM pg_subscription s\n"
@@ -4575,6 +4580,7 @@ getSubscriptions(Archive *fout)
i_subtwophasestate = PQfnumber(res, "subtwophasestate");
i_subdisableonerr = PQfnumber(res, "subdisableonerr");
i_suborigin = PQfnumber(res, "suborigin");
+ i_subminsenddelay = PQfnumber(res, "subminsenddelay");
subinfo = pg_malloc(ntups * sizeof(SubscriptionInfo));
@@ -4605,6 +4611,8 @@ getSubscriptions(Archive *fout)
subinfo[i].subdisableonerr =
pg_strdup(PQgetvalue(res, i, i_subdisableonerr));
subinfo[i].suborigin = pg_strdup(PQgetvalue(res, i, i_suborigin));
+ subinfo[i].subminsenddelay =
+ atoi(PQgetvalue(res, i, i_subminsenddelay));
/* Decide whether we want to dump it */
selectDumpableObject(&(subinfo[i].dobj), fout);
@@ -4686,6 +4694,9 @@ dumpSubscription(Archive *fout, const SubscriptionInfo *subinfo)
if (strcmp(subinfo->subsynccommit, "off") != 0)
appendPQExpBuffer(query, ", synchronous_commit = %s", fmtId(subinfo->subsynccommit));
+ if (subinfo->subminsenddelay > 0)
+ appendPQExpBuffer(query, ", min_send_delay = '%d'", subinfo->subminsenddelay);
+
appendPQExpBufferStr(query, ");\n");
if (subinfo->dobj.dump & DUMP_COMPONENT_DEFINITION)
diff --git a/src/bin/pg_dump/pg_dump.h b/src/bin/pg_dump/pg_dump.h
index cdca0b993d..4c55f8efc4 100644
--- a/src/bin/pg_dump/pg_dump.h
+++ b/src/bin/pg_dump/pg_dump.h
@@ -660,6 +660,7 @@ typedef struct _SubscriptionInfo
char *subdisableonerr;
char *suborigin;
char *subsynccommit;
+ int subminsenddelay;
char *subpublications;
} SubscriptionInfo;
diff --git a/src/bin/psql/describe.c b/src/bin/psql/describe.c
index c8a0bb7b3a..c7d303a168 100644
--- a/src/bin/psql/describe.c
+++ b/src/bin/psql/describe.c
@@ -6472,7 +6472,7 @@ describeSubscriptions(const char *pattern, bool verbose)
PGresult *res;
printQueryOpt myopt = pset.popt;
static const bool translate_columns[] = {false, false, false, false,
- false, false, false, false, false, false, false, false};
+ false, false, false, false, false, false, false, false, false};
if (pset.sversion < 100000)
{
@@ -6527,10 +6527,13 @@ describeSubscriptions(const char *pattern, bool verbose)
gettext_noop("Two-phase commit"),
gettext_noop("Disable on error"));
+ /* Origin and min_send_delay are only supported in v16 and higher */
if (pset.sversion >= 160000)
appendPQExpBuffer(&buf,
- ", suborigin AS \"%s\"\n",
- gettext_noop("Origin"));
+ ", suborigin AS \"%s\"\n"
+ ", subminsenddelay AS \"%s\"\n",
+ gettext_noop("Origin"),
+ gettext_noop("Min send delay"));
appendPQExpBuffer(&buf,
", subsynccommit AS \"%s\"\n"
diff --git a/src/bin/psql/tab-complete.c b/src/bin/psql/tab-complete.c
index 5e1882eaea..6643db6f55 100644
--- a/src/bin/psql/tab-complete.c
+++ b/src/bin/psql/tab-complete.c
@@ -1925,7 +1925,7 @@ psql_completion(const char *text, int start, int end)
COMPLETE_WITH("(", "PUBLICATION");
/* ALTER SUBSCRIPTION <name> SET ( */
else if (HeadMatches("ALTER", "SUBSCRIPTION", MatchAny) && TailMatches("SET", "("))
- COMPLETE_WITH("binary", "disable_on_error", "origin", "slot_name",
+ COMPLETE_WITH("binary", "disable_on_error", "min_send_delay", "origin", "slot_name",
"streaming", "synchronous_commit");
/* ALTER SUBSCRIPTION <name> SKIP ( */
else if (HeadMatches("ALTER", "SUBSCRIPTION", MatchAny) && TailMatches("SKIP", "("))
@@ -3268,7 +3268,7 @@ psql_completion(const char *text, int start, int end)
/* Complete "CREATE SUBSCRIPTION <name> ... WITH ( <opt>" */
else if (HeadMatches("CREATE", "SUBSCRIPTION") && TailMatches("WITH", "("))
COMPLETE_WITH("binary", "connect", "copy_data", "create_slot",
- "disable_on_error", "enabled", "origin", "slot_name",
+ "disable_on_error", "enabled", "min_send_delay", "origin", "slot_name",
"streaming", "synchronous_commit", "two_phase");
/* CREATE TRIGGER --- is allowed inside CREATE SCHEMA, so use TailMatches */
diff --git a/src/include/catalog/pg_subscription.h b/src/include/catalog/pg_subscription.h
index b0f2a1705d..69ae4314b4 100644
--- a/src/include/catalog/pg_subscription.h
+++ b/src/include/catalog/pg_subscription.h
@@ -74,6 +74,8 @@ CATALOG(pg_subscription,6100,SubscriptionRelationId) BKI_SHARED_RELATION BKI_ROW
Oid subowner BKI_LOOKUP(pg_authid); /* Owner of the subscription */
+ int32 subminsenddelay; /* Replication send delay (ms) */
+
bool subenabled; /* True if the subscription is enabled (the
* worker should be running) */
@@ -122,6 +124,7 @@ typedef struct Subscription
* skipped */
char *name; /* Name of the subscription */
Oid owner; /* Oid of the subscription owner */
+ int32 minsenddelay; /* Replication send delay (ms) */
bool enabled; /* Indicates if the subscription is enabled */
bool binary; /* Indicates if the subscription wants data in
* binary format */
diff --git a/src/include/replication/logical.h b/src/include/replication/logical.h
index 5f49554ea0..c389dc17e5 100644
--- a/src/include/replication/logical.h
+++ b/src/include/replication/logical.h
@@ -30,6 +30,11 @@ typedef void (*LogicalOutputPluginWriterUpdateProgress) (struct LogicalDecodingC
bool skipped_xact
);
+typedef void (*LogicalOutputPluginWriterDelay) (struct LogicalDecodingContext *lr,
+ TransactionId xid,
+ TimestampTz start_time
+);
+
typedef struct LogicalDecodingContext
{
/* memory context this is all allocated in */
@@ -64,6 +69,7 @@ typedef struct LogicalDecodingContext
LogicalOutputPluginWriterPrepareWrite prepare_write;
LogicalOutputPluginWriterWrite write;
LogicalOutputPluginWriterUpdateProgress update_progress;
+ LogicalOutputPluginWriterDelay delay_send;
/*
* Output buffer.
@@ -100,6 +106,12 @@ typedef struct LogicalDecodingContext
*/
bool twophase_opt_given;
+ /*
+ * The minimum delay, in milliseconds, by the publisher before sending all
+ * the changes.
+ */
+ int32 min_send_delay;
+
/*
* State for writing output.
*/
@@ -121,14 +133,16 @@ extern LogicalDecodingContext *CreateInitDecodingContext(const char *plugin,
XLogReaderRoutine *xl_routine,
LogicalOutputPluginWriterPrepareWrite prepare_write,
LogicalOutputPluginWriterWrite do_write,
- LogicalOutputPluginWriterUpdateProgress update_progress);
+ LogicalOutputPluginWriterUpdateProgress update_progress,
+ LogicalOutputPluginWriterDelay delay_send);
extern LogicalDecodingContext *CreateDecodingContext(XLogRecPtr start_lsn,
List *output_plugin_options,
bool fast_forward,
XLogReaderRoutine *xl_routine,
LogicalOutputPluginWriterPrepareWrite prepare_write,
LogicalOutputPluginWriterWrite do_write,
- LogicalOutputPluginWriterUpdateProgress update_progress);
+ LogicalOutputPluginWriterUpdateProgress update_progress,
+ LogicalOutputPluginWriterDelay delay_send);
extern void DecodingContextFindStartpoint(LogicalDecodingContext *ctx);
extern bool DecodingContextReady(LogicalDecodingContext *ctx);
extern void FreeDecodingContext(LogicalDecodingContext *ctx);
diff --git a/src/include/replication/pgoutput.h b/src/include/replication/pgoutput.h
index b4a8015403..d2fde09e00 100644
--- a/src/include/replication/pgoutput.h
+++ b/src/include/replication/pgoutput.h
@@ -30,6 +30,7 @@ typedef struct PGOutputData
bool messages;
bool two_phase;
char *origin;
+ int32 min_send_delay;
} PGOutputData;
#endif /* PGOUTPUT_H */
diff --git a/src/include/replication/walreceiver.h b/src/include/replication/walreceiver.h
index decffe352d..271797c7b1 100644
--- a/src/include/replication/walreceiver.h
+++ b/src/include/replication/walreceiver.h
@@ -187,6 +187,7 @@ typedef struct
* prepare time */
char *origin; /* Only publish data originating from the
* specified origin */
+ int32 min_send_delay; /* The minimum send delay (ms) */
} logical;
} proto;
} WalRcvStreamOptions;
diff --git a/src/include/utils/wait_event.h b/src/include/utils/wait_event.h
index 9ab23e1c4a..cc3a234eba 100644
--- a/src/include/utils/wait_event.h
+++ b/src/include/utils/wait_event.h
@@ -150,7 +150,8 @@ typedef enum
WAIT_EVENT_REGISTER_SYNC_REQUEST,
WAIT_EVENT_SPIN_DELAY,
WAIT_EVENT_VACUUM_DELAY,
- WAIT_EVENT_VACUUM_TRUNCATE
+ WAIT_EVENT_VACUUM_TRUNCATE,
+ WAIT_EVENT_WALSENDER_SEND_DELAY
} WaitEventTimeout;
/* ----------
diff --git a/src/test/regress/expected/subscription.out b/src/test/regress/expected/subscription.out
index 3f99b14394..121c335f48 100644
--- a/src/test/regress/expected/subscription.out
+++ b/src/test/regress/expected/subscription.out
@@ -114,18 +114,18 @@ CREATE SUBSCRIPTION regress_testsub4 CONNECTION 'dbname=regress_doesnotexist' PU
WARNING: subscription was created, but is not connected
HINT: To initiate replication, you must manually create the replication slot, enable the subscription, and refresh the subscription.
\dRs+ regress_testsub4
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
-------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub4 | regress_subscription_user | f | {testpub} | f | off | d | f | none | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min send delay | Synchronous commit | Conninfo | Skip LSN
+------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+----------------+--------------------+-----------------------------+----------
+ regress_testsub4 | regress_subscription_user | f | {testpub} | f | off | d | f | none | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
ALTER SUBSCRIPTION regress_testsub4 SET (origin = any);
\dRs+ regress_testsub4
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
-------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub4 | regress_subscription_user | f | {testpub} | f | off | d | f | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min send delay | Synchronous commit | Conninfo | Skip LSN
+------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+----------------+--------------------+-----------------------------+----------
+ regress_testsub4 | regress_subscription_user | f | {testpub} | f | off | d | f | any | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
DROP SUBSCRIPTION regress_testsub3;
@@ -143,10 +143,10 @@ ALTER SUBSCRIPTION regress_testsub CONNECTION 'foobar';
ERROR: invalid connection string syntax: missing "=" after "foobar" in connection info string
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub} | f | off | d | f | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min send delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+----------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | off | d | f | any | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
ALTER SUBSCRIPTION regress_testsub SET PUBLICATION testpub2, testpub3 WITH (refresh = false);
@@ -163,10 +163,10 @@ ERROR: unrecognized subscription parameter: "create_slot"
-- ok
ALTER SUBSCRIPTION regress_testsub SKIP (lsn = '0/12345');
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+---------------------+--------+-----------+------------------+------------------+--------+--------------------+------------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub2,testpub3} | f | off | d | f | any | off | dbname=regress_doesnotexist2 | 0/12345
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min send delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+---------------------+--------+-----------+------------------+------------------+--------+----------------+--------------------+------------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub2,testpub3} | f | off | d | f | any | 0 | off | dbname=regress_doesnotexist2 | 0/12345
(1 row)
-- ok - with lsn = NONE
@@ -175,10 +175,10 @@ ALTER SUBSCRIPTION regress_testsub SKIP (lsn = NONE);
ALTER SUBSCRIPTION regress_testsub SKIP (lsn = '0/0');
ERROR: invalid WAL location (LSN): 0/0
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+---------------------+--------+-----------+------------------+------------------+--------+--------------------+------------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub2,testpub3} | f | off | d | f | any | off | dbname=regress_doesnotexist2 | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min send delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+---------------------+--------+-----------+------------------+------------------+--------+----------------+--------------------+------------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub2,testpub3} | f | off | d | f | any | 0 | off | dbname=regress_doesnotexist2 | 0/0
(1 row)
BEGIN;
@@ -210,10 +210,10 @@ ALTER SUBSCRIPTION regress_testsub_foo SET (synchronous_commit = foobar);
ERROR: invalid value for parameter "synchronous_commit": "foobar"
HINT: Available values: local, remote_write, remote_apply, on, off.
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
----------------------+---------------------------+---------+---------------------+--------+-----------+------------------+------------------+--------+--------------------+------------------------------+----------
- regress_testsub_foo | regress_subscription_user | f | {testpub2,testpub3} | f | off | d | f | any | local | dbname=regress_doesnotexist2 | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min send delay | Synchronous commit | Conninfo | Skip LSN
+---------------------+---------------------------+---------+---------------------+--------+-----------+------------------+------------------+--------+----------------+--------------------+------------------------------+----------
+ regress_testsub_foo | regress_subscription_user | f | {testpub2,testpub3} | f | off | d | f | any | 0 | local | dbname=regress_doesnotexist2 | 0/0
(1 row)
-- rename back to keep the rest simple
@@ -247,19 +247,19 @@ CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUB
WARNING: subscription was created, but is not connected
HINT: To initiate replication, you must manually create the replication slot, enable the subscription, and refresh the subscription.
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub} | t | off | d | f | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min send delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+----------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | t | off | d | f | any | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
ALTER SUBSCRIPTION regress_testsub SET (binary = false);
ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE);
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub} | f | off | d | f | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min send delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+----------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | off | d | f | any | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
DROP SUBSCRIPTION regress_testsub;
@@ -271,27 +271,27 @@ CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUB
WARNING: subscription was created, but is not connected
HINT: To initiate replication, you must manually create the replication slot, enable the subscription, and refresh the subscription.
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub} | f | on | d | f | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min send delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+----------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | on | d | f | any | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
ALTER SUBSCRIPTION regress_testsub SET (streaming = parallel);
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub} | f | parallel | d | f | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min send delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+----------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | parallel | d | f | any | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
ALTER SUBSCRIPTION regress_testsub SET (streaming = false);
ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE);
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub} | f | off | d | f | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min send delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+----------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | off | d | f | any | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
-- fail - publication already exists
@@ -306,10 +306,10 @@ ALTER SUBSCRIPTION regress_testsub ADD PUBLICATION testpub1, testpub2 WITH (refr
ALTER SUBSCRIPTION regress_testsub ADD PUBLICATION testpub1, testpub2 WITH (refresh = false);
ERROR: publication "testpub1" is already in subscription "regress_testsub"
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-----------------------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub,testpub1,testpub2} | f | off | d | f | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min send delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-----------------------------+--------+-----------+------------------+------------------+--------+----------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub,testpub1,testpub2} | f | off | d | f | any | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
-- fail - publication used more than once
@@ -324,10 +324,10 @@ ERROR: publication "testpub3" is not in subscription "regress_testsub"
-- ok - delete publications
ALTER SUBSCRIPTION regress_testsub DROP PUBLICATION testpub1, testpub2 WITH (refresh = false);
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub} | f | off | d | f | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min send delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+----------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | off | d | f | any | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
DROP SUBSCRIPTION regress_testsub;
@@ -363,10 +363,10 @@ CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUB
WARNING: subscription was created, but is not connected
HINT: To initiate replication, you must manually create the replication slot, enable the subscription, and refresh the subscription.
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub} | f | off | p | f | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min send delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+----------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | off | p | f | any | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
--fail - alter of two_phase option not supported.
@@ -375,10 +375,10 @@ ERROR: unrecognized subscription parameter: "two_phase"
-- but can alter streaming when two_phase enabled
ALTER SUBSCRIPTION regress_testsub SET (streaming = true);
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub} | f | on | p | f | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min send delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+----------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | on | p | f | any | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE);
@@ -388,10 +388,10 @@ CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUB
WARNING: subscription was created, but is not connected
HINT: To initiate replication, you must manually create the replication slot, enable the subscription, and refresh the subscription.
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub} | f | on | p | f | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min send delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+----------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | on | p | f | any | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE);
@@ -404,20 +404,61 @@ CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUB
WARNING: subscription was created, but is not connected
HINT: To initiate replication, you must manually create the replication slot, enable the subscription, and refresh the subscription.
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub} | f | off | d | f | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min send delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+----------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | off | d | f | any | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
ALTER SUBSCRIPTION regress_testsub SET (disable_on_error = true);
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub} | f | off | d | t | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min send delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+----------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | off | d | t | any | 0 | off | dbname=regress_doesnotexist | 0/0
+(1 row)
+
+ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE);
+DROP SUBSCRIPTION regress_testsub;
+-- fail -- min_send_delay must be a non-negative integer
+CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexit' PUBLICATION testpub WITH (connect = false, min_send_delay = foo);
+ERROR: invalid value for parameter "min_send_delay": "foo"
+CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexit' PUBLICATION testpub WITH (connect = false, min_send_delay = -1);
+ERROR: -1 ms is outside the valid range for parameter "min_send_delay" (0 .. 2147483647)
+-- fail - specifying streaming = parallel with positive min_send_delay is not
+-- supported
+CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUBLICATION testpub WITH (connect = false, streaming = parallel, min_send_delay = 123);
+ERROR: min_send_delay > 0 and streaming = parallel are mutually exclusive options
+-- success -- min_send_delay value without units is taken as milliseconds
+CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexit' PUBLICATION testpub WITH (connect = false, min_send_delay = 123);
+WARNING: subscription was created, but is not connected
+HINT: To initiate replication, you must manually create the replication slot, enable the subscription, and refresh the subscription.
+\dRs+
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min send delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+----------------+--------------------+----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | off | d | f | any | 123 | off | dbname=regress_doesnotexit | 0/0
+(1 row)
+
+-- success -- min_send_delay value with units other than ms is converted to ms
+-- and stored as an integer
+ALTER SUBSCRIPTION regress_testsub SET (min_send_delay = '1 d');
+\dRs+
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min send delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+----------------+--------------------+----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | off | d | f | any | 86400000 | off | dbname=regress_doesnotexit | 0/0
(1 row)
+-- fail - alter subscription with streaming = parallel should fail when
+-- min_send_delay is greater than zero
+ALTER SUBSCRIPTION regress_testsub SET (streaming = parallel);
+ERROR: cannot set parallel streaming mode for subscription with min_send_delay
+-- fail - alter subscription with min_send_delay should fail when
+-- streaming = parallel is set
+ALTER SUBSCRIPTION regress_testsub SET (min_send_delay = 0, streaming = parallel);
+ALTER SUBSCRIPTION regress_testsub SET (min_send_delay = 123);
+ERROR: cannot set min_send_delay for subscription in parallel streaming mode
ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE);
DROP SUBSCRIPTION regress_testsub;
RESET SESSION AUTHORIZATION;
diff --git a/src/test/regress/sql/subscription.sql b/src/test/regress/sql/subscription.sql
index 7281f5fee2..b2d49fcfcb 100644
--- a/src/test/regress/sql/subscription.sql
+++ b/src/test/regress/sql/subscription.sql
@@ -286,6 +286,35 @@ ALTER SUBSCRIPTION regress_testsub SET (disable_on_error = true);
ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE);
DROP SUBSCRIPTION regress_testsub;
+-- fail -- min_send_delay must be a non-negative integer
+CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexit' PUBLICATION testpub WITH (connect = false, min_send_delay = foo);
+CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexit' PUBLICATION testpub WITH (connect = false, min_send_delay = -1);
+
+-- fail - specifying streaming = parallel with positive min_send_delay is not
+-- supported
+CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUBLICATION testpub WITH (connect = false, streaming = parallel, min_send_delay = 123);
+
+-- success -- min_send_delay value without units is taken as milliseconds
+CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexit' PUBLICATION testpub WITH (connect = false, min_send_delay = 123);
+\dRs+
+
+-- success -- min_send_delay value with units other than ms is converted to ms
+-- and stored as an integer
+ALTER SUBSCRIPTION regress_testsub SET (min_send_delay = '1 d');
+\dRs+
+
+-- fail - alter subscription with streaming = parallel should fail when
+-- min_send_delay is greater than zero
+ALTER SUBSCRIPTION regress_testsub SET (streaming = parallel);
+
+-- fail - alter subscription with min_send_delay should fail when
+-- streaming = parallel is set
+ALTER SUBSCRIPTION regress_testsub SET (min_send_delay = 0, streaming = parallel);
+ALTER SUBSCRIPTION regress_testsub SET (min_send_delay = 123);
+
+ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE);
+DROP SUBSCRIPTION regress_testsub;
+
RESET SESSION AUTHORIZATION;
DROP ROLE regress_subscription_user;
DROP ROLE regress_subscription_user2;
diff --git a/src/test/subscription/t/001_rep_changes.pl b/src/test/subscription/t/001_rep_changes.pl
index 91aa068c95..c4e6f11b07 100644
--- a/src/test/subscription/t/001_rep_changes.pl
+++ b/src/test/subscription/t/001_rep_changes.pl
@@ -515,6 +515,33 @@ $node_publisher->poll_query_until('postgres',
or die
"Timed out while waiting for apply to restart after renaming SUBSCRIPTION";
+# Test time-delayed logical replication
+#
+# If the subscription sets min_send_delay parameter, the walsender will delay
+# the transaction for min_send_delay milliseconds. We verify this by looking
+# at the time difference between a) when tuples are inserted on the publisher,
+# and b) when those changes are replicated on the subscriber. Even on slow
+# machines, this strategy will give predictable behavior.
+
+# Set min_send_delay parameter to 3 seconds
+my $delay = 3;
+$node_subscriber->safe_psql('postgres',
+ "ALTER SUBSCRIPTION tap_sub_renamed SET (min_send_delay = '${delay}s')");
+
+# Before doing the insertion, get the current timestamp that will be
+# used as a comparison base.
+my $publisher_insert_time = time();
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_ins VALUES (generate_series(1101, 1120))");
+
+# The publisher waits for the replication to complete
+$node_publisher->wait_for_catchup('tap_sub_renamed');
+
+# This test is successful only if at least the configured delay has elapsed.
+ok( time() - $publisher_insert_time >= $delay,
+ "subscriber applies changes only after replication delay"
+);
+
# check all the cleanup
$node_subscriber->safe_psql('postgres', "DROP SUBSCRIPTION tap_sub_renamed");
--
2.27.0
Dear Vignesh,
Thank you for reviewing! New version can be available at [1]/messages/by-id/TYAPR01MB586606CF3B585B6F8BE13A9CF5B29@TYAPR01MB5866.jpnprd01.prod.outlook.com.
1) Currently we have added the delay during the decode of commit, while decoding the commit walsender process will stop decoding any further transaction until delay is completed. There might be a possibility that a lot of transactions will happen in parallel and there will be a lot of transactions to be decoded after the delay is completed. Will it be possible to decode the WAL if any WAL is generated instead of staying idle in the meantime, I'm not sure if this is feasible just throwing my thought to see if it might be possible. --- a/src/backend/replication/logical/decode.c +++ b/src/backend/replication/logical/decode.c @@ -676,6 +676,15 @@ DecodeCommit(LogicalDecodingContext *ctx, XLogRecordBuffer *buf,buf->origptr, buf->endptr);
}+ /* + * Delay sending the changes if required. For streaming transactions, + * this means a delay in sending the last stream but that is OK + * because on the downstream the changes will be applied only after + * receiving the last stream. + */ + if (ctx->min_send_delay > 0 && ctx->delay_send) + ctx->delay_send(ctx, xid, commit_time); +
I see your point, but I think that extension can be done in future version if needed.
This is because we must change some parts and introduce some complexities.
If we have decoded but have not wanted to send changes yet, we must store them in
the memory one and skip sending. In order to do that we must add new data structure,
and we must add another path in DecodeCommit, DecodePrepare not to send changes
and in WalSndLoop() and other functions to send pending changes. These may not be sufficient.
I'm now thinking aboves are not needed, we can modify later if the overhead of
decoding is quite large and we must do them very efficiently.
2) Generally single line comments are not terminated by ".", The comment "/* Sleep until appropriate time. */" can be changed appropriately: + + /* Sleep until appropriate time. */ + timeout_sleeptime_ms = WalSndComputeSleeptime(now); + + elog(DEBUG2, "time-delayed replication for txid %u, delay_time = %d ms, remaining wait time: %ld ms", + xid, (int) ctx->min_send_delay, remaining_wait_time_ms); + + /* Sleep until we get reply from worker or we time out */ + WalSndWait(WL_SOCKET_READABLE,
Right, removed.
3) In some places we mention as min_send_delay and in some places we mention it as time-delayed replication, we can keep the comment consistent by using the similar wordings. +-- fail - specifying streaming = parallel with time-delayed replication is not +-- supported +CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUBLICATION testpub WITH (connect = false, streaming = parallel, min_send_delay = 123);+-- fail - alter subscription with streaming = parallel should fail when +-- time-delayed replication is set +ALTER SUBSCRIPTION regress_testsub SET (streaming = parallel);+-- fail - alter subscription with min_send_delay should fail when +-- streaming = parallel is set
"time-delayed replication" was removed.
4) Since the value is stored in ms, we need not add ms again as the
default value is in ms:
@@ -4686,6 +4694,9 @@ dumpSubscription(Archive *fout, const
SubscriptionInfo *subinfo)
if (strcmp(subinfo->subsynccommit, "off") != 0)
appendPQExpBuffer(query, ", synchronous_commit = %s",
fmtId(subinfo->subsynccommit));+ if (subinfo->subminsenddelay > 0) + appendPQExpBuffer(query, ", min_send_delay = '%d ms'", subinfo->subminsenddelay);
Right, fixed.
5) we can use the new error reporting style: 5.a) brackets around errcode can be removed + ereport(ERROR, + (errcode(ERRCODE_INVALID_PARAMETER_VALUE), + errmsg("invalid value for parameter \"%s\": \"%s\"", + "min_send_delay", input_string), + hintmsg ? errhint("%s", _(hintmsg)) : 0));5.b) Similarly here too; + if (result < 0 || result > PG_INT32_MAX) + ereport(ERROR, + (errcode(ERRCODE_INVALID_PARAMETER_VALUE), + errmsg("%d ms is outside the valid range for parameter \"%s\" (%d .. %d)", + result, + "min_send_delay", + 0, PG_INT32_MAX)));5.c) Similarly here too; + delay_val = strtoul(strVal(defel->arg), &endptr, 10); + if (errno != 0 || *endptr != '\0') + ereport(ERROR, + (errcode(ERRCODE_INVALID_PARAMETER_VALUE), + errmsg("invalid min_send_delay")));5.d) Similarly here too; + if (delay_val > PG_INT32_MAX) + ereport(ERROR, + (errcode(ERRCODE_INVALID_PARAMETER_VALUE), + errmsg("min_send_delay \"%s\" out of range", + strVal(defel->arg))));
All of them are fixed.
6) This can be changed to a single line comment: + /* + * Parse given string as parameter which has millisecond unit + */ + if (!parse_int(input_string, &result, GUC_UNIT_MS, &hintmsg)) + ereport(ERROR,
Changed. I grepped ereport() in the patch and I thought there were no similar one.
7) In the expect we have specifically mention "for non-streaming transaction", is the behavior different for streaming transaction, if not we can change the message accordingly +# The publisher waits for the replication to complete +$node_publisher->wait_for_catchup('tap_sub_renamed'); + +# This test is successful only if at least the configured delay has elapsed. +ok( time() - $publisher_insert_time >= $delay, + "subscriber applies changes only after replication delay for non-streaming transaction" +);
There is no difference, both of normal and streamed transaction could be delayed to apply.
So removed.
[1]: /messages/by-id/TYAPR01MB586606CF3B585B6F8BE13A9CF5B29@TYAPR01MB5866.jpnprd01.prod.outlook.com
Best Regards,
Hayato Kuroda
FUJITSU LIMITED
Dear Amit,
Fair point but I think the current comment should explain why we are
doing something different here. How about extending the existing
comments to something like: "If we've requested to shut down, exit the
process. This is unlike handling at other places where we allow
complete WAL to be sent before shutdown because we don't want the
delayed transactions to be applied downstream. This will allow one to
use the data from downstream in case of some unwanted operations on
the current node."
Thank you for suggestion. I think it is better, so changed.
Please see new patch at [1]/messages/by-id/TYAPR01MB586606CF3B585B6F8BE13A9CF5B29@TYAPR01MB5866.jpnprd01.prod.outlook.com
[1]: /messages/by-id/TYAPR01MB586606CF3B585B6F8BE13A9CF5B29@TYAPR01MB5866.jpnprd01.prod.outlook.com
Best Regards,
Hayato Kuroda
FUJITSU LIMITED
Yeah, min_send_delay and max_slots_wal_keep_size should be easily tunable
because
the appropriate value depends on the enviroment and workload.
However, pg_replication_slots.pg_replication_slots cannot show the exact amout
of WALs,
so it may not suitable for tuning. I think user can compare the value
pg_replication_slots.restart_lsn (or pg_stat_replication.sent_lsn) and
pg_current_wal_lsn() to calclate number of WALs to be delayed, like```
postgres=# select pg_current_wal_lsn() - pg_replication_slots.restart_lsn as
delayed from pg_replication_slots;
delayed
------------
1689153760
(1 row)
```I think it would be better to tell about this in the docs along with
the 'min_send_delay' description. The key point is whether this would
be an acceptable trade-off for users who want to use this feature. I
think it can harm only if users use this without understanding the
corresponding trade-off. As we kept the default to no delay, it is
expected from users using this have an understanding of the trade-off.Yes, the trade-off should be emphasized.
Based on the understanding, I added them to the doc in new version patch.
Please see [1]/messages/by-id/TYAPR01MB586606CF3B585B6F8BE13A9CF5B29@TYAPR01MB5866.jpnprd01.prod.outlook.com.
[1]: /messages/by-id/TYAPR01MB586606CF3B585B6F8BE13A9CF5B29@TYAPR01MB5866.jpnprd01.prod.outlook.com
Best Regards,
Hayato Kuroda
FUJITSU LIMITED
On Thu, Mar 2, 2023 at 1:07 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
On Thu, Mar 2, 2023 at 7:38 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
On Wed, Mar 1, 2023 at 6:21 PM Hayato Kuroda (Fujitsu)
<kuroda.hayato@fujitsu.com> wrote:Apart from a bad-use case example I mentioned, in general, piling up
WAL files due to the replication slot has many bad effects on the
system. I'm concerned that the side effect of this feature (at least
of the current design) is too huge compared to the benefit, and afraid
that users might end up using this feature without understanding the
side effect well. It might be okay if we thoroughly document it but
I'm not sure.One approach is that change max_slot_wal_keep_size forcibly when min_send_delay
is set. But it may lead to disable the slot because WALs needed by the time-delayed
replication may be also removed. Just the right value cannot be set by us because
it is quite depends on the min_send_delay and workload.How about throwing the WARNING when min_send_delay > 0 but
max_slot_wal_keep_size < 0? Differ from previous, version the subscription
parameter min_send_delay will be sent to publisher. Therefore, we can compare
min_send_delay and max_slot_wal_keep_size when publisher receives the parameter.Since max_slot_wal_keep_size can be changed by reloading the config
file, each walsender warns it also at that time?I think Kuroda-San wants to emit a WARNING at the time of CREATE
SUBSCRIPTION. But it won't be possible to emit a WARNING at the time
of ALTER SUBSCRIPTION. Also, as you say if the user later changes the
value of max_slot_wal_keep_size, then even if we issue LOG/WARNING in
walsender, it may go unnoticed. If we really want to give WARNING for
this then we can probably give it as soon as user has set non-default
value of min_send_delay to indicate that this can lead to retaining
WAL on the publisher and they should consider setting
max_slot_wal_keep_size.Having said that, I think users can always tune max_slot_wal_keep_size
and min_send_delay (as none of these requires restart) if they see any
indication of unexpected WAL size growth. There could be multiple ways
to check it but I think one can refer wal_status in
pg_replication_slots, the extended value can be an indicator of this.Not sure it's
helpful. I think it's a legitimate use case to set min_send_delay > 0
and max_slot_wal_keep_size = -1, and users might not even notice the
WARNING message.I think it would be better to tell about this in the docs along with
the 'min_send_delay' description. The key point is whether this would
be an acceptable trade-off for users who want to use this feature. I
think it can harm only if users use this without understanding the
corresponding trade-off. As we kept the default to no delay, it is
expected from users using this have an understanding of the trade-off.
I imagine that a typical use case would be to set min_send_delay to
several hours to days. I'm concerned that it could not be an
acceptable trade-off for many users that the system cannot collect any
garbage during that.
I think we can have the apply process write the decoded changes
somewhere on the disk (as not temporary files) and return the flush
LSN so that the apply worker can apply them later and the publisher
can advance slot's LSN. The feature would be more complex but from the
user perspective it would be better.
Regards,
--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com
Hi all,
Thanks for working on this.
I imagine that a typical use case would be to set min_send_delay to
several hours to days. I'm concerned that it could not be an
acceptable trade-off for many users that the system cannot collect any
garbage during that.
I'm not too worried about the WAL recycling, that mostly looks like
a documentation issue to me. It is not a problem that many PG users
are unfamiliar. Also, even though one day creating - altering subscription
is relaxed to be done by a regular user, one option could be to require
this setting to be changed by a superuser? That would alleviate my concern
regarding WAL recycling. A superuser should be able to monitor the system
and adjust the settings/hardware accordingly.
However, VACUUM being blocked by replication with a configuration
change on the subscription sounds more concerning to me. Blocking
VACUUM for hours could quickly escalate to performance problems.
On the other hand, we already have a similar problem with
recovery_min_apply_delay combined with hot_standby_feedback [1]PostgreSQL: Documentation: 15: 20.6. Replication <https://www.postgresql.org/docs/current/runtime-config-replication.html>.
So, that probably is an acceptable trade-off for the pgsql-hackers.
If you use this feature, you should be even more careful.
I think we can have the apply process write the decoded changes
somewhere on the disk (as not temporary files) and return the flush
LSN so that the apply worker can apply them later and the publisher
can advance slot's LSN. The feature would be more complex but from the
user perspective it would be better.
Yes, this might probably be one of the ideal solutions to the problem at
hand. But,
my current guess is that it'd be a non-trivial change with different
concurrency/failure
scenarios. So, I'm not sure if that is going to be a realistic patch to
pursue.
Thanks,
Onder KALACI
[1]: PostgreSQL: Documentation: 15: 20.6. Replication <https://www.postgresql.org/docs/current/runtime-config-replication.html>
<https://www.postgresql.org/docs/current/runtime-config-replication.html>
On Mon, Mar 06, 2023 at 07:27:59PM +0300, Önder Kalacı wrote:
On the other hand, we already have a similar problem with
recovery_min_apply_delay combined with hot_standby_feedback [1].
So, that probably is an acceptable trade-off for the pgsql-hackers.
If you use this feature, you should be even more careful.
Yes, but it's possible to turn off hot_standby_feedback so that you don't
incur bloat on the primary. And you don't need to store hours or days of
WAL on the primary. I'm very late to this thread, but IIUC you cannot
avoid blocking VACUUM with the proposed feature. IMO the current set of
trade-offs (e.g., unavoidable bloat and WAL buildup) would make this
feature virtually unusable for a lot of workloads, so it's probably worth
exploring an alternative approach. In any case, we probably shouldn't rush
this into v16 in its current form.
--
Nathan Bossart
Amazon Web Services: https://aws.amazon.com
On Wed, Mar 8, 2023 at 3:30 AM Nathan Bossart <nathandbossart@gmail.com> wrote:
On Mon, Mar 06, 2023 at 07:27:59PM +0300, Önder Kalacı wrote:
On the other hand, we already have a similar problem with
recovery_min_apply_delay combined with hot_standby_feedback [1].
So, that probably is an acceptable trade-off for the pgsql-hackers.
If you use this feature, you should be even more careful.Yes, but it's possible to turn off hot_standby_feedback so that you don't
incur bloat on the primary. And you don't need to store hours or days of
WAL on the primary.
Right. This side effect belongs to the combination of
recovery_min_apply_delay and hot_standby_feedback/replication slot.
recovery_min_apply_delay itself can be used even without this side
effect if we accept other trade-offs. When it comes to this
time-delayed logical replication feature, there is no choice to avoid
the side effect for users who want to use this feature.
I'm very late to this thread, but IIUC you cannot
avoid blocking VACUUM with the proposed feature.
Right.
IMO the current set of
trade-offs (e.g., unavoidable bloat and WAL buildup) would make this
feature virtually unusable for a lot of workloads, so it's probably worth
exploring an alternative approach.
It might require more engineering effort for alternative approaches
such as one I proposed but the feature could become better from the
user perspective. I also think it would be worth exploring it if we've
not yet.
Regards,
--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com
On Wed, Mar 8, 2023 at 9:20 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
On Wed, Mar 8, 2023 at 3:30 AM Nathan Bossart <nathandbossart@gmail.com> wrote:
IMO the current set of
trade-offs (e.g., unavoidable bloat and WAL buildup) would make this
feature virtually unusable for a lot of workloads, so it's probably worth
exploring an alternative approach.It might require more engineering effort for alternative approaches
such as one I proposed but the feature could become better from the
user perspective. I also think it would be worth exploring it if we've
not yet.
Fair enough. I think as of now most people think that we should
consider alternative approaches for this feature. The two ideas at a
high level are that the apply worker itself first flushes the decoded
WAL (maybe only when time-delay is configured) or have a separate
walreceiver process as we have for standby. I think we need to analyze
the pros and cons of each of those approaches and see if they would be
useful even for other things on the apply side.
--
With Regards,
Amit Kapila.
At Thu, 9 Mar 2023 11:00:46 +0530, Amit Kapila <amit.kapila16@gmail.com> wrote in
On Wed, Mar 8, 2023 at 9:20 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
On Wed, Mar 8, 2023 at 3:30 AM Nathan Bossart <nathandbossart@gmail.com> wrote:
IMO the current set of
trade-offs (e.g., unavoidable bloat and WAL buildup) would make this
feature virtually unusable for a lot of workloads, so it's probably worth
exploring an alternative approach.It might require more engineering effort for alternative approaches
such as one I proposed but the feature could become better from the
user perspective. I also think it would be worth exploring it if we've
not yet.Fair enough. I think as of now most people think that we should
consider alternative approaches for this feature. The two ideas at a
If we can notify subscriber of the transaction start time, will that
solve the current problem? If not, or if it is not possible, +1 to
look for other solutions.
high level are that the apply worker itself first flushes the decoded
WAL (maybe only when time-delay is configured) or have a separate
walreceiver process as we have for standby. I think we need to analyze
the pros and cons of each of those approaches and see if they would be
useful even for other things on the apply side.
My understanding of the requirements here is that the publisher should
not hold changes, the subscriber should not hold data reads, and all
transactions including two-phase ones should be applied at once upon
committing. Both sides need to respond to the requests from the other
side. We expect apply-delay of several hours or more. My thoughts
considering the requirements are as follows:
If we expect delays of several hours or more, I don't think it's
feasible to stack received changes in the process memory. So, if
apply-delay is in effect, I think it would be better to process
transactions through files regardless of process configuration.
I'm not sure whether we should have a separate process for protocol
processing. On one hand, it would simplify the protocol processing
part, but on the other hand, changes would always have to be applied
through files. If we plan to integrate the paths with and without
apply-delay by the file-passing method, this might work. If we want to
maintain the current approach when not applying apply-delay, I think
we would have to implement it in a single process, but I feel the
protocol processing part could become complicated.
regards.
--
Kyotaro Horiguchi
NTT Open Source Software Center
On Thu, Mar 9, 2023 at 2:56 PM Kyotaro Horiguchi
<horikyota.ntt@gmail.com> wrote:
At Thu, 9 Mar 2023 11:00:46 +0530, Amit Kapila <amit.kapila16@gmail.com> wrote in
On Wed, Mar 8, 2023 at 9:20 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
On Wed, Mar 8, 2023 at 3:30 AM Nathan Bossart <nathandbossart@gmail.com> wrote:
IMO the current set of
trade-offs (e.g., unavoidable bloat and WAL buildup) would make this
feature virtually unusable for a lot of workloads, so it's probably worth
exploring an alternative approach.It might require more engineering effort for alternative approaches
such as one I proposed but the feature could become better from the
user perspective. I also think it would be worth exploring it if we've
not yet.Fair enough. I think as of now most people think that we should
consider alternative approaches for this feature. The two ideas at aIf we can notify subscriber of the transaction start time, will that
solve the current problem?
I don't think that will solve the current problem because the problem
is related to confirming back the flush LSN (commit LSN) to the
publisher which we do only after we commit the delayed transaction.
Due to this, we are not able to advance WAL(restart_lsn)/XMIN on the
publisher which causes an accumulation of WAL and does not allow the
vacuum to remove deleted rows. Do you have something else in mind
which makes you think that it can solve the problem?
--
With Regards,
Amit Kapila.
Dear hackers,
Based on the discussion Sawada-san pointed out[1]/messages/by-id/CAD21AoAeG2+RsUYD9+mEwr8-rrt8R1bqpe56T2D=euO-Qs-GAg@mail.gmail.com that the current approach of
logical time-delayed avoids recycling WALs, I'm planning to close the CF entry once.
This or the forked thread will be registered again after deciding on the alternative
approach. Thank you very much for the time to join our discussions earlier.
I think to solve the issue, logical changes must be flushed on subscribers once
and workers apply changes after spending a specified time. The straightforward
approach for it is following physical replication - introduce the walreceiver process
on the subscriber. We must research more, but at least there are some benefits:
* Publisher can be shutted down even if the apply worker stuck. The stuck is more
likely happen than physical replication, so this may improve the robustness.
More detail, please see another thread[2]/messages/by-id/TYAPR01MB586668E50FC2447AD7F92491F5E89@TYAPR01MB5866.jpnprd01.prod.outlook.com.
* In case of synchronous_commit = 'remote_write', publisher can COMMIT faster.
This is because walreceiver will flush changes immediately and reply soon.
Even if time-delayed is enabled, the wait-time will not be increased.
* May be used as an infrastructure of parallel apply for non-streaming transaction.
The basic design of them are the similar - one process receive changes and others apply.
I searched old discussions [3]/messages/by-id/201206131327.24092.andres@2ndquadrant.com and wiki pages, and I found that the initial prototype
had a logical walreceiver but in a later version [4]/messages/by-id/37e19ad5-f667-2fe2-b95b-bba69c5b6c68@2ndquadrant.com apply worker directly received
changes. I could not find the reason for the decision, but I suspect there were the
following reasons. Could you please tell me the correct background about that?
* Performance bottlenecks. If the walreceiver flush changes and the worker applies
them, fsync() is called for every reception.
* Complexity. In this design walreceiver and apply worker must share the progress
of flush/apply. For crash recovery, more consideration is needed. The related discussion
can be found in [5]/messages/by-id/1339586927-13156-12-git-send-email-andres@2ndquadrant.com.
* Extendibility. In-core logical replication should be a sample of an external
project. Apply worker is just a background worker that can be launched from an extension,
so it can be easily understood. If it deeply depends on the walreceiver, other projects cannot follow.
[1]: /messages/by-id/CAD21AoAeG2+RsUYD9+mEwr8-rrt8R1bqpe56T2D=euO-Qs-GAg@mail.gmail.com
[2]: /messages/by-id/TYAPR01MB586668E50FC2447AD7F92491F5E89@TYAPR01MB5866.jpnprd01.prod.outlook.com
[3]: /messages/by-id/201206131327.24092.andres@2ndquadrant.com
[4]: /messages/by-id/37e19ad5-f667-2fe2-b95b-bba69c5b6c68@2ndquadrant.com
[5]: /messages/by-id/1339586927-13156-12-git-send-email-andres@2ndquadrant.com
Best Regards,
Hayato Kuroda
FUJITSU LIMITED
Hi hackers,
I have made a rough prototype that can serialize changes to permanent file and
apply after time elapsed from v30 patch. I think the 2PC and restore mechanism
needs more analysis, but I can share codes for discussion. How do you think?
## Interfaces
Not changed from old versions. The subscription parameter "min_apply_delay" is
used to enable the time-delayed logical replication.
## Advantages
Two big problems are solved.
* Apply worker can respond from walsender's keepalive while delaying application.
This is because the process will not sleep.
* Publisher can recycle WALs even if a transaction related with the WAL is not
applied yet. This is because the apply worker flush all the changes to file
and reply that WALs are flushed.
## Disadvantages
Code complexity.
## Basic design
The basic idea is quite simple - create a new file when apply worker receive
BEGIN message, write received changes, and flush them when COMMIT message is come.
The delayed transaction is checked its commit time for every main loop, and applied
when the time exceeds the min_apply_delay.
To handle files APIs that uses plain kernel FDs was used. This approach is
similar to physical walreceiver process. Apart from the physical one, worker
does not flush for every messages - it is done at the end of the transaction.
### For 2PC
The delay is started since COMMIT PREPARED is come. But to avoid the
long-lock-holding issue, the prepared transaction is just written into file
without applying.
When BEGIN PREPARE is received, same as normal transactions, the worker creates
a file and starts to write changes. If we reach the PREPARE message, just writes
a message into file, flushes, and just closes it. This means that no transactions
are prepared on subscriber. When COMMIT PREPARED is received, the worker opens the
file again and write the message. After that we treat same as normal committed
transaction.
### For streamed transaction
Do no special thing when the streaming transaction is come. When it is committed
or prepared, read all the changes and write into permanent file. To read and
write changes apply_spooled_changes() is used, which means the basic workflow
is not changed.
### Restore from files
To check the elapsed time from the commit, all commit_time of delayed transactions
must be stored in the memory. Basically it can store when the worker handle COMMIT
message, but it must do special treatment for restarting.
When an apply worker receives COMMIT/PREPARE/COMMIT PREPARED message, it writes
the message, flush them, and cache the commit_time. When worker restarts, it open
files, check the final message (this is done by seeking some bytes from end of
the file), and then cache the written commit_time.
## Restrictions
* The combination with ALTER SUBSCRIPTION .. SKIP LSN is not considered.
Thanks for Osumi-san to help implementing.
Best Regards,
Hayato Kuroda
FUJITSU LIMITED
Attachments:
0001-WIP-Time-delayed-logical-replication-by-serializing-.patchapplication/octet-stream; name=0001-WIP-Time-delayed-logical-replication-by-serializing-.patchDownload
From d267abea1d7c7fbdaf80bc1b14afa43a4878b262 Mon Sep 17 00:00:00 2001
From: Takamichi Osumi <osumi.takamichi@fujitsu.com>
Date: Fri, 10 Feb 2023 10:43:26 +0000
Subject: [PATCH] (WIP) Time-delayed logical replication by serializing changes
Similar to physical replication, a time-delayed copy of the data for
logical replication is useful for some scenarios (particularly to fix
errors that might cause data loss).
This patch implements a new subscription parameter called 'min_apply_delay'.
If the subscription sets min_apply_delay parameter, the logical
replication worker will delay the transaction apply for min_apply_delay
milliseconds.
The delaying is implemented by serializing changes into file. The file is
created when the worker receives BEGIN message. The worker writes received
changes and flush at COMMIT. The delayed transaction is checked its commit time
for every main loop, and applied from the file when the time exceeds the
min_apply_delay. The commit time is stored in memory when the transaction is
committed, or the worker restarts.
The delay is calculated between the WAL time stamp and the current time
on the subscriber.
The combination of parallel streaming mode and min_apply_delay is not
allowed. This is because in parallel streaming mode, we start applying
the transaction stream as soon as the first change arrives without
knowing the transaction's prepare/commit time. This means we cannot
calculate the underlying network/decoding lag between publisher and
subscriber, and so always waiting for the full 'min_apply_delay' period
might include unnecessary delay.
The other possibility was to apply the delay at the end of the parallel
apply transaction but that would cause issues related to resource
bloat and locks being held for a long time.
Currently the combination of skip transaction feature and min_apply_delay
does not work well.
Earlier versions were written by Euler Taveira, Takamichi Osumi, and Kuroda Hayato
Author: Kuroda Hayato, Takamichi Osumi
---
doc/src/sgml/catalogs.sgml | 9 +
doc/src/sgml/glossary.sgml | 15 +
doc/src/sgml/logical-replication.sgml | 6 +
doc/src/sgml/ref/alter_subscription.sgml | 5 +-
doc/src/sgml/ref/create_subscription.sgml | 47 +-
src/backend/catalog/pg_subscription.c | 1 +
src/backend/catalog/system_views.sql | 7 +-
src/backend/commands/subscriptioncmds.c | 122 ++-
src/backend/replication/logical/worker.c | 1128 ++++++++++++++++++--
src/bin/pg_dump/pg_dump.c | 15 +-
src/bin/pg_dump/pg_dump.h | 1 +
src/bin/psql/describe.c | 9 +-
src/bin/psql/tab-complete.c | 4 +-
src/include/catalog/pg_subscription.h | 3 +
src/test/regress/expected/subscription.out | 181 ++--
src/test/regress/sql/subscription.sql | 24 +
src/test/subscription/t/001_rep_changes.pl | 31 +
src/test/subscription/t/032_tmp.pl | 141 +++
18 files changed, 1571 insertions(+), 178 deletions(-)
create mode 100644 src/test/subscription/t/032_tmp.pl
diff --git a/doc/src/sgml/catalogs.sgml b/doc/src/sgml/catalogs.sgml
index 746baf5053..937935d83b 100644
--- a/doc/src/sgml/catalogs.sgml
+++ b/doc/src/sgml/catalogs.sgml
@@ -7891,6 +7891,15 @@ SCRAM-SHA-256$<replaceable><iteration count></replaceable>:<replaceable>&l
</para></entry>
</row>
+ <row>
+ <entry role="catalog_table_entry"><para role="column_definition">
+ <structfield>subminapplydelay</structfield> <type>int4</type>
+ </para>
+ <para>
+ The minimum delay, in milliseconds, for applying changes
+ </para></entry>
+ </row>
+
<row>
<entry role="catalog_table_entry"><para role="column_definition">
<structfield>subname</structfield> <type>name</type>
diff --git a/doc/src/sgml/glossary.sgml b/doc/src/sgml/glossary.sgml
index 7c01a541fe..9ede9d05f6 100644
--- a/doc/src/sgml/glossary.sgml
+++ b/doc/src/sgml/glossary.sgml
@@ -1729,6 +1729,21 @@
</glossdef>
</glossentry>
+ <glossentry id="glossary-time-delayed-replication">
+ <glossterm>Time-delayed replication</glossterm>
+ <glossdef>
+ <para>
+ Replication setup that delays the application of changes by a specified
+ minimum time-delay period.
+ </para>
+ <para>
+ For more information, see
+ <xref linkend="guc-recovery-min-apply-delay"/> for physical replication
+ and <xref linkend="sql-createsubscription"/> for logical replication.
+ </para>
+ </glossdef>
+ </glossentry>
+
<glossentry id="glossary-toast">
<glossterm>TOAST</glossterm>
<glossdef>
diff --git a/doc/src/sgml/logical-replication.sgml b/doc/src/sgml/logical-replication.sgml
index 6b0e300adc..ebf19da406 100644
--- a/doc/src/sgml/logical-replication.sgml
+++ b/doc/src/sgml/logical-replication.sgml
@@ -254,6 +254,12 @@
target table.
</para>
+ <para>
+ A subscription can delay the application of changes by specifying the
+ <literal>min_apply_delay</literal> subscription parameter. See
+ <xref linkend="sql-createsubscription"/> for details.
+ </para>
+
<sect2 id="logical-replication-subscription-slot">
<title>Replication Slot Management</title>
diff --git a/doc/src/sgml/ref/alter_subscription.sgml b/doc/src/sgml/ref/alter_subscription.sgml
index 964fcbb8ff..8b7eb28e54 100644
--- a/doc/src/sgml/ref/alter_subscription.sgml
+++ b/doc/src/sgml/ref/alter_subscription.sgml
@@ -213,8 +213,9 @@ ALTER SUBSCRIPTION <replaceable class="parameter">name</replaceable> RENAME TO <
are <literal>slot_name</literal>,
<literal>synchronous_commit</literal>,
<literal>binary</literal>, <literal>streaming</literal>,
- <literal>disable_on_error</literal>, and
- <literal>origin</literal>.
+ <literal>disable_on_error</literal>,
+ <literal>origin</literal>, and
+ <literal>min_apply_delay</literal>.
</para>
</listitem>
</varlistentry>
diff --git a/doc/src/sgml/ref/create_subscription.sgml b/doc/src/sgml/ref/create_subscription.sgml
index 51c45f17c7..5146aa2c88 100644
--- a/doc/src/sgml/ref/create_subscription.sgml
+++ b/doc/src/sgml/ref/create_subscription.sgml
@@ -349,7 +349,47 @@ CREATE SUBSCRIPTION <replaceable class="parameter">subscription_name</replaceabl
</para>
</listitem>
</varlistentry>
- </variablelist></para>
+
+ <varlistentry>
+ <term><literal>min_apply_delay</literal> (<type>integer</type>)</term>
+ <listitem>
+ <para>
+ By default, the subscriber applies changes as soon as possible. This
+ parameter allows the user to delay the application of changes by a
+ given time period. This is done by writing all the changes into a
+ file once and apply contents after spending time. If the value is
+ specified without units, it is taken as milliseconds. The default
+ is zero (no delay). See <xref linkend="config-setting-names-values"/>
+ for details on the available valid time units.
+ </para>
+ <para>
+ Any delay becomes effective only after all initial table
+ synchronization has finished and occurs before each transaction starts
+ to get applied on the subscriber. The delay is calculated as the
+ difference between the WAL timestamp as written on the publisher and
+ the current time on the subscriber. Any overhead of time spent in
+ logical decoding and in transferring the transaction may reduce the
+ actual wait time. Even if the overhead already exceeds the requested
+ <literal>min_apply_delay</literal> value, all the changes are written
+ into file and applied immediately. If the system clocks on publisher
+ and subscriber are not synchronized, this may lead to apply changes
+ earlier than expected, but this is not a major issue because this
+ parameter is typically much larger than the time deviations between
+ servers.
+ </para>
+ <warning>
+ <para>
+ Delaying the replication means there is a much longer time between
+ making a change on the publisher, and that change being committed
+ on the subscriber. This can impact the performance of synchronous
+ replication. See <xref linkend="guc-synchronous-commit"/>
+ parameter.
+ </para>
+ </warning>
+ </listitem>
+ </varlistentry>
+ </variablelist>
+ </para>
</listitem>
</varlistentry>
@@ -420,6 +460,11 @@ CREATE SUBSCRIPTION <replaceable class="parameter">subscription_name</replaceabl
published with different column lists are not supported.
</para>
+ <para>
+ A non-zero <literal>min_apply_delay</literal> parameter is not allowed when
+ streaming in parallel mode.
+ </para>
+
<para>
We allow non-existent publications to be specified so that users can add
those later. This means
diff --git a/src/backend/catalog/pg_subscription.c b/src/backend/catalog/pg_subscription.c
index a56ae311c3..e19e5cbca2 100644
--- a/src/backend/catalog/pg_subscription.c
+++ b/src/backend/catalog/pg_subscription.c
@@ -66,6 +66,7 @@ GetSubscription(Oid subid, bool missing_ok)
sub->skiplsn = subform->subskiplsn;
sub->name = pstrdup(NameStr(subform->subname));
sub->owner = subform->subowner;
+ sub->minapplydelay = subform->subminapplydelay;
sub->enabled = subform->subenabled;
sub->binary = subform->subbinary;
sub->stream = subform->substream;
diff --git a/src/backend/catalog/system_views.sql b/src/backend/catalog/system_views.sql
index 34ca0e739f..7578b80c07 100644
--- a/src/backend/catalog/system_views.sql
+++ b/src/backend/catalog/system_views.sql
@@ -1314,9 +1314,10 @@ REVOKE ALL ON pg_replication_origin_status FROM public;
-- All columns of pg_subscription except subconninfo are publicly readable.
REVOKE ALL ON pg_subscription FROM public;
-GRANT SELECT (oid, subdbid, subskiplsn, subname, subowner, subenabled,
- subbinary, substream, subtwophasestate, subdisableonerr,
- subslotname, subsynccommit, subpublications, suborigin)
+GRANT SELECT (oid, subdbid, subskiplsn, subminapplydelay, subname, subowner,
+ subenabled, subbinary, substream, subtwophasestate,
+ subdisableonerr, subslotname, subsynccommit, subpublications,
+ suborigin)
ON pg_subscription TO public;
CREATE VIEW pg_stat_subscription_stats AS
diff --git a/src/backend/commands/subscriptioncmds.c b/src/backend/commands/subscriptioncmds.c
index 464db6d247..82e16fd0f9 100644
--- a/src/backend/commands/subscriptioncmds.c
+++ b/src/backend/commands/subscriptioncmds.c
@@ -66,6 +66,7 @@
#define SUBOPT_DISABLE_ON_ERR 0x00000400
#define SUBOPT_LSN 0x00000800
#define SUBOPT_ORIGIN 0x00001000
+#define SUBOPT_MIN_APPLY_DELAY 0x00002000
/* check if the 'val' has 'bits' set */
#define IsSet(val, bits) (((val) & (bits)) == (bits))
@@ -90,6 +91,7 @@ typedef struct SubOpts
bool disableonerr;
char *origin;
XLogRecPtr lsn;
+ int32 min_apply_delay;
} SubOpts;
static List *fetch_table_list(WalReceiverConn *wrconn, List *publications);
@@ -100,7 +102,7 @@ static void check_publications_origin(WalReceiverConn *wrconn,
static void check_duplicates_in_publist(List *publist, Datum *datums);
static List *merge_publications(List *oldpublist, List *newpublist, bool addpub, const char *subname);
static void ReportSlotConnectionError(List *rstates, Oid subid, char *slotname, char *err);
-
+static int32 defGetMinApplyDelay(DefElem *def);
/*
* Common option parsing function for CREATE and ALTER SUBSCRIPTION commands.
@@ -146,6 +148,8 @@ parse_subscription_options(ParseState *pstate, List *stmt_options,
opts->disableonerr = false;
if (IsSet(supported_opts, SUBOPT_ORIGIN))
opts->origin = pstrdup(LOGICALREP_ORIGIN_ANY);
+ if (IsSet(supported_opts, SUBOPT_MIN_APPLY_DELAY))
+ opts->min_apply_delay = 0;
/* Parse options */
foreach(lc, stmt_options)
@@ -324,6 +328,15 @@ parse_subscription_options(ParseState *pstate, List *stmt_options,
opts->specified_opts |= SUBOPT_LSN;
opts->lsn = lsn;
}
+ else if (IsSet(supported_opts, SUBOPT_MIN_APPLY_DELAY) &&
+ strcmp(defel->defname, "min_apply_delay") == 0)
+ {
+ if (IsSet(opts->specified_opts, SUBOPT_MIN_APPLY_DELAY))
+ errorConflictingDefElem(defel, pstate);
+
+ opts->specified_opts |= SUBOPT_MIN_APPLY_DELAY;
+ opts->min_apply_delay = defGetMinApplyDelay(defel);
+ }
else
ereport(ERROR,
(errcode(ERRCODE_SYNTAX_ERROR),
@@ -404,6 +417,32 @@ parse_subscription_options(ParseState *pstate, List *stmt_options,
"slot_name = NONE", "create_slot = false")));
}
}
+
+ /*
+ * The combination of parallel streaming mode and min_apply_delay is not
+ * allowed. This is because in parallel streaming mode, we start applying
+ * the transaction stream as soon as the first change arrives without
+ * knowing the transaction's prepare/commit time. This means we cannot
+ * calculate the underlying network/decoding lag between publisher and
+ * subscriber, and so always waiting for the full 'min_apply_delay' period
+ * might include unnecessary delay.
+ *
+ * The other possibility was to apply the delay at the end of the parallel
+ * apply transaction but that would cause issues related to resource bloat
+ * and locks being held for a long time.
+ */
+ if (IsSet(supported_opts, SUBOPT_MIN_APPLY_DELAY) &&
+ opts->min_apply_delay > 0 &&
+ opts->streaming == LOGICALREP_STREAM_PARALLEL)
+ ereport(ERROR,
+ errcode(ERRCODE_SYNTAX_ERROR),
+
+ /*
+ * translator: the first %s is a string of the form "parameter > 0"
+ * and the second one is "option = value".
+ */
+ errmsg("%s and %s are mutually exclusive options",
+ "min_apply_delay > 0", "streaming = parallel"));
}
/*
@@ -560,7 +599,8 @@ CreateSubscription(ParseState *pstate, CreateSubscriptionStmt *stmt,
SUBOPT_SLOT_NAME | SUBOPT_COPY_DATA |
SUBOPT_SYNCHRONOUS_COMMIT | SUBOPT_BINARY |
SUBOPT_STREAMING | SUBOPT_TWOPHASE_COMMIT |
- SUBOPT_DISABLE_ON_ERR | SUBOPT_ORIGIN);
+ SUBOPT_DISABLE_ON_ERR | SUBOPT_ORIGIN |
+ SUBOPT_MIN_APPLY_DELAY);
parse_subscription_options(pstate, stmt->options, supported_opts, &opts);
/*
@@ -625,6 +665,7 @@ CreateSubscription(ParseState *pstate, CreateSubscriptionStmt *stmt,
values[Anum_pg_subscription_oid - 1] = ObjectIdGetDatum(subid);
values[Anum_pg_subscription_subdbid - 1] = ObjectIdGetDatum(MyDatabaseId);
values[Anum_pg_subscription_subskiplsn - 1] = LSNGetDatum(InvalidXLogRecPtr);
+ values[Anum_pg_subscription_subminapplydelay - 1] = Int32GetDatum(opts.min_apply_delay);
values[Anum_pg_subscription_subname - 1] =
DirectFunctionCall1(namein, CStringGetDatum(stmt->subname));
values[Anum_pg_subscription_subowner - 1] = ObjectIdGetDatum(owner);
@@ -1054,7 +1095,7 @@ AlterSubscription(ParseState *pstate, AlterSubscriptionStmt *stmt,
supported_opts = (SUBOPT_SLOT_NAME |
SUBOPT_SYNCHRONOUS_COMMIT | SUBOPT_BINARY |
SUBOPT_STREAMING | SUBOPT_DISABLE_ON_ERR |
- SUBOPT_ORIGIN);
+ SUBOPT_ORIGIN | SUBOPT_MIN_APPLY_DELAY);
parse_subscription_options(pstate, stmt->options,
supported_opts, &opts);
@@ -1098,6 +1139,19 @@ AlterSubscription(ParseState *pstate, AlterSubscriptionStmt *stmt,
if (IsSet(opts.specified_opts, SUBOPT_STREAMING))
{
+ /*
+ * The combination of parallel streaming mode and
+ * min_apply_delay is not allowed. See
+ * parse_subscription_options.
+ */
+ if (opts.streaming == LOGICALREP_STREAM_PARALLEL &&
+ !IsSet(opts.specified_opts, SUBOPT_MIN_APPLY_DELAY)
+ && sub->minapplydelay > 0)
+ ereport(ERROR,
+ errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
+ errmsg("cannot set parallel streaming mode for subscription with %s",
+ "min_apply_delay"));
+
values[Anum_pg_subscription_substream - 1] =
CharGetDatum(opts.streaming);
replaces[Anum_pg_subscription_substream - 1] = true;
@@ -1111,6 +1165,26 @@ AlterSubscription(ParseState *pstate, AlterSubscriptionStmt *stmt,
= true;
}
+ if (IsSet(opts.specified_opts, SUBOPT_MIN_APPLY_DELAY))
+ {
+ /*
+ * The combination of parallel streaming mode and
+ * min_apply_delay is not allowed. See
+ * parse_subscription_options.
+ */
+ if (opts.min_apply_delay > 0 &&
+ !IsSet(opts.specified_opts, SUBOPT_STREAMING)
+ && sub->stream == LOGICALREP_STREAM_PARALLEL)
+ ereport(ERROR,
+ errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
+ errmsg("cannot set %s for subscription in parallel streaming mode",
+ "min_apply_delay"));
+
+ values[Anum_pg_subscription_subminapplydelay - 1] =
+ Int32GetDatum(opts.min_apply_delay);
+ replaces[Anum_pg_subscription_subminapplydelay - 1] = true;
+ }
+
if (IsSet(opts.specified_opts, SUBOPT_ORIGIN))
{
values[Anum_pg_subscription_suborigin - 1] =
@@ -2195,3 +2269,45 @@ defGetStreamingMode(DefElem *def)
def->defname)));
return LOGICALREP_STREAM_OFF; /* keep compiler quiet */
}
+
+/*
+ * Extract the min_apply_delay value from a DefElem. This is very similar to
+ * parse_and_validate_value() for integer values, because min_apply_delay
+ * accepts the same parameter format as recovery_min_apply_delay.
+ */
+static int32
+defGetMinApplyDelay(DefElem *def)
+{
+ char *input_string;
+ int result;
+ const char *hintmsg;
+
+ input_string = defGetString(def);
+
+ /*
+ * Parse given string as parameter which has millisecond unit
+ */
+ if (!parse_int(input_string, &result, GUC_UNIT_MS, &hintmsg))
+ ereport(ERROR,
+ (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+ errmsg("invalid value for parameter \"%s\": \"%s\"",
+ "min_apply_delay", input_string),
+ hintmsg ? errhint("%s", _(hintmsg)) : 0));
+
+ /*
+ * Check both the lower boundary for the valid min_apply_delay range and
+ * the upper boundary as the safeguard for some platforms where INT_MAX is
+ * wider than int32 respectively. Although parse_int() has confirmed that
+ * the result is less than or equal to INT_MAX, the value will be stored
+ * in a catalog column of int32.
+ */
+ if (result < 0 || result > PG_INT32_MAX)
+ ereport(ERROR,
+ (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+ errmsg("%d ms is outside the valid range for parameter \"%s\" (%d .. %d)",
+ result,
+ "min_apply_delay",
+ 0, PG_INT32_MAX)));
+
+ return result;
+}
diff --git a/src/backend/replication/logical/worker.c b/src/backend/replication/logical/worker.c
index 10f9711972..82f7e789ec 100644
--- a/src/backend/replication/logical/worker.c
+++ b/src/backend/replication/logical/worker.c
@@ -431,6 +431,670 @@ static inline void reset_apply_error_context_info(void);
static TransApplyAction get_transaction_apply_action(TransactionId xid,
ParallelApplyWorkerInfo **winfo);
+static void begin_replication_step(void);
+static void end_replication_step(void);
+
+/* XXX definitions for time-delayed logical replicaiton */
+#include "common/file_utils.h"
+
+/* DELAYEDDIR stores files that contains changes of delayed transactions. */
+#define DELAYEDDIR "pg_logical/delayed_txns"
+#define DELAYEDSUFFIX ".delayed_changes"
+
+/* List entry to map xid and commit time */
+typedef struct DelayedTxnListEntry
+{
+ TransactionId xid;
+ LogicalRepCommitData commit_data;
+} DelayedTxnListEntry;
+
+/*
+ * An entry is appended when the we receives commit message and time-delayed
+ * logical replication is requested. The entry will be deleted after contents
+ * are applied.
+ */
+static List *DelayedTxnList = NIL;
+
+/* fields valid only when time-delayed logical replication is requested */
+static bool in_delayed_transaction = false;
+
+static TransactionId delayed_xid = InvalidTransactionId;
+
+/*
+ * Store flushed lsn for time-delayed logical replication. This is used when
+ * we send a feedback message to the publisher.
+ */
+static XLogRecPtr last_flushed = InvalidXLogRecPtr;
+
+/*
+ * FIXME: global file descriptor may be not sufficient. There is a possibility
+ * that non-streaming transactions are come concurrently. At that time
+ * create_delay_file() for the second transaction will be failed...
+ */
+static int delayed_fd = -1;
+
+/*
+ * Cache commit_data into the list
+ */
+static void
+cache_commit_data(LogicalRepCommitData *commit_data, TransactionId xid)
+{
+ MemoryContext old;
+ DelayedTxnListEntry *entry;
+
+ old = MemoryContextSwitchTo(ApplyContext);
+
+ entry = palloc0(sizeof(DelayedTxnListEntry));
+
+ /* Contruct an entry and append it */
+ entry->xid = xid;
+ memcpy(&entry->commit_data, commit_data, sizeof(LogicalRepCommitData));
+ DelayedTxnList = lappend(DelayedTxnList, entry);
+
+ MemoryContextSwitchTo(old);
+}
+
+/*
+ * Flush given changes and close the file. This will be called at the end of the txn
+ */
+static void
+flush_delayed_changes(LogicalRepCommitData *commit_data)
+{
+ Assert(delayed_fd > 0);
+ Assert(TransactionIdIsValid(delayed_xid));
+
+ /* Cache given commit_data into the list */
+ cache_commit_data(commit_data, delayed_xid);
+
+ /* Flush previously written changes */
+ if (pg_fdatasync(delayed_fd) != 0)
+ {
+ int save_errno = errno;
+ close(delayed_fd);
+ errno = save_errno;
+ ereport(ERROR,
+ errcode_for_file_access(),
+ errmsg("could not fsync file"));
+ }
+
+ /* Store flushed lsn */
+ last_flushed = commit_data->end_lsn;
+
+ /* Cleanup */
+ close(delayed_fd);
+ delayed_fd = -1;
+ delayed_xid = InvalidTransactionId;
+ in_delayed_transaction = false;
+}
+
+/*
+ * Get formal filename from subid and xid
+ */
+static void
+delay_file_name(char *path, Oid subid, TransactionId xid)
+{
+ snprintf(path, MAXPGPATH, DELAYEDDIR "/%u-%u" DELAYEDSUFFIX, subid, xid);
+}
+
+/*
+ * Extract subid and xid and given pathname
+ */
+static void
+extract_info_from_delay_file(char *path, Oid *subid, TransactionId *xid)
+{
+ sscanf(path, DELAYEDDIR "/%u-%u", subid, xid);
+}
+
+/*
+ * Check whether the given transaction is delayed. This is done by checking the
+ * delay file.
+ */
+static bool
+is_given_transaction_delayed(Oid subid, TransactionId xid)
+{
+ struct stat st;
+ char path[MAXPGPATH];
+
+ delay_file_name(path, subid, xid);
+
+ return stat(path, &st) == 0;
+}
+
+/*
+ * Apply the delayed transaction. In the function a delayed file is opened and
+ * read. Apply worker applies written changes.
+ */
+static void
+apply_delayed_transaction(TransactionId xid, XLogRecPtr lsn)
+{
+ StringInfoData s2;
+ int nchanges;
+ char path[MAXPGPATH];
+ char *buffer = NULL;
+ MemoryContext oldcxt;
+ ResourceOwner oldowner;
+
+ /* Make sure we have an open transaction */
+ begin_replication_step();
+
+ /*
+ * Allocate file handle and memory required to process all the messages in
+ * TopTransactionContext to avoid them getting reset after each message is
+ * processed.
+ */
+ oldcxt = MemoryContextSwitchTo(TopTransactionContext);
+
+ /* Open the spool file for the committed transaction */
+ delay_file_name(path, MyLogicalRepWorker->subid, xid);
+ elog(DEBUG1, "replaying changes from file \"%s\"", path);
+
+ /*
+ * Make sure the file is owned by the toplevel transaction so that the
+ * file will not be accidentally closed when aborting a subtransaction.
+ */
+ oldowner = CurrentResourceOwner;
+ CurrentResourceOwner = TopTransactionResourceOwner;
+
+ /* Open the specified file */
+ delayed_fd = BasicOpenFile(path, O_RDONLY | PG_BINARY);
+
+ Assert(delayed_fd > 0);
+
+ CurrentResourceOwner = oldowner;
+
+ buffer = palloc(BLCKSZ);
+ initStringInfo(&s2);
+
+ MemoryContextSwitchTo(oldcxt);
+
+ remote_final_lsn = lsn;
+
+ /*
+ * Make sure the handle apply_dispatch methods are aware we're in a remote
+ * transaction.
+ */
+ in_remote_transaction = true;
+ pgstat_report_activity(STATE_RUNNING, NULL);
+
+ end_replication_step();
+
+ /*
+ * Read the entries one by one and pass them through the same logic as in
+ * apply_dispatch.
+ */
+ nchanges = 0;
+ while (true)
+ {
+ size_t nbytes;
+ int len;
+
+ CHECK_FOR_INTERRUPTS();
+
+ /* read length of the on-disk record */
+ nbytes = read(delayed_fd, &len, sizeof(len));
+
+ /* have we reached end of the file? */
+ if (nbytes == 0)
+ break;
+
+ /* do we have a correct length? */
+ if (len <= 0)
+ elog(ERROR, "incorrect length %d in delaed transaction's changes file \"%s\"",
+ len, path);
+
+ /* make sure we have sufficiently large buffer */
+ buffer = repalloc(buffer, len);
+
+ /* and finally read the data into the buffer */
+ read(delayed_fd, buffer, len);
+
+ /* copy the buffer to the stringinfo and call apply_dispatch */
+ resetStringInfo(&s2);
+ appendBinaryStringInfo(&s2, buffer, len);
+
+ /* Ensure we are reading the data into our memory context. */
+ oldcxt = MemoryContextSwitchTo(ApplyMessageContext);
+
+ apply_dispatch(&s2);
+
+ MemoryContextReset(ApplyMessageContext);
+
+ MemoryContextSwitchTo(oldcxt);
+
+ nchanges++;
+
+ if (nchanges % 1000 == 0)
+ elog(DEBUG1, "replayed %d changes from file \"%s\"",
+ nchanges, path);
+ }
+
+ if (delayed_fd > 0)
+ {
+ close(delayed_fd);
+ delayed_fd = -1;
+ durable_unlink(path, LOG);
+ }
+
+ elog(DEBUG1, "replayed %d (all) changes from file \"%s\"",
+ nchanges, path);
+
+ return;
+}
+
+/*
+ * Create a file that will be written changes.
+ */
+static void
+create_delay_file(TransactionId xid)
+{
+ char path[MAXPGPATH];
+ int fd;
+
+ Assert(TransactionIdIsValid(xid));
+ Assert(delayed_fd < 0);
+
+ delay_file_name(path, MyLogicalRepWorker->subid, xid);
+
+ elog(DEBUG1, "creating a file \"%s\" for time-delayed logical replication",
+ path);
+
+ fd = BasicOpenFile(path, O_WRONLY | O_CREAT | O_EXCL | O_APPEND | PG_BINARY);
+
+ if (fd < 0)
+ ereport(ERROR,
+ errcode_for_file_access(),
+ errmsg("could not create file \"%s\": %m",
+ path));
+
+ delayed_fd = fd;
+}
+
+/*
+ * Create a directory that holds delayed files
+ */
+static void
+initialize_delay_directory(void)
+{
+ char path[MAXPGPATH];
+
+ snprintf(path, MAXPGPATH, DELAYEDDIR);
+ if (MakePGDirectory(path) < 0 && errno != EEXIST)
+ ereport(ERROR,
+ errcode_for_file_access(),
+ errmsg("could not create directory \"%s\": %m",
+ path));
+
+ START_CRIT_SECTION();
+ fsync_fname(path, true);
+ END_CRIT_SECTION();
+}
+
+/*
+ * Read the delayed file and cache information of transaction e.g. committime
+ */
+static bool
+ReadCommitRecord(int fd, TransactionId xid)
+{
+ int len = 0;
+ char action = 0;
+ char *buffer;
+ StringInfoData commit_message;
+ LogicalRepCommitData commit_data = {0};
+
+ /*
+ * If the transaction is not 2PC, we can assume that decoded commit record
+ * is at the end of the file. Therefore, read from the end.
+ */
+
+ /* FIXME: size of the messages is estimated from the document */
+#define COMMIT_MESSAGE_SIZE (sizeof(int) + sizeof(char) + sizeof(int8) + sizeof(LogicalRepCommitData))
+
+ /* seek file to the end */
+ lseek(fd, -COMMIT_MESSAGE_SIZE, SEEK_END);
+
+ read(fd, &len, sizeof(int));
+ read(fd, &action, sizeof(char));
+
+ /*
+ * If the action is not 'C' and the got length is not valid, the
+ * transaction may be 2PC. So stop reading more.
+ */
+ if (len != (COMMIT_MESSAGE_SIZE - sizeof(int)) &&
+ action != LOGICAL_REP_MSG_COMMIT)
+ return false;
+
+ /*
+ * If we reach here, this file seems valid and normal transaction.
+ * Start to read more and cache into memory to start delaying.
+ */
+
+ /* Prepare buffer and read from file */
+ buffer = palloc0(len - sizeof(char));
+ read(fd, buffer, len - sizeof(char));
+
+ /* Append to StringInfo in order to use same read function */
+ initStringInfo(&commit_message);
+ appendBinaryStringInfo(&commit_message, buffer, len - sizeof(char));
+
+ /* Finally start to read decoded commit record */
+ logicalrep_read_commit(&commit_message, &commit_data);
+
+ /* ..and cache into the list */
+ cache_commit_data(&commit_data, xid);
+
+ pfree(buffer);
+ pfree(commit_message.data);
+
+#undef COMMIT_MESSAGE_SIZE
+
+ return true;
+}
+
+/*
+ * Read the delayed file and cache information of transaction e.g. committime.
+ *
+ * Note that apart from above, the native PREPARE/COMMIT PREPARED message is
+ * not directly written into the file. This is because gid can have arbitrary
+ * length and then we cannot estimate the offset of these records from the end
+ * of the file. Instread, the important information - prepare/commit_lsn,
+ * end_lsn, prepare/commit_time, and its transaction id are serialized.
+ * Functions for PREPARE/COMMIT PREPARED were combined because they have same
+ * attributes.
+ */
+static bool
+ReadPreparedCommonRecord(int fd)
+{
+ int len = 0;
+ char action = 0;
+
+ /*
+ * If the transaction is 2PC, we can assume that the final record is either
+ * or decoded prepare/commit prepared.
+ */
+
+ /*
+ * XXX: Modified message contains
+ * - length
+ * - message type
+ * - prepare/commit_lsn
+ * - end_lsn
+ * - xid
+ */
+#define PREPARE_MESSAGE_SIZE (sizeof(int) + sizeof(char) + sizeof(XLogRecPtr) + sizeof(XLogRecPtr) + sizeof(TimestampTz) + sizeof(TransactionId))
+ lseek(fd, -PREPARE_MESSAGE_SIZE, SEEK_END);
+ read(fd, &len, sizeof(int));
+ read(fd, &action, sizeof(char));
+
+ /*
+ * Do something if the record seems to be PREPARE or COMMIT PREPARED
+ */
+ if (len == (PREPARE_MESSAGE_SIZE - sizeof(int)) &&
+ action == LOGICAL_REP_MSG_PREPARE)
+ {
+ /* For PREPARE, do nothing */
+ return true;
+ }
+ else if (len == (PREPARE_MESSAGE_SIZE - sizeof(int)) &&
+ LOGICAL_REP_MSG_COMMIT_PREPARED)
+ {
+ /* For COMMIT PREPARED, cache into memory and start to delay */
+
+ LogicalRepCommitData commit_data = {0};
+ TransactionId xid = InvalidTransactionId;
+
+ /* Adjust position and append to StringInfo in order to use same read function */
+ read(fd, &commit_data.commit_lsn, sizeof(XLogRecPtr));
+ read(fd, &commit_data.end_lsn, sizeof(XLogRecPtr));
+ read(fd, &commit_data.committime, sizeof(TimestampTz));
+ read(fd, &xid, sizeof(TransactionId));
+
+ cache_commit_data(&commit_data, xid);
+
+ return true;
+ }
+ else
+ return false;
+}
+
+/*
+ * Transform information from commit_prepared style to commit style.
+ */
+static void
+ConstructCommitFromCommitPrepared(LogicalRepCommitData *commit,
+ LogicalRepCommitPreparedTxnData *prepare_data)
+{
+ commit->commit_lsn = prepare_data->commit_lsn;
+ commit->committime = prepare_data->commit_time;
+ commit->end_lsn = prepare_data->end_lsn;
+}
+
+/*
+ * Restore the delayed transaction from given files.
+ */
+static void
+RestoreDelayedTxn(char *path)
+{
+ Oid subid = InvalidOid;
+ TransactionId xid = InvalidTransactionId;
+ int fd;
+
+ /* Check filename to extract subid and xid */
+ extract_info_from_delay_file(path, &subid, &xid);
+
+ /*
+ * If the subid is not related with the apply worker, the transaction is
+ * out-of-scope for us...
+ */
+ if (MyLogicalRepWorker->subid != subid)
+ return;
+
+ /* OK, the transaction must be maintained by the worker. Open file */
+ fd = BasicOpenFile(path, O_RDONLY | PG_BINARY);
+
+ /* And restore from the end of the file */
+ if (ReadCommitRecord(fd, xid))
+ goto cleanup;
+
+ if (ReadPreparedCommonRecord(fd))
+ goto cleanup;
+
+ /*
+ * If we reach here the file seems to be corrupted. So remove once and
+ * receive changes again.
+ */
+ close(fd);
+ durable_unlink(path, LOG);
+ return;
+
+cleanup:
+ close(fd);
+}
+
+/*
+ * Restore all the delayed transactions to memory.
+ */
+static void
+RestoreDelayedTxns(void)
+{
+ DIR *delayed_dir;
+ struct dirent *delayed_de;
+
+ /* Read all the file step-by-step */
+ delayed_dir = AllocateDir(DELAYEDDIR);
+ while ((delayed_de = ReadDir(delayed_dir, DELAYEDDIR)) != NULL)
+ {
+ char path[MAXPGPATH];
+ PGFileType de_type;
+
+ if (strcmp(delayed_de->d_name, ".") == 0 ||
+ strcmp(delayed_de->d_name, "..") == 0)
+ continue;
+
+ /* Check the filename and status */
+ snprintf(path, sizeof(path), DELAYEDDIR "/%s", delayed_de->d_name);
+ de_type = get_dirent_type(path, delayed_de, false, DEBUG1);
+
+ if (de_type != PGFILETYPE_REG)
+ continue;
+
+ /* Found a delayed transaction. Restore it. */
+ RestoreDelayedTxn(path);
+ }
+ FreeDir(delayed_dir);
+}
+
+/*
+ * Restore delayed transactions, or initialize the directory
+ */
+static void
+InitializeDelayedTxn(void)
+{
+ struct stat st;
+ char path[MAXPGPATH];
+
+ snprintf(path, MAXPGPATH, DELAYEDDIR);
+
+ /*
+ * If the given directory does not exist, create one. Otherwise start to
+ * restore.
+ */
+ if (stat(path, &st) != 0)
+ {
+ initialize_delay_directory();
+ return;
+ }
+
+ RestoreDelayedTxns();
+}
+
+/*
+ * Write a given message to a file. This is called for every message.
+ * This returns true only when changes are written into file.
+ *
+ * The format of the serialized changes is same as the streamed one. This
+ * has a length (not including the length), action code (identifying the
+ * message type) and message contents (without the subxact TransactionId
+ * value).
+ */
+static bool
+handle_delayed_transaction(char action, StringInfo s)
+{
+ int len;
+
+ /* Return if we are not in delay */
+ if (!in_delayed_transaction)
+ return false;
+
+ Assert(delayed_fd > 0);
+ Assert(TransactionIdIsValid(delayed_xid));
+
+ len = (s->len - s->cursor) + sizeof(char);
+
+ if (write(delayed_fd, &len, sizeof(len)) != sizeof(len))
+ abort();
+ if (write(delayed_fd, &action, sizeof(action)) != sizeof(action))
+ abort();
+
+ len = (s->len - s->cursor);
+
+ if (write(delayed_fd, &s->data[s->cursor], len) != len)
+ abort();
+
+ return true;
+}
+
+/*
+ * Write a given information from PREPARE/COMMIT PREPARED to a file. This is
+ * called when we receive PREPARE or COMMIT PREPARED message. This returns true
+ * only when changes are written into file.
+ *
+ * About the needness of the function see comments atop
+ * ReadPreparedCommonRecord().
+ */
+static void
+handle_delayed_prepared(char action, XLogRecPtr prepare_lsn,
+ XLogRecPtr end_lsn, TimestampTz prepare_time,
+ TransactionId xid)
+{
+ int len;
+
+ Assert(delayed_fd > 0);
+
+#define MESSAGE_SIZE (sizeof(char) + sizeof(XLogRecPtr) + sizeof(XLogRecPtr) + sizeof(TimestampTz) + sizeof(TransactionId))
+ len = MESSAGE_SIZE;
+
+ /*
+ * XXX: Modified message contains
+ * - length
+ * - message type
+ * - prepare/commit_lsn
+ * - end_lsn
+ * - xid
+ */
+ if (write(delayed_fd, &len, sizeof(len)) != sizeof(len))
+ abort();
+ if (write(delayed_fd, &action, sizeof(action)) != sizeof(action))
+ abort();
+ if (write(delayed_fd, &prepare_lsn, sizeof(prepare_lsn)) != sizeof(prepare_lsn))
+ abort();
+ if (write(delayed_fd, &end_lsn, sizeof(end_lsn)) != sizeof(end_lsn))
+ abort();
+ if (write(delayed_fd, &prepare_time, sizeof(prepare_time)) != sizeof(prepare_time))
+ abort();
+ if (write(delayed_fd, &xid, sizeof(xid)) != sizeof(xid))
+ abort();
+#undef MESSAGE_SIZE
+}
+
+/*
+ * Check the delayed transactions and apply if we elapsed sufficient time
+ */
+static void
+check_delayed_transaction(void)
+{
+ TimestampTz now;
+ ListCell *lc;
+ int n = 0;
+
+ if (in_streamed_transaction)
+ return;
+
+ now = GetCurrentTimestamp();
+
+ /* Read cache on-by-one */
+ foreach(lc, DelayedTxnList)
+ {
+ DelayedTxnListEntry *entry = (DelayedTxnListEntry *) lfirst(lc);
+ LogicalRepCommitData *commit_data = &entry->commit_data;
+ TimestampTz delayUntil;
+ long diffms;
+
+ delayUntil = TimestampTzPlusMilliseconds(commit_data->committime,
+ MySubscription->minapplydelay);
+
+ diffms = TimestampDifferenceMilliseconds(now, delayUntil);
+
+ /*
+ * The cache is aligned the commit ordering, so we do not have to check
+ * latter entries if we find transactions that should not be applied.
+ */
+ if (diffms > 0)
+ break;
+
+ elog(DEBUG1, "started to apply transaction %u", entry->xid);
+
+ apply_delayed_transaction(entry->xid, commit_data->end_lsn);
+ apply_handle_commit_internal(commit_data);
+
+ delayed_fd = -1;
+ delayed_xid = InvalidTransactionId;
+ in_delayed_transaction = false;
+ n++;
+ }
+ /* Discards applied entries */
+ DelayedTxnList = list_delete_first_n(DelayedTxnList, n);
+}
+
/*
* Return the name of the logical replication worker.
*/
@@ -1018,13 +1682,28 @@ apply_handle_begin(StringInfo s)
logicalrep_read_begin(s, &begin_data);
set_apply_error_context_xact(begin_data.xid, begin_data.final_lsn);
- remote_final_lsn = begin_data.final_lsn;
+ /*
+ * Prepare to write changes into file if time-delayed replication is
+ * requested.
+ */
+ if (MySubscription->minapplydelay && AllTablesyncsReady())
+ {
+ in_delayed_transaction = true;
- maybe_start_skipping_changes(begin_data.final_lsn);
+ create_delay_file(begin_data.xid);
- in_remote_transaction = true;
+ delayed_xid = begin_data.xid;
+ }
+ else
+ {
+ remote_final_lsn = begin_data.final_lsn;
- pgstat_report_activity(STATE_RUNNING, NULL);
+ maybe_start_skipping_changes(begin_data.final_lsn);
+
+ in_remote_transaction = true;
+
+ pgstat_report_activity(STATE_RUNNING, NULL);
+ }
}
/*
@@ -1036,20 +1715,40 @@ static void
apply_handle_commit(StringInfo s)
{
LogicalRepCommitData commit_data;
+ /* Save the message before it is consumed. */
+ StringInfoData original_msg = *s;
+
+ /*
+ * If we are applying the delayed transaction, skip here.
+ * Actual COMMIT will be done outside the apply_delayed_transaction()
+ */
+ if (delayed_fd > 0 && !in_delayed_transaction)
+ return;
logicalrep_read_commit(s, &commit_data);
- if (commit_data.commit_lsn != remote_final_lsn)
- ereport(ERROR,
- (errcode(ERRCODE_PROTOCOL_VIOLATION),
- errmsg_internal("incorrect commit LSN %X/%X in commit message (expected %X/%X)",
- LSN_FORMAT_ARGS(commit_data.commit_lsn),
- LSN_FORMAT_ARGS(remote_final_lsn))));
+ /* If we are applying, skip here. */
+
+ if (in_delayed_transaction)
+ {
+ /* Write a commit message into file and flush all of messages */
+ handle_delayed_transaction(LOGICAL_REP_MSG_COMMIT, &original_msg);
+ flush_delayed_changes(&commit_data);
+ }
+ else
+ {
+ if (commit_data.commit_lsn != remote_final_lsn)
+ ereport(ERROR,
+ (errcode(ERRCODE_PROTOCOL_VIOLATION),
+ errmsg_internal("incorrect commit LSN %X/%X in commit message (expected %X/%X)",
+ LSN_FORMAT_ARGS(commit_data.commit_lsn),
+ LSN_FORMAT_ARGS(remote_final_lsn))));
- apply_handle_commit_internal(&commit_data);
+ apply_handle_commit_internal(&commit_data);
- /* Process any tables that are being synchronized in parallel. */
- process_syncing_tables(commit_data.end_lsn);
+ /* Process any tables that are being synchronized in parallel. */
+ process_syncing_tables(commit_data.end_lsn);
+ }
pgstat_report_activity(STATE_IDLE, NULL);
reset_apply_error_context_info();
@@ -1075,13 +1774,28 @@ apply_handle_begin_prepare(StringInfo s)
logicalrep_read_begin_prepare(s, &begin_data);
set_apply_error_context_xact(begin_data.xid, begin_data.prepare_lsn);
- remote_final_lsn = begin_data.prepare_lsn;
+ /*
+ * Prepare to write changes into file if time-delayed replication is
+ * requested.
+ */
+ if (MySubscription->minapplydelay && AllTablesyncsReady())
+ {
+ in_delayed_transaction = true;
+
+ create_delay_file(begin_data.xid);
+
+ delayed_xid = begin_data.xid;
+ }
+ else
+ {
+ remote_final_lsn = begin_data.prepare_lsn;
- maybe_start_skipping_changes(begin_data.prepare_lsn);
+ maybe_start_skipping_changes(begin_data.prepare_lsn);
- in_remote_transaction = true;
+ in_remote_transaction = true;
- pgstat_report_activity(STATE_RUNNING, NULL);
+ pgstat_report_activity(STATE_RUNNING, NULL);
+ }
}
/*
@@ -1123,57 +1837,115 @@ apply_handle_prepare_internal(LogicalRepPreparedTxnData *prepare_data)
/*
* Handle PREPARE message.
+ *
+ * When time-delayed logical replication is requested, we just write a message
+ * into file and return. This means that no transaction is prepared on
+ * subscriber. This can avoid that the apply worker acquires locks for a long
+ * time due to the long min_apply_time.
+ *
+ * Even if the transaction is applied from delayed file, the transaction is not
+ * prepared. We just skip PREPARE message.
*/
static void
apply_handle_prepare(StringInfo s)
{
LogicalRepPreparedTxnData prepare_data;
- logicalrep_read_prepare(s, &prepare_data);
+ /*
+ * If we are applying the delayed transaction, just consume the PREPARE
+ * message and return.
+ */
+ if (delayed_fd > 0 && !in_delayed_transaction)
+ {
+ /* Consume non-needed data */
+ (void) pq_getmsgint64(s);
+ (void) pq_getmsgint64(s);
+ (void) pq_getmsgint64(s);
+ (void) pq_getmsgint(s, 4);
- if (prepare_data.prepare_lsn != remote_final_lsn)
- ereport(ERROR,
- (errcode(ERRCODE_PROTOCOL_VIOLATION),
- errmsg_internal("incorrect prepare LSN %X/%X in prepare message (expected %X/%X)",
- LSN_FORMAT_ARGS(prepare_data.prepare_lsn),
- LSN_FORMAT_ARGS(remote_final_lsn))));
+ return;
+ }
/*
- * Unlike commit, here, we always prepare the transaction even though no
- * change has happened in this transaction or all changes are skipped. It
- * is done this way because at commit prepared time, we won't know whether
- * we have skipped preparing a transaction because of those reasons.
- *
- * XXX, We can optimize such that at commit prepared time, we first check
- * whether we have prepared the transaction or not but that doesn't seem
- * worthwhile because such cases shouldn't be common.
+ * If we are writing changes into delayed file, construct a modified
+ * message and write it. This is needed for avoiding to write gid into
+ * file. More detail, see atop ReadPreparedCommonRecord().
*/
- begin_replication_step();
+ if (in_delayed_transaction)
+ {
+ /* Write the modifed message */
+ handle_delayed_prepared(LOGICAL_REP_MSG_PREPARE,
+ prepare_data.prepare_lsn,
+ prepare_data.end_lsn,
+ prepare_data.prepare_time,
+ prepare_data.xid);
+
+ /* Flush changes */
+ if (pg_fdatasync(delayed_fd) != 0)
+ {
+ int save_errno = errno;
+ close(delayed_fd);
+ errno = save_errno;
+ ereport(ERROR,
+ errcode_for_file_access(),
+ errmsg("could not fsync file"));
+ }
- apply_handle_prepare_internal(&prepare_data);
+ /* Store flushed lsn */
+ last_flushed = prepare_data.end_lsn;
- end_replication_step();
- CommitTransactionCommand();
- pgstat_report_stat(false);
+ /* Cleanup */
+ close(delayed_fd);
- store_flush_position(prepare_data.end_lsn, XactLastCommitEnd);
+ delayed_fd = -1;
+ delayed_xid = InvalidTransactionId;
+ in_delayed_transaction = false;
+ }
+ else
+ {
+ logicalrep_read_prepare(s, &prepare_data);
- in_remote_transaction = false;
+ if (prepare_data.prepare_lsn != remote_final_lsn)
+ ereport(ERROR,
+ (errcode(ERRCODE_PROTOCOL_VIOLATION),
+ errmsg_internal("incorrect prepare LSN %X/%X in prepare message (expected %X/%X)",
+ LSN_FORMAT_ARGS(prepare_data.prepare_lsn),
+ LSN_FORMAT_ARGS(remote_final_lsn))));
- /* Process any tables that are being synchronized in parallel. */
- process_syncing_tables(prepare_data.end_lsn);
+ /*
+ * Unlike commit, here, we always prepare the transaction even though no
+ * change has happened in this transaction or all changes are skipped. It
+ * is done this way because at commit prepared time, we won't know whether
+ * we have skipped preparing a transaction because of those reasons.
+ *
+ * XXX, We can optimize such that at commit prepared time, we first check
+ * whether we have prepared the transaction or not but that doesn't seem
+ * worthwhile because such cases shouldn't be common.
+ */
+ begin_replication_step();
- /*
- * Since we have already prepared the transaction, in a case where the
- * server crashes before clearing the subskiplsn, it will be left but the
- * transaction won't be resent. But that's okay because it's a rare case
- * and the subskiplsn will be cleared when finishing the next transaction.
- */
- stop_skipping_changes();
- clear_subscription_skip_lsn(prepare_data.prepare_lsn);
+ apply_handle_prepare_internal(&prepare_data);
- pgstat_report_activity(STATE_IDLE, NULL);
- reset_apply_error_context_info();
+ end_replication_step();
+ CommitTransactionCommand();
+ pgstat_report_stat(false);
+
+ store_flush_position(prepare_data.end_lsn, XactLastCommitEnd);
+
+ in_remote_transaction = false;
+
+ /* Process any tables that are being synchronized in parallel. */
+ process_syncing_tables(prepare_data.end_lsn);
+
+ /*
+ * Since we have already prepared the transaction, in a case where the
+ * server crashes before clearing the subskiplsn, it will be left but the
+ * transaction won't be resent. But that's okay because it's a rare case
+ * and the subskiplsn will be cleared when finishing the next transaction.
+ */
+ stop_skipping_changes();
+ clear_subscription_skip_lsn(prepare_data.prepare_lsn);
+ }
}
/*
@@ -1191,38 +1963,95 @@ apply_handle_commit_prepared(StringInfo s)
LogicalRepCommitPreparedTxnData prepare_data;
char gid[GIDSIZE];
- logicalrep_read_commit_prepared(s, &prepare_data);
- set_apply_error_context_xact(prepare_data.xid, prepare_data.commit_lsn);
-
- /* Compute GID for two_phase transactions. */
- TwoPhaseTransactionGid(MySubscription->oid, prepare_data.xid,
- gid, sizeof(gid));
+ if (delayed_fd > 0 && !in_delayed_transaction)
+ return;
- /* There is no transaction when COMMIT PREPARED is called */
- begin_replication_step();
+ logicalrep_read_commit_prepared(s, &prepare_data);
/*
- * Update origin state so we can restart streaming from correct position
- * in case of crash.
+ * Check whether delayed file exists or not. If we have a file and we have
+ * not opened yet, it means that time-delayed logical replication has been
+ * requested. At that time we write the modified message.
+ * Otherwise, the transaction will be committed normally.
*/
- replorigin_session_origin_lsn = prepare_data.end_lsn;
- replorigin_session_origin_timestamp = prepare_data.commit_time;
+ if (delayed_fd < 0 &&
+ is_given_transaction_delayed(MyLogicalRepWorker->subid, prepare_data.xid))
+ {
+ char path[MAXPGPATH];
+ LogicalRepCommitData commit_data = {0};
- FinishPreparedTransaction(gid, true);
- end_replication_step();
- CommitTransactionCommand();
- pgstat_report_stat(false);
+ /* Open file again */
+ delay_file_name(path, MyLogicalRepWorker->subid, prepare_data.xid);
+ delayed_fd = BasicOpenFile(path, O_WRONLY | O_APPEND | PG_BINARY);
+ if (delayed_fd < 0)
+ ereport(ERROR,
+ errcode_for_file_access(),
+ errmsg("could not open file \"%s\": %m",
+ path));
+
+ /* Write modified message to file */
+ handle_delayed_prepared(LOGICAL_REP_MSG_COMMIT_PREPARED,
+ prepare_data.commit_lsn,
+ prepare_data.end_lsn,
+ prepare_data.commit_time,
+ prepare_data.xid);
+ /* Flush it */
+ if (pg_fdatasync(delayed_fd) != 0)
+ {
+ int save_errno = errno;
+ close(delayed_fd);
+ errno = save_errno;
+ ereport(ERROR,
+ errcode_for_file_access(),
+ errmsg("could not fsync file"));
+ }
- store_flush_position(prepare_data.end_lsn, XactLastCommitEnd);
- in_remote_transaction = false;
+ /* Store flushed lsn */
+ last_flushed = prepare_data.end_lsn;
- /* Process any tables that are being synchronized in parallel. */
- process_syncing_tables(prepare_data.end_lsn);
+ /* clean up */
+ close(delayed_fd);
- clear_subscription_skip_lsn(prepare_data.end_lsn);
+ delayed_fd = -1;
+ delayed_xid = InvalidTransactionId;
+ in_delayed_transaction = false;
- pgstat_report_activity(STATE_IDLE, NULL);
- reset_apply_error_context_info();
+ ConstructCommitFromCommitPrepared(&commit_data, &prepare_data);
+
+ /* Cache the commited transaction */
+ cache_commit_data(&commit_data, prepare_data.xid);
+ }
+ else
+ {
+ set_apply_error_context_xact(prepare_data.xid, prepare_data.commit_lsn);
+
+ /* Compute GID for two_phase transactions. */
+ TwoPhaseTransactionGid(MySubscription->oid, prepare_data.xid,
+ gid, sizeof(gid));
+
+ /* There is no transaction when COMMIT PREPARED is called */
+ begin_replication_step();
+
+ /*
+ * Update origin state so we can restart streaming from correct position
+ * in case of crash.
+ */
+ replorigin_session_origin_lsn = prepare_data.end_lsn;
+ replorigin_session_origin_timestamp = prepare_data.commit_time;
+
+ FinishPreparedTransaction(gid, true);
+ end_replication_step();
+ CommitTransactionCommand();
+ pgstat_report_stat(false);
+
+ store_flush_position(prepare_data.end_lsn, XactLastCommitEnd);
+ in_remote_transaction = false;
+
+ /* Process any tables that are being synchronized in parallel. */
+ process_syncing_tables(prepare_data.end_lsn);
+
+ clear_subscription_skip_lsn(prepare_data.end_lsn);
+ }
}
/*
@@ -1241,6 +2070,20 @@ apply_handle_rollback_prepared(StringInfo s)
char gid[GIDSIZE];
logicalrep_read_rollback_prepared(s, &rollback_data);
+
+ /*
+ * If the delayed file exists, just remove it. The delayed transaction have
+ * never prepared, so it's OK not to call FinishPreparedTransaction().
+ */
+ if (is_given_transaction_delayed(MyLogicalRepWorker->subid, rollback_data.xid))
+ {
+ char path[MAXPGPATH];
+ delay_file_name(path, MyLogicalRepWorker->subid, rollback_data.xid);
+ durable_unlink(path, LOG);
+
+ return;
+ }
+
set_apply_error_context_xact(rollback_data.xid, rollback_data.rollback_end_lsn);
/* Compute GID for two_phase transactions. */
@@ -1316,16 +2159,68 @@ apply_handle_stream_prepare(StringInfo s)
switch (apply_action)
{
case TRANS_LEADER_APPLY:
+ /*
+ * If time-delayed is requested, start to write changes to
+ * permanent file instead of temporary one.
+ */
+ if (MySubscription->minapplydelay)
+ {
+ in_delayed_transaction = true;
+
+ create_delay_file(prepare_data.xid);
+
+ delayed_xid = prepare_data.xid;
+ }
/*
* The transaction has been serialized to file, so replay all the
* spooled operations.
+ * Note that if time-delayed replication is requested, changes are
+ * written into permanent file here.
*/
apply_spooled_messages(MyLogicalRepWorker->stream_fileset,
prepare_data.xid, prepare_data.prepare_lsn);
- /* Mark the transaction as prepared. */
- apply_handle_prepare_internal(&prepare_data);
+
+ /*
+ * If time-delayed replication is requested, construct a modified
+ * message and write it. This is needed for avoiding to write gid into
+ * file. More detail, see atop ReadPreparedCommonRecord().
+ */
+ if (MySubscription->minapplydelay)
+ {
+ /* Write the modified message */
+ handle_delayed_prepared(LOGICAL_REP_MSG_PREPARE,
+ prepare_data.prepare_lsn,
+ prepare_data.end_lsn,
+ prepare_data.prepare_time,
+ prepare_data.xid);
+
+ /* Flush changes */
+ if (pg_fdatasync(delayed_fd) != 0)
+ {
+ int save_errno = errno;
+ close(delayed_fd);
+ errno = save_errno;
+ ereport(ERROR,
+ errcode_for_file_access(),
+ errmsg("could not fsync file"));
+ }
+
+ /* Store flushed lsn */
+ last_flushed = prepare_data.end_lsn;
+
+ close(delayed_fd);
+
+ delayed_fd = -1;
+ delayed_xid = InvalidTransactionId;
+ in_delayed_transaction = false;
+ }
+ else
+ {
+ /* Mark the transaction as prepared. */
+ apply_handle_prepare_internal(&prepare_data);
+ }
CommitTransactionCommand();
@@ -1404,8 +2299,11 @@ apply_handle_stream_prepare(StringInfo s)
pgstat_report_stat(false);
- /* Process any tables that are being synchronized in parallel. */
- process_syncing_tables(prepare_data.end_lsn);
+ if (list_length(DelayedTxnList) == 0)
+ {
+ /* Process any tables that are being synchronized in parallel. */
+ process_syncing_tables(prepare_data.end_lsn);
+ }
/*
* Similar to prepare case, the subskiplsn could be left in a case of
@@ -2174,19 +3072,43 @@ apply_handle_stream_commit(StringInfo s)
{
case TRANS_LEADER_APPLY:
+ /*
+ * If time-delayed is requested, start to write changes to
+ * permanent file instead of temporary one.
+ */
+ if (MySubscription->minapplydelay)
+ {
+ in_delayed_transaction = true;
+
+ create_delay_file(xid);
+
+ delayed_xid = xid;
+ }
+
/*
* The transaction has been serialized to file, so replay all the
* spooled operations.
+ * Note that if time-delayed replication is requested, changes are
+ * written into permanent file here.
*/
apply_spooled_messages(MyLogicalRepWorker->stream_fileset, xid,
- commit_data.commit_lsn);
+ commit_data.commit_lsn);
- apply_handle_commit_internal(&commit_data);
+
+ /* Flush changes if time-delayed is requested */
+ if (MySubscription->minapplydelay)
+ {
+ handle_delayed_transaction(LOGICAL_REP_MSG_COMMIT, &original_msg);
+ flush_delayed_changes(&commit_data);
+ }
+ else
+ apply_handle_commit_internal(&commit_data);
/* Unlink the files with serialized changes and subxact info. */
stream_cleanup_files(MyLogicalRepWorker->subid, xid);
elog(DEBUG1, "finished processing the STREAM COMMIT command");
+
break;
case TRANS_LEADER_SEND_TO_PARALLEL:
@@ -2248,8 +3170,11 @@ apply_handle_stream_commit(StringInfo s)
break;
}
- /* Process any tables that are being synchronized in parallel. */
- process_syncing_tables(commit_data.end_lsn);
+ if (list_length(DelayedTxnList) == 0)
+ {
+ /* Process any tables that are being synchronized in parallel. */
+ process_syncing_tables(commit_data.end_lsn);
+ }
pgstat_report_activity(STATE_IDLE, NULL);
@@ -2324,7 +3249,8 @@ apply_handle_relation(StringInfo s)
{
LogicalRepRelation *rel;
- if (handle_streamed_transaction(LOGICAL_REP_MSG_RELATION, s))
+ if (handle_streamed_transaction(LOGICAL_REP_MSG_RELATION, s) ||
+ handle_delayed_transaction(LOGICAL_REP_MSG_RELATION, s))
return;
rel = logicalrep_read_rel(s);
@@ -2347,7 +3273,8 @@ apply_handle_type(StringInfo s)
{
LogicalRepTyp typ;
- if (handle_streamed_transaction(LOGICAL_REP_MSG_TYPE, s))
+ if (handle_streamed_transaction(LOGICAL_REP_MSG_TYPE, s) ||
+ handle_delayed_transaction(LOGICAL_REP_MSG_TYPE, s))
return;
logicalrep_read_typ(s, &typ);
@@ -2405,7 +3332,8 @@ apply_handle_insert(StringInfo s)
* streamed transactions.
*/
if (is_skipping_changes() ||
- handle_streamed_transaction(LOGICAL_REP_MSG_INSERT, s))
+ handle_streamed_transaction(LOGICAL_REP_MSG_INSERT, s) ||
+ handle_delayed_transaction(LOGICAL_REP_MSG_INSERT, s))
return;
begin_replication_step();
@@ -2544,7 +3472,8 @@ apply_handle_update(StringInfo s)
* streamed transactions.
*/
if (is_skipping_changes() ||
- handle_streamed_transaction(LOGICAL_REP_MSG_UPDATE, s))
+ handle_streamed_transaction(LOGICAL_REP_MSG_UPDATE, s) ||
+ handle_delayed_transaction(LOGICAL_REP_MSG_UPDATE, s))
return;
begin_replication_step();
@@ -2712,7 +3641,8 @@ apply_handle_delete(StringInfo s)
* streamed transactions.
*/
if (is_skipping_changes() ||
- handle_streamed_transaction(LOGICAL_REP_MSG_DELETE, s))
+ handle_streamed_transaction(LOGICAL_REP_MSG_DELETE, s) ||
+ handle_delayed_transaction(LOGICAL_REP_MSG_DELETE, s))
return;
begin_replication_step();
@@ -3129,7 +4059,8 @@ apply_handle_truncate(StringInfo s)
* streamed transactions.
*/
if (is_skipping_changes() ||
- handle_streamed_transaction(LOGICAL_REP_MSG_TRUNCATE, s))
+ handle_streamed_transaction(LOGICAL_REP_MSG_TRUNCATE, s) ||
+ handle_delayed_transaction(LOGICAL_REP_MSG_TRUNCATE, s))
return;
begin_replication_step();
@@ -3390,6 +4321,10 @@ get_flush_position(XLogRecPtr *write, XLogRecPtr *flush,
}
}
+ /* If change are written into file, report the LSN instead */
+ if (last_flushed > *flush)
+ *flush = last_flushed;
+
*have_pending_txes = !dlist_is_empty(&lsn_mapping);
}
@@ -3586,9 +4521,13 @@ LogicalRepApplyLoop(XLogRecPtr last_received)
maybe_reread_subscription();
/* Process any table synchronization changes. */
- process_syncing_tables(last_received);
+ if (list_length(DelayedTxnList) == 0)
+ process_syncing_tables(last_received);
}
+ /* Check delayed transactions and apply them */
+ check_delayed_transaction();
+
/* Cleanup the memory. */
MemoryContextResetAndDeleteChildren(ApplyMessageContext);
MemoryContextSwitchTo(TopMemoryContext);
@@ -3730,8 +4669,14 @@ send_feedback(XLogRecPtr recvpos, bool force, bool requestReply)
/*
* No outstanding transactions to flush, we can report the latest received
* position. This is important for synchronous replication.
+ *
+ * If the logical replication subscription has unprocessed changes then do
+ * not inform the publisher that the received latest LSN is already
+ * applied and flushed, otherwise, the publisher will make a wrong
+ * assumption about the logical replication progress. Instead, just send a
+ * feedback message to avoid a replication timeout during the delay.
*/
- if (!have_pending_txes)
+ if (!have_pending_txes && (list_length(DelayedTxnList) == 0))
flushpos = writepos = recvpos;
if (writepos < last_writepos)
@@ -4534,6 +5479,9 @@ ApplyWorkerMain(Datum main_arg)
(errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
errmsg("subscription has no replication slot set")));
+ /* Check delayed files or initialize directory */
+ InitializeDelayedTxn();
+
/* Setup replication origin tracking. */
StartTransactionCommand();
ReplicationOriginNameForLogicalRep(MySubscription->oid, InvalidOid,
diff --git a/src/bin/pg_dump/pg_dump.c b/src/bin/pg_dump/pg_dump.c
index 2e068c6620..e953bed76d 100644
--- a/src/bin/pg_dump/pg_dump.c
+++ b/src/bin/pg_dump/pg_dump.c
@@ -4569,6 +4569,7 @@ getSubscriptions(Archive *fout)
int i_subsynccommit;
int i_subpublications;
int i_subbinary;
+ int i_subminapplydelay;
int i,
ntups;
@@ -4621,9 +4622,13 @@ getSubscriptions(Archive *fout)
LOGICALREP_TWOPHASE_STATE_DISABLED);
if (fout->remoteVersion >= 160000)
- appendPQExpBufferStr(query, " s.suborigin\n");
+ appendPQExpBufferStr(query,
+ " s.suborigin,\n"
+ " s.subminapplydelay\n");
else
- appendPQExpBuffer(query, " '%s' AS suborigin\n", LOGICALREP_ORIGIN_ANY);
+ appendPQExpBuffer(query, " '%s' AS suborigin,\n"
+ " 0 AS subminapplydelay\n",
+ LOGICALREP_ORIGIN_ANY);
appendPQExpBufferStr(query,
"FROM pg_subscription s\n"
@@ -4651,6 +4656,7 @@ getSubscriptions(Archive *fout)
i_subtwophasestate = PQfnumber(res, "subtwophasestate");
i_subdisableonerr = PQfnumber(res, "subdisableonerr");
i_suborigin = PQfnumber(res, "suborigin");
+ i_subminapplydelay = PQfnumber(res, "subminapplydelay");
subinfo = pg_malloc(ntups * sizeof(SubscriptionInfo));
@@ -4681,6 +4687,8 @@ getSubscriptions(Archive *fout)
subinfo[i].subdisableonerr =
pg_strdup(PQgetvalue(res, i, i_subdisableonerr));
subinfo[i].suborigin = pg_strdup(PQgetvalue(res, i, i_suborigin));
+ subinfo[i].subminapplydelay =
+ atoi(PQgetvalue(res, i, i_subminapplydelay));
/* Decide whether we want to dump it */
selectDumpableObject(&(subinfo[i].dobj), fout);
@@ -4762,6 +4770,9 @@ dumpSubscription(Archive *fout, const SubscriptionInfo *subinfo)
if (strcmp(subinfo->subsynccommit, "off") != 0)
appendPQExpBuffer(query, ", synchronous_commit = %s", fmtId(subinfo->subsynccommit));
+ if (subinfo->subminapplydelay > 0)
+ appendPQExpBuffer(query, ", min_apply_delay = '%d ms'", subinfo->subminapplydelay);
+
appendPQExpBufferStr(query, ");\n");
if (subinfo->dobj.dump & DUMP_COMPONENT_DEFINITION)
diff --git a/src/bin/pg_dump/pg_dump.h b/src/bin/pg_dump/pg_dump.h
index cdca0b993d..aba4421f11 100644
--- a/src/bin/pg_dump/pg_dump.h
+++ b/src/bin/pg_dump/pg_dump.h
@@ -660,6 +660,7 @@ typedef struct _SubscriptionInfo
char *subdisableonerr;
char *suborigin;
char *subsynccommit;
+ int subminapplydelay;
char *subpublications;
} SubscriptionInfo;
diff --git a/src/bin/psql/describe.c b/src/bin/psql/describe.c
index 99e28f607e..80f9f27aef 100644
--- a/src/bin/psql/describe.c
+++ b/src/bin/psql/describe.c
@@ -6493,7 +6493,7 @@ describeSubscriptions(const char *pattern, bool verbose)
PGresult *res;
printQueryOpt myopt = pset.popt;
static const bool translate_columns[] = {false, false, false, false,
- false, false, false, false, false, false, false, false};
+ false, false, false, false, false, false, false, false, false};
if (pset.sversion < 100000)
{
@@ -6548,10 +6548,13 @@ describeSubscriptions(const char *pattern, bool verbose)
gettext_noop("Two-phase commit"),
gettext_noop("Disable on error"));
+ /* Origin and min_apply_delay are only supported in v16 and higher */
if (pset.sversion >= 160000)
appendPQExpBuffer(&buf,
- ", suborigin AS \"%s\"\n",
- gettext_noop("Origin"));
+ ", suborigin AS \"%s\"\n"
+ ", subminapplydelay AS \"%s\"\n",
+ gettext_noop("Origin"),
+ gettext_noop("Min apply delay"));
appendPQExpBuffer(&buf,
", subsynccommit AS \"%s\"\n"
diff --git a/src/bin/psql/tab-complete.c b/src/bin/psql/tab-complete.c
index 42e87b9e49..d24b7220b2 100644
--- a/src/bin/psql/tab-complete.c
+++ b/src/bin/psql/tab-complete.c
@@ -1925,7 +1925,7 @@ psql_completion(const char *text, int start, int end)
COMPLETE_WITH("(", "PUBLICATION");
/* ALTER SUBSCRIPTION <name> SET ( */
else if (HeadMatches("ALTER", "SUBSCRIPTION", MatchAny) && TailMatches("SET", "("))
- COMPLETE_WITH("binary", "disable_on_error", "origin", "slot_name",
+ COMPLETE_WITH("binary", "disable_on_error", "min_apply_delay", "origin", "slot_name",
"streaming", "synchronous_commit");
/* ALTER SUBSCRIPTION <name> SKIP ( */
else if (HeadMatches("ALTER", "SUBSCRIPTION", MatchAny) && TailMatches("SKIP", "("))
@@ -3268,7 +3268,7 @@ psql_completion(const char *text, int start, int end)
/* Complete "CREATE SUBSCRIPTION <name> ... WITH ( <opt>" */
else if (HeadMatches("CREATE", "SUBSCRIPTION") && TailMatches("WITH", "("))
COMPLETE_WITH("binary", "connect", "copy_data", "create_slot",
- "disable_on_error", "enabled", "origin", "slot_name",
+ "disable_on_error", "enabled", "min_apply_delay", "origin", "slot_name",
"streaming", "synchronous_commit", "two_phase");
/* CREATE TRIGGER --- is allowed inside CREATE SCHEMA, so use TailMatches */
diff --git a/src/include/catalog/pg_subscription.h b/src/include/catalog/pg_subscription.h
index b0f2a1705d..d1cfefc6d6 100644
--- a/src/include/catalog/pg_subscription.h
+++ b/src/include/catalog/pg_subscription.h
@@ -74,6 +74,8 @@ CATALOG(pg_subscription,6100,SubscriptionRelationId) BKI_SHARED_RELATION BKI_ROW
Oid subowner BKI_LOOKUP(pg_authid); /* Owner of the subscription */
+ int32 subminapplydelay; /* Replication apply delay (ms) */
+
bool subenabled; /* True if the subscription is enabled (the
* worker should be running) */
@@ -122,6 +124,7 @@ typedef struct Subscription
* skipped */
char *name; /* Name of the subscription */
Oid owner; /* Oid of the subscription owner */
+ int32 minapplydelay; /* Replication apply delay (ms) */
bool enabled; /* Indicates if the subscription is enabled */
bool binary; /* Indicates if the subscription wants data in
* binary format */
diff --git a/src/test/regress/expected/subscription.out b/src/test/regress/expected/subscription.out
index 3f99b14394..cf8e727ee9 100644
--- a/src/test/regress/expected/subscription.out
+++ b/src/test/regress/expected/subscription.out
@@ -114,18 +114,18 @@ CREATE SUBSCRIPTION regress_testsub4 CONNECTION 'dbname=regress_doesnotexist' PU
WARNING: subscription was created, but is not connected
HINT: To initiate replication, you must manually create the replication slot, enable the subscription, and refresh the subscription.
\dRs+ regress_testsub4
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
-------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub4 | regress_subscription_user | f | {testpub} | f | off | d | f | none | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay | Synchronous commit | Conninfo | Skip LSN
+------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+-----------------+--------------------+-----------------------------+----------
+ regress_testsub4 | regress_subscription_user | f | {testpub} | f | off | d | f | none | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
ALTER SUBSCRIPTION regress_testsub4 SET (origin = any);
\dRs+ regress_testsub4
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
-------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub4 | regress_subscription_user | f | {testpub} | f | off | d | f | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay | Synchronous commit | Conninfo | Skip LSN
+------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+-----------------+--------------------+-----------------------------+----------
+ regress_testsub4 | regress_subscription_user | f | {testpub} | f | off | d | f | any | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
DROP SUBSCRIPTION regress_testsub3;
@@ -143,10 +143,10 @@ ALTER SUBSCRIPTION regress_testsub CONNECTION 'foobar';
ERROR: invalid connection string syntax: missing "=" after "foobar" in connection info string
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub} | f | off | d | f | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+-----------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | off | d | f | any | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
ALTER SUBSCRIPTION regress_testsub SET PUBLICATION testpub2, testpub3 WITH (refresh = false);
@@ -163,10 +163,10 @@ ERROR: unrecognized subscription parameter: "create_slot"
-- ok
ALTER SUBSCRIPTION regress_testsub SKIP (lsn = '0/12345');
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+---------------------+--------+-----------+------------------+------------------+--------+--------------------+------------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub2,testpub3} | f | off | d | f | any | off | dbname=regress_doesnotexist2 | 0/12345
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+---------------------+--------+-----------+------------------+------------------+--------+-----------------+--------------------+------------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub2,testpub3} | f | off | d | f | any | 0 | off | dbname=regress_doesnotexist2 | 0/12345
(1 row)
-- ok - with lsn = NONE
@@ -175,10 +175,10 @@ ALTER SUBSCRIPTION regress_testsub SKIP (lsn = NONE);
ALTER SUBSCRIPTION regress_testsub SKIP (lsn = '0/0');
ERROR: invalid WAL location (LSN): 0/0
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+---------------------+--------+-----------+------------------+------------------+--------+--------------------+------------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub2,testpub3} | f | off | d | f | any | off | dbname=regress_doesnotexist2 | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+---------------------+--------+-----------+------------------+------------------+--------+-----------------+--------------------+------------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub2,testpub3} | f | off | d | f | any | 0 | off | dbname=regress_doesnotexist2 | 0/0
(1 row)
BEGIN;
@@ -210,10 +210,10 @@ ALTER SUBSCRIPTION regress_testsub_foo SET (synchronous_commit = foobar);
ERROR: invalid value for parameter "synchronous_commit": "foobar"
HINT: Available values: local, remote_write, remote_apply, on, off.
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
----------------------+---------------------------+---------+---------------------+--------+-----------+------------------+------------------+--------+--------------------+------------------------------+----------
- regress_testsub_foo | regress_subscription_user | f | {testpub2,testpub3} | f | off | d | f | any | local | dbname=regress_doesnotexist2 | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay | Synchronous commit | Conninfo | Skip LSN
+---------------------+---------------------------+---------+---------------------+--------+-----------+------------------+------------------+--------+-----------------+--------------------+------------------------------+----------
+ regress_testsub_foo | regress_subscription_user | f | {testpub2,testpub3} | f | off | d | f | any | 0 | local | dbname=regress_doesnotexist2 | 0/0
(1 row)
-- rename back to keep the rest simple
@@ -247,19 +247,19 @@ CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUB
WARNING: subscription was created, but is not connected
HINT: To initiate replication, you must manually create the replication slot, enable the subscription, and refresh the subscription.
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub} | t | off | d | f | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+-----------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | t | off | d | f | any | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
ALTER SUBSCRIPTION regress_testsub SET (binary = false);
ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE);
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub} | f | off | d | f | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+-----------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | off | d | f | any | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
DROP SUBSCRIPTION regress_testsub;
@@ -271,27 +271,27 @@ CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUB
WARNING: subscription was created, but is not connected
HINT: To initiate replication, you must manually create the replication slot, enable the subscription, and refresh the subscription.
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub} | f | on | d | f | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+-----------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | on | d | f | any | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
ALTER SUBSCRIPTION regress_testsub SET (streaming = parallel);
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub} | f | parallel | d | f | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+-----------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | parallel | d | f | any | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
ALTER SUBSCRIPTION regress_testsub SET (streaming = false);
ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE);
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub} | f | off | d | f | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+-----------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | off | d | f | any | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
-- fail - publication already exists
@@ -306,10 +306,10 @@ ALTER SUBSCRIPTION regress_testsub ADD PUBLICATION testpub1, testpub2 WITH (refr
ALTER SUBSCRIPTION regress_testsub ADD PUBLICATION testpub1, testpub2 WITH (refresh = false);
ERROR: publication "testpub1" is already in subscription "regress_testsub"
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-----------------------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub,testpub1,testpub2} | f | off | d | f | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-----------------------------+--------+-----------+------------------+------------------+--------+-----------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub,testpub1,testpub2} | f | off | d | f | any | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
-- fail - publication used more than once
@@ -324,10 +324,10 @@ ERROR: publication "testpub3" is not in subscription "regress_testsub"
-- ok - delete publications
ALTER SUBSCRIPTION regress_testsub DROP PUBLICATION testpub1, testpub2 WITH (refresh = false);
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub} | f | off | d | f | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+-----------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | off | d | f | any | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
DROP SUBSCRIPTION regress_testsub;
@@ -363,10 +363,10 @@ CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUB
WARNING: subscription was created, but is not connected
HINT: To initiate replication, you must manually create the replication slot, enable the subscription, and refresh the subscription.
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub} | f | off | p | f | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+-----------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | off | p | f | any | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
--fail - alter of two_phase option not supported.
@@ -375,10 +375,10 @@ ERROR: unrecognized subscription parameter: "two_phase"
-- but can alter streaming when two_phase enabled
ALTER SUBSCRIPTION regress_testsub SET (streaming = true);
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub} | f | on | p | f | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+-----------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | on | p | f | any | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE);
@@ -388,10 +388,10 @@ CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUB
WARNING: subscription was created, but is not connected
HINT: To initiate replication, you must manually create the replication slot, enable the subscription, and refresh the subscription.
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub} | f | on | p | f | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+-----------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | on | p | f | any | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE);
@@ -404,20 +404,57 @@ CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUB
WARNING: subscription was created, but is not connected
HINT: To initiate replication, you must manually create the replication slot, enable the subscription, and refresh the subscription.
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub} | f | off | d | f | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+-----------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | off | d | f | any | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
ALTER SUBSCRIPTION regress_testsub SET (disable_on_error = true);
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub} | f | off | d | t | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+-----------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | off | d | t | any | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
+ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE);
+DROP SUBSCRIPTION regress_testsub;
+-- fail -- min_apply_delay must be a non-negative integer
+CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUBLICATION testpub WITH (connect = false, min_apply_delay = foo);
+ERROR: invalid value for parameter "min_apply_delay": "foo"
+CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUBLICATION testpub WITH (connect = false, min_apply_delay = -1);
+ERROR: -1 ms is outside the valid range for parameter "min_apply_delay" (0 .. 2147483647)
+-- fail - utilizing streaming = parallel with time-delayed replication is not supported
+CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUBLICATION testpub WITH (connect = false, streaming = parallel, min_apply_delay = 123);
+ERROR: min_apply_delay > 0 and streaming = parallel are mutually exclusive options
+-- success -- min_apply_delay value without unit is taken as milliseconds
+CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUBLICATION testpub WITH (connect = false, min_apply_delay = 123);
+WARNING: subscription was created, but is not connected
+HINT: To initiate replication, you must manually create the replication slot, enable the subscription, and refresh the subscription.
+\dRs+
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+-----------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | off | d | f | any | 123 | off | dbname=regress_doesnotexist | 0/0
+(1 row)
+
+-- success -- min_apply_delay value with unit is converted into ms and stored as an integer
+ALTER SUBSCRIPTION regress_testsub SET (min_apply_delay = '1 d');
+\dRs+
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+-----------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | off | d | f | any | 86400000 | off | dbname=regress_doesnotexist | 0/0
+(1 row)
+
+-- fail - alter subscription with streaming = parallel should fail when time-delayed replication is set
+ALTER SUBSCRIPTION regress_testsub SET (streaming = parallel);
+ERROR: cannot set parallel streaming mode for subscription with min_apply_delay
+-- fail - alter subscription with min_apply_delay should fail when streaming = parallel is set
+ALTER SUBSCRIPTION regress_testsub SET (min_apply_delay = 0, streaming = parallel);
+ALTER SUBSCRIPTION regress_testsub SET (min_apply_delay = 123);
+ERROR: cannot set min_apply_delay for subscription in parallel streaming mode
ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE);
DROP SUBSCRIPTION regress_testsub;
RESET SESSION AUTHORIZATION;
diff --git a/src/test/regress/sql/subscription.sql b/src/test/regress/sql/subscription.sql
index 7281f5fee2..7317b140f5 100644
--- a/src/test/regress/sql/subscription.sql
+++ b/src/test/regress/sql/subscription.sql
@@ -286,6 +286,30 @@ ALTER SUBSCRIPTION regress_testsub SET (disable_on_error = true);
ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE);
DROP SUBSCRIPTION regress_testsub;
+-- fail -- min_apply_delay must be a non-negative integer
+CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUBLICATION testpub WITH (connect = false, min_apply_delay = foo);
+CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUBLICATION testpub WITH (connect = false, min_apply_delay = -1);
+
+-- fail - utilizing streaming = parallel with time-delayed replication is not supported
+CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUBLICATION testpub WITH (connect = false, streaming = parallel, min_apply_delay = 123);
+
+-- success -- min_apply_delay value without unit is taken as milliseconds
+CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUBLICATION testpub WITH (connect = false, min_apply_delay = 123);
+\dRs+
+
+-- success -- min_apply_delay value with unit is converted into ms and stored as an integer
+ALTER SUBSCRIPTION regress_testsub SET (min_apply_delay = '1 d');
+\dRs+
+
+-- fail - alter subscription with streaming = parallel should fail when time-delayed replication is set
+ALTER SUBSCRIPTION regress_testsub SET (streaming = parallel);
+
+-- fail - alter subscription with min_apply_delay should fail when streaming = parallel is set
+ALTER SUBSCRIPTION regress_testsub SET (min_apply_delay = 0, streaming = parallel);
+ALTER SUBSCRIPTION regress_testsub SET (min_apply_delay = 123);
+ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE);
+DROP SUBSCRIPTION regress_testsub;
+
RESET SESSION AUTHORIZATION;
DROP ROLE regress_subscription_user;
DROP ROLE regress_subscription_user2;
diff --git a/src/test/subscription/t/001_rep_changes.pl b/src/test/subscription/t/001_rep_changes.pl
index 91aa068c95..01f2c4284d 100644
--- a/src/test/subscription/t/001_rep_changes.pl
+++ b/src/test/subscription/t/001_rep_changes.pl
@@ -515,6 +515,37 @@ $node_publisher->poll_query_until('postgres',
or die
"Timed out while waiting for apply to restart after renaming SUBSCRIPTION";
+# Test time-delayed logical replication
+#
+# If the subscription sets min_apply_delay parameter, the logical replication
+# worker will delay the transaction apply for min_apply_delay milliseconds. We
+# verify this by looking at the time difference between a) when tuples are
+# inserted on the publisher, and b) when those changes are replicated on the
+# subscriber. Even on slow machines, this strategy will give predictable behavior.
+
+# Set min_apply_delay parameter to 3 seconds
+my $delay = 3;
+$node_subscriber->safe_psql('postgres',
+ "ALTER SUBSCRIPTION tap_sub_renamed SET (min_apply_delay = '${delay}s')");
+
+# Before doing the insertion, get the current timestamp that will be
+# used as a comparison base.
+my $publisher_insert_time = time();
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_ins VALUES (generate_series(1101, 1120))");
+
+$node_subscriber->poll_query_until('postgres',
+ "SELECT count(*) = 1 FROM tab_ins WHERE a = 1120;"
+ )
+ or die
+ "failed to replicate changes";
+
+# This test is successful if and only if the LSN has been applied with at least
+# the configured apply delay.
+ok( time() - $publisher_insert_time >= $delay,
+ "subscriber applies WAL only after replication delay for non-streaming transaction"
+);
+
# check all the cleanup
$node_subscriber->safe_psql('postgres', "DROP SUBSCRIPTION tap_sub_renamed");
diff --git a/src/test/subscription/t/032_tmp.pl b/src/test/subscription/t/032_tmp.pl
new file mode 100644
index 0000000000..b298b5ed6e
--- /dev/null
+++ b/src/test/subscription/t/032_tmp.pl
@@ -0,0 +1,141 @@
+
+# Copyright (c) 2021-2023, PostgreSQL Global Development Group
+
+# Basic logical replication test
+use strict;
+use warnings;
+use PostgreSQL::Test::Cluster;
+use PostgreSQL::Test::Utils;
+use Test::More;
+
+# Initialize publisher node
+my $node_publisher = PostgreSQL::Test::Cluster->new('publisher');
+$node_publisher->init(allows_streaming => 'logical');
+$node_publisher->append_conf('postgresql.conf', "max_prepared_transactions = 10");
+$node_publisher->start;
+
+# Create subscriber node
+my $node_subscriber = PostgreSQL::Test::Cluster->new('subscriber');
+$node_subscriber->init(allows_streaming => 'logical');
+$node_subscriber->append_conf('postgresql.conf', "log_min_messages = debug1");
+$node_subscriber->append_conf('postgresql.conf', "max_prepared_transactions = 10");
+$node_subscriber->start;
+
+# Create some preexisting content on publisher
+$node_publisher->safe_psql('postgres', "CREATE TABLE tab_ins (a int)");
+$node_subscriber->safe_psql('postgres', "CREATE TABLE tab_ins (a int)");
+
+# Setup logical replication
+my $publisher_connstr = $node_publisher->connstr . ' dbname=postgres';
+$node_publisher->safe_psql('postgres', "CREATE PUBLICATION tap_pub FOR ALL TABLES");
+$node_subscriber->safe_psql('postgres',
+ "CREATE SUBSCRIPTION tap_sub CONNECTION '$publisher_connstr' PUBLICATION tap_pub WITH (two_phase = on)"
+);
+
+# Wait for initial table sync to finish
+$node_subscriber->wait_for_subscription_sync($node_publisher, 'tap_sub');
+
+# Test time-delayed logical replication
+#
+# If the subscription sets min_apply_delay parameter, the logical replication
+# worker will delay the transaction apply for min_apply_delay milliseconds. We
+# verify this by looking at the time difference between a) when tuples are
+# inserted on the publisher, and b) when those changes are replicated on the
+# subscriber. Even on slow machines, this strategy will give predictable behavior.
+
+# Set min_apply_delay parameter to 3 seconds
+my $delay = 3;
+$node_subscriber->safe_psql('postgres',
+ "ALTER SUBSCRIPTION tap_sub SET (min_apply_delay = '${delay}s')");
+
+#
+# non-streaming
+#
+
+# Before doing the insertion, get the current timestamp that will be
+# used as a comparison base.
+my $publisher_insert_time = time();
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_ins VALUES (1)");
+
+# The publisher waits for the replication to complete
+$node_publisher->wait_for_catchup('tap_sub');
+
+$node_subscriber->poll_query_until('postgres',
+ "SELECT count(*) = 1 FROM tab_ins;"
+ )
+ or die
+ "failed to replicate changes";
+
+# This test is successful if and only if the LSN has been applied with at least
+# the configured apply delay.
+ok( time() - $publisher_insert_time >= $delay,
+ "subscriber applies WAL only after replication delay for non-streaming transaction"
+);
+
+#
+# non-streaming, 2PC
+#
+
+# my $publisher_insert_time = time();
+# $node_publisher->safe_psql('postgres',"
+# BEGIN;
+# INSERT INTO tab_ins VALUES (2)
+# PREPARE TRANSACTION 'test_non_stream'");
+
+# # The publisher waits for the replication to complete
+# $node_publisher->wait_for_catchup('tap_sub');
+
+# $result = $node_subscriber->safe_psql('postgres',
+# "SELECT count(*) FROM pg_prepared_xacts;");
+# is($result, qq(1), 'transaction is prepared on subscriber');
+
+# # check that 2PC gets committed on subscriber
+# $node_publisher->safe_psql('postgres',
+# "COMMIT PREPARED 'test_non_stream';");
+
+# $node_subscriber->poll_query_until('postgres',
+# "SELECT count(*) = 1 FROM tab_ins;"
+# )
+# or die
+# "failed to replicate changes";
+
+# # This test is successful if and only if the LSN has been applied with at least
+# # the configured apply delay.
+# ok( time() - $publisher_insert_time >= $delay,
+# "subscriber applies WAL only after replication delay for non-streaming transaction"
+# );
+
+
+#
+# streaming
+#
+
+$node_publisher->append_conf('postgresql.conf',
+ 'logical_replication_mode = immediate');
+$node_publisher->reload;
+$node_publisher->safe_psql('postgres', q{SELECT 1});
+
+$publisher_insert_time = time();
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_ins VALUES (generate_series(2, 30))");
+
+# The publisher waits for the replication to complete
+$node_publisher->wait_for_catchup('tap_sub');
+
+$node_subscriber->poll_query_until('postgres',
+ "SELECT count(*) = 30 FROM tab_ins;"
+ )
+ or die
+ "failed to replicate changes";
+
+# This test is successful if and only if the LSN has been applied with at least
+# the configured apply delay.
+ok( time() - $publisher_insert_time >= $delay,
+ "subscriber applies WAL only after replication delay for streaming transaction"
+);
+
+$node_subscriber->stop('fast');
+$node_publisher->stop('fast');
+
+done_testing();
--
2.27.0
Dear hackers,
I have made a rough prototype that can serialize changes to permanent file and
apply after time elapsed from v30 patch. I think the 2PC and restore mechanism
needs more analysis, but I can share codes for discussion. How do you think?
I have noticed that it could not be applied due to the recent commit.
Here is a rebased version.
Best Regards,
Hayato Kuroda
FUJITSU LIMITED
Attachments:
v2-0001-WIP-Time-delayed-logical-replication-by-serializi.patchapplication/octet-stream; name=v2-0001-WIP-Time-delayed-logical-replication-by-serializi.patchDownload
From 4388847dc19342596f5d723095d1ea24ea1f8eed Mon Sep 17 00:00:00 2001
From: Hayato Kuroda <kuroda.hayato@fujitsu.com>
Date: Thu, 30 Mar 2023 05:25:53 +0000
Subject: [PATCH v2] (WIP) Time-delayed logical replication by serializing
changes
Similar to physical replication, a time-delayed copy of the data for
logical replication is useful for some scenarios (particularly to fix
errors that might cause data loss).
This patch implements a new subscription parameter called 'min_apply_delay'.
If the subscription sets min_apply_delay parameter, the logical
replication worker will delay the transaction apply for min_apply_delay
milliseconds.
The delaying is implemented by serializing changes into file. The file is
created when the worker receives BEGIN message. The worker writes received
changes and flush at COMMIT. The delayed transaction is checked its commit time
for every main loop, and applied from the file when the time exceeds the
min_apply_delay. The commit time is stored in memory when the transaction is
committed, or the worker restarts.
The delay is calculated between the WAL time stamp and the current time
on the subscriber.
The combination of parallel streaming mode and min_apply_delay is not
allowed. This is because in parallel streaming mode, we start applying
the transaction stream as soon as the first change arrives without
knowing the transaction's prepare/commit time. This means we cannot
calculate the underlying network/decoding lag between publisher and
subscriber, and so always waiting for the full 'min_apply_delay' period
might include unnecessary delay.
The other possibility was to apply the delay at the end of the parallel
apply transaction but that would cause issues related to resource
bloat and locks being held for a long time.
Currently the combination of skip transaction feature and min_apply_delay
does not work well.
Earlier versions were written by Euler Taveira, Takamichi Osumi, and Kuroda Hayato
Author: Kuroda Hayato, Takamichi Osumi
---
doc/src/sgml/catalogs.sgml | 9 +
doc/src/sgml/glossary.sgml | 15 +
doc/src/sgml/logical-replication.sgml | 7 +
doc/src/sgml/ref/alter_subscription.sgml | 3 +-
doc/src/sgml/ref/create_subscription.sgml | 47 +-
src/backend/catalog/pg_subscription.c | 1 +
src/backend/catalog/system_views.sql | 7 +-
src/backend/commands/subscriptioncmds.c | 122 ++-
src/backend/replication/logical/worker.c | 1128 ++++++++++++++++++--
src/bin/pg_dump/pg_dump.c | 15 +-
src/bin/pg_dump/pg_dump.h | 1 +
src/bin/psql/describe.c | 9 +-
src/bin/psql/tab-complete.c | 4 +-
src/include/catalog/pg_subscription.h | 3 +
src/test/regress/expected/subscription.out | 181 ++--
src/test/regress/sql/subscription.sql | 24 +
src/test/subscription/t/001_rep_changes.pl | 31 +
src/test/subscription/t/032_tmp.pl | 141 +++
18 files changed, 1571 insertions(+), 177 deletions(-)
create mode 100644 src/test/subscription/t/032_tmp.pl
diff --git a/doc/src/sgml/catalogs.sgml b/doc/src/sgml/catalogs.sgml
index 7c09ab3000..6f2e348351 100644
--- a/doc/src/sgml/catalogs.sgml
+++ b/doc/src/sgml/catalogs.sgml
@@ -7891,6 +7891,15 @@ SCRAM-SHA-256$<replaceable><iteration count></replaceable>:<replaceable>&l
</para></entry>
</row>
+ <row>
+ <entry role="catalog_table_entry"><para role="column_definition">
+ <structfield>subminapplydelay</structfield> <type>int4</type>
+ </para>
+ <para>
+ The minimum delay, in milliseconds, for applying changes
+ </para></entry>
+ </row>
+
<row>
<entry role="catalog_table_entry"><para role="column_definition">
<structfield>subname</structfield> <type>name</type>
diff --git a/doc/src/sgml/glossary.sgml b/doc/src/sgml/glossary.sgml
index 7c01a541fe..9ede9d05f6 100644
--- a/doc/src/sgml/glossary.sgml
+++ b/doc/src/sgml/glossary.sgml
@@ -1729,6 +1729,21 @@
</glossdef>
</glossentry>
+ <glossentry id="glossary-time-delayed-replication">
+ <glossterm>Time-delayed replication</glossterm>
+ <glossdef>
+ <para>
+ Replication setup that delays the application of changes by a specified
+ minimum time-delay period.
+ </para>
+ <para>
+ For more information, see
+ <xref linkend="guc-recovery-min-apply-delay"/> for physical replication
+ and <xref linkend="sql-createsubscription"/> for logical replication.
+ </para>
+ </glossdef>
+ </glossentry>
+
<glossentry id="glossary-toast">
<glossterm>TOAST</glossterm>
<glossdef>
diff --git a/doc/src/sgml/logical-replication.sgml b/doc/src/sgml/logical-replication.sgml
index 6358c5da05..135bf93835 100644
--- a/doc/src/sgml/logical-replication.sgml
+++ b/doc/src/sgml/logical-replication.sgml
@@ -257,6 +257,13 @@
option of <command>CREATE SUBSCRIPTION</command> for details.
</para>
+ <para>
+ A subscription can delay the application of changes by specifying the
+ <link linkend="sql-createsubscription-with-min-apply-delay"><literal>min_apply_delay</literal></link>
+ subscription parameter. See <xref linkend="sql-createsubscription"/> for
+ details.
+ </para>
+
<sect2 id="logical-replication-subscription-slot">
<title>Replication Slot Management</title>
diff --git a/doc/src/sgml/ref/alter_subscription.sgml b/doc/src/sgml/ref/alter_subscription.sgml
index 9735a82206..5e24d8992d 100644
--- a/doc/src/sgml/ref/alter_subscription.sgml
+++ b/doc/src/sgml/ref/alter_subscription.sgml
@@ -223,7 +223,8 @@ ALTER SUBSCRIPTION <replaceable class="parameter">name</replaceable> RENAME TO <
<link linkend="sql-createsubscription-with-binary"><literal>binary</literal></link>,
<link linkend="sql-createsubscription-with-streaming"><literal>streaming</literal></link>,
<link linkend="sql-createsubscription-with-disable-on-error"><literal>disable_on_error</literal></link>,
- and <link linkend="sql-createsubscription-with-origin"><literal>origin</literal></link>.
+ <link linkend="sql-createsubscription-with-origin"><literal>origin</literal></link>,
+ and <link linkend="sql-createsubscription-with-min-apply-delay"><literal>min_apply_delay</literal></link>.
</para>
</listitem>
</varlistentry>
diff --git a/doc/src/sgml/ref/create_subscription.sgml b/doc/src/sgml/ref/create_subscription.sgml
index a66b8025f3..070189f3f2 100644
--- a/doc/src/sgml/ref/create_subscription.sgml
+++ b/doc/src/sgml/ref/create_subscription.sgml
@@ -365,7 +365,47 @@ CREATE SUBSCRIPTION <replaceable class="parameter">subscription_name</replaceabl
</para>
</listitem>
</varlistentry>
- </variablelist></para>
+
+ <varlistentry id="sql-createsubscription-with-min-apply-delay">
+ <term><literal>min_apply_delay</literal> (<type>integer</type>)</term>
+ <listitem>
+ <para>
+ By default, the subscriber applies changes as soon as possible. This
+ parameter allows the user to delay the application of changes by a
+ given time period. This is done by writing all the changes into a
+ file once and apply contents after spending time. If the value is
+ specified without units, it is taken as milliseconds. The default
+ is zero (no delay). See <xref linkend="config-setting-names-values"/>
+ for details on the available valid time units.
+ </para>
+ <para>
+ Any delay becomes effective only after all initial table
+ synchronization has finished and occurs before each transaction starts
+ to get applied on the subscriber. The delay is calculated as the
+ difference between the WAL timestamp as written on the publisher and
+ the current time on the subscriber. Any overhead of time spent in
+ logical decoding and in transferring the transaction may reduce the
+ actual wait time. Even if the overhead already exceeds the requested
+ <literal>min_apply_delay</literal> value, all the changes are written
+ into file and applied immediately. If the system clocks on publisher
+ and subscriber are not synchronized, this may lead to apply changes
+ earlier than expected, but this is not a major issue because this
+ parameter is typically much larger than the time deviations between
+ servers.
+ </para>
+ <warning>
+ <para>
+ Delaying the replication means there is a much longer time between
+ making a change on the publisher, and that change being committed
+ on the subscriber. This can impact the performance of synchronous
+ replication. See <xref linkend="guc-synchronous-commit"/>
+ parameter.
+ </para>
+ </warning>
+ </listitem>
+ </varlistentry>
+ </variablelist>
+ </para>
</listitem>
</varlistentry>
@@ -436,6 +476,11 @@ CREATE SUBSCRIPTION <replaceable class="parameter">subscription_name</replaceabl
published with different column lists are not supported.
</para>
+ <para>
+ A non-zero <literal>min_apply_delay</literal> parameter is not allowed when
+ streaming in parallel mode.
+ </para>
+
<para>
We allow non-existent publications to be specified so that users can add
those later. This means
diff --git a/src/backend/catalog/pg_subscription.c b/src/backend/catalog/pg_subscription.c
index d322b9482c..dd35a36a80 100644
--- a/src/backend/catalog/pg_subscription.c
+++ b/src/backend/catalog/pg_subscription.c
@@ -66,6 +66,7 @@ GetSubscription(Oid subid, bool missing_ok)
sub->skiplsn = subform->subskiplsn;
sub->name = pstrdup(NameStr(subform->subname));
sub->owner = subform->subowner;
+ sub->minapplydelay = subform->subminapplydelay;
sub->enabled = subform->subenabled;
sub->binary = subform->subbinary;
sub->stream = subform->substream;
diff --git a/src/backend/catalog/system_views.sql b/src/backend/catalog/system_views.sql
index 8ea159dbde..00420e1f78 100644
--- a/src/backend/catalog/system_views.sql
+++ b/src/backend/catalog/system_views.sql
@@ -1316,9 +1316,10 @@ REVOKE ALL ON pg_replication_origin_status FROM public;
-- All columns of pg_subscription except subconninfo are publicly readable.
REVOKE ALL ON pg_subscription FROM public;
-GRANT SELECT (oid, subdbid, subskiplsn, subname, subowner, subenabled,
- subbinary, substream, subtwophasestate, subdisableonerr,
- subslotname, subsynccommit, subpublications, suborigin)
+GRANT SELECT (oid, subdbid, subskiplsn, subminapplydelay, subname, subowner,
+ subenabled, subbinary, substream, subtwophasestate,
+ subdisableonerr, subslotname, subsynccommit, subpublications,
+ suborigin)
ON pg_subscription TO public;
CREATE VIEW pg_stat_subscription_stats AS
diff --git a/src/backend/commands/subscriptioncmds.c b/src/backend/commands/subscriptioncmds.c
index 93a238412a..2e56b3b956 100644
--- a/src/backend/commands/subscriptioncmds.c
+++ b/src/backend/commands/subscriptioncmds.c
@@ -66,6 +66,7 @@
#define SUBOPT_DISABLE_ON_ERR 0x00000400
#define SUBOPT_LSN 0x00000800
#define SUBOPT_ORIGIN 0x00001000
+#define SUBOPT_MIN_APPLY_DELAY 0x00002000
/* check if the 'val' has 'bits' set */
#define IsSet(val, bits) (((val) & (bits)) == (bits))
@@ -90,6 +91,7 @@ typedef struct SubOpts
bool disableonerr;
char *origin;
XLogRecPtr lsn;
+ int32 min_apply_delay;
} SubOpts;
static List *fetch_table_list(WalReceiverConn *wrconn, List *publications);
@@ -100,7 +102,7 @@ static void check_publications_origin(WalReceiverConn *wrconn,
static void check_duplicates_in_publist(List *publist, Datum *datums);
static List *merge_publications(List *oldpublist, List *newpublist, bool addpub, const char *subname);
static void ReportSlotConnectionError(List *rstates, Oid subid, char *slotname, char *err);
-
+static int32 defGetMinApplyDelay(DefElem *def);
/*
* Common option parsing function for CREATE and ALTER SUBSCRIPTION commands.
@@ -146,6 +148,8 @@ parse_subscription_options(ParseState *pstate, List *stmt_options,
opts->disableonerr = false;
if (IsSet(supported_opts, SUBOPT_ORIGIN))
opts->origin = pstrdup(LOGICALREP_ORIGIN_ANY);
+ if (IsSet(supported_opts, SUBOPT_MIN_APPLY_DELAY))
+ opts->min_apply_delay = 0;
/* Parse options */
foreach(lc, stmt_options)
@@ -324,6 +328,15 @@ parse_subscription_options(ParseState *pstate, List *stmt_options,
opts->specified_opts |= SUBOPT_LSN;
opts->lsn = lsn;
}
+ else if (IsSet(supported_opts, SUBOPT_MIN_APPLY_DELAY) &&
+ strcmp(defel->defname, "min_apply_delay") == 0)
+ {
+ if (IsSet(opts->specified_opts, SUBOPT_MIN_APPLY_DELAY))
+ errorConflictingDefElem(defel, pstate);
+
+ opts->specified_opts |= SUBOPT_MIN_APPLY_DELAY;
+ opts->min_apply_delay = defGetMinApplyDelay(defel);
+ }
else
ereport(ERROR,
(errcode(ERRCODE_SYNTAX_ERROR),
@@ -404,6 +417,32 @@ parse_subscription_options(ParseState *pstate, List *stmt_options,
"slot_name = NONE", "create_slot = false")));
}
}
+
+ /*
+ * The combination of parallel streaming mode and min_apply_delay is not
+ * allowed. This is because in parallel streaming mode, we start applying
+ * the transaction stream as soon as the first change arrives without
+ * knowing the transaction's prepare/commit time. This means we cannot
+ * calculate the underlying network/decoding lag between publisher and
+ * subscriber, and so always waiting for the full 'min_apply_delay' period
+ * might include unnecessary delay.
+ *
+ * The other possibility was to apply the delay at the end of the parallel
+ * apply transaction but that would cause issues related to resource bloat
+ * and locks being held for a long time.
+ */
+ if (IsSet(supported_opts, SUBOPT_MIN_APPLY_DELAY) &&
+ opts->min_apply_delay > 0 &&
+ opts->streaming == LOGICALREP_STREAM_PARALLEL)
+ ereport(ERROR,
+ errcode(ERRCODE_SYNTAX_ERROR),
+
+ /*
+ * translator: the first %s is a string of the form "parameter > 0"
+ * and the second one is "option = value".
+ */
+ errmsg("%s and %s are mutually exclusive options",
+ "min_apply_delay > 0", "streaming = parallel"));
}
/*
@@ -560,7 +599,8 @@ CreateSubscription(ParseState *pstate, CreateSubscriptionStmt *stmt,
SUBOPT_SLOT_NAME | SUBOPT_COPY_DATA |
SUBOPT_SYNCHRONOUS_COMMIT | SUBOPT_BINARY |
SUBOPT_STREAMING | SUBOPT_TWOPHASE_COMMIT |
- SUBOPT_DISABLE_ON_ERR | SUBOPT_ORIGIN);
+ SUBOPT_DISABLE_ON_ERR | SUBOPT_ORIGIN |
+ SUBOPT_MIN_APPLY_DELAY);
parse_subscription_options(pstate, stmt->options, supported_opts, &opts);
/*
@@ -625,6 +665,7 @@ CreateSubscription(ParseState *pstate, CreateSubscriptionStmt *stmt,
values[Anum_pg_subscription_oid - 1] = ObjectIdGetDatum(subid);
values[Anum_pg_subscription_subdbid - 1] = ObjectIdGetDatum(MyDatabaseId);
values[Anum_pg_subscription_subskiplsn - 1] = LSNGetDatum(InvalidXLogRecPtr);
+ values[Anum_pg_subscription_subminapplydelay - 1] = Int32GetDatum(opts.min_apply_delay);
values[Anum_pg_subscription_subname - 1] =
DirectFunctionCall1(namein, CStringGetDatum(stmt->subname));
values[Anum_pg_subscription_subowner - 1] = ObjectIdGetDatum(owner);
@@ -1054,7 +1095,7 @@ AlterSubscription(ParseState *pstate, AlterSubscriptionStmt *stmt,
supported_opts = (SUBOPT_SLOT_NAME |
SUBOPT_SYNCHRONOUS_COMMIT | SUBOPT_BINARY |
SUBOPT_STREAMING | SUBOPT_DISABLE_ON_ERR |
- SUBOPT_ORIGIN);
+ SUBOPT_ORIGIN | SUBOPT_MIN_APPLY_DELAY);
parse_subscription_options(pstate, stmt->options,
supported_opts, &opts);
@@ -1098,6 +1139,19 @@ AlterSubscription(ParseState *pstate, AlterSubscriptionStmt *stmt,
if (IsSet(opts.specified_opts, SUBOPT_STREAMING))
{
+ /*
+ * The combination of parallel streaming mode and
+ * min_apply_delay is not allowed. See
+ * parse_subscription_options.
+ */
+ if (opts.streaming == LOGICALREP_STREAM_PARALLEL &&
+ !IsSet(opts.specified_opts, SUBOPT_MIN_APPLY_DELAY)
+ && sub->minapplydelay > 0)
+ ereport(ERROR,
+ errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
+ errmsg("cannot set parallel streaming mode for subscription with %s",
+ "min_apply_delay"));
+
values[Anum_pg_subscription_substream - 1] =
CharGetDatum(opts.streaming);
replaces[Anum_pg_subscription_substream - 1] = true;
@@ -1111,6 +1165,26 @@ AlterSubscription(ParseState *pstate, AlterSubscriptionStmt *stmt,
= true;
}
+ if (IsSet(opts.specified_opts, SUBOPT_MIN_APPLY_DELAY))
+ {
+ /*
+ * The combination of parallel streaming mode and
+ * min_apply_delay is not allowed. See
+ * parse_subscription_options.
+ */
+ if (opts.min_apply_delay > 0 &&
+ !IsSet(opts.specified_opts, SUBOPT_STREAMING)
+ && sub->stream == LOGICALREP_STREAM_PARALLEL)
+ ereport(ERROR,
+ errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
+ errmsg("cannot set %s for subscription in parallel streaming mode",
+ "min_apply_delay"));
+
+ values[Anum_pg_subscription_subminapplydelay - 1] =
+ Int32GetDatum(opts.min_apply_delay);
+ replaces[Anum_pg_subscription_subminapplydelay - 1] = true;
+ }
+
if (IsSet(opts.specified_opts, SUBOPT_ORIGIN))
{
values[Anum_pg_subscription_suborigin - 1] =
@@ -2232,3 +2306,45 @@ defGetStreamingMode(DefElem *def)
def->defname)));
return LOGICALREP_STREAM_OFF; /* keep compiler quiet */
}
+
+/*
+ * Extract the min_apply_delay value from a DefElem. This is very similar to
+ * parse_and_validate_value() for integer values, because min_apply_delay
+ * accepts the same parameter format as recovery_min_apply_delay.
+ */
+static int32
+defGetMinApplyDelay(DefElem *def)
+{
+ char *input_string;
+ int result;
+ const char *hintmsg;
+
+ input_string = defGetString(def);
+
+ /*
+ * Parse given string as parameter which has millisecond unit
+ */
+ if (!parse_int(input_string, &result, GUC_UNIT_MS, &hintmsg))
+ ereport(ERROR,
+ (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+ errmsg("invalid value for parameter \"%s\": \"%s\"",
+ "min_apply_delay", input_string),
+ hintmsg ? errhint("%s", _(hintmsg)) : 0));
+
+ /*
+ * Check both the lower boundary for the valid min_apply_delay range and
+ * the upper boundary as the safeguard for some platforms where INT_MAX is
+ * wider than int32 respectively. Although parse_int() has confirmed that
+ * the result is less than or equal to INT_MAX, the value will be stored
+ * in a catalog column of int32.
+ */
+ if (result < 0 || result > PG_INT32_MAX)
+ ereport(ERROR,
+ (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+ errmsg("%d ms is outside the valid range for parameter \"%s\" (%d .. %d)",
+ result,
+ "min_apply_delay",
+ 0, PG_INT32_MAX)));
+
+ return result;
+}
diff --git a/src/backend/replication/logical/worker.c b/src/backend/replication/logical/worker.c
index 10f9711972..82f7e789ec 100644
--- a/src/backend/replication/logical/worker.c
+++ b/src/backend/replication/logical/worker.c
@@ -431,6 +431,670 @@ static inline void reset_apply_error_context_info(void);
static TransApplyAction get_transaction_apply_action(TransactionId xid,
ParallelApplyWorkerInfo **winfo);
+static void begin_replication_step(void);
+static void end_replication_step(void);
+
+/* XXX definitions for time-delayed logical replicaiton */
+#include "common/file_utils.h"
+
+/* DELAYEDDIR stores files that contains changes of delayed transactions. */
+#define DELAYEDDIR "pg_logical/delayed_txns"
+#define DELAYEDSUFFIX ".delayed_changes"
+
+/* List entry to map xid and commit time */
+typedef struct DelayedTxnListEntry
+{
+ TransactionId xid;
+ LogicalRepCommitData commit_data;
+} DelayedTxnListEntry;
+
+/*
+ * An entry is appended when the we receives commit message and time-delayed
+ * logical replication is requested. The entry will be deleted after contents
+ * are applied.
+ */
+static List *DelayedTxnList = NIL;
+
+/* fields valid only when time-delayed logical replication is requested */
+static bool in_delayed_transaction = false;
+
+static TransactionId delayed_xid = InvalidTransactionId;
+
+/*
+ * Store flushed lsn for time-delayed logical replication. This is used when
+ * we send a feedback message to the publisher.
+ */
+static XLogRecPtr last_flushed = InvalidXLogRecPtr;
+
+/*
+ * FIXME: global file descriptor may be not sufficient. There is a possibility
+ * that non-streaming transactions are come concurrently. At that time
+ * create_delay_file() for the second transaction will be failed...
+ */
+static int delayed_fd = -1;
+
+/*
+ * Cache commit_data into the list
+ */
+static void
+cache_commit_data(LogicalRepCommitData *commit_data, TransactionId xid)
+{
+ MemoryContext old;
+ DelayedTxnListEntry *entry;
+
+ old = MemoryContextSwitchTo(ApplyContext);
+
+ entry = palloc0(sizeof(DelayedTxnListEntry));
+
+ /* Contruct an entry and append it */
+ entry->xid = xid;
+ memcpy(&entry->commit_data, commit_data, sizeof(LogicalRepCommitData));
+ DelayedTxnList = lappend(DelayedTxnList, entry);
+
+ MemoryContextSwitchTo(old);
+}
+
+/*
+ * Flush given changes and close the file. This will be called at the end of the txn
+ */
+static void
+flush_delayed_changes(LogicalRepCommitData *commit_data)
+{
+ Assert(delayed_fd > 0);
+ Assert(TransactionIdIsValid(delayed_xid));
+
+ /* Cache given commit_data into the list */
+ cache_commit_data(commit_data, delayed_xid);
+
+ /* Flush previously written changes */
+ if (pg_fdatasync(delayed_fd) != 0)
+ {
+ int save_errno = errno;
+ close(delayed_fd);
+ errno = save_errno;
+ ereport(ERROR,
+ errcode_for_file_access(),
+ errmsg("could not fsync file"));
+ }
+
+ /* Store flushed lsn */
+ last_flushed = commit_data->end_lsn;
+
+ /* Cleanup */
+ close(delayed_fd);
+ delayed_fd = -1;
+ delayed_xid = InvalidTransactionId;
+ in_delayed_transaction = false;
+}
+
+/*
+ * Get formal filename from subid and xid
+ */
+static void
+delay_file_name(char *path, Oid subid, TransactionId xid)
+{
+ snprintf(path, MAXPGPATH, DELAYEDDIR "/%u-%u" DELAYEDSUFFIX, subid, xid);
+}
+
+/*
+ * Extract subid and xid and given pathname
+ */
+static void
+extract_info_from_delay_file(char *path, Oid *subid, TransactionId *xid)
+{
+ sscanf(path, DELAYEDDIR "/%u-%u", subid, xid);
+}
+
+/*
+ * Check whether the given transaction is delayed. This is done by checking the
+ * delay file.
+ */
+static bool
+is_given_transaction_delayed(Oid subid, TransactionId xid)
+{
+ struct stat st;
+ char path[MAXPGPATH];
+
+ delay_file_name(path, subid, xid);
+
+ return stat(path, &st) == 0;
+}
+
+/*
+ * Apply the delayed transaction. In the function a delayed file is opened and
+ * read. Apply worker applies written changes.
+ */
+static void
+apply_delayed_transaction(TransactionId xid, XLogRecPtr lsn)
+{
+ StringInfoData s2;
+ int nchanges;
+ char path[MAXPGPATH];
+ char *buffer = NULL;
+ MemoryContext oldcxt;
+ ResourceOwner oldowner;
+
+ /* Make sure we have an open transaction */
+ begin_replication_step();
+
+ /*
+ * Allocate file handle and memory required to process all the messages in
+ * TopTransactionContext to avoid them getting reset after each message is
+ * processed.
+ */
+ oldcxt = MemoryContextSwitchTo(TopTransactionContext);
+
+ /* Open the spool file for the committed transaction */
+ delay_file_name(path, MyLogicalRepWorker->subid, xid);
+ elog(DEBUG1, "replaying changes from file \"%s\"", path);
+
+ /*
+ * Make sure the file is owned by the toplevel transaction so that the
+ * file will not be accidentally closed when aborting a subtransaction.
+ */
+ oldowner = CurrentResourceOwner;
+ CurrentResourceOwner = TopTransactionResourceOwner;
+
+ /* Open the specified file */
+ delayed_fd = BasicOpenFile(path, O_RDONLY | PG_BINARY);
+
+ Assert(delayed_fd > 0);
+
+ CurrentResourceOwner = oldowner;
+
+ buffer = palloc(BLCKSZ);
+ initStringInfo(&s2);
+
+ MemoryContextSwitchTo(oldcxt);
+
+ remote_final_lsn = lsn;
+
+ /*
+ * Make sure the handle apply_dispatch methods are aware we're in a remote
+ * transaction.
+ */
+ in_remote_transaction = true;
+ pgstat_report_activity(STATE_RUNNING, NULL);
+
+ end_replication_step();
+
+ /*
+ * Read the entries one by one and pass them through the same logic as in
+ * apply_dispatch.
+ */
+ nchanges = 0;
+ while (true)
+ {
+ size_t nbytes;
+ int len;
+
+ CHECK_FOR_INTERRUPTS();
+
+ /* read length of the on-disk record */
+ nbytes = read(delayed_fd, &len, sizeof(len));
+
+ /* have we reached end of the file? */
+ if (nbytes == 0)
+ break;
+
+ /* do we have a correct length? */
+ if (len <= 0)
+ elog(ERROR, "incorrect length %d in delaed transaction's changes file \"%s\"",
+ len, path);
+
+ /* make sure we have sufficiently large buffer */
+ buffer = repalloc(buffer, len);
+
+ /* and finally read the data into the buffer */
+ read(delayed_fd, buffer, len);
+
+ /* copy the buffer to the stringinfo and call apply_dispatch */
+ resetStringInfo(&s2);
+ appendBinaryStringInfo(&s2, buffer, len);
+
+ /* Ensure we are reading the data into our memory context. */
+ oldcxt = MemoryContextSwitchTo(ApplyMessageContext);
+
+ apply_dispatch(&s2);
+
+ MemoryContextReset(ApplyMessageContext);
+
+ MemoryContextSwitchTo(oldcxt);
+
+ nchanges++;
+
+ if (nchanges % 1000 == 0)
+ elog(DEBUG1, "replayed %d changes from file \"%s\"",
+ nchanges, path);
+ }
+
+ if (delayed_fd > 0)
+ {
+ close(delayed_fd);
+ delayed_fd = -1;
+ durable_unlink(path, LOG);
+ }
+
+ elog(DEBUG1, "replayed %d (all) changes from file \"%s\"",
+ nchanges, path);
+
+ return;
+}
+
+/*
+ * Create a file that will be written changes.
+ */
+static void
+create_delay_file(TransactionId xid)
+{
+ char path[MAXPGPATH];
+ int fd;
+
+ Assert(TransactionIdIsValid(xid));
+ Assert(delayed_fd < 0);
+
+ delay_file_name(path, MyLogicalRepWorker->subid, xid);
+
+ elog(DEBUG1, "creating a file \"%s\" for time-delayed logical replication",
+ path);
+
+ fd = BasicOpenFile(path, O_WRONLY | O_CREAT | O_EXCL | O_APPEND | PG_BINARY);
+
+ if (fd < 0)
+ ereport(ERROR,
+ errcode_for_file_access(),
+ errmsg("could not create file \"%s\": %m",
+ path));
+
+ delayed_fd = fd;
+}
+
+/*
+ * Create a directory that holds delayed files
+ */
+static void
+initialize_delay_directory(void)
+{
+ char path[MAXPGPATH];
+
+ snprintf(path, MAXPGPATH, DELAYEDDIR);
+ if (MakePGDirectory(path) < 0 && errno != EEXIST)
+ ereport(ERROR,
+ errcode_for_file_access(),
+ errmsg("could not create directory \"%s\": %m",
+ path));
+
+ START_CRIT_SECTION();
+ fsync_fname(path, true);
+ END_CRIT_SECTION();
+}
+
+/*
+ * Read the delayed file and cache information of transaction e.g. committime
+ */
+static bool
+ReadCommitRecord(int fd, TransactionId xid)
+{
+ int len = 0;
+ char action = 0;
+ char *buffer;
+ StringInfoData commit_message;
+ LogicalRepCommitData commit_data = {0};
+
+ /*
+ * If the transaction is not 2PC, we can assume that decoded commit record
+ * is at the end of the file. Therefore, read from the end.
+ */
+
+ /* FIXME: size of the messages is estimated from the document */
+#define COMMIT_MESSAGE_SIZE (sizeof(int) + sizeof(char) + sizeof(int8) + sizeof(LogicalRepCommitData))
+
+ /* seek file to the end */
+ lseek(fd, -COMMIT_MESSAGE_SIZE, SEEK_END);
+
+ read(fd, &len, sizeof(int));
+ read(fd, &action, sizeof(char));
+
+ /*
+ * If the action is not 'C' and the got length is not valid, the
+ * transaction may be 2PC. So stop reading more.
+ */
+ if (len != (COMMIT_MESSAGE_SIZE - sizeof(int)) &&
+ action != LOGICAL_REP_MSG_COMMIT)
+ return false;
+
+ /*
+ * If we reach here, this file seems valid and normal transaction.
+ * Start to read more and cache into memory to start delaying.
+ */
+
+ /* Prepare buffer and read from file */
+ buffer = palloc0(len - sizeof(char));
+ read(fd, buffer, len - sizeof(char));
+
+ /* Append to StringInfo in order to use same read function */
+ initStringInfo(&commit_message);
+ appendBinaryStringInfo(&commit_message, buffer, len - sizeof(char));
+
+ /* Finally start to read decoded commit record */
+ logicalrep_read_commit(&commit_message, &commit_data);
+
+ /* ..and cache into the list */
+ cache_commit_data(&commit_data, xid);
+
+ pfree(buffer);
+ pfree(commit_message.data);
+
+#undef COMMIT_MESSAGE_SIZE
+
+ return true;
+}
+
+/*
+ * Read the delayed file and cache information of transaction e.g. committime.
+ *
+ * Note that apart from above, the native PREPARE/COMMIT PREPARED message is
+ * not directly written into the file. This is because gid can have arbitrary
+ * length and then we cannot estimate the offset of these records from the end
+ * of the file. Instread, the important information - prepare/commit_lsn,
+ * end_lsn, prepare/commit_time, and its transaction id are serialized.
+ * Functions for PREPARE/COMMIT PREPARED were combined because they have same
+ * attributes.
+ */
+static bool
+ReadPreparedCommonRecord(int fd)
+{
+ int len = 0;
+ char action = 0;
+
+ /*
+ * If the transaction is 2PC, we can assume that the final record is either
+ * or decoded prepare/commit prepared.
+ */
+
+ /*
+ * XXX: Modified message contains
+ * - length
+ * - message type
+ * - prepare/commit_lsn
+ * - end_lsn
+ * - xid
+ */
+#define PREPARE_MESSAGE_SIZE (sizeof(int) + sizeof(char) + sizeof(XLogRecPtr) + sizeof(XLogRecPtr) + sizeof(TimestampTz) + sizeof(TransactionId))
+ lseek(fd, -PREPARE_MESSAGE_SIZE, SEEK_END);
+ read(fd, &len, sizeof(int));
+ read(fd, &action, sizeof(char));
+
+ /*
+ * Do something if the record seems to be PREPARE or COMMIT PREPARED
+ */
+ if (len == (PREPARE_MESSAGE_SIZE - sizeof(int)) &&
+ action == LOGICAL_REP_MSG_PREPARE)
+ {
+ /* For PREPARE, do nothing */
+ return true;
+ }
+ else if (len == (PREPARE_MESSAGE_SIZE - sizeof(int)) &&
+ LOGICAL_REP_MSG_COMMIT_PREPARED)
+ {
+ /* For COMMIT PREPARED, cache into memory and start to delay */
+
+ LogicalRepCommitData commit_data = {0};
+ TransactionId xid = InvalidTransactionId;
+
+ /* Adjust position and append to StringInfo in order to use same read function */
+ read(fd, &commit_data.commit_lsn, sizeof(XLogRecPtr));
+ read(fd, &commit_data.end_lsn, sizeof(XLogRecPtr));
+ read(fd, &commit_data.committime, sizeof(TimestampTz));
+ read(fd, &xid, sizeof(TransactionId));
+
+ cache_commit_data(&commit_data, xid);
+
+ return true;
+ }
+ else
+ return false;
+}
+
+/*
+ * Transform information from commit_prepared style to commit style.
+ */
+static void
+ConstructCommitFromCommitPrepared(LogicalRepCommitData *commit,
+ LogicalRepCommitPreparedTxnData *prepare_data)
+{
+ commit->commit_lsn = prepare_data->commit_lsn;
+ commit->committime = prepare_data->commit_time;
+ commit->end_lsn = prepare_data->end_lsn;
+}
+
+/*
+ * Restore the delayed transaction from given files.
+ */
+static void
+RestoreDelayedTxn(char *path)
+{
+ Oid subid = InvalidOid;
+ TransactionId xid = InvalidTransactionId;
+ int fd;
+
+ /* Check filename to extract subid and xid */
+ extract_info_from_delay_file(path, &subid, &xid);
+
+ /*
+ * If the subid is not related with the apply worker, the transaction is
+ * out-of-scope for us...
+ */
+ if (MyLogicalRepWorker->subid != subid)
+ return;
+
+ /* OK, the transaction must be maintained by the worker. Open file */
+ fd = BasicOpenFile(path, O_RDONLY | PG_BINARY);
+
+ /* And restore from the end of the file */
+ if (ReadCommitRecord(fd, xid))
+ goto cleanup;
+
+ if (ReadPreparedCommonRecord(fd))
+ goto cleanup;
+
+ /*
+ * If we reach here the file seems to be corrupted. So remove once and
+ * receive changes again.
+ */
+ close(fd);
+ durable_unlink(path, LOG);
+ return;
+
+cleanup:
+ close(fd);
+}
+
+/*
+ * Restore all the delayed transactions to memory.
+ */
+static void
+RestoreDelayedTxns(void)
+{
+ DIR *delayed_dir;
+ struct dirent *delayed_de;
+
+ /* Read all the file step-by-step */
+ delayed_dir = AllocateDir(DELAYEDDIR);
+ while ((delayed_de = ReadDir(delayed_dir, DELAYEDDIR)) != NULL)
+ {
+ char path[MAXPGPATH];
+ PGFileType de_type;
+
+ if (strcmp(delayed_de->d_name, ".") == 0 ||
+ strcmp(delayed_de->d_name, "..") == 0)
+ continue;
+
+ /* Check the filename and status */
+ snprintf(path, sizeof(path), DELAYEDDIR "/%s", delayed_de->d_name);
+ de_type = get_dirent_type(path, delayed_de, false, DEBUG1);
+
+ if (de_type != PGFILETYPE_REG)
+ continue;
+
+ /* Found a delayed transaction. Restore it. */
+ RestoreDelayedTxn(path);
+ }
+ FreeDir(delayed_dir);
+}
+
+/*
+ * Restore delayed transactions, or initialize the directory
+ */
+static void
+InitializeDelayedTxn(void)
+{
+ struct stat st;
+ char path[MAXPGPATH];
+
+ snprintf(path, MAXPGPATH, DELAYEDDIR);
+
+ /*
+ * If the given directory does not exist, create one. Otherwise start to
+ * restore.
+ */
+ if (stat(path, &st) != 0)
+ {
+ initialize_delay_directory();
+ return;
+ }
+
+ RestoreDelayedTxns();
+}
+
+/*
+ * Write a given message to a file. This is called for every message.
+ * This returns true only when changes are written into file.
+ *
+ * The format of the serialized changes is same as the streamed one. This
+ * has a length (not including the length), action code (identifying the
+ * message type) and message contents (without the subxact TransactionId
+ * value).
+ */
+static bool
+handle_delayed_transaction(char action, StringInfo s)
+{
+ int len;
+
+ /* Return if we are not in delay */
+ if (!in_delayed_transaction)
+ return false;
+
+ Assert(delayed_fd > 0);
+ Assert(TransactionIdIsValid(delayed_xid));
+
+ len = (s->len - s->cursor) + sizeof(char);
+
+ if (write(delayed_fd, &len, sizeof(len)) != sizeof(len))
+ abort();
+ if (write(delayed_fd, &action, sizeof(action)) != sizeof(action))
+ abort();
+
+ len = (s->len - s->cursor);
+
+ if (write(delayed_fd, &s->data[s->cursor], len) != len)
+ abort();
+
+ return true;
+}
+
+/*
+ * Write a given information from PREPARE/COMMIT PREPARED to a file. This is
+ * called when we receive PREPARE or COMMIT PREPARED message. This returns true
+ * only when changes are written into file.
+ *
+ * About the needness of the function see comments atop
+ * ReadPreparedCommonRecord().
+ */
+static void
+handle_delayed_prepared(char action, XLogRecPtr prepare_lsn,
+ XLogRecPtr end_lsn, TimestampTz prepare_time,
+ TransactionId xid)
+{
+ int len;
+
+ Assert(delayed_fd > 0);
+
+#define MESSAGE_SIZE (sizeof(char) + sizeof(XLogRecPtr) + sizeof(XLogRecPtr) + sizeof(TimestampTz) + sizeof(TransactionId))
+ len = MESSAGE_SIZE;
+
+ /*
+ * XXX: Modified message contains
+ * - length
+ * - message type
+ * - prepare/commit_lsn
+ * - end_lsn
+ * - xid
+ */
+ if (write(delayed_fd, &len, sizeof(len)) != sizeof(len))
+ abort();
+ if (write(delayed_fd, &action, sizeof(action)) != sizeof(action))
+ abort();
+ if (write(delayed_fd, &prepare_lsn, sizeof(prepare_lsn)) != sizeof(prepare_lsn))
+ abort();
+ if (write(delayed_fd, &end_lsn, sizeof(end_lsn)) != sizeof(end_lsn))
+ abort();
+ if (write(delayed_fd, &prepare_time, sizeof(prepare_time)) != sizeof(prepare_time))
+ abort();
+ if (write(delayed_fd, &xid, sizeof(xid)) != sizeof(xid))
+ abort();
+#undef MESSAGE_SIZE
+}
+
+/*
+ * Check the delayed transactions and apply if we elapsed sufficient time
+ */
+static void
+check_delayed_transaction(void)
+{
+ TimestampTz now;
+ ListCell *lc;
+ int n = 0;
+
+ if (in_streamed_transaction)
+ return;
+
+ now = GetCurrentTimestamp();
+
+ /* Read cache on-by-one */
+ foreach(lc, DelayedTxnList)
+ {
+ DelayedTxnListEntry *entry = (DelayedTxnListEntry *) lfirst(lc);
+ LogicalRepCommitData *commit_data = &entry->commit_data;
+ TimestampTz delayUntil;
+ long diffms;
+
+ delayUntil = TimestampTzPlusMilliseconds(commit_data->committime,
+ MySubscription->minapplydelay);
+
+ diffms = TimestampDifferenceMilliseconds(now, delayUntil);
+
+ /*
+ * The cache is aligned the commit ordering, so we do not have to check
+ * latter entries if we find transactions that should not be applied.
+ */
+ if (diffms > 0)
+ break;
+
+ elog(DEBUG1, "started to apply transaction %u", entry->xid);
+
+ apply_delayed_transaction(entry->xid, commit_data->end_lsn);
+ apply_handle_commit_internal(commit_data);
+
+ delayed_fd = -1;
+ delayed_xid = InvalidTransactionId;
+ in_delayed_transaction = false;
+ n++;
+ }
+ /* Discards applied entries */
+ DelayedTxnList = list_delete_first_n(DelayedTxnList, n);
+}
+
/*
* Return the name of the logical replication worker.
*/
@@ -1018,13 +1682,28 @@ apply_handle_begin(StringInfo s)
logicalrep_read_begin(s, &begin_data);
set_apply_error_context_xact(begin_data.xid, begin_data.final_lsn);
- remote_final_lsn = begin_data.final_lsn;
+ /*
+ * Prepare to write changes into file if time-delayed replication is
+ * requested.
+ */
+ if (MySubscription->minapplydelay && AllTablesyncsReady())
+ {
+ in_delayed_transaction = true;
- maybe_start_skipping_changes(begin_data.final_lsn);
+ create_delay_file(begin_data.xid);
- in_remote_transaction = true;
+ delayed_xid = begin_data.xid;
+ }
+ else
+ {
+ remote_final_lsn = begin_data.final_lsn;
- pgstat_report_activity(STATE_RUNNING, NULL);
+ maybe_start_skipping_changes(begin_data.final_lsn);
+
+ in_remote_transaction = true;
+
+ pgstat_report_activity(STATE_RUNNING, NULL);
+ }
}
/*
@@ -1036,20 +1715,40 @@ static void
apply_handle_commit(StringInfo s)
{
LogicalRepCommitData commit_data;
+ /* Save the message before it is consumed. */
+ StringInfoData original_msg = *s;
+
+ /*
+ * If we are applying the delayed transaction, skip here.
+ * Actual COMMIT will be done outside the apply_delayed_transaction()
+ */
+ if (delayed_fd > 0 && !in_delayed_transaction)
+ return;
logicalrep_read_commit(s, &commit_data);
- if (commit_data.commit_lsn != remote_final_lsn)
- ereport(ERROR,
- (errcode(ERRCODE_PROTOCOL_VIOLATION),
- errmsg_internal("incorrect commit LSN %X/%X in commit message (expected %X/%X)",
- LSN_FORMAT_ARGS(commit_data.commit_lsn),
- LSN_FORMAT_ARGS(remote_final_lsn))));
+ /* If we are applying, skip here. */
+
+ if (in_delayed_transaction)
+ {
+ /* Write a commit message into file and flush all of messages */
+ handle_delayed_transaction(LOGICAL_REP_MSG_COMMIT, &original_msg);
+ flush_delayed_changes(&commit_data);
+ }
+ else
+ {
+ if (commit_data.commit_lsn != remote_final_lsn)
+ ereport(ERROR,
+ (errcode(ERRCODE_PROTOCOL_VIOLATION),
+ errmsg_internal("incorrect commit LSN %X/%X in commit message (expected %X/%X)",
+ LSN_FORMAT_ARGS(commit_data.commit_lsn),
+ LSN_FORMAT_ARGS(remote_final_lsn))));
- apply_handle_commit_internal(&commit_data);
+ apply_handle_commit_internal(&commit_data);
- /* Process any tables that are being synchronized in parallel. */
- process_syncing_tables(commit_data.end_lsn);
+ /* Process any tables that are being synchronized in parallel. */
+ process_syncing_tables(commit_data.end_lsn);
+ }
pgstat_report_activity(STATE_IDLE, NULL);
reset_apply_error_context_info();
@@ -1075,13 +1774,28 @@ apply_handle_begin_prepare(StringInfo s)
logicalrep_read_begin_prepare(s, &begin_data);
set_apply_error_context_xact(begin_data.xid, begin_data.prepare_lsn);
- remote_final_lsn = begin_data.prepare_lsn;
+ /*
+ * Prepare to write changes into file if time-delayed replication is
+ * requested.
+ */
+ if (MySubscription->minapplydelay && AllTablesyncsReady())
+ {
+ in_delayed_transaction = true;
+
+ create_delay_file(begin_data.xid);
+
+ delayed_xid = begin_data.xid;
+ }
+ else
+ {
+ remote_final_lsn = begin_data.prepare_lsn;
- maybe_start_skipping_changes(begin_data.prepare_lsn);
+ maybe_start_skipping_changes(begin_data.prepare_lsn);
- in_remote_transaction = true;
+ in_remote_transaction = true;
- pgstat_report_activity(STATE_RUNNING, NULL);
+ pgstat_report_activity(STATE_RUNNING, NULL);
+ }
}
/*
@@ -1123,57 +1837,115 @@ apply_handle_prepare_internal(LogicalRepPreparedTxnData *prepare_data)
/*
* Handle PREPARE message.
+ *
+ * When time-delayed logical replication is requested, we just write a message
+ * into file and return. This means that no transaction is prepared on
+ * subscriber. This can avoid that the apply worker acquires locks for a long
+ * time due to the long min_apply_time.
+ *
+ * Even if the transaction is applied from delayed file, the transaction is not
+ * prepared. We just skip PREPARE message.
*/
static void
apply_handle_prepare(StringInfo s)
{
LogicalRepPreparedTxnData prepare_data;
- logicalrep_read_prepare(s, &prepare_data);
+ /*
+ * If we are applying the delayed transaction, just consume the PREPARE
+ * message and return.
+ */
+ if (delayed_fd > 0 && !in_delayed_transaction)
+ {
+ /* Consume non-needed data */
+ (void) pq_getmsgint64(s);
+ (void) pq_getmsgint64(s);
+ (void) pq_getmsgint64(s);
+ (void) pq_getmsgint(s, 4);
- if (prepare_data.prepare_lsn != remote_final_lsn)
- ereport(ERROR,
- (errcode(ERRCODE_PROTOCOL_VIOLATION),
- errmsg_internal("incorrect prepare LSN %X/%X in prepare message (expected %X/%X)",
- LSN_FORMAT_ARGS(prepare_data.prepare_lsn),
- LSN_FORMAT_ARGS(remote_final_lsn))));
+ return;
+ }
/*
- * Unlike commit, here, we always prepare the transaction even though no
- * change has happened in this transaction or all changes are skipped. It
- * is done this way because at commit prepared time, we won't know whether
- * we have skipped preparing a transaction because of those reasons.
- *
- * XXX, We can optimize such that at commit prepared time, we first check
- * whether we have prepared the transaction or not but that doesn't seem
- * worthwhile because such cases shouldn't be common.
+ * If we are writing changes into delayed file, construct a modified
+ * message and write it. This is needed for avoiding to write gid into
+ * file. More detail, see atop ReadPreparedCommonRecord().
*/
- begin_replication_step();
+ if (in_delayed_transaction)
+ {
+ /* Write the modifed message */
+ handle_delayed_prepared(LOGICAL_REP_MSG_PREPARE,
+ prepare_data.prepare_lsn,
+ prepare_data.end_lsn,
+ prepare_data.prepare_time,
+ prepare_data.xid);
+
+ /* Flush changes */
+ if (pg_fdatasync(delayed_fd) != 0)
+ {
+ int save_errno = errno;
+ close(delayed_fd);
+ errno = save_errno;
+ ereport(ERROR,
+ errcode_for_file_access(),
+ errmsg("could not fsync file"));
+ }
- apply_handle_prepare_internal(&prepare_data);
+ /* Store flushed lsn */
+ last_flushed = prepare_data.end_lsn;
- end_replication_step();
- CommitTransactionCommand();
- pgstat_report_stat(false);
+ /* Cleanup */
+ close(delayed_fd);
- store_flush_position(prepare_data.end_lsn, XactLastCommitEnd);
+ delayed_fd = -1;
+ delayed_xid = InvalidTransactionId;
+ in_delayed_transaction = false;
+ }
+ else
+ {
+ logicalrep_read_prepare(s, &prepare_data);
- in_remote_transaction = false;
+ if (prepare_data.prepare_lsn != remote_final_lsn)
+ ereport(ERROR,
+ (errcode(ERRCODE_PROTOCOL_VIOLATION),
+ errmsg_internal("incorrect prepare LSN %X/%X in prepare message (expected %X/%X)",
+ LSN_FORMAT_ARGS(prepare_data.prepare_lsn),
+ LSN_FORMAT_ARGS(remote_final_lsn))));
- /* Process any tables that are being synchronized in parallel. */
- process_syncing_tables(prepare_data.end_lsn);
+ /*
+ * Unlike commit, here, we always prepare the transaction even though no
+ * change has happened in this transaction or all changes are skipped. It
+ * is done this way because at commit prepared time, we won't know whether
+ * we have skipped preparing a transaction because of those reasons.
+ *
+ * XXX, We can optimize such that at commit prepared time, we first check
+ * whether we have prepared the transaction or not but that doesn't seem
+ * worthwhile because such cases shouldn't be common.
+ */
+ begin_replication_step();
- /*
- * Since we have already prepared the transaction, in a case where the
- * server crashes before clearing the subskiplsn, it will be left but the
- * transaction won't be resent. But that's okay because it's a rare case
- * and the subskiplsn will be cleared when finishing the next transaction.
- */
- stop_skipping_changes();
- clear_subscription_skip_lsn(prepare_data.prepare_lsn);
+ apply_handle_prepare_internal(&prepare_data);
- pgstat_report_activity(STATE_IDLE, NULL);
- reset_apply_error_context_info();
+ end_replication_step();
+ CommitTransactionCommand();
+ pgstat_report_stat(false);
+
+ store_flush_position(prepare_data.end_lsn, XactLastCommitEnd);
+
+ in_remote_transaction = false;
+
+ /* Process any tables that are being synchronized in parallel. */
+ process_syncing_tables(prepare_data.end_lsn);
+
+ /*
+ * Since we have already prepared the transaction, in a case where the
+ * server crashes before clearing the subskiplsn, it will be left but the
+ * transaction won't be resent. But that's okay because it's a rare case
+ * and the subskiplsn will be cleared when finishing the next transaction.
+ */
+ stop_skipping_changes();
+ clear_subscription_skip_lsn(prepare_data.prepare_lsn);
+ }
}
/*
@@ -1191,38 +1963,95 @@ apply_handle_commit_prepared(StringInfo s)
LogicalRepCommitPreparedTxnData prepare_data;
char gid[GIDSIZE];
- logicalrep_read_commit_prepared(s, &prepare_data);
- set_apply_error_context_xact(prepare_data.xid, prepare_data.commit_lsn);
-
- /* Compute GID for two_phase transactions. */
- TwoPhaseTransactionGid(MySubscription->oid, prepare_data.xid,
- gid, sizeof(gid));
+ if (delayed_fd > 0 && !in_delayed_transaction)
+ return;
- /* There is no transaction when COMMIT PREPARED is called */
- begin_replication_step();
+ logicalrep_read_commit_prepared(s, &prepare_data);
/*
- * Update origin state so we can restart streaming from correct position
- * in case of crash.
+ * Check whether delayed file exists or not. If we have a file and we have
+ * not opened yet, it means that time-delayed logical replication has been
+ * requested. At that time we write the modified message.
+ * Otherwise, the transaction will be committed normally.
*/
- replorigin_session_origin_lsn = prepare_data.end_lsn;
- replorigin_session_origin_timestamp = prepare_data.commit_time;
+ if (delayed_fd < 0 &&
+ is_given_transaction_delayed(MyLogicalRepWorker->subid, prepare_data.xid))
+ {
+ char path[MAXPGPATH];
+ LogicalRepCommitData commit_data = {0};
- FinishPreparedTransaction(gid, true);
- end_replication_step();
- CommitTransactionCommand();
- pgstat_report_stat(false);
+ /* Open file again */
+ delay_file_name(path, MyLogicalRepWorker->subid, prepare_data.xid);
+ delayed_fd = BasicOpenFile(path, O_WRONLY | O_APPEND | PG_BINARY);
+ if (delayed_fd < 0)
+ ereport(ERROR,
+ errcode_for_file_access(),
+ errmsg("could not open file \"%s\": %m",
+ path));
+
+ /* Write modified message to file */
+ handle_delayed_prepared(LOGICAL_REP_MSG_COMMIT_PREPARED,
+ prepare_data.commit_lsn,
+ prepare_data.end_lsn,
+ prepare_data.commit_time,
+ prepare_data.xid);
+ /* Flush it */
+ if (pg_fdatasync(delayed_fd) != 0)
+ {
+ int save_errno = errno;
+ close(delayed_fd);
+ errno = save_errno;
+ ereport(ERROR,
+ errcode_for_file_access(),
+ errmsg("could not fsync file"));
+ }
- store_flush_position(prepare_data.end_lsn, XactLastCommitEnd);
- in_remote_transaction = false;
+ /* Store flushed lsn */
+ last_flushed = prepare_data.end_lsn;
- /* Process any tables that are being synchronized in parallel. */
- process_syncing_tables(prepare_data.end_lsn);
+ /* clean up */
+ close(delayed_fd);
- clear_subscription_skip_lsn(prepare_data.end_lsn);
+ delayed_fd = -1;
+ delayed_xid = InvalidTransactionId;
+ in_delayed_transaction = false;
- pgstat_report_activity(STATE_IDLE, NULL);
- reset_apply_error_context_info();
+ ConstructCommitFromCommitPrepared(&commit_data, &prepare_data);
+
+ /* Cache the commited transaction */
+ cache_commit_data(&commit_data, prepare_data.xid);
+ }
+ else
+ {
+ set_apply_error_context_xact(prepare_data.xid, prepare_data.commit_lsn);
+
+ /* Compute GID for two_phase transactions. */
+ TwoPhaseTransactionGid(MySubscription->oid, prepare_data.xid,
+ gid, sizeof(gid));
+
+ /* There is no transaction when COMMIT PREPARED is called */
+ begin_replication_step();
+
+ /*
+ * Update origin state so we can restart streaming from correct position
+ * in case of crash.
+ */
+ replorigin_session_origin_lsn = prepare_data.end_lsn;
+ replorigin_session_origin_timestamp = prepare_data.commit_time;
+
+ FinishPreparedTransaction(gid, true);
+ end_replication_step();
+ CommitTransactionCommand();
+ pgstat_report_stat(false);
+
+ store_flush_position(prepare_data.end_lsn, XactLastCommitEnd);
+ in_remote_transaction = false;
+
+ /* Process any tables that are being synchronized in parallel. */
+ process_syncing_tables(prepare_data.end_lsn);
+
+ clear_subscription_skip_lsn(prepare_data.end_lsn);
+ }
}
/*
@@ -1241,6 +2070,20 @@ apply_handle_rollback_prepared(StringInfo s)
char gid[GIDSIZE];
logicalrep_read_rollback_prepared(s, &rollback_data);
+
+ /*
+ * If the delayed file exists, just remove it. The delayed transaction have
+ * never prepared, so it's OK not to call FinishPreparedTransaction().
+ */
+ if (is_given_transaction_delayed(MyLogicalRepWorker->subid, rollback_data.xid))
+ {
+ char path[MAXPGPATH];
+ delay_file_name(path, MyLogicalRepWorker->subid, rollback_data.xid);
+ durable_unlink(path, LOG);
+
+ return;
+ }
+
set_apply_error_context_xact(rollback_data.xid, rollback_data.rollback_end_lsn);
/* Compute GID for two_phase transactions. */
@@ -1316,16 +2159,68 @@ apply_handle_stream_prepare(StringInfo s)
switch (apply_action)
{
case TRANS_LEADER_APPLY:
+ /*
+ * If time-delayed is requested, start to write changes to
+ * permanent file instead of temporary one.
+ */
+ if (MySubscription->minapplydelay)
+ {
+ in_delayed_transaction = true;
+
+ create_delay_file(prepare_data.xid);
+
+ delayed_xid = prepare_data.xid;
+ }
/*
* The transaction has been serialized to file, so replay all the
* spooled operations.
+ * Note that if time-delayed replication is requested, changes are
+ * written into permanent file here.
*/
apply_spooled_messages(MyLogicalRepWorker->stream_fileset,
prepare_data.xid, prepare_data.prepare_lsn);
- /* Mark the transaction as prepared. */
- apply_handle_prepare_internal(&prepare_data);
+
+ /*
+ * If time-delayed replication is requested, construct a modified
+ * message and write it. This is needed for avoiding to write gid into
+ * file. More detail, see atop ReadPreparedCommonRecord().
+ */
+ if (MySubscription->minapplydelay)
+ {
+ /* Write the modified message */
+ handle_delayed_prepared(LOGICAL_REP_MSG_PREPARE,
+ prepare_data.prepare_lsn,
+ prepare_data.end_lsn,
+ prepare_data.prepare_time,
+ prepare_data.xid);
+
+ /* Flush changes */
+ if (pg_fdatasync(delayed_fd) != 0)
+ {
+ int save_errno = errno;
+ close(delayed_fd);
+ errno = save_errno;
+ ereport(ERROR,
+ errcode_for_file_access(),
+ errmsg("could not fsync file"));
+ }
+
+ /* Store flushed lsn */
+ last_flushed = prepare_data.end_lsn;
+
+ close(delayed_fd);
+
+ delayed_fd = -1;
+ delayed_xid = InvalidTransactionId;
+ in_delayed_transaction = false;
+ }
+ else
+ {
+ /* Mark the transaction as prepared. */
+ apply_handle_prepare_internal(&prepare_data);
+ }
CommitTransactionCommand();
@@ -1404,8 +2299,11 @@ apply_handle_stream_prepare(StringInfo s)
pgstat_report_stat(false);
- /* Process any tables that are being synchronized in parallel. */
- process_syncing_tables(prepare_data.end_lsn);
+ if (list_length(DelayedTxnList) == 0)
+ {
+ /* Process any tables that are being synchronized in parallel. */
+ process_syncing_tables(prepare_data.end_lsn);
+ }
/*
* Similar to prepare case, the subskiplsn could be left in a case of
@@ -2174,19 +3072,43 @@ apply_handle_stream_commit(StringInfo s)
{
case TRANS_LEADER_APPLY:
+ /*
+ * If time-delayed is requested, start to write changes to
+ * permanent file instead of temporary one.
+ */
+ if (MySubscription->minapplydelay)
+ {
+ in_delayed_transaction = true;
+
+ create_delay_file(xid);
+
+ delayed_xid = xid;
+ }
+
/*
* The transaction has been serialized to file, so replay all the
* spooled operations.
+ * Note that if time-delayed replication is requested, changes are
+ * written into permanent file here.
*/
apply_spooled_messages(MyLogicalRepWorker->stream_fileset, xid,
- commit_data.commit_lsn);
+ commit_data.commit_lsn);
- apply_handle_commit_internal(&commit_data);
+
+ /* Flush changes if time-delayed is requested */
+ if (MySubscription->minapplydelay)
+ {
+ handle_delayed_transaction(LOGICAL_REP_MSG_COMMIT, &original_msg);
+ flush_delayed_changes(&commit_data);
+ }
+ else
+ apply_handle_commit_internal(&commit_data);
/* Unlink the files with serialized changes and subxact info. */
stream_cleanup_files(MyLogicalRepWorker->subid, xid);
elog(DEBUG1, "finished processing the STREAM COMMIT command");
+
break;
case TRANS_LEADER_SEND_TO_PARALLEL:
@@ -2248,8 +3170,11 @@ apply_handle_stream_commit(StringInfo s)
break;
}
- /* Process any tables that are being synchronized in parallel. */
- process_syncing_tables(commit_data.end_lsn);
+ if (list_length(DelayedTxnList) == 0)
+ {
+ /* Process any tables that are being synchronized in parallel. */
+ process_syncing_tables(commit_data.end_lsn);
+ }
pgstat_report_activity(STATE_IDLE, NULL);
@@ -2324,7 +3249,8 @@ apply_handle_relation(StringInfo s)
{
LogicalRepRelation *rel;
- if (handle_streamed_transaction(LOGICAL_REP_MSG_RELATION, s))
+ if (handle_streamed_transaction(LOGICAL_REP_MSG_RELATION, s) ||
+ handle_delayed_transaction(LOGICAL_REP_MSG_RELATION, s))
return;
rel = logicalrep_read_rel(s);
@@ -2347,7 +3273,8 @@ apply_handle_type(StringInfo s)
{
LogicalRepTyp typ;
- if (handle_streamed_transaction(LOGICAL_REP_MSG_TYPE, s))
+ if (handle_streamed_transaction(LOGICAL_REP_MSG_TYPE, s) ||
+ handle_delayed_transaction(LOGICAL_REP_MSG_TYPE, s))
return;
logicalrep_read_typ(s, &typ);
@@ -2405,7 +3332,8 @@ apply_handle_insert(StringInfo s)
* streamed transactions.
*/
if (is_skipping_changes() ||
- handle_streamed_transaction(LOGICAL_REP_MSG_INSERT, s))
+ handle_streamed_transaction(LOGICAL_REP_MSG_INSERT, s) ||
+ handle_delayed_transaction(LOGICAL_REP_MSG_INSERT, s))
return;
begin_replication_step();
@@ -2544,7 +3472,8 @@ apply_handle_update(StringInfo s)
* streamed transactions.
*/
if (is_skipping_changes() ||
- handle_streamed_transaction(LOGICAL_REP_MSG_UPDATE, s))
+ handle_streamed_transaction(LOGICAL_REP_MSG_UPDATE, s) ||
+ handle_delayed_transaction(LOGICAL_REP_MSG_UPDATE, s))
return;
begin_replication_step();
@@ -2712,7 +3641,8 @@ apply_handle_delete(StringInfo s)
* streamed transactions.
*/
if (is_skipping_changes() ||
- handle_streamed_transaction(LOGICAL_REP_MSG_DELETE, s))
+ handle_streamed_transaction(LOGICAL_REP_MSG_DELETE, s) ||
+ handle_delayed_transaction(LOGICAL_REP_MSG_DELETE, s))
return;
begin_replication_step();
@@ -3129,7 +4059,8 @@ apply_handle_truncate(StringInfo s)
* streamed transactions.
*/
if (is_skipping_changes() ||
- handle_streamed_transaction(LOGICAL_REP_MSG_TRUNCATE, s))
+ handle_streamed_transaction(LOGICAL_REP_MSG_TRUNCATE, s) ||
+ handle_delayed_transaction(LOGICAL_REP_MSG_TRUNCATE, s))
return;
begin_replication_step();
@@ -3390,6 +4321,10 @@ get_flush_position(XLogRecPtr *write, XLogRecPtr *flush,
}
}
+ /* If change are written into file, report the LSN instead */
+ if (last_flushed > *flush)
+ *flush = last_flushed;
+
*have_pending_txes = !dlist_is_empty(&lsn_mapping);
}
@@ -3586,9 +4521,13 @@ LogicalRepApplyLoop(XLogRecPtr last_received)
maybe_reread_subscription();
/* Process any table synchronization changes. */
- process_syncing_tables(last_received);
+ if (list_length(DelayedTxnList) == 0)
+ process_syncing_tables(last_received);
}
+ /* Check delayed transactions and apply them */
+ check_delayed_transaction();
+
/* Cleanup the memory. */
MemoryContextResetAndDeleteChildren(ApplyMessageContext);
MemoryContextSwitchTo(TopMemoryContext);
@@ -3730,8 +4669,14 @@ send_feedback(XLogRecPtr recvpos, bool force, bool requestReply)
/*
* No outstanding transactions to flush, we can report the latest received
* position. This is important for synchronous replication.
+ *
+ * If the logical replication subscription has unprocessed changes then do
+ * not inform the publisher that the received latest LSN is already
+ * applied and flushed, otherwise, the publisher will make a wrong
+ * assumption about the logical replication progress. Instead, just send a
+ * feedback message to avoid a replication timeout during the delay.
*/
- if (!have_pending_txes)
+ if (!have_pending_txes && (list_length(DelayedTxnList) == 0))
flushpos = writepos = recvpos;
if (writepos < last_writepos)
@@ -4534,6 +5479,9 @@ ApplyWorkerMain(Datum main_arg)
(errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
errmsg("subscription has no replication slot set")));
+ /* Check delayed files or initialize directory */
+ InitializeDelayedTxn();
+
/* Setup replication origin tracking. */
StartTransactionCommand();
ReplicationOriginNameForLogicalRep(MySubscription->oid, InvalidOid,
diff --git a/src/bin/pg_dump/pg_dump.c b/src/bin/pg_dump/pg_dump.c
index d62780a088..cec9fa145e 100644
--- a/src/bin/pg_dump/pg_dump.c
+++ b/src/bin/pg_dump/pg_dump.c
@@ -4608,6 +4608,7 @@ getSubscriptions(Archive *fout)
int i_subsynccommit;
int i_subpublications;
int i_subbinary;
+ int i_subminapplydelay;
int i,
ntups;
@@ -4660,9 +4661,13 @@ getSubscriptions(Archive *fout)
LOGICALREP_TWOPHASE_STATE_DISABLED);
if (fout->remoteVersion >= 160000)
- appendPQExpBufferStr(query, " s.suborigin\n");
+ appendPQExpBufferStr(query,
+ " s.suborigin,\n"
+ " s.subminapplydelay\n");
else
- appendPQExpBuffer(query, " '%s' AS suborigin\n", LOGICALREP_ORIGIN_ANY);
+ appendPQExpBuffer(query, " '%s' AS suborigin,\n"
+ " 0 AS subminapplydelay\n",
+ LOGICALREP_ORIGIN_ANY);
appendPQExpBufferStr(query,
"FROM pg_subscription s\n"
@@ -4690,6 +4695,7 @@ getSubscriptions(Archive *fout)
i_subtwophasestate = PQfnumber(res, "subtwophasestate");
i_subdisableonerr = PQfnumber(res, "subdisableonerr");
i_suborigin = PQfnumber(res, "suborigin");
+ i_subminapplydelay = PQfnumber(res, "subminapplydelay");
subinfo = pg_malloc(ntups * sizeof(SubscriptionInfo));
@@ -4720,6 +4726,8 @@ getSubscriptions(Archive *fout)
subinfo[i].subdisableonerr =
pg_strdup(PQgetvalue(res, i, i_subdisableonerr));
subinfo[i].suborigin = pg_strdup(PQgetvalue(res, i, i_suborigin));
+ subinfo[i].subminapplydelay =
+ atoi(PQgetvalue(res, i, i_subminapplydelay));
/* Decide whether we want to dump it */
selectDumpableObject(&(subinfo[i].dobj), fout);
@@ -4801,6 +4809,9 @@ dumpSubscription(Archive *fout, const SubscriptionInfo *subinfo)
if (strcmp(subinfo->subsynccommit, "off") != 0)
appendPQExpBuffer(query, ", synchronous_commit = %s", fmtId(subinfo->subsynccommit));
+ if (subinfo->subminapplydelay > 0)
+ appendPQExpBuffer(query, ", min_apply_delay = '%d ms'", subinfo->subminapplydelay);
+
appendPQExpBufferStr(query, ");\n");
if (subinfo->dobj.dump & DUMP_COMPONENT_DEFINITION)
diff --git a/src/bin/pg_dump/pg_dump.h b/src/bin/pg_dump/pg_dump.h
index 283cd1a602..be21a50fd9 100644
--- a/src/bin/pg_dump/pg_dump.h
+++ b/src/bin/pg_dump/pg_dump.h
@@ -662,6 +662,7 @@ typedef struct _SubscriptionInfo
char *subdisableonerr;
char *suborigin;
char *subsynccommit;
+ int subminapplydelay;
char *subpublications;
} SubscriptionInfo;
diff --git a/src/bin/psql/describe.c b/src/bin/psql/describe.c
index 99e28f607e..80f9f27aef 100644
--- a/src/bin/psql/describe.c
+++ b/src/bin/psql/describe.c
@@ -6493,7 +6493,7 @@ describeSubscriptions(const char *pattern, bool verbose)
PGresult *res;
printQueryOpt myopt = pset.popt;
static const bool translate_columns[] = {false, false, false, false,
- false, false, false, false, false, false, false, false};
+ false, false, false, false, false, false, false, false, false};
if (pset.sversion < 100000)
{
@@ -6548,10 +6548,13 @@ describeSubscriptions(const char *pattern, bool verbose)
gettext_noop("Two-phase commit"),
gettext_noop("Disable on error"));
+ /* Origin and min_apply_delay are only supported in v16 and higher */
if (pset.sversion >= 160000)
appendPQExpBuffer(&buf,
- ", suborigin AS \"%s\"\n",
- gettext_noop("Origin"));
+ ", suborigin AS \"%s\"\n"
+ ", subminapplydelay AS \"%s\"\n",
+ gettext_noop("Origin"),
+ gettext_noop("Min apply delay"));
appendPQExpBuffer(&buf,
", subsynccommit AS \"%s\"\n"
diff --git a/src/bin/psql/tab-complete.c b/src/bin/psql/tab-complete.c
index e38a49e8bd..881f8288a7 100644
--- a/src/bin/psql/tab-complete.c
+++ b/src/bin/psql/tab-complete.c
@@ -1925,7 +1925,7 @@ psql_completion(const char *text, int start, int end)
COMPLETE_WITH("(", "PUBLICATION");
/* ALTER SUBSCRIPTION <name> SET ( */
else if (HeadMatches("ALTER", "SUBSCRIPTION", MatchAny) && TailMatches("SET", "("))
- COMPLETE_WITH("binary", "disable_on_error", "origin", "slot_name",
+ COMPLETE_WITH("binary", "disable_on_error", "min_apply_delay", "origin", "slot_name",
"streaming", "synchronous_commit");
/* ALTER SUBSCRIPTION <name> SKIP ( */
else if (HeadMatches("ALTER", "SUBSCRIPTION", MatchAny) && TailMatches("SKIP", "("))
@@ -3268,7 +3268,7 @@ psql_completion(const char *text, int start, int end)
/* Complete "CREATE SUBSCRIPTION <name> ... WITH ( <opt>" */
else if (HeadMatches("CREATE", "SUBSCRIPTION") && TailMatches("WITH", "("))
COMPLETE_WITH("binary", "connect", "copy_data", "create_slot",
- "disable_on_error", "enabled", "origin", "slot_name",
+ "disable_on_error", "enabled", "min_apply_delay", "origin", "slot_name",
"streaming", "synchronous_commit", "two_phase");
/* CREATE TRIGGER --- is allowed inside CREATE SCHEMA, so use TailMatches */
diff --git a/src/include/catalog/pg_subscription.h b/src/include/catalog/pg_subscription.h
index b0f2a1705d..d1cfefc6d6 100644
--- a/src/include/catalog/pg_subscription.h
+++ b/src/include/catalog/pg_subscription.h
@@ -74,6 +74,8 @@ CATALOG(pg_subscription,6100,SubscriptionRelationId) BKI_SHARED_RELATION BKI_ROW
Oid subowner BKI_LOOKUP(pg_authid); /* Owner of the subscription */
+ int32 subminapplydelay; /* Replication apply delay (ms) */
+
bool subenabled; /* True if the subscription is enabled (the
* worker should be running) */
@@ -122,6 +124,7 @@ typedef struct Subscription
* skipped */
char *name; /* Name of the subscription */
Oid owner; /* Oid of the subscription owner */
+ int32 minapplydelay; /* Replication apply delay (ms) */
bool enabled; /* Indicates if the subscription is enabled */
bool binary; /* Indicates if the subscription wants data in
* binary format */
diff --git a/src/test/regress/expected/subscription.out b/src/test/regress/expected/subscription.out
index 3f99b14394..cf8e727ee9 100644
--- a/src/test/regress/expected/subscription.out
+++ b/src/test/regress/expected/subscription.out
@@ -114,18 +114,18 @@ CREATE SUBSCRIPTION regress_testsub4 CONNECTION 'dbname=regress_doesnotexist' PU
WARNING: subscription was created, but is not connected
HINT: To initiate replication, you must manually create the replication slot, enable the subscription, and refresh the subscription.
\dRs+ regress_testsub4
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
-------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub4 | regress_subscription_user | f | {testpub} | f | off | d | f | none | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay | Synchronous commit | Conninfo | Skip LSN
+------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+-----------------+--------------------+-----------------------------+----------
+ regress_testsub4 | regress_subscription_user | f | {testpub} | f | off | d | f | none | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
ALTER SUBSCRIPTION regress_testsub4 SET (origin = any);
\dRs+ regress_testsub4
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
-------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub4 | regress_subscription_user | f | {testpub} | f | off | d | f | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay | Synchronous commit | Conninfo | Skip LSN
+------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+-----------------+--------------------+-----------------------------+----------
+ regress_testsub4 | regress_subscription_user | f | {testpub} | f | off | d | f | any | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
DROP SUBSCRIPTION regress_testsub3;
@@ -143,10 +143,10 @@ ALTER SUBSCRIPTION regress_testsub CONNECTION 'foobar';
ERROR: invalid connection string syntax: missing "=" after "foobar" in connection info string
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub} | f | off | d | f | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+-----------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | off | d | f | any | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
ALTER SUBSCRIPTION regress_testsub SET PUBLICATION testpub2, testpub3 WITH (refresh = false);
@@ -163,10 +163,10 @@ ERROR: unrecognized subscription parameter: "create_slot"
-- ok
ALTER SUBSCRIPTION regress_testsub SKIP (lsn = '0/12345');
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+---------------------+--------+-----------+------------------+------------------+--------+--------------------+------------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub2,testpub3} | f | off | d | f | any | off | dbname=regress_doesnotexist2 | 0/12345
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+---------------------+--------+-----------+------------------+------------------+--------+-----------------+--------------------+------------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub2,testpub3} | f | off | d | f | any | 0 | off | dbname=regress_doesnotexist2 | 0/12345
(1 row)
-- ok - with lsn = NONE
@@ -175,10 +175,10 @@ ALTER SUBSCRIPTION regress_testsub SKIP (lsn = NONE);
ALTER SUBSCRIPTION regress_testsub SKIP (lsn = '0/0');
ERROR: invalid WAL location (LSN): 0/0
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+---------------------+--------+-----------+------------------+------------------+--------+--------------------+------------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub2,testpub3} | f | off | d | f | any | off | dbname=regress_doesnotexist2 | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+---------------------+--------+-----------+------------------+------------------+--------+-----------------+--------------------+------------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub2,testpub3} | f | off | d | f | any | 0 | off | dbname=regress_doesnotexist2 | 0/0
(1 row)
BEGIN;
@@ -210,10 +210,10 @@ ALTER SUBSCRIPTION regress_testsub_foo SET (synchronous_commit = foobar);
ERROR: invalid value for parameter "synchronous_commit": "foobar"
HINT: Available values: local, remote_write, remote_apply, on, off.
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
----------------------+---------------------------+---------+---------------------+--------+-----------+------------------+------------------+--------+--------------------+------------------------------+----------
- regress_testsub_foo | regress_subscription_user | f | {testpub2,testpub3} | f | off | d | f | any | local | dbname=regress_doesnotexist2 | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay | Synchronous commit | Conninfo | Skip LSN
+---------------------+---------------------------+---------+---------------------+--------+-----------+------------------+------------------+--------+-----------------+--------------------+------------------------------+----------
+ regress_testsub_foo | regress_subscription_user | f | {testpub2,testpub3} | f | off | d | f | any | 0 | local | dbname=regress_doesnotexist2 | 0/0
(1 row)
-- rename back to keep the rest simple
@@ -247,19 +247,19 @@ CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUB
WARNING: subscription was created, but is not connected
HINT: To initiate replication, you must manually create the replication slot, enable the subscription, and refresh the subscription.
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub} | t | off | d | f | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+-----------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | t | off | d | f | any | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
ALTER SUBSCRIPTION regress_testsub SET (binary = false);
ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE);
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub} | f | off | d | f | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+-----------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | off | d | f | any | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
DROP SUBSCRIPTION regress_testsub;
@@ -271,27 +271,27 @@ CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUB
WARNING: subscription was created, but is not connected
HINT: To initiate replication, you must manually create the replication slot, enable the subscription, and refresh the subscription.
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub} | f | on | d | f | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+-----------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | on | d | f | any | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
ALTER SUBSCRIPTION regress_testsub SET (streaming = parallel);
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub} | f | parallel | d | f | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+-----------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | parallel | d | f | any | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
ALTER SUBSCRIPTION regress_testsub SET (streaming = false);
ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE);
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub} | f | off | d | f | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+-----------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | off | d | f | any | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
-- fail - publication already exists
@@ -306,10 +306,10 @@ ALTER SUBSCRIPTION regress_testsub ADD PUBLICATION testpub1, testpub2 WITH (refr
ALTER SUBSCRIPTION regress_testsub ADD PUBLICATION testpub1, testpub2 WITH (refresh = false);
ERROR: publication "testpub1" is already in subscription "regress_testsub"
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-----------------------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub,testpub1,testpub2} | f | off | d | f | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-----------------------------+--------+-----------+------------------+------------------+--------+-----------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub,testpub1,testpub2} | f | off | d | f | any | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
-- fail - publication used more than once
@@ -324,10 +324,10 @@ ERROR: publication "testpub3" is not in subscription "regress_testsub"
-- ok - delete publications
ALTER SUBSCRIPTION regress_testsub DROP PUBLICATION testpub1, testpub2 WITH (refresh = false);
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub} | f | off | d | f | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+-----------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | off | d | f | any | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
DROP SUBSCRIPTION regress_testsub;
@@ -363,10 +363,10 @@ CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUB
WARNING: subscription was created, but is not connected
HINT: To initiate replication, you must manually create the replication slot, enable the subscription, and refresh the subscription.
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub} | f | off | p | f | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+-----------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | off | p | f | any | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
--fail - alter of two_phase option not supported.
@@ -375,10 +375,10 @@ ERROR: unrecognized subscription parameter: "two_phase"
-- but can alter streaming when two_phase enabled
ALTER SUBSCRIPTION regress_testsub SET (streaming = true);
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub} | f | on | p | f | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+-----------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | on | p | f | any | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE);
@@ -388,10 +388,10 @@ CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUB
WARNING: subscription was created, but is not connected
HINT: To initiate replication, you must manually create the replication slot, enable the subscription, and refresh the subscription.
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub} | f | on | p | f | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+-----------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | on | p | f | any | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE);
@@ -404,20 +404,57 @@ CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUB
WARNING: subscription was created, but is not connected
HINT: To initiate replication, you must manually create the replication slot, enable the subscription, and refresh the subscription.
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub} | f | off | d | f | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+-----------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | off | d | f | any | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
ALTER SUBSCRIPTION regress_testsub SET (disable_on_error = true);
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub} | f | off | d | t | any | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+-----------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | off | d | t | any | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
+ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE);
+DROP SUBSCRIPTION regress_testsub;
+-- fail -- min_apply_delay must be a non-negative integer
+CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUBLICATION testpub WITH (connect = false, min_apply_delay = foo);
+ERROR: invalid value for parameter "min_apply_delay": "foo"
+CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUBLICATION testpub WITH (connect = false, min_apply_delay = -1);
+ERROR: -1 ms is outside the valid range for parameter "min_apply_delay" (0 .. 2147483647)
+-- fail - utilizing streaming = parallel with time-delayed replication is not supported
+CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUBLICATION testpub WITH (connect = false, streaming = parallel, min_apply_delay = 123);
+ERROR: min_apply_delay > 0 and streaming = parallel are mutually exclusive options
+-- success -- min_apply_delay value without unit is taken as milliseconds
+CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUBLICATION testpub WITH (connect = false, min_apply_delay = 123);
+WARNING: subscription was created, but is not connected
+HINT: To initiate replication, you must manually create the replication slot, enable the subscription, and refresh the subscription.
+\dRs+
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+-----------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | off | d | f | any | 123 | off | dbname=regress_doesnotexist | 0/0
+(1 row)
+
+-- success -- min_apply_delay value with unit is converted into ms and stored as an integer
+ALTER SUBSCRIPTION regress_testsub SET (min_apply_delay = '1 d');
+\dRs+
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Min apply delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+-----------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | off | d | f | any | 86400000 | off | dbname=regress_doesnotexist | 0/0
+(1 row)
+
+-- fail - alter subscription with streaming = parallel should fail when time-delayed replication is set
+ALTER SUBSCRIPTION regress_testsub SET (streaming = parallel);
+ERROR: cannot set parallel streaming mode for subscription with min_apply_delay
+-- fail - alter subscription with min_apply_delay should fail when streaming = parallel is set
+ALTER SUBSCRIPTION regress_testsub SET (min_apply_delay = 0, streaming = parallel);
+ALTER SUBSCRIPTION regress_testsub SET (min_apply_delay = 123);
+ERROR: cannot set min_apply_delay for subscription in parallel streaming mode
ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE);
DROP SUBSCRIPTION regress_testsub;
RESET SESSION AUTHORIZATION;
diff --git a/src/test/regress/sql/subscription.sql b/src/test/regress/sql/subscription.sql
index 7281f5fee2..7317b140f5 100644
--- a/src/test/regress/sql/subscription.sql
+++ b/src/test/regress/sql/subscription.sql
@@ -286,6 +286,30 @@ ALTER SUBSCRIPTION regress_testsub SET (disable_on_error = true);
ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE);
DROP SUBSCRIPTION regress_testsub;
+-- fail -- min_apply_delay must be a non-negative integer
+CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUBLICATION testpub WITH (connect = false, min_apply_delay = foo);
+CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUBLICATION testpub WITH (connect = false, min_apply_delay = -1);
+
+-- fail - utilizing streaming = parallel with time-delayed replication is not supported
+CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUBLICATION testpub WITH (connect = false, streaming = parallel, min_apply_delay = 123);
+
+-- success -- min_apply_delay value without unit is taken as milliseconds
+CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUBLICATION testpub WITH (connect = false, min_apply_delay = 123);
+\dRs+
+
+-- success -- min_apply_delay value with unit is converted into ms and stored as an integer
+ALTER SUBSCRIPTION regress_testsub SET (min_apply_delay = '1 d');
+\dRs+
+
+-- fail - alter subscription with streaming = parallel should fail when time-delayed replication is set
+ALTER SUBSCRIPTION regress_testsub SET (streaming = parallel);
+
+-- fail - alter subscription with min_apply_delay should fail when streaming = parallel is set
+ALTER SUBSCRIPTION regress_testsub SET (min_apply_delay = 0, streaming = parallel);
+ALTER SUBSCRIPTION regress_testsub SET (min_apply_delay = 123);
+ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE);
+DROP SUBSCRIPTION regress_testsub;
+
RESET SESSION AUTHORIZATION;
DROP ROLE regress_subscription_user;
DROP ROLE regress_subscription_user2;
diff --git a/src/test/subscription/t/001_rep_changes.pl b/src/test/subscription/t/001_rep_changes.pl
index 91aa068c95..01f2c4284d 100644
--- a/src/test/subscription/t/001_rep_changes.pl
+++ b/src/test/subscription/t/001_rep_changes.pl
@@ -515,6 +515,37 @@ $node_publisher->poll_query_until('postgres',
or die
"Timed out while waiting for apply to restart after renaming SUBSCRIPTION";
+# Test time-delayed logical replication
+#
+# If the subscription sets min_apply_delay parameter, the logical replication
+# worker will delay the transaction apply for min_apply_delay milliseconds. We
+# verify this by looking at the time difference between a) when tuples are
+# inserted on the publisher, and b) when those changes are replicated on the
+# subscriber. Even on slow machines, this strategy will give predictable behavior.
+
+# Set min_apply_delay parameter to 3 seconds
+my $delay = 3;
+$node_subscriber->safe_psql('postgres',
+ "ALTER SUBSCRIPTION tap_sub_renamed SET (min_apply_delay = '${delay}s')");
+
+# Before doing the insertion, get the current timestamp that will be
+# used as a comparison base.
+my $publisher_insert_time = time();
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_ins VALUES (generate_series(1101, 1120))");
+
+$node_subscriber->poll_query_until('postgres',
+ "SELECT count(*) = 1 FROM tab_ins WHERE a = 1120;"
+ )
+ or die
+ "failed to replicate changes";
+
+# This test is successful if and only if the LSN has been applied with at least
+# the configured apply delay.
+ok( time() - $publisher_insert_time >= $delay,
+ "subscriber applies WAL only after replication delay for non-streaming transaction"
+);
+
# check all the cleanup
$node_subscriber->safe_psql('postgres', "DROP SUBSCRIPTION tap_sub_renamed");
diff --git a/src/test/subscription/t/032_tmp.pl b/src/test/subscription/t/032_tmp.pl
new file mode 100644
index 0000000000..b298b5ed6e
--- /dev/null
+++ b/src/test/subscription/t/032_tmp.pl
@@ -0,0 +1,141 @@
+
+# Copyright (c) 2021-2023, PostgreSQL Global Development Group
+
+# Basic logical replication test
+use strict;
+use warnings;
+use PostgreSQL::Test::Cluster;
+use PostgreSQL::Test::Utils;
+use Test::More;
+
+# Initialize publisher node
+my $node_publisher = PostgreSQL::Test::Cluster->new('publisher');
+$node_publisher->init(allows_streaming => 'logical');
+$node_publisher->append_conf('postgresql.conf', "max_prepared_transactions = 10");
+$node_publisher->start;
+
+# Create subscriber node
+my $node_subscriber = PostgreSQL::Test::Cluster->new('subscriber');
+$node_subscriber->init(allows_streaming => 'logical');
+$node_subscriber->append_conf('postgresql.conf', "log_min_messages = debug1");
+$node_subscriber->append_conf('postgresql.conf', "max_prepared_transactions = 10");
+$node_subscriber->start;
+
+# Create some preexisting content on publisher
+$node_publisher->safe_psql('postgres', "CREATE TABLE tab_ins (a int)");
+$node_subscriber->safe_psql('postgres', "CREATE TABLE tab_ins (a int)");
+
+# Setup logical replication
+my $publisher_connstr = $node_publisher->connstr . ' dbname=postgres';
+$node_publisher->safe_psql('postgres', "CREATE PUBLICATION tap_pub FOR ALL TABLES");
+$node_subscriber->safe_psql('postgres',
+ "CREATE SUBSCRIPTION tap_sub CONNECTION '$publisher_connstr' PUBLICATION tap_pub WITH (two_phase = on)"
+);
+
+# Wait for initial table sync to finish
+$node_subscriber->wait_for_subscription_sync($node_publisher, 'tap_sub');
+
+# Test time-delayed logical replication
+#
+# If the subscription sets min_apply_delay parameter, the logical replication
+# worker will delay the transaction apply for min_apply_delay milliseconds. We
+# verify this by looking at the time difference between a) when tuples are
+# inserted on the publisher, and b) when those changes are replicated on the
+# subscriber. Even on slow machines, this strategy will give predictable behavior.
+
+# Set min_apply_delay parameter to 3 seconds
+my $delay = 3;
+$node_subscriber->safe_psql('postgres',
+ "ALTER SUBSCRIPTION tap_sub SET (min_apply_delay = '${delay}s')");
+
+#
+# non-streaming
+#
+
+# Before doing the insertion, get the current timestamp that will be
+# used as a comparison base.
+my $publisher_insert_time = time();
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_ins VALUES (1)");
+
+# The publisher waits for the replication to complete
+$node_publisher->wait_for_catchup('tap_sub');
+
+$node_subscriber->poll_query_until('postgres',
+ "SELECT count(*) = 1 FROM tab_ins;"
+ )
+ or die
+ "failed to replicate changes";
+
+# This test is successful if and only if the LSN has been applied with at least
+# the configured apply delay.
+ok( time() - $publisher_insert_time >= $delay,
+ "subscriber applies WAL only after replication delay for non-streaming transaction"
+);
+
+#
+# non-streaming, 2PC
+#
+
+# my $publisher_insert_time = time();
+# $node_publisher->safe_psql('postgres',"
+# BEGIN;
+# INSERT INTO tab_ins VALUES (2)
+# PREPARE TRANSACTION 'test_non_stream'");
+
+# # The publisher waits for the replication to complete
+# $node_publisher->wait_for_catchup('tap_sub');
+
+# $result = $node_subscriber->safe_psql('postgres',
+# "SELECT count(*) FROM pg_prepared_xacts;");
+# is($result, qq(1), 'transaction is prepared on subscriber');
+
+# # check that 2PC gets committed on subscriber
+# $node_publisher->safe_psql('postgres',
+# "COMMIT PREPARED 'test_non_stream';");
+
+# $node_subscriber->poll_query_until('postgres',
+# "SELECT count(*) = 1 FROM tab_ins;"
+# )
+# or die
+# "failed to replicate changes";
+
+# # This test is successful if and only if the LSN has been applied with at least
+# # the configured apply delay.
+# ok( time() - $publisher_insert_time >= $delay,
+# "subscriber applies WAL only after replication delay for non-streaming transaction"
+# );
+
+
+#
+# streaming
+#
+
+$node_publisher->append_conf('postgresql.conf',
+ 'logical_replication_mode = immediate');
+$node_publisher->reload;
+$node_publisher->safe_psql('postgres', q{SELECT 1});
+
+$publisher_insert_time = time();
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_ins VALUES (generate_series(2, 30))");
+
+# The publisher waits for the replication to complete
+$node_publisher->wait_for_catchup('tap_sub');
+
+$node_subscriber->poll_query_until('postgres',
+ "SELECT count(*) = 30 FROM tab_ins;"
+ )
+ or die
+ "failed to replicate changes";
+
+# This test is successful if and only if the LSN has been applied with at least
+# the configured apply delay.
+ok( time() - $publisher_insert_time >= $delay,
+ "subscriber applies WAL only after replication delay for streaming transaction"
+);
+
+$node_subscriber->stop('fast');
+$node_publisher->stop('fast');
+
+done_testing();
--
2.27.0
Dear hackers,
Previous patch could not be applied due to 482675 1e10d4, c3afe8.
PSA rebased version. Also, I have done some code cleanups.
Best Regards,
Hayato Kuroda
FUJITSU LIMITED
Attachments:
v3-0001-WIP-Time-delayed-logical-replication-by-serializi.patchapplication/octet-stream; name=v3-0001-WIP-Time-delayed-logical-replication-by-serializi.patchDownload
From 6493edd4371eb5085eaed0ffdcb8e93bf0ffada2 Mon Sep 17 00:00:00 2001
From: Hayato Kuroda <kuroda.hayato@fujitsu.com>
Date: Thu, 30 Mar 2023 05:25:53 +0000
Subject: [PATCH v3] (WIP) Time-delayed logical replication by serializing
changes
Similar to physical replication, a time-delayed copy of the data for
logical replication is useful for some scenarios (particularly to fix
errors that might cause data loss).
This patch implements a new subscription parameter called 'min_apply_delay'.
If the subscription sets min_apply_delay parameter, the logical
replication worker will delay the transaction apply for min_apply_delay
milliseconds.
The delaying is implemented by serializing changes into file. The file is
created when the worker receives BEGIN message. The worker writes received
changes and flush at COMMIT. The delayed transaction is checked its commit time
for every main loop, and applied from the file when the time exceeds the
min_apply_delay. The commit time is stored in memory when the transaction is
committed, or the worker restarts.
The delay is calculated between the WAL time stamp and the current time
on the subscriber.
The combination of parallel streaming mode and min_apply_delay is not
allowed. This is because in parallel streaming mode, we start applying
the transaction stream as soon as the first change arrives without
knowing the transaction's prepare/commit time. This means we cannot
calculate the underlying network/decoding lag between publisher and
subscriber, and so always waiting for the full 'min_apply_delay' period
might include unnecessary delay.
The other possibility was to apply the delay at the end of the parallel
apply transaction but that would cause issues related to resource
bloat and locks being held for a long time.
Currently the combination of skip transaction feature and min_apply_delay
does not work well.
Earlier versions were written by Euler Taveira, Takamichi Osumi, and Kuroda Hayato
Author: Kuroda Hayato, Takamichi Osumi
---
doc/src/sgml/catalogs.sgml | 9 +
doc/src/sgml/glossary.sgml | 15 +
doc/src/sgml/logical-replication.sgml | 7 +
doc/src/sgml/ref/alter_subscription.sgml | 5 +-
doc/src/sgml/ref/create_subscription.sgml | 47 +-
src/backend/catalog/pg_subscription.c | 1 +
src/backend/catalog/system_views.sql | 6 +-
src/backend/commands/subscriptioncmds.c | 123 +-
src/backend/replication/logical/worker.c | 1232 +++++++++++++++++---
src/bin/pg_dump/pg_dump.c | 13 +-
src/bin/pg_dump/pg_dump.h | 1 +
src/bin/psql/describe.c | 8 +-
src/bin/psql/tab-complete.c | 4 +-
src/include/catalog/pg_subscription.h | 3 +
src/test/regress/expected/subscription.out | 181 +--
src/test/regress/sql/subscription.sql | 25 +
src/test/subscription/t/001_rep_changes.pl | 31 +
17 files changed, 1487 insertions(+), 224 deletions(-)
diff --git a/doc/src/sgml/catalogs.sgml b/doc/src/sgml/catalogs.sgml
index 7c09ab3000..6f2e348351 100644
--- a/doc/src/sgml/catalogs.sgml
+++ b/doc/src/sgml/catalogs.sgml
@@ -7891,6 +7891,15 @@ SCRAM-SHA-256$<replaceable><iteration count></replaceable>:<replaceable>&l
</para></entry>
</row>
+ <row>
+ <entry role="catalog_table_entry"><para role="column_definition">
+ <structfield>subminapplydelay</structfield> <type>int4</type>
+ </para>
+ <para>
+ The minimum delay, in milliseconds, for applying changes
+ </para></entry>
+ </row>
+
<row>
<entry role="catalog_table_entry"><para role="column_definition">
<structfield>subname</structfield> <type>name</type>
diff --git a/doc/src/sgml/glossary.sgml b/doc/src/sgml/glossary.sgml
index 29bf1873bd..204fe7f3ae 100644
--- a/doc/src/sgml/glossary.sgml
+++ b/doc/src/sgml/glossary.sgml
@@ -1757,6 +1757,21 @@
</glossdef>
</glossentry>
+ <glossentry id="glossary-time-delayed-replication">
+ <glossterm>Time-delayed replication</glossterm>
+ <glossdef>
+ <para>
+ Replication setup that delays the application of changes by a specified
+ minimum time-delay period.
+ </para>
+ <para>
+ For more information, see
+ <xref linkend="guc-recovery-min-apply-delay"/> for physical replication
+ and <xref linkend="sql-createsubscription"/> for logical replication.
+ </para>
+ </glossdef>
+ </glossentry>
+
<glossentry id="glossary-toast">
<glossterm>TOAST</glossterm>
<glossdef>
diff --git a/doc/src/sgml/logical-replication.sgml b/doc/src/sgml/logical-replication.sgml
index c65f4aabfd..0be4d652aa 100644
--- a/doc/src/sgml/logical-replication.sgml
+++ b/doc/src/sgml/logical-replication.sgml
@@ -257,6 +257,13 @@
option of <command>CREATE SUBSCRIPTION</command> for details.
</para>
+ <para>
+ A subscription can delay the application of changes by specifying the
+ <link linkend="sql-createsubscription-with-min-apply-delay"><literal>min_apply_delay</literal></link>
+ subscription parameter. See <xref linkend="sql-createsubscription"/> for
+ details.
+ </para>
+
<sect2 id="logical-replication-subscription-slot">
<title>Replication Slot Management</title>
diff --git a/doc/src/sgml/ref/alter_subscription.sgml b/doc/src/sgml/ref/alter_subscription.sgml
index a85e04e4d6..bf6c5fe7f0 100644
--- a/doc/src/sgml/ref/alter_subscription.sgml
+++ b/doc/src/sgml/ref/alter_subscription.sgml
@@ -225,8 +225,9 @@ ALTER SUBSCRIPTION <replaceable class="parameter">name</replaceable> RENAME TO <
<link linkend="sql-createsubscription-with-streaming"><literal>streaming</literal></link>,
<link linkend="sql-createsubscription-with-disable-on-error"><literal>disable_on_error</literal></link>,
<link linkend="sql-createsubscription-with-password-required"><literal>password_required</literal></link>,
- <link linkend="sql-createsubscription-with-run-as-owner"><literal>run_as_owner</literal></link>, and
- <link linkend="sql-createsubscription-with-origin"><literal>origin</literal></link>.
+ <link linkend="sql-createsubscription-with-run-as-owner"><literal>run_as_owner</literal></link>,
+ <link linkend="sql-createsubscription-with-origin"><literal>origin</literal></link>, and
+ <link linkend="sql-createsubscription-with-min-apply-delay"><literal>min_apply_delay</literal></link>.
Only a superuser can set <literal>password_required = false</literal>.
</para>
</listitem>
diff --git a/doc/src/sgml/ref/create_subscription.sgml b/doc/src/sgml/ref/create_subscription.sgml
index 68aa2b47f2..8f31a14b37 100644
--- a/doc/src/sgml/ref/create_subscription.sgml
+++ b/doc/src/sgml/ref/create_subscription.sgml
@@ -399,7 +399,47 @@ CREATE SUBSCRIPTION <replaceable class="parameter">subscription_name</replaceabl
</para>
</listitem>
</varlistentry>
- </variablelist></para>
+
+ <varlistentry id="sql-createsubscription-with-min-apply-delay">
+ <term><literal>min_apply_delay</literal> (<type>integer</type>)</term>
+ <listitem>
+ <para>
+ By default, the subscriber applies changes as soon as possible. This
+ parameter allows the user to delay the application of changes by a
+ given time period. This is done by writing all the changes into a
+ file once and apply contents after spending time. If the value is
+ specified without units, it is taken as milliseconds. The default
+ is zero (no delay). See <xref linkend="config-setting-names-values"/>
+ for details on the available valid time units.
+ </para>
+ <para>
+ Any delay becomes effective only after all initial table
+ synchronization has finished and occurs before each transaction starts
+ to get applied on the subscriber. The delay is calculated as the
+ difference between the WAL timestamp as written on the publisher and
+ the current time on the subscriber. Any overhead of time spent in
+ logical decoding and in transferring the transaction may reduce the
+ actual wait time. Even if the overhead already exceeds the requested
+ <literal>min_apply_delay</literal> value, all the changes are written
+ into file and applied immediately. If the system clocks on publisher
+ and subscriber are not synchronized, this may lead to apply changes
+ earlier than expected, but this is not a major issue because this
+ parameter is typically much larger than the time deviations between
+ servers.
+ </para>
+ <warning>
+ <para>
+ Delaying the replication means there is a much longer time between
+ making a change on the publisher, and that change being committed
+ on the subscriber. This can impact the performance of synchronous
+ replication. See <xref linkend="guc-synchronous-commit"/>
+ parameter.
+ </para>
+ </warning>
+ </listitem>
+ </varlistentry>
+ </variablelist>
+ </para>
</listitem>
</varlistentry>
@@ -472,6 +512,11 @@ CREATE SUBSCRIPTION <replaceable class="parameter">subscription_name</replaceabl
published with different column lists are not supported.
</para>
+ <para>
+ A non-zero <literal>min_apply_delay</literal> parameter is not allowed when
+ streaming in parallel mode.
+ </para>
+
<para>
We allow non-existent publications to be specified so that users can add
those later. This means
diff --git a/src/backend/catalog/pg_subscription.c b/src/backend/catalog/pg_subscription.c
index d07f88ce28..56f8fdda10 100644
--- a/src/backend/catalog/pg_subscription.c
+++ b/src/backend/catalog/pg_subscription.c
@@ -66,6 +66,7 @@ GetSubscription(Oid subid, bool missing_ok)
sub->skiplsn = subform->subskiplsn;
sub->name = pstrdup(NameStr(subform->subname));
sub->owner = subform->subowner;
+ sub->minapplydelay = subform->subminapplydelay;
sub->enabled = subform->subenabled;
sub->binary = subform->subbinary;
sub->stream = subform->substream;
diff --git a/src/backend/catalog/system_views.sql b/src/backend/catalog/system_views.sql
index 6b098234f8..1b1f40de62 100644
--- a/src/backend/catalog/system_views.sql
+++ b/src/backend/catalog/system_views.sql
@@ -1317,9 +1317,9 @@ REVOKE ALL ON pg_replication_origin_status FROM public;
-- All columns of pg_subscription except subconninfo are publicly readable.
REVOKE ALL ON pg_subscription FROM public;
-GRANT SELECT (oid, subdbid, subskiplsn, subname, subowner, subenabled,
- subbinary, substream, subtwophasestate, subdisableonerr,
- subpasswordrequired, subrunasowner,
+GRANT SELECT (oid, subdbid, subskiplsn, subminapplydelay, subname, subowner,
+ subenabled, subbinary, substream, subtwophasestate,
+ subdisableonerr, subpasswordrequired, subrunasowner,
subslotname, subsynccommit, subpublications, suborigin)
ON pg_subscription TO public;
diff --git a/src/backend/commands/subscriptioncmds.c b/src/backend/commands/subscriptioncmds.c
index 3251d89ba8..aaa2065311 100644
--- a/src/backend/commands/subscriptioncmds.c
+++ b/src/backend/commands/subscriptioncmds.c
@@ -71,6 +71,7 @@
#define SUBOPT_RUN_AS_OWNER 0x00001000
#define SUBOPT_LSN 0x00002000
#define SUBOPT_ORIGIN 0x00004000
+#define SUBOPT_MIN_APPLY_DELAY 0x00008000
/* check if the 'val' has 'bits' set */
#define IsSet(val, bits) (((val) & (bits)) == (bits))
@@ -97,6 +98,7 @@ typedef struct SubOpts
bool runasowner;
char *origin;
XLogRecPtr lsn;
+ int32 min_apply_delay;
} SubOpts;
static List *fetch_table_list(WalReceiverConn *wrconn, List *publications);
@@ -107,7 +109,7 @@ static void check_publications_origin(WalReceiverConn *wrconn,
static void check_duplicates_in_publist(List *publist, Datum *datums);
static List *merge_publications(List *oldpublist, List *newpublist, bool addpub, const char *subname);
static void ReportSlotConnectionError(List *rstates, Oid subid, char *slotname, char *err);
-
+static int32 defGetMinApplyDelay(DefElem *def);
/*
* Common option parsing function for CREATE and ALTER SUBSCRIPTION commands.
@@ -157,6 +159,8 @@ parse_subscription_options(ParseState *pstate, List *stmt_options,
opts->runasowner = false;
if (IsSet(supported_opts, SUBOPT_ORIGIN))
opts->origin = pstrdup(LOGICALREP_ORIGIN_ANY);
+ if (IsSet(supported_opts, SUBOPT_MIN_APPLY_DELAY))
+ opts->min_apply_delay = 0;
/* Parse options */
foreach(lc, stmt_options)
@@ -353,6 +357,15 @@ parse_subscription_options(ParseState *pstate, List *stmt_options,
opts->specified_opts |= SUBOPT_LSN;
opts->lsn = lsn;
}
+ else if (IsSet(supported_opts, SUBOPT_MIN_APPLY_DELAY) &&
+ strcmp(defel->defname, "min_apply_delay") == 0)
+ {
+ if (IsSet(opts->specified_opts, SUBOPT_MIN_APPLY_DELAY))
+ errorConflictingDefElem(defel, pstate);
+
+ opts->specified_opts |= SUBOPT_MIN_APPLY_DELAY;
+ opts->min_apply_delay = defGetMinApplyDelay(defel);
+ }
else
ereport(ERROR,
(errcode(ERRCODE_SYNTAX_ERROR),
@@ -433,6 +446,32 @@ parse_subscription_options(ParseState *pstate, List *stmt_options,
"slot_name = NONE", "create_slot = false")));
}
}
+
+ /*
+ * The combination of parallel streaming mode and min_apply_delay is not
+ * allowed. This is because in parallel streaming mode, we start applying
+ * the transaction stream as soon as the first change arrives without
+ * knowing the transaction's prepare/commit time. This means we cannot
+ * calculate the underlying network/decoding lag between publisher and
+ * subscriber, and so always waiting for the full 'min_apply_delay' period
+ * might include unnecessary delay.
+ *
+ * The other possibility was to apply the delay at the end of the parallel
+ * apply transaction but that would cause issues related to resource bloat
+ * and locks being held for a long time.
+ */
+ if (IsSet(supported_opts, SUBOPT_MIN_APPLY_DELAY) &&
+ opts->min_apply_delay > 0 &&
+ opts->streaming == LOGICALREP_STREAM_PARALLEL)
+ ereport(ERROR,
+ errcode(ERRCODE_SYNTAX_ERROR),
+
+ /*
+ * translator: the first %s is a string of the form "parameter > 0"
+ * and the second one is "option = value".
+ */
+ errmsg("%s and %s are mutually exclusive options",
+ "min_apply_delay > 0", "streaming = parallel"));
}
/*
@@ -591,7 +630,8 @@ CreateSubscription(ParseState *pstate, CreateSubscriptionStmt *stmt,
SUBOPT_SYNCHRONOUS_COMMIT | SUBOPT_BINARY |
SUBOPT_STREAMING | SUBOPT_TWOPHASE_COMMIT |
SUBOPT_DISABLE_ON_ERR | SUBOPT_PASSWORD_REQUIRED |
- SUBOPT_RUN_AS_OWNER | SUBOPT_ORIGIN);
+ SUBOPT_RUN_AS_OWNER | SUBOPT_ORIGIN |
+ SUBOPT_MIN_APPLY_DELAY);
parse_subscription_options(pstate, stmt->options, supported_opts, &opts);
/*
@@ -682,6 +722,7 @@ CreateSubscription(ParseState *pstate, CreateSubscriptionStmt *stmt,
values[Anum_pg_subscription_oid - 1] = ObjectIdGetDatum(subid);
values[Anum_pg_subscription_subdbid - 1] = ObjectIdGetDatum(MyDatabaseId);
values[Anum_pg_subscription_subskiplsn - 1] = LSNGetDatum(InvalidXLogRecPtr);
+ values[Anum_pg_subscription_subminapplydelay - 1] = Int32GetDatum(opts.min_apply_delay);
values[Anum_pg_subscription_subname - 1] =
DirectFunctionCall1(namein, CStringGetDatum(stmt->subname));
values[Anum_pg_subscription_subowner - 1] = ObjectIdGetDatum(owner);
@@ -1130,7 +1171,8 @@ AlterSubscription(ParseState *pstate, AlterSubscriptionStmt *stmt,
SUBOPT_SYNCHRONOUS_COMMIT | SUBOPT_BINARY |
SUBOPT_STREAMING | SUBOPT_DISABLE_ON_ERR |
SUBOPT_PASSWORD_REQUIRED |
- SUBOPT_RUN_AS_OWNER | SUBOPT_ORIGIN);
+ SUBOPT_RUN_AS_OWNER | SUBOPT_ORIGIN |
+ SUBOPT_MIN_APPLY_DELAY);
parse_subscription_options(pstate, stmt->options,
supported_opts, &opts);
@@ -1174,6 +1216,19 @@ AlterSubscription(ParseState *pstate, AlterSubscriptionStmt *stmt,
if (IsSet(opts.specified_opts, SUBOPT_STREAMING))
{
+ /*
+ * The combination of parallel streaming mode and
+ * min_apply_delay is not allowed. See
+ * parse_subscription_options.
+ */
+ if (opts.streaming == LOGICALREP_STREAM_PARALLEL &&
+ !IsSet(opts.specified_opts, SUBOPT_MIN_APPLY_DELAY)
+ && sub->minapplydelay > 0)
+ ereport(ERROR,
+ errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
+ errmsg("cannot set parallel streaming mode for subscription with %s",
+ "min_apply_delay"));
+
values[Anum_pg_subscription_substream - 1] =
CharGetDatum(opts.streaming);
replaces[Anum_pg_subscription_substream - 1] = true;
@@ -1202,6 +1257,26 @@ AlterSubscription(ParseState *pstate, AlterSubscriptionStmt *stmt,
= true;
}
+ if (IsSet(opts.specified_opts, SUBOPT_MIN_APPLY_DELAY))
+ {
+ /*
+ * The combination of parallel streaming mode and
+ * min_apply_delay is not allowed. See
+ * parse_subscription_options.
+ */
+ if (opts.min_apply_delay > 0 &&
+ !IsSet(opts.specified_opts, SUBOPT_STREAMING)
+ && sub->stream == LOGICALREP_STREAM_PARALLEL)
+ ereport(ERROR,
+ errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
+ errmsg("cannot set %s for subscription in parallel streaming mode",
+ "min_apply_delay"));
+
+ values[Anum_pg_subscription_subminapplydelay - 1] =
+ Int32GetDatum(opts.min_apply_delay);
+ replaces[Anum_pg_subscription_subminapplydelay - 1] = true;
+ }
+
if (IsSet(opts.specified_opts, SUBOPT_ORIGIN))
{
values[Anum_pg_subscription_suborigin - 1] =
@@ -2343,3 +2418,45 @@ defGetStreamingMode(DefElem *def)
def->defname)));
return LOGICALREP_STREAM_OFF; /* keep compiler quiet */
}
+
+/*
+ * Extract the min_apply_delay value from a DefElem. This is very similar to
+ * parse_and_validate_value() for integer values, because min_apply_delay
+ * accepts the same parameter format as recovery_min_apply_delay.
+ */
+static int32
+defGetMinApplyDelay(DefElem *def)
+{
+ char *input_string;
+ int result;
+ const char *hintmsg;
+
+ input_string = defGetString(def);
+
+ /*
+ * Parse given string as parameter which has millisecond unit
+ */
+ if (!parse_int(input_string, &result, GUC_UNIT_MS, &hintmsg))
+ ereport(ERROR,
+ (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+ errmsg("invalid value for parameter \"%s\": \"%s\"",
+ "min_apply_delay", input_string),
+ hintmsg ? errhint("%s", _(hintmsg)) : 0));
+
+ /*
+ * Check both the lower boundary for the valid min_apply_delay range and
+ * the upper boundary as the safeguard for some platforms where INT_MAX is
+ * wider than int32 respectively. Although parse_int() has confirmed that
+ * the result is less than or equal to INT_MAX, the value will be stored
+ * in a catalog column of int32.
+ */
+ if (result < 0 || result > PG_INT32_MAX)
+ ereport(ERROR,
+ (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+ errmsg("%d ms is outside the valid range for parameter \"%s\" (%d .. %d)",
+ result,
+ "min_apply_delay",
+ 0, PG_INT32_MAX)));
+
+ return result;
+}
diff --git a/src/backend/replication/logical/worker.c b/src/backend/replication/logical/worker.c
index 3d58910c14..7311ca75ab 100644
--- a/src/backend/replication/logical/worker.c
+++ b/src/backend/replication/logical/worker.c
@@ -153,6 +153,7 @@
#include "catalog/pg_subscription.h"
#include "catalog/pg_subscription_rel.h"
#include "catalog/pg_tablespace.h"
+#include "common/file_utils.h"
#include "commands/tablecmds.h"
#include "commands/tablespace.h"
#include "commands/trigger.h"
@@ -370,67 +371,742 @@ typedef struct ApplySubXactData
static ApplySubXactData subxact_data = {0, 0, InvalidTransactionId, NULL};
+/* XXX definitions for time-delayed logical replicaiton */
+
+/* DELAYEDDIR stores files that contains changes of delayed transactions. */
+#define DELAYEDDIR "pg_logical/delayed_txns"
+#define DELAYEDSUFFIX ".delayed_changes"
+
+/* List entry to map xid and commit time */
+typedef struct DelayedTxnListEntry
+{
+ TransactionId xid;
+ LogicalRepCommitData commit_data;
+} DelayedTxnListEntry;
+
+/*
+ * An entry is appended when the we receives commit message and time-delayed
+ * logical replication is requested. The entry will be deleted after contents
+ * are applied.
+ */
+static List *DelayedTxnList = NIL;
+
+/* fields valid only when time-delayed logical replication is requested */
+static bool in_delayed_transaction = false;
+
+static TransactionId delayed_xid = InvalidTransactionId;
+
+/*
+ * Store flushed lsn for time-delayed logical replication. This is used when
+ * we send a feedback message to the publisher.
+ */
+static XLogRecPtr last_flushed = InvalidXLogRecPtr;
+
+/*
+ * FIXME: global file descriptor may be not sufficient. There is a possibility
+ * that non-streaming transactions are come concurrently. At that time
+ * create_delay_file() for the second transaction will be failed...
+ */
+static int delayed_fd = -1;
+
static inline void subxact_filename(char *path, Oid subid, TransactionId xid);
static inline void changes_filename(char *path, Oid subid, TransactionId xid);
/*
- * Information about subtransactions of a given toplevel transaction.
+ * Information about subtransactions of a given toplevel transaction.
+ */
+static void subxact_info_write(Oid subid, TransactionId xid);
+static void subxact_info_read(Oid subid, TransactionId xid);
+static void subxact_info_add(TransactionId xid);
+static inline void cleanup_subxact_info(void);
+
+/*
+ * Serialize and deserialize changes for a toplevel transaction.
+ */
+static void stream_open_file(Oid subid, TransactionId xid,
+ bool first_segment);
+static void stream_write_change(char action, StringInfo s);
+static void stream_open_and_write_change(TransactionId xid, char action, StringInfo s);
+static void stream_close_file(void);
+
+static void send_feedback(XLogRecPtr recvpos, bool force, bool requestReply);
+
+static void DisableSubscriptionAndExit(void);
+
+static void apply_handle_commit_internal(LogicalRepCommitData *commit_data);
+static void apply_handle_insert_internal(ApplyExecutionData *edata,
+ ResultRelInfo *relinfo,
+ TupleTableSlot *remoteslot);
+static void apply_handle_update_internal(ApplyExecutionData *edata,
+ ResultRelInfo *relinfo,
+ TupleTableSlot *remoteslot,
+ LogicalRepTupleData *newtup,
+ Oid localindexoid);
+static void apply_handle_delete_internal(ApplyExecutionData *edata,
+ ResultRelInfo *relinfo,
+ TupleTableSlot *remoteslot,
+ Oid localindexoid);
+static bool FindReplTupleInLocalRel(EState *estate, Relation localrel,
+ LogicalRepRelation *remoterel,
+ Oid localidxoid,
+ TupleTableSlot *remoteslot,
+ TupleTableSlot **localslot);
+static void apply_handle_tuple_routing(ApplyExecutionData *edata,
+ TupleTableSlot *remoteslot,
+ LogicalRepTupleData *newtup,
+ CmdType operation);
+
+/* Compute GID for two_phase transactions */
+static void TwoPhaseTransactionGid(Oid subid, TransactionId xid, char *gid, int szgid);
+
+/* Functions for skipping changes */
+static void maybe_start_skipping_changes(XLogRecPtr finish_lsn);
+static void stop_skipping_changes(void);
+static void clear_subscription_skip_lsn(XLogRecPtr finish_lsn);
+
+/* Functions for apply error callback */
+static inline void set_apply_error_context_xact(TransactionId xid, XLogRecPtr lsn);
+static inline void reset_apply_error_context_info(void);
+
+static TransApplyAction get_transaction_apply_action(TransactionId xid,
+ ParallelApplyWorkerInfo **winfo);
+
+static void begin_replication_step(void);
+static void end_replication_step(void);
+
+/* Functions for time-delayed logical replicaiton */
+static void cache_commit_data(LogicalRepCommitData *commit_data, TransactionId xid);
+static void flush_delayed_changes(LogicalRepCommitData *commit_data);
+static void delay_file_name(char *path, Oid subid, TransactionId xid);
+static bool is_given_transaction_delayed(Oid subid, TransactionId xid);
+static void create_delay_file(TransactionId xid);
+static bool handle_delayed_transaction(char action, StringInfo s);
+static void handle_delayed_prepared(char action, XLogRecPtr prepare_lsn,
+ XLogRecPtr end_lsn, TimestampTz prepare_time,
+ TransactionId xid);
+
+/*
+ * Cache commit_data into the list
+ */
+static void
+cache_commit_data(LogicalRepCommitData *commit_data, TransactionId xid)
+{
+ MemoryContext old;
+ DelayedTxnListEntry *entry;
+
+ old = MemoryContextSwitchTo(ApplyContext);
+
+ entry = palloc0(sizeof(DelayedTxnListEntry));
+
+ /* Contruct an entry and append it */
+ entry->xid = xid;
+ memcpy(&entry->commit_data, commit_data, sizeof(LogicalRepCommitData));
+ DelayedTxnList = lappend(DelayedTxnList, entry);
+
+ MemoryContextSwitchTo(old);
+}
+
+/*
+ * Flush given changes and close the file. This will be called at the end of
+ * the transaction.
+ */
+static void
+flush_delayed_changes(LogicalRepCommitData *commit_data)
+{
+ Assert(delayed_fd > 0);
+ Assert(TransactionIdIsValid(delayed_xid));
+
+ /* Cache given commit_data into the list */
+ cache_commit_data(commit_data, delayed_xid);
+
+ /* Flush previously written changes */
+ if (pg_fdatasync(delayed_fd) != 0)
+ {
+ int save_errno = errno;
+ close(delayed_fd);
+ errno = save_errno;
+ ereport(ERROR,
+ errcode_for_file_access(),
+ errmsg("could not fsync file"));
+ }
+
+ /* Store flushed lsn */
+ last_flushed = commit_data->end_lsn;
+
+ /* Cleanup */
+ close(delayed_fd);
+ delayed_fd = -1;
+ delayed_xid = InvalidTransactionId;
+ in_delayed_transaction = false;
+}
+
+/*
+ * Get formal filename from subid and xid
+ */
+static void
+delay_file_name(char *path, Oid subid, TransactionId xid)
+{
+ snprintf(path, MAXPGPATH, DELAYEDDIR "/%u-%u" DELAYEDSUFFIX, subid, xid);
+}
+
+/*
+ * Extract subid and xid and given pathname
+ */
+static void
+extract_info_from_delay_file(char *path, Oid *subid, TransactionId *xid)
+{
+ sscanf(path, DELAYEDDIR "/%u-%u", subid, xid);
+}
+
+/*
+ * Check whether the given transaction is delayed. This is done by checking the
+ * delay file.
+ */
+static bool
+is_given_transaction_delayed(Oid subid, TransactionId xid)
+{
+ struct stat st;
+ char path[MAXPGPATH];
+
+ delay_file_name(path, subid, xid);
+
+ return stat(path, &st) == 0;
+}
+
+/*
+ * Apply the delayed transaction. In the function a delayed file is opened and
+ * read. Apply worker applies written changes.
+ */
+static void
+apply_delayed_transaction(TransactionId xid, XLogRecPtr lsn)
+{
+ StringInfoData s2;
+ int nchanges;
+ char path[MAXPGPATH];
+ char *buffer = NULL;
+ MemoryContext oldcxt;
+ ResourceOwner oldowner;
+
+ /* Make sure we have an open transaction */
+ begin_replication_step();
+
+ /*
+ * Allocate file handle and memory required to process all the messages in
+ * TopTransactionContext to avoid them getting reset after each message is
+ * processed.
+ */
+ oldcxt = MemoryContextSwitchTo(TopTransactionContext);
+
+ /* Open the spool file for the committed transaction */
+ delay_file_name(path, MyLogicalRepWorker->subid, xid);
+ elog(DEBUG1, "replaying changes from file \"%s\"", path);
+
+ /*
+ * Make sure the file is owned by the toplevel transaction so that the
+ * file will not be accidentally closed when aborting a subtransaction.
+ */
+ oldowner = CurrentResourceOwner;
+ CurrentResourceOwner = TopTransactionResourceOwner;
+
+ /* Open the specified file */
+ delayed_fd = BasicOpenFile(path, O_RDONLY | PG_BINARY);
+
+ Assert(delayed_fd > 0);
+
+ CurrentResourceOwner = oldowner;
+
+ buffer = palloc(BLCKSZ);
+ initStringInfo(&s2);
+
+ MemoryContextSwitchTo(oldcxt);
+
+ remote_final_lsn = lsn;
+
+ /*
+ * Make sure the handle apply_dispatch methods are aware we're in a remote
+ * transaction.
+ */
+ in_remote_transaction = true;
+ pgstat_report_activity(STATE_RUNNING, NULL);
+
+ end_replication_step();
+
+ /*
+ * Read the entries one by one and pass them through the same logic as in
+ * apply_dispatch.
+ */
+ nchanges = 0;
+ while (true)
+ {
+ size_t nbytes;
+ int len;
+
+ CHECK_FOR_INTERRUPTS();
+
+ /* read length of the on-disk record */
+ nbytes = read(delayed_fd, &len, sizeof(len));
+
+ /* have we reached end of the file? */
+ if (nbytes == 0)
+ break;
+
+ /* do we have a correct length? */
+ if (len <= 0)
+ elog(ERROR, "incorrect length %d in delaed transaction's changes file \"%s\"",
+ len, path);
+
+ /* make sure we have sufficiently large buffer */
+ buffer = repalloc(buffer, len);
+
+ /* and finally read the data into the buffer */
+ read(delayed_fd, buffer, len);
+
+ /* copy the buffer to the stringinfo and call apply_dispatch */
+ resetStringInfo(&s2);
+ appendBinaryStringInfo(&s2, buffer, len);
+
+ /* Ensure we are reading the data into our memory context. */
+ oldcxt = MemoryContextSwitchTo(ApplyMessageContext);
+
+ apply_dispatch(&s2);
+
+ MemoryContextReset(ApplyMessageContext);
+
+ MemoryContextSwitchTo(oldcxt);
+
+ nchanges++;
+
+ if (nchanges % 1000 == 0)
+ elog(DEBUG1, "replayed %d changes from file \"%s\"",
+ nchanges, path);
+ }
+
+ if (delayed_fd > 0)
+ {
+ close(delayed_fd);
+ delayed_fd = -1;
+ durable_unlink(path, LOG);
+ }
+
+ elog(DEBUG1, "replayed %d (all) changes from file \"%s\"",
+ nchanges, path);
+
+ return;
+}
+
+/*
+ * Create a file that will be written changes.
+ */
+static void
+create_delay_file(TransactionId xid)
+{
+ char path[MAXPGPATH];
+ int fd;
+
+ Assert(TransactionIdIsValid(xid));
+ Assert(delayed_fd < 0);
+
+ delay_file_name(path, MyLogicalRepWorker->subid, xid);
+
+ elog(DEBUG1, "creating a file \"%s\" for time-delayed logical replication",
+ path);
+
+ fd = BasicOpenFile(path, O_WRONLY | O_CREAT | O_EXCL | O_APPEND | PG_BINARY);
+
+ if (fd < 0)
+ ereport(ERROR,
+ errcode_for_file_access(),
+ errmsg("could not create file \"%s\": %m",
+ path));
+
+ delayed_fd = fd;
+}
+
+/*
+ * Create a directory that holds delayed files
+ */
+static void
+initialize_delay_directory(void)
+{
+ char path[MAXPGPATH];
+
+ snprintf(path, MAXPGPATH, DELAYEDDIR);
+ if (MakePGDirectory(path) < 0 && errno != EEXIST)
+ ereport(ERROR,
+ errcode_for_file_access(),
+ errmsg("could not create directory \"%s\": %m",
+ path));
+
+ START_CRIT_SECTION();
+ fsync_fname(path, true);
+ END_CRIT_SECTION();
+}
+
+/*
+ * Read the delayed file and cache information of transaction e.g. committime
+ */
+static bool
+ReadCommitRecord(int fd, TransactionId xid)
+{
+ int len = 0;
+ char action = 0;
+ char *buffer;
+ StringInfoData commit_message;
+ LogicalRepCommitData commit_data = {0};
+
+ /*
+ * If the transaction is not 2PC, we can assume that decoded commit record
+ * is at the end of the file. Therefore, read from the end.
+ */
+
+ /* FIXME: size of the messages is estimated from the document */
+#define COMMIT_MESSAGE_SIZE (sizeof(int) + sizeof(char) + sizeof(int8) + sizeof(LogicalRepCommitData))
+
+ /* seek file to the end */
+ lseek(fd, -COMMIT_MESSAGE_SIZE, SEEK_END);
+
+ read(fd, &len, sizeof(int));
+ read(fd, &action, sizeof(char));
+
+ /*
+ * If the action is not 'C' and the got length is not valid, the
+ * transaction may be 2PC. So stop reading more.
+ */
+ if (len != (COMMIT_MESSAGE_SIZE - sizeof(int)) &&
+ action != LOGICAL_REP_MSG_COMMIT)
+ return false;
+
+ /*
+ * If we reach here, this file seems valid and normal transaction.
+ * Start to read more and cache into memory to start delaying.
+ */
+
+ /* Prepare buffer and read from file */
+ buffer = palloc0(len - sizeof(char));
+ read(fd, buffer, len - sizeof(char));
+
+ /* Append to StringInfo in order to use same read function */
+ initStringInfo(&commit_message);
+ appendBinaryStringInfo(&commit_message, buffer, len - sizeof(char));
+
+ /* Finally start to read decoded commit record */
+ logicalrep_read_commit(&commit_message, &commit_data);
+
+ /* ..and cache into the list */
+ cache_commit_data(&commit_data, xid);
+
+ pfree(buffer);
+ pfree(commit_message.data);
+
+#undef COMMIT_MESSAGE_SIZE
+
+ return true;
+}
+
+/*
+ * Read the delayed file and cache information of transaction e.g. committime.
+ *
+ * Note that apart from above, the native PREPARE/COMMIT PREPARED message is
+ * not directly written into the file. This is because gid can have arbitrary
+ * length and then we cannot estimate the offset of these records from the end
+ * of the file. Instread, the important information - prepare/commit_lsn,
+ * end_lsn, prepare/commit_time, and its transaction id are serialized.
+ * Functions for PREPARE/COMMIT PREPARED were combined because they have same
+ * attributes.
+ */
+static bool
+ReadPreparedCommonRecord(int fd)
+{
+ int len = 0;
+ char action = 0;
+
+ /*
+ * If the transaction is 2PC, we can assume that the final record is either
+ * or decoded prepare/commit prepared.
+ */
+
+ /*
+ * XXX: Modified message contains
+ * - length
+ * - message type
+ * - prepare/commit_lsn
+ * - end_lsn
+ * - xid
+ */
+#define PREPARE_MESSAGE_SIZE (sizeof(int) + sizeof(char) + sizeof(XLogRecPtr) + sizeof(XLogRecPtr) + sizeof(TimestampTz) + sizeof(TransactionId))
+ lseek(fd, -PREPARE_MESSAGE_SIZE, SEEK_END);
+ read(fd, &len, sizeof(int));
+ read(fd, &action, sizeof(char));
+
+ /*
+ * Do something if the record seems to be PREPARE or COMMIT PREPARED
+ */
+ if (len == (PREPARE_MESSAGE_SIZE - sizeof(int)) &&
+ action == LOGICAL_REP_MSG_PREPARE)
+ {
+ /* For PREPARE, do nothing */
+ return true;
+ }
+ else if (len == (PREPARE_MESSAGE_SIZE - sizeof(int)) &&
+ LOGICAL_REP_MSG_COMMIT_PREPARED)
+ {
+ /* For COMMIT PREPARED, cache into memory and start to delay */
+
+ LogicalRepCommitData commit_data = {0};
+ TransactionId xid = InvalidTransactionId;
+
+ /* Adjust position and append to StringInfo in order to use same read function */
+ read(fd, &commit_data.commit_lsn, sizeof(XLogRecPtr));
+ read(fd, &commit_data.end_lsn, sizeof(XLogRecPtr));
+ read(fd, &commit_data.committime, sizeof(TimestampTz));
+ read(fd, &xid, sizeof(TransactionId));
+
+ cache_commit_data(&commit_data, xid);
+
+ return true;
+ }
+ else
+ return false;
+}
+
+/*
+ * Transform information from commit_prepared style to commit style.
+ */
+static void
+ConstructCommitFromCommitPrepared(LogicalRepCommitData *commit,
+ LogicalRepCommitPreparedTxnData *prepare_data)
+{
+ commit->commit_lsn = prepare_data->commit_lsn;
+ commit->committime = prepare_data->commit_time;
+ commit->end_lsn = prepare_data->end_lsn;
+}
+
+/*
+ * Restore the delayed transaction from given files.
+ */
+static void
+RestoreDelayedTxn(char *path)
+{
+ Oid subid = InvalidOid;
+ TransactionId xid = InvalidTransactionId;
+ int fd;
+
+ /* Check filename to extract subid and xid */
+ extract_info_from_delay_file(path, &subid, &xid);
+
+ /*
+ * If the subid is not related with the apply worker, the transaction is
+ * out-of-scope for us...
+ */
+ if (MyLogicalRepWorker->subid != subid)
+ return;
+
+ /* OK, the transaction must be maintained by the worker. Open file */
+ fd = BasicOpenFile(path, O_RDONLY | PG_BINARY);
+
+ /* And restore from the end of the file */
+ if (ReadCommitRecord(fd, xid))
+ goto cleanup;
+
+ if (ReadPreparedCommonRecord(fd))
+ goto cleanup;
+
+ /*
+ * If we reach here the file seems to be corrupted. So remove once and
+ * receive changes again.
+ */
+ close(fd);
+ durable_unlink(path, LOG);
+ return;
+
+cleanup:
+ close(fd);
+}
+
+/*
+ * Restore all the delayed transactions to memory.
+ */
+static void
+RestoreDelayedTxns(void)
+{
+ DIR *delayed_dir;
+ struct dirent *delayed_de;
+
+ /* Read all the file step-by-step */
+ delayed_dir = AllocateDir(DELAYEDDIR);
+ while ((delayed_de = ReadDir(delayed_dir, DELAYEDDIR)) != NULL)
+ {
+ char path[MAXPGPATH];
+ PGFileType de_type;
+
+ if (strcmp(delayed_de->d_name, ".") == 0 ||
+ strcmp(delayed_de->d_name, "..") == 0)
+ continue;
+
+ /* Check the filename and status */
+ snprintf(path, sizeof(path), DELAYEDDIR "/%s", delayed_de->d_name);
+ de_type = get_dirent_type(path, delayed_de, false, DEBUG1);
+
+ if (de_type != PGFILETYPE_REG)
+ continue;
+
+ /* Found a delayed transaction. Restore it. */
+ RestoreDelayedTxn(path);
+ }
+ FreeDir(delayed_dir);
+}
+
+/*
+ * Restore delayed transactions, or initialize the directory
+ */
+static void
+InitializeDelayedTxn(void)
+{
+ struct stat st;
+ char path[MAXPGPATH];
+
+ snprintf(path, MAXPGPATH, DELAYEDDIR);
+
+ /*
+ * If the given directory does not exist, create one. Otherwise start to
+ * restore.
+ */
+ if (stat(path, &st) != 0)
+ {
+ initialize_delay_directory();
+ return;
+ }
+
+ RestoreDelayedTxns();
+}
+
+/*
+ * Write a given message to a file. This is called for every message.
+ * This returns true only when changes are written into file.
+ *
+ * The format of the serialized changes is same as the streamed one. This
+ * has a length (not including the length), action code (identifying the
+ * message type) and message contents (without the subxact TransactionId
+ * value).
+ */
+static bool
+handle_delayed_transaction(char action, StringInfo s)
+{
+ int len;
+
+ /* Return if we are not in delay */
+ if (!in_delayed_transaction)
+ return false;
+
+ Assert(delayed_fd > 0);
+ Assert(TransactionIdIsValid(delayed_xid));
+
+ len = (s->len - s->cursor) + sizeof(char);
+
+ if (write(delayed_fd, &len, sizeof(len)) != sizeof(len))
+ abort();
+ if (write(delayed_fd, &action, sizeof(action)) != sizeof(action))
+ abort();
+
+ len = (s->len - s->cursor);
+
+ if (write(delayed_fd, &s->data[s->cursor], len) != len)
+ abort();
+
+ return true;
+}
+
+/*
+ * Write a given information from PREPARE/COMMIT PREPARED to a file. This is
+ * called when we receive PREPARE or COMMIT PREPARED message. This returns true
+ * only when changes are written into file.
+ *
+ * About the needness of the function see comments atop
+ * ReadPreparedCommonRecord().
*/
-static void subxact_info_write(Oid subid, TransactionId xid);
-static void subxact_info_read(Oid subid, TransactionId xid);
-static void subxact_info_add(TransactionId xid);
-static inline void cleanup_subxact_info(void);
+static void
+handle_delayed_prepared(char action, XLogRecPtr prepare_lsn,
+ XLogRecPtr end_lsn, TimestampTz prepare_time,
+ TransactionId xid)
+{
+ int len;
+
+ Assert(delayed_fd > 0);
+
+#define MESSAGE_SIZE (sizeof(char) + sizeof(XLogRecPtr) + sizeof(XLogRecPtr) + sizeof(TimestampTz) + sizeof(TransactionId))
+ len = MESSAGE_SIZE;
+
+ /*
+ * XXX: Modified message contains
+ * - length
+ * - message type
+ * - prepare/commit_lsn
+ * - end_lsn
+ * - xid
+ */
+ if (write(delayed_fd, &len, sizeof(len)) != sizeof(len))
+ abort();
+ if (write(delayed_fd, &action, sizeof(action)) != sizeof(action))
+ abort();
+ if (write(delayed_fd, &prepare_lsn, sizeof(prepare_lsn)) != sizeof(prepare_lsn))
+ abort();
+ if (write(delayed_fd, &end_lsn, sizeof(end_lsn)) != sizeof(end_lsn))
+ abort();
+ if (write(delayed_fd, &prepare_time, sizeof(prepare_time)) != sizeof(prepare_time))
+ abort();
+ if (write(delayed_fd, &xid, sizeof(xid)) != sizeof(xid))
+ abort();
+#undef MESSAGE_SIZE
+}
/*
- * Serialize and deserialize changes for a toplevel transaction.
+ * Check the delayed transactions and apply if we elapsed sufficient time
*/
-static void stream_open_file(Oid subid, TransactionId xid,
- bool first_segment);
-static void stream_write_change(char action, StringInfo s);
-static void stream_open_and_write_change(TransactionId xid, char action, StringInfo s);
-static void stream_close_file(void);
+static void
+check_delayed_transaction(void)
+{
+ TimestampTz now;
+ ListCell *lc;
+ int n = 0;
-static void send_feedback(XLogRecPtr recvpos, bool force, bool requestReply);
+ if (in_streamed_transaction)
+ return;
-static void DisableSubscriptionAndExit(void);
+ now = GetCurrentTimestamp();
-static void apply_handle_commit_internal(LogicalRepCommitData *commit_data);
-static void apply_handle_insert_internal(ApplyExecutionData *edata,
- ResultRelInfo *relinfo,
- TupleTableSlot *remoteslot);
-static void apply_handle_update_internal(ApplyExecutionData *edata,
- ResultRelInfo *relinfo,
- TupleTableSlot *remoteslot,
- LogicalRepTupleData *newtup,
- Oid localindexoid);
-static void apply_handle_delete_internal(ApplyExecutionData *edata,
- ResultRelInfo *relinfo,
- TupleTableSlot *remoteslot,
- Oid localindexoid);
-static bool FindReplTupleInLocalRel(EState *estate, Relation localrel,
- LogicalRepRelation *remoterel,
- Oid localidxoid,
- TupleTableSlot *remoteslot,
- TupleTableSlot **localslot);
-static void apply_handle_tuple_routing(ApplyExecutionData *edata,
- TupleTableSlot *remoteslot,
- LogicalRepTupleData *newtup,
- CmdType operation);
+ /* Read cache on-by-one */
+ foreach(lc, DelayedTxnList)
+ {
+ DelayedTxnListEntry *entry = (DelayedTxnListEntry *) lfirst(lc);
+ LogicalRepCommitData *commit_data = &entry->commit_data;
+ TimestampTz delayUntil;
+ long diffms;
-/* Compute GID for two_phase transactions */
-static void TwoPhaseTransactionGid(Oid subid, TransactionId xid, char *gid, int szgid);
+ delayUntil = TimestampTzPlusMilliseconds(commit_data->committime,
+ MySubscription->minapplydelay);
-/* Functions for skipping changes */
-static void maybe_start_skipping_changes(XLogRecPtr finish_lsn);
-static void stop_skipping_changes(void);
-static void clear_subscription_skip_lsn(XLogRecPtr finish_lsn);
+ diffms = TimestampDifferenceMilliseconds(now, delayUntil);
-/* Functions for apply error callback */
-static inline void set_apply_error_context_xact(TransactionId xid, XLogRecPtr lsn);
-static inline void reset_apply_error_context_info(void);
+ /*
+ * The cache is aligned the commit ordering, so we do not have to check
+ * latter entries if we find transactions that should not be applied.
+ */
+ if (diffms > 0)
+ break;
-static TransApplyAction get_transaction_apply_action(TransactionId xid,
- ParallelApplyWorkerInfo **winfo);
+ elog(DEBUG1, "started to apply transaction %u", entry->xid);
+
+ apply_delayed_transaction(entry->xid, commit_data->end_lsn);
+ apply_handle_commit_internal(commit_data);
+
+ delayed_fd = -1;
+ delayed_xid = InvalidTransactionId;
+ in_delayed_transaction = false;
+ n++;
+ }
+ /* Discards applied entries */
+ DelayedTxnList = list_delete_first_n(DelayedTxnList, n);
+}
/*
* Return the name of the logical replication worker.
@@ -1019,13 +1695,28 @@ apply_handle_begin(StringInfo s)
logicalrep_read_begin(s, &begin_data);
set_apply_error_context_xact(begin_data.xid, begin_data.final_lsn);
- remote_final_lsn = begin_data.final_lsn;
+ /*
+ * Prepare to write changes into file if time-delayed replication is
+ * requested.
+ */
+ if (MySubscription->minapplydelay && AllTablesyncsReady())
+ {
+ in_delayed_transaction = true;
- maybe_start_skipping_changes(begin_data.final_lsn);
+ create_delay_file(begin_data.xid);
- in_remote_transaction = true;
+ delayed_xid = begin_data.xid;
+ }
+ else
+ {
+ remote_final_lsn = begin_data.final_lsn;
- pgstat_report_activity(STATE_RUNNING, NULL);
+ maybe_start_skipping_changes(begin_data.final_lsn);
+
+ in_remote_transaction = true;
+
+ pgstat_report_activity(STATE_RUNNING, NULL);
+ }
}
/*
@@ -1037,20 +1728,40 @@ static void
apply_handle_commit(StringInfo s)
{
LogicalRepCommitData commit_data;
+ /* Save the message before it is consumed. */
+ StringInfoData original_msg = *s;
+
+ /*
+ * If we are applying the delayed transaction, skip here.
+ * Actual COMMIT will be done outside the apply_delayed_transaction()
+ */
+ if (delayed_fd > 0 && !in_delayed_transaction)
+ return;
logicalrep_read_commit(s, &commit_data);
- if (commit_data.commit_lsn != remote_final_lsn)
- ereport(ERROR,
- (errcode(ERRCODE_PROTOCOL_VIOLATION),
- errmsg_internal("incorrect commit LSN %X/%X in commit message (expected %X/%X)",
- LSN_FORMAT_ARGS(commit_data.commit_lsn),
- LSN_FORMAT_ARGS(remote_final_lsn))));
+ /* If we are applying, skip here. */
+
+ if (in_delayed_transaction)
+ {
+ /* Write a commit message into file and flush all of messages */
+ handle_delayed_transaction(LOGICAL_REP_MSG_COMMIT, &original_msg);
+ flush_delayed_changes(&commit_data);
+ }
+ else
+ {
+ if (commit_data.commit_lsn != remote_final_lsn)
+ ereport(ERROR,
+ (errcode(ERRCODE_PROTOCOL_VIOLATION),
+ errmsg_internal("incorrect commit LSN %X/%X in commit message (expected %X/%X)",
+ LSN_FORMAT_ARGS(commit_data.commit_lsn),
+ LSN_FORMAT_ARGS(remote_final_lsn))));
- apply_handle_commit_internal(&commit_data);
+ apply_handle_commit_internal(&commit_data);
- /* Process any tables that are being synchronized in parallel. */
- process_syncing_tables(commit_data.end_lsn);
+ /* Process any tables that are being synchronized in parallel. */
+ process_syncing_tables(commit_data.end_lsn);
+ }
pgstat_report_activity(STATE_IDLE, NULL);
reset_apply_error_context_info();
@@ -1076,13 +1787,28 @@ apply_handle_begin_prepare(StringInfo s)
logicalrep_read_begin_prepare(s, &begin_data);
set_apply_error_context_xact(begin_data.xid, begin_data.prepare_lsn);
- remote_final_lsn = begin_data.prepare_lsn;
+ /*
+ * Prepare to write changes into file if time-delayed replication is
+ * requested.
+ */
+ if (MySubscription->minapplydelay && AllTablesyncsReady())
+ {
+ in_delayed_transaction = true;
+
+ create_delay_file(begin_data.xid);
+
+ delayed_xid = begin_data.xid;
+ }
+ else
+ {
+ remote_final_lsn = begin_data.prepare_lsn;
- maybe_start_skipping_changes(begin_data.prepare_lsn);
+ maybe_start_skipping_changes(begin_data.prepare_lsn);
- in_remote_transaction = true;
+ in_remote_transaction = true;
- pgstat_report_activity(STATE_RUNNING, NULL);
+ pgstat_report_activity(STATE_RUNNING, NULL);
+ }
}
/*
@@ -1124,57 +1850,115 @@ apply_handle_prepare_internal(LogicalRepPreparedTxnData *prepare_data)
/*
* Handle PREPARE message.
+ *
+ * When time-delayed logical replication is requested, we just write a message
+ * into file and return. This means that no transaction is prepared on
+ * subscriber. This can avoid that the apply worker acquires locks for a long
+ * time due to the long min_apply_time.
+ *
+ * Even if the transaction is applied from delayed file, the transaction is not
+ * prepared. We just skip PREPARE message.
*/
static void
apply_handle_prepare(StringInfo s)
{
LogicalRepPreparedTxnData prepare_data;
- logicalrep_read_prepare(s, &prepare_data);
+ /*
+ * If we are applying the delayed transaction, just consume the PREPARE
+ * message and return.
+ */
+ if (delayed_fd > 0 && !in_delayed_transaction)
+ {
+ /* Consume non-needed data */
+ (void) pq_getmsgint64(s);
+ (void) pq_getmsgint64(s);
+ (void) pq_getmsgint64(s);
+ (void) pq_getmsgint(s, 4);
- if (prepare_data.prepare_lsn != remote_final_lsn)
- ereport(ERROR,
- (errcode(ERRCODE_PROTOCOL_VIOLATION),
- errmsg_internal("incorrect prepare LSN %X/%X in prepare message (expected %X/%X)",
- LSN_FORMAT_ARGS(prepare_data.prepare_lsn),
- LSN_FORMAT_ARGS(remote_final_lsn))));
+ return;
+ }
/*
- * Unlike commit, here, we always prepare the transaction even though no
- * change has happened in this transaction or all changes are skipped. It
- * is done this way because at commit prepared time, we won't know whether
- * we have skipped preparing a transaction because of those reasons.
- *
- * XXX, We can optimize such that at commit prepared time, we first check
- * whether we have prepared the transaction or not but that doesn't seem
- * worthwhile because such cases shouldn't be common.
+ * If we are writing changes into delayed file, construct a modified
+ * message and write it. This is needed for avoiding to write gid into
+ * file. More detail, see atop ReadPreparedCommonRecord().
*/
- begin_replication_step();
+ if (in_delayed_transaction)
+ {
+ /* Write the modifed message */
+ handle_delayed_prepared(LOGICAL_REP_MSG_PREPARE,
+ prepare_data.prepare_lsn,
+ prepare_data.end_lsn,
+ prepare_data.prepare_time,
+ prepare_data.xid);
+
+ /* Flush changes */
+ if (pg_fdatasync(delayed_fd) != 0)
+ {
+ int save_errno = errno;
+ close(delayed_fd);
+ errno = save_errno;
+ ereport(ERROR,
+ errcode_for_file_access(),
+ errmsg("could not fsync file"));
+ }
- apply_handle_prepare_internal(&prepare_data);
+ /* Store flushed lsn */
+ last_flushed = prepare_data.end_lsn;
- end_replication_step();
- CommitTransactionCommand();
- pgstat_report_stat(false);
+ /* Cleanup */
+ close(delayed_fd);
- store_flush_position(prepare_data.end_lsn, XactLastCommitEnd);
+ delayed_fd = -1;
+ delayed_xid = InvalidTransactionId;
+ in_delayed_transaction = false;
+ }
+ else
+ {
+ logicalrep_read_prepare(s, &prepare_data);
- in_remote_transaction = false;
+ if (prepare_data.prepare_lsn != remote_final_lsn)
+ ereport(ERROR,
+ (errcode(ERRCODE_PROTOCOL_VIOLATION),
+ errmsg_internal("incorrect prepare LSN %X/%X in prepare message (expected %X/%X)",
+ LSN_FORMAT_ARGS(prepare_data.prepare_lsn),
+ LSN_FORMAT_ARGS(remote_final_lsn))));
- /* Process any tables that are being synchronized in parallel. */
- process_syncing_tables(prepare_data.end_lsn);
+ /*
+ * Unlike commit, here, we always prepare the transaction even though no
+ * change has happened in this transaction or all changes are skipped. It
+ * is done this way because at commit prepared time, we won't know whether
+ * we have skipped preparing a transaction because of those reasons.
+ *
+ * XXX, We can optimize such that at commit prepared time, we first check
+ * whether we have prepared the transaction or not but that doesn't seem
+ * worthwhile because such cases shouldn't be common.
+ */
+ begin_replication_step();
- /*
- * Since we have already prepared the transaction, in a case where the
- * server crashes before clearing the subskiplsn, it will be left but the
- * transaction won't be resent. But that's okay because it's a rare case
- * and the subskiplsn will be cleared when finishing the next transaction.
- */
- stop_skipping_changes();
- clear_subscription_skip_lsn(prepare_data.prepare_lsn);
+ apply_handle_prepare_internal(&prepare_data);
- pgstat_report_activity(STATE_IDLE, NULL);
- reset_apply_error_context_info();
+ end_replication_step();
+ CommitTransactionCommand();
+ pgstat_report_stat(false);
+
+ store_flush_position(prepare_data.end_lsn, XactLastCommitEnd);
+
+ in_remote_transaction = false;
+
+ /* Process any tables that are being synchronized in parallel. */
+ process_syncing_tables(prepare_data.end_lsn);
+
+ /*
+ * Since we have already prepared the transaction, in a case where the
+ * server crashes before clearing the subskiplsn, it will be left but the
+ * transaction won't be resent. But that's okay because it's a rare case
+ * and the subskiplsn will be cleared when finishing the next transaction.
+ */
+ stop_skipping_changes();
+ clear_subscription_skip_lsn(prepare_data.prepare_lsn);
+ }
}
/*
@@ -1192,38 +1976,95 @@ apply_handle_commit_prepared(StringInfo s)
LogicalRepCommitPreparedTxnData prepare_data;
char gid[GIDSIZE];
- logicalrep_read_commit_prepared(s, &prepare_data);
- set_apply_error_context_xact(prepare_data.xid, prepare_data.commit_lsn);
-
- /* Compute GID for two_phase transactions. */
- TwoPhaseTransactionGid(MySubscription->oid, prepare_data.xid,
- gid, sizeof(gid));
+ if (delayed_fd > 0 && !in_delayed_transaction)
+ return;
- /* There is no transaction when COMMIT PREPARED is called */
- begin_replication_step();
+ logicalrep_read_commit_prepared(s, &prepare_data);
/*
- * Update origin state so we can restart streaming from correct position
- * in case of crash.
+ * Check whether delayed file exists or not. If we have a file and we have
+ * not opened yet, it means that time-delayed logical replication has been
+ * requested. At that time we write the modified message.
+ * Otherwise, the transaction will be committed normally.
*/
- replorigin_session_origin_lsn = prepare_data.end_lsn;
- replorigin_session_origin_timestamp = prepare_data.commit_time;
+ if (delayed_fd < 0 &&
+ is_given_transaction_delayed(MyLogicalRepWorker->subid, prepare_data.xid))
+ {
+ char path[MAXPGPATH];
+ LogicalRepCommitData commit_data = {0};
- FinishPreparedTransaction(gid, true);
- end_replication_step();
- CommitTransactionCommand();
- pgstat_report_stat(false);
+ /* Open file again */
+ delay_file_name(path, MyLogicalRepWorker->subid, prepare_data.xid);
+ delayed_fd = BasicOpenFile(path, O_WRONLY | O_APPEND | PG_BINARY);
+ if (delayed_fd < 0)
+ ereport(ERROR,
+ errcode_for_file_access(),
+ errmsg("could not open file \"%s\": %m",
+ path));
+
+ /* Write modified message to file */
+ handle_delayed_prepared(LOGICAL_REP_MSG_COMMIT_PREPARED,
+ prepare_data.commit_lsn,
+ prepare_data.end_lsn,
+ prepare_data.commit_time,
+ prepare_data.xid);
+ /* Flush it */
+ if (pg_fdatasync(delayed_fd) != 0)
+ {
+ int save_errno = errno;
+ close(delayed_fd);
+ errno = save_errno;
+ ereport(ERROR,
+ errcode_for_file_access(),
+ errmsg("could not fsync file"));
+ }
- store_flush_position(prepare_data.end_lsn, XactLastCommitEnd);
- in_remote_transaction = false;
+ /* Store flushed lsn */
+ last_flushed = prepare_data.end_lsn;
- /* Process any tables that are being synchronized in parallel. */
- process_syncing_tables(prepare_data.end_lsn);
+ /* clean up */
+ close(delayed_fd);
- clear_subscription_skip_lsn(prepare_data.end_lsn);
+ delayed_fd = -1;
+ delayed_xid = InvalidTransactionId;
+ in_delayed_transaction = false;
- pgstat_report_activity(STATE_IDLE, NULL);
- reset_apply_error_context_info();
+ ConstructCommitFromCommitPrepared(&commit_data, &prepare_data);
+
+ /* Cache the commited transaction */
+ cache_commit_data(&commit_data, prepare_data.xid);
+ }
+ else
+ {
+ set_apply_error_context_xact(prepare_data.xid, prepare_data.commit_lsn);
+
+ /* Compute GID for two_phase transactions. */
+ TwoPhaseTransactionGid(MySubscription->oid, prepare_data.xid,
+ gid, sizeof(gid));
+
+ /* There is no transaction when COMMIT PREPARED is called */
+ begin_replication_step();
+
+ /*
+ * Update origin state so we can restart streaming from correct position
+ * in case of crash.
+ */
+ replorigin_session_origin_lsn = prepare_data.end_lsn;
+ replorigin_session_origin_timestamp = prepare_data.commit_time;
+
+ FinishPreparedTransaction(gid, true);
+ end_replication_step();
+ CommitTransactionCommand();
+ pgstat_report_stat(false);
+
+ store_flush_position(prepare_data.end_lsn, XactLastCommitEnd);
+ in_remote_transaction = false;
+
+ /* Process any tables that are being synchronized in parallel. */
+ process_syncing_tables(prepare_data.end_lsn);
+
+ clear_subscription_skip_lsn(prepare_data.end_lsn);
+ }
}
/*
@@ -1242,6 +2083,20 @@ apply_handle_rollback_prepared(StringInfo s)
char gid[GIDSIZE];
logicalrep_read_rollback_prepared(s, &rollback_data);
+
+ /*
+ * If the delayed file exists, just remove it. The delayed transaction have
+ * never prepared, so it's OK not to call FinishPreparedTransaction().
+ */
+ if (is_given_transaction_delayed(MyLogicalRepWorker->subid, rollback_data.xid))
+ {
+ char path[MAXPGPATH];
+ delay_file_name(path, MyLogicalRepWorker->subid, rollback_data.xid);
+ durable_unlink(path, LOG);
+
+ return;
+ }
+
set_apply_error_context_xact(rollback_data.xid, rollback_data.rollback_end_lsn);
/* Compute GID for two_phase transactions. */
@@ -1317,16 +2172,68 @@ apply_handle_stream_prepare(StringInfo s)
switch (apply_action)
{
case TRANS_LEADER_APPLY:
+ /*
+ * If time-delayed is requested, start to write changes to
+ * permanent file instead of temporary one.
+ */
+ if (MySubscription->minapplydelay)
+ {
+ in_delayed_transaction = true;
+
+ create_delay_file(prepare_data.xid);
+
+ delayed_xid = prepare_data.xid;
+ }
/*
* The transaction has been serialized to file, so replay all the
* spooled operations.
+ * Note that if time-delayed replication is requested, changes are
+ * written into permanent file here.
*/
apply_spooled_messages(MyLogicalRepWorker->stream_fileset,
prepare_data.xid, prepare_data.prepare_lsn);
- /* Mark the transaction as prepared. */
- apply_handle_prepare_internal(&prepare_data);
+
+ /*
+ * If time-delayed replication is requested, construct a modified
+ * message and write it. This is needed for avoiding to write gid into
+ * file. More detail, see atop ReadPreparedCommonRecord().
+ */
+ if (MySubscription->minapplydelay)
+ {
+ /* Write the modified message */
+ handle_delayed_prepared(LOGICAL_REP_MSG_PREPARE,
+ prepare_data.prepare_lsn,
+ prepare_data.end_lsn,
+ prepare_data.prepare_time,
+ prepare_data.xid);
+
+ /* Flush changes */
+ if (pg_fdatasync(delayed_fd) != 0)
+ {
+ int save_errno = errno;
+ close(delayed_fd);
+ errno = save_errno;
+ ereport(ERROR,
+ errcode_for_file_access(),
+ errmsg("could not fsync file"));
+ }
+
+ /* Store flushed lsn */
+ last_flushed = prepare_data.end_lsn;
+
+ close(delayed_fd);
+
+ delayed_fd = -1;
+ delayed_xid = InvalidTransactionId;
+ in_delayed_transaction = false;
+ }
+ else
+ {
+ /* Mark the transaction as prepared. */
+ apply_handle_prepare_internal(&prepare_data);
+ }
CommitTransactionCommand();
@@ -1405,8 +2312,11 @@ apply_handle_stream_prepare(StringInfo s)
pgstat_report_stat(false);
- /* Process any tables that are being synchronized in parallel. */
- process_syncing_tables(prepare_data.end_lsn);
+ if (list_length(DelayedTxnList) == 0)
+ {
+ /* Process any tables that are being synchronized in parallel. */
+ process_syncing_tables(prepare_data.end_lsn);
+ }
/*
* Similar to prepare case, the subskiplsn could be left in a case of
@@ -2175,19 +3085,43 @@ apply_handle_stream_commit(StringInfo s)
{
case TRANS_LEADER_APPLY:
+ /*
+ * If time-delayed is requested, start to write changes to
+ * permanent file instead of temporary one.
+ */
+ if (MySubscription->minapplydelay)
+ {
+ in_delayed_transaction = true;
+
+ create_delay_file(xid);
+
+ delayed_xid = xid;
+ }
+
/*
* The transaction has been serialized to file, so replay all the
* spooled operations.
+ * Note that if time-delayed replication is requested, changes are
+ * written into permanent file here.
*/
apply_spooled_messages(MyLogicalRepWorker->stream_fileset, xid,
- commit_data.commit_lsn);
+ commit_data.commit_lsn);
- apply_handle_commit_internal(&commit_data);
+
+ /* Flush changes if time-delayed is requested */
+ if (MySubscription->minapplydelay)
+ {
+ handle_delayed_transaction(LOGICAL_REP_MSG_COMMIT, &original_msg);
+ flush_delayed_changes(&commit_data);
+ }
+ else
+ apply_handle_commit_internal(&commit_data);
/* Unlink the files with serialized changes and subxact info. */
stream_cleanup_files(MyLogicalRepWorker->subid, xid);
elog(DEBUG1, "finished processing the STREAM COMMIT command");
+
break;
case TRANS_LEADER_SEND_TO_PARALLEL:
@@ -2249,8 +3183,11 @@ apply_handle_stream_commit(StringInfo s)
break;
}
- /* Process any tables that are being synchronized in parallel. */
- process_syncing_tables(commit_data.end_lsn);
+ if (list_length(DelayedTxnList) == 0)
+ {
+ /* Process any tables that are being synchronized in parallel. */
+ process_syncing_tables(commit_data.end_lsn);
+ }
pgstat_report_activity(STATE_IDLE, NULL);
@@ -2325,7 +3262,8 @@ apply_handle_relation(StringInfo s)
{
LogicalRepRelation *rel;
- if (handle_streamed_transaction(LOGICAL_REP_MSG_RELATION, s))
+ if (handle_streamed_transaction(LOGICAL_REP_MSG_RELATION, s) ||
+ handle_delayed_transaction(LOGICAL_REP_MSG_RELATION, s))
return;
rel = logicalrep_read_rel(s);
@@ -2348,7 +3286,8 @@ apply_handle_type(StringInfo s)
{
LogicalRepTyp typ;
- if (handle_streamed_transaction(LOGICAL_REP_MSG_TYPE, s))
+ if (handle_streamed_transaction(LOGICAL_REP_MSG_TYPE, s) ||
+ handle_delayed_transaction(LOGICAL_REP_MSG_TYPE, s))
return;
logicalrep_read_typ(s, &typ);
@@ -2408,7 +3347,8 @@ apply_handle_insert(StringInfo s)
* streamed transactions.
*/
if (is_skipping_changes() ||
- handle_streamed_transaction(LOGICAL_REP_MSG_INSERT, s))
+ handle_streamed_transaction(LOGICAL_REP_MSG_INSERT, s) ||
+ handle_delayed_transaction(LOGICAL_REP_MSG_INSERT, s))
return;
begin_replication_step();
@@ -2560,7 +3500,8 @@ apply_handle_update(StringInfo s)
* streamed transactions.
*/
if (is_skipping_changes() ||
- handle_streamed_transaction(LOGICAL_REP_MSG_UPDATE, s))
+ handle_streamed_transaction(LOGICAL_REP_MSG_UPDATE, s) ||
+ handle_delayed_transaction(LOGICAL_REP_MSG_UPDATE, s))
return;
begin_replication_step();
@@ -2741,7 +3682,8 @@ apply_handle_delete(StringInfo s)
* streamed transactions.
*/
if (is_skipping_changes() ||
- handle_streamed_transaction(LOGICAL_REP_MSG_DELETE, s))
+ handle_streamed_transaction(LOGICAL_REP_MSG_DELETE, s) ||
+ handle_delayed_transaction(LOGICAL_REP_MSG_DELETE, s))
return;
begin_replication_step();
@@ -3169,7 +4111,8 @@ apply_handle_truncate(StringInfo s)
* streamed transactions.
*/
if (is_skipping_changes() ||
- handle_streamed_transaction(LOGICAL_REP_MSG_TRUNCATE, s))
+ handle_streamed_transaction(LOGICAL_REP_MSG_TRUNCATE, s) ||
+ handle_delayed_transaction(LOGICAL_REP_MSG_TRUNCATE, s))
return;
begin_replication_step();
@@ -3436,6 +4379,10 @@ get_flush_position(XLogRecPtr *write, XLogRecPtr *flush,
}
}
+ /* If change are written into file, report the LSN instead */
+ if (last_flushed > *flush)
+ *flush = last_flushed;
+
*have_pending_txes = !dlist_is_empty(&lsn_mapping);
}
@@ -3632,9 +4579,13 @@ LogicalRepApplyLoop(XLogRecPtr last_received)
maybe_reread_subscription();
/* Process any table synchronization changes. */
- process_syncing_tables(last_received);
+ if (list_length(DelayedTxnList) == 0)
+ process_syncing_tables(last_received);
}
+ /* Check delayed transactions and apply them */
+ check_delayed_transaction();
+
/* Cleanup the memory. */
MemoryContextResetAndDeleteChildren(ApplyMessageContext);
MemoryContextSwitchTo(TopMemoryContext);
@@ -3776,8 +4727,14 @@ send_feedback(XLogRecPtr recvpos, bool force, bool requestReply)
/*
* No outstanding transactions to flush, we can report the latest received
* position. This is important for synchronous replication.
+ *
+ * If the logical replication subscription has unprocessed changes then do
+ * not inform the publisher that the received latest LSN is already
+ * applied and flushed, otherwise, the publisher will make a wrong
+ * assumption about the logical replication progress. Instead, just send a
+ * feedback message to avoid a replication timeout during the delay.
*/
- if (!have_pending_txes)
+ if (!have_pending_txes && (list_length(DelayedTxnList) == 0))
flushpos = writepos = recvpos;
if (writepos < last_writepos)
@@ -4581,6 +5538,9 @@ ApplyWorkerMain(Datum main_arg)
(errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
errmsg("subscription has no replication slot set")));
+ /* Check delayed files or initialize directory */
+ InitializeDelayedTxn();
+
/* Setup replication origin tracking. */
StartTransactionCommand();
ReplicationOriginNameForLogicalRep(MySubscription->oid, InvalidOid,
diff --git a/src/bin/pg_dump/pg_dump.c b/src/bin/pg_dump/pg_dump.c
index 6abbcff683..2916965c56 100644
--- a/src/bin/pg_dump/pg_dump.c
+++ b/src/bin/pg_dump/pg_dump.c
@@ -4609,6 +4609,7 @@ getSubscriptions(Archive *fout)
int i_subpublications;
int i_subbinary;
int i_subpasswordrequired;
+ int i_subminapplydelay;
int i,
ntups;
@@ -4663,11 +4664,13 @@ getSubscriptions(Archive *fout)
if (fout->remoteVersion >= 160000)
appendPQExpBufferStr(query,
" s.suborigin,\n"
- " s.subpasswordrequired\n");
+ " s.subpasswordrequired,\n"
+ " s.subminapplydelay\n");
else
appendPQExpBuffer(query,
" '%s' AS suborigin,\n"
- " 't' AS subpasswordrequired\n",
+ " 't' AS subpasswordrequired,\n"
+ " 0 AS subminapplydelay\n",
LOGICALREP_ORIGIN_ANY);
appendPQExpBufferStr(query,
@@ -4697,6 +4700,7 @@ getSubscriptions(Archive *fout)
i_subdisableonerr = PQfnumber(res, "subdisableonerr");
i_suborigin = PQfnumber(res, "suborigin");
i_subpasswordrequired = PQfnumber(res, "subpasswordrequired");
+ i_subminapplydelay = PQfnumber(res, "subminapplydelay");
subinfo = pg_malloc(ntups * sizeof(SubscriptionInfo));
@@ -4729,6 +4733,8 @@ getSubscriptions(Archive *fout)
subinfo[i].suborigin = pg_strdup(PQgetvalue(res, i, i_suborigin));
subinfo[i].subpasswordrequired =
pg_strdup(PQgetvalue(res, i, i_subpasswordrequired));
+ subinfo[i].subminapplydelay =
+ atoi(PQgetvalue(res, i, i_subminapplydelay));
/* Decide whether we want to dump it */
selectDumpableObject(&(subinfo[i].dobj), fout);
@@ -4813,6 +4819,9 @@ dumpSubscription(Archive *fout, const SubscriptionInfo *subinfo)
if (strcmp(subinfo->subpasswordrequired, "t") != 0)
appendPQExpBuffer(query, ", password_required = false");
+ if (subinfo->subminapplydelay > 0)
+ appendPQExpBuffer(query, ", min_apply_delay = '%d ms'", subinfo->subminapplydelay);
+
appendPQExpBufferStr(query, ");\n");
if (subinfo->dobj.dump & DUMP_COMPONENT_DEFINITION)
diff --git a/src/bin/pg_dump/pg_dump.h b/src/bin/pg_dump/pg_dump.h
index ed6ce41ad7..6bf889a00a 100644
--- a/src/bin/pg_dump/pg_dump.h
+++ b/src/bin/pg_dump/pg_dump.h
@@ -662,6 +662,7 @@ typedef struct _SubscriptionInfo
char *subdisableonerr;
char *suborigin;
char *subsynccommit;
+ int subminapplydelay;
char *subpublications;
char *subpasswordrequired;
} SubscriptionInfo;
diff --git a/src/bin/psql/describe.c b/src/bin/psql/describe.c
index 83a37ee601..78f2426d99 100644
--- a/src/bin/psql/describe.c
+++ b/src/bin/psql/describe.c
@@ -6493,7 +6493,7 @@ describeSubscriptions(const char *pattern, bool verbose)
PGresult *res;
printQueryOpt myopt = pset.popt;
static const bool translate_columns[] = {false, false, false, false,
- false, false, false, false, false, false, false, false, false};
+ false, false, false, false, false, false, false, false, false, false};
if (pset.sversion < 100000)
{
@@ -6551,9 +6551,11 @@ describeSubscriptions(const char *pattern, bool verbose)
if (pset.sversion >= 160000)
appendPQExpBuffer(&buf,
", suborigin AS \"%s\"\n"
- ", subrunasowner AS \"%s\"\n",
+ ", subrunasowner AS \"%s\"\n"
+ ", subminapplydelay AS \"%s\"\n",
gettext_noop("Origin"),
- gettext_noop("Run as Owner?"));
+ gettext_noop("Run as Owner?"),
+ gettext_noop("Min apply delay"));
appendPQExpBuffer(&buf,
", subsynccommit AS \"%s\"\n"
diff --git a/src/bin/psql/tab-complete.c b/src/bin/psql/tab-complete.c
index e38a49e8bd..881f8288a7 100644
--- a/src/bin/psql/tab-complete.c
+++ b/src/bin/psql/tab-complete.c
@@ -1925,7 +1925,7 @@ psql_completion(const char *text, int start, int end)
COMPLETE_WITH("(", "PUBLICATION");
/* ALTER SUBSCRIPTION <name> SET ( */
else if (HeadMatches("ALTER", "SUBSCRIPTION", MatchAny) && TailMatches("SET", "("))
- COMPLETE_WITH("binary", "disable_on_error", "origin", "slot_name",
+ COMPLETE_WITH("binary", "disable_on_error", "min_apply_delay", "origin", "slot_name",
"streaming", "synchronous_commit");
/* ALTER SUBSCRIPTION <name> SKIP ( */
else if (HeadMatches("ALTER", "SUBSCRIPTION", MatchAny) && TailMatches("SKIP", "("))
@@ -3268,7 +3268,7 @@ psql_completion(const char *text, int start, int end)
/* Complete "CREATE SUBSCRIPTION <name> ... WITH ( <opt>" */
else if (HeadMatches("CREATE", "SUBSCRIPTION") && TailMatches("WITH", "("))
COMPLETE_WITH("binary", "connect", "copy_data", "create_slot",
- "disable_on_error", "enabled", "origin", "slot_name",
+ "disable_on_error", "enabled", "min_apply_delay", "origin", "slot_name",
"streaming", "synchronous_commit", "two_phase");
/* CREATE TRIGGER --- is allowed inside CREATE SCHEMA, so use TailMatches */
diff --git a/src/include/catalog/pg_subscription.h b/src/include/catalog/pg_subscription.h
index 91d729d62d..649e789240 100644
--- a/src/include/catalog/pg_subscription.h
+++ b/src/include/catalog/pg_subscription.h
@@ -74,6 +74,8 @@ CATALOG(pg_subscription,6100,SubscriptionRelationId) BKI_SHARED_RELATION BKI_ROW
Oid subowner BKI_LOOKUP(pg_authid); /* Owner of the subscription */
+ int32 subminapplydelay; /* Replication apply delay (ms) */
+
bool subenabled; /* True if the subscription is enabled (the
* worker should be running) */
@@ -127,6 +129,7 @@ typedef struct Subscription
* skipped */
char *name; /* Name of the subscription */
Oid owner; /* Oid of the subscription owner */
+ int32 minapplydelay; /* Replication apply delay (ms) */
bool enabled; /* Indicates if the subscription is enabled */
bool binary; /* Indicates if the subscription wants data in
* binary format */
diff --git a/src/test/regress/expected/subscription.out b/src/test/regress/expected/subscription.out
index 9c52890f1d..31261bcd9c 100644
--- a/src/test/regress/expected/subscription.out
+++ b/src/test/regress/expected/subscription.out
@@ -115,18 +115,18 @@ CREATE SUBSCRIPTION regress_testsub4 CONNECTION 'dbname=regress_doesnotexist' PU
WARNING: subscription was created, but is not connected
HINT: To initiate replication, you must manually create the replication slot, enable the subscription, and refresh the subscription.
\dRs+ regress_testsub4
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Run as Owner? | Synchronous commit | Conninfo | Skip LSN
-------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+---------------+--------------------+-----------------------------+----------
- regress_testsub4 | regress_subscription_user | f | {testpub} | f | off | d | f | none | f | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Run as Owner? | Min apply delay | Synchronous commit | Conninfo | Skip LSN
+------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+---------------+-----------------+--------------------+-----------------------------+----------
+ regress_testsub4 | regress_subscription_user | f | {testpub} | f | off | d | f | none | f | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
ALTER SUBSCRIPTION regress_testsub4 SET (origin = any);
\dRs+ regress_testsub4
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Run as Owner? | Synchronous commit | Conninfo | Skip LSN
-------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+---------------+--------------------+-----------------------------+----------
- regress_testsub4 | regress_subscription_user | f | {testpub} | f | off | d | f | any | f | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Run as Owner? | Min apply delay | Synchronous commit | Conninfo | Skip LSN
+------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+---------------+-----------------+--------------------+-----------------------------+----------
+ regress_testsub4 | regress_subscription_user | f | {testpub} | f | off | d | f | any | f | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
DROP SUBSCRIPTION regress_testsub3;
@@ -144,10 +144,10 @@ ALTER SUBSCRIPTION regress_testsub CONNECTION 'foobar';
ERROR: invalid connection string syntax: missing "=" after "foobar" in connection info string
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Run as Owner? | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+---------------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub} | f | off | d | f | any | f | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Run as Owner? | Min apply delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+---------------+-----------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | off | d | f | any | f | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
ALTER SUBSCRIPTION regress_testsub SET PUBLICATION testpub2, testpub3 WITH (refresh = false);
@@ -166,10 +166,10 @@ ERROR: unrecognized subscription parameter: "create_slot"
-- ok
ALTER SUBSCRIPTION regress_testsub SKIP (lsn = '0/12345');
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Run as Owner? | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+---------------------+--------+-----------+------------------+------------------+--------+---------------+--------------------+------------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub2,testpub3} | f | off | d | f | any | f | off | dbname=regress_doesnotexist2 | 0/12345
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Run as Owner? | Min apply delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+---------------------+--------+-----------+------------------+------------------+--------+---------------+-----------------+--------------------+------------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub2,testpub3} | f | off | d | f | any | f | 0 | off | dbname=regress_doesnotexist2 | 0/12345
(1 row)
-- ok - with lsn = NONE
@@ -178,10 +178,10 @@ ALTER SUBSCRIPTION regress_testsub SKIP (lsn = NONE);
ALTER SUBSCRIPTION regress_testsub SKIP (lsn = '0/0');
ERROR: invalid WAL location (LSN): 0/0
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Run as Owner? | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+---------------------+--------+-----------+------------------+------------------+--------+---------------+--------------------+------------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub2,testpub3} | f | off | d | f | any | f | off | dbname=regress_doesnotexist2 | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Run as Owner? | Min apply delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+---------------------+--------+-----------+------------------+------------------+--------+---------------+-----------------+--------------------+------------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub2,testpub3} | f | off | d | f | any | f | 0 | off | dbname=regress_doesnotexist2 | 0/0
(1 row)
BEGIN;
@@ -213,10 +213,10 @@ ALTER SUBSCRIPTION regress_testsub_foo SET (synchronous_commit = foobar);
ERROR: invalid value for parameter "synchronous_commit": "foobar"
HINT: Available values: local, remote_write, remote_apply, on, off.
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Run as Owner? | Synchronous commit | Conninfo | Skip LSN
----------------------+---------------------------+---------+---------------------+--------+-----------+------------------+------------------+--------+---------------+--------------------+------------------------------+----------
- regress_testsub_foo | regress_subscription_user | f | {testpub2,testpub3} | f | off | d | f | any | f | local | dbname=regress_doesnotexist2 | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Run as Owner? | Min apply delay | Synchronous commit | Conninfo | Skip LSN
+---------------------+---------------------------+---------+---------------------+--------+-----------+------------------+------------------+--------+---------------+-----------------+--------------------+------------------------------+----------
+ regress_testsub_foo | regress_subscription_user | f | {testpub2,testpub3} | f | off | d | f | any | f | 0 | local | dbname=regress_doesnotexist2 | 0/0
(1 row)
-- rename back to keep the rest simple
@@ -245,19 +245,19 @@ CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUB
WARNING: subscription was created, but is not connected
HINT: To initiate replication, you must manually create the replication slot, enable the subscription, and refresh the subscription.
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Run as Owner? | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+---------------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub} | t | off | d | f | any | f | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Run as Owner? | Min apply delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+---------------+-----------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | t | off | d | f | any | f | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
ALTER SUBSCRIPTION regress_testsub SET (binary = false);
ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE);
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Run as Owner? | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+---------------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub} | f | off | d | f | any | f | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Run as Owner? | Min apply delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+---------------+-----------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | off | d | f | any | f | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
DROP SUBSCRIPTION regress_testsub;
@@ -269,27 +269,27 @@ CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUB
WARNING: subscription was created, but is not connected
HINT: To initiate replication, you must manually create the replication slot, enable the subscription, and refresh the subscription.
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Run as Owner? | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+---------------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub} | f | on | d | f | any | f | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Run as Owner? | Min apply delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+---------------+-----------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | on | d | f | any | f | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
ALTER SUBSCRIPTION regress_testsub SET (streaming = parallel);
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Run as Owner? | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+---------------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub} | f | parallel | d | f | any | f | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Run as Owner? | Min apply delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+---------------+-----------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | parallel | d | f | any | f | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
ALTER SUBSCRIPTION regress_testsub SET (streaming = false);
ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE);
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Run as Owner? | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+---------------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub} | f | off | d | f | any | f | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Run as Owner? | Min apply delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+---------------+-----------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | off | d | f | any | f | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
-- fail - publication already exists
@@ -304,10 +304,10 @@ ALTER SUBSCRIPTION regress_testsub ADD PUBLICATION testpub1, testpub2 WITH (refr
ALTER SUBSCRIPTION regress_testsub ADD PUBLICATION testpub1, testpub2 WITH (refresh = false);
ERROR: publication "testpub1" is already in subscription "regress_testsub"
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Run as Owner? | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-----------------------------+--------+-----------+------------------+------------------+--------+---------------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub,testpub1,testpub2} | f | off | d | f | any | f | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Run as Owner? | Min apply delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-----------------------------+--------+-----------+------------------+------------------+--------+---------------+-----------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub,testpub1,testpub2} | f | off | d | f | any | f | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
-- fail - publication used more than once
@@ -322,10 +322,10 @@ ERROR: publication "testpub3" is not in subscription "regress_testsub"
-- ok - delete publications
ALTER SUBSCRIPTION regress_testsub DROP PUBLICATION testpub1, testpub2 WITH (refresh = false);
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Run as Owner? | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+---------------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub} | f | off | d | f | any | f | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Run as Owner? | Min apply delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+---------------+-----------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | off | d | f | any | f | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
DROP SUBSCRIPTION regress_testsub;
@@ -361,10 +361,10 @@ CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUB
WARNING: subscription was created, but is not connected
HINT: To initiate replication, you must manually create the replication slot, enable the subscription, and refresh the subscription.
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Run as Owner? | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+---------------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub} | f | off | p | f | any | f | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Run as Owner? | Min apply delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+---------------+-----------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | off | p | f | any | f | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
--fail - alter of two_phase option not supported.
@@ -373,10 +373,10 @@ ERROR: unrecognized subscription parameter: "two_phase"
-- but can alter streaming when two_phase enabled
ALTER SUBSCRIPTION regress_testsub SET (streaming = true);
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Run as Owner? | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+---------------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub} | f | on | p | f | any | f | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Run as Owner? | Min apply delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+---------------+-----------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | on | p | f | any | f | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE);
@@ -386,10 +386,10 @@ CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUB
WARNING: subscription was created, but is not connected
HINT: To initiate replication, you must manually create the replication slot, enable the subscription, and refresh the subscription.
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Run as Owner? | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+---------------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub} | f | on | p | f | any | f | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Run as Owner? | Min apply delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+---------------+-----------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | on | p | f | any | f | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE);
@@ -402,18 +402,18 @@ CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUB
WARNING: subscription was created, but is not connected
HINT: To initiate replication, you must manually create the replication slot, enable the subscription, and refresh the subscription.
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Run as Owner? | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+---------------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub} | f | off | d | f | any | f | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Run as Owner? | Min apply delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+---------------+-----------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | off | d | f | any | f | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
ALTER SUBSCRIPTION regress_testsub SET (disable_on_error = true);
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Run as Owner? | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+---------------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub} | f | off | d | t | any | f | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Run as Owner? | Min apply delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+---------------+-----------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | off | d | t | any | f | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE);
@@ -464,6 +464,43 @@ ERROR: permission denied for database regression
ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE);
DROP SUBSCRIPTION regress_testsub;
RESET SESSION AUTHORIZATION;
+-- fail -- min_apply_delay must be a non-negative integer
+CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUBLICATION testpub WITH (connect = false, min_apply_delay = foo);
+ERROR: invalid value for parameter "min_apply_delay": "foo"
+CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUBLICATION testpub WITH (connect = false, min_apply_delay = -1);
+ERROR: -1 ms is outside the valid range for parameter "min_apply_delay" (0 .. 2147483647)
+-- fail - utilizing streaming = parallel with time-delayed replication is not supported
+CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUBLICATION testpub WITH (connect = false, streaming = parallel, min_apply_delay = 123);
+ERROR: min_apply_delay > 0 and streaming = parallel are mutually exclusive options
+-- success -- min_apply_delay value without unit is taken as milliseconds
+CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUBLICATION testpub WITH (connect = false, min_apply_delay = 123);
+WARNING: subscription was created, but is not connected
+HINT: To initiate replication, you must manually create the replication slot, enable the subscription, and refresh the subscription.
+\dRs+
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Run as Owner? | Min apply delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+--------+---------+-------------+--------+-----------+------------------+------------------+--------+---------------+-----------------+--------------------+-----------------------------+----------
+ regress_testsub | hayato | f | {testpub} | f | off | d | f | any | f | 123 | off | dbname=regress_doesnotexist | 0/0
+(1 row)
+
+-- success -- min_apply_delay value with unit is converted into ms and stored as an integer
+ALTER SUBSCRIPTION regress_testsub SET (min_apply_delay = '1 d');
+\dRs+
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Run as Owner? | Min apply delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+--------+---------+-------------+--------+-----------+------------------+------------------+--------+---------------+-----------------+--------------------+-----------------------------+----------
+ regress_testsub | hayato | f | {testpub} | f | off | d | f | any | f | 86400000 | off | dbname=regress_doesnotexist | 0/0
+(1 row)
+
+-- fail - alter subscription with streaming = parallel should fail when time-delayed replication is set
+ALTER SUBSCRIPTION regress_testsub SET (streaming = parallel);
+ERROR: cannot set parallel streaming mode for subscription with min_apply_delay
+-- fail - alter subscription with min_apply_delay should fail when streaming = parallel is set
+ALTER SUBSCRIPTION regress_testsub SET (min_apply_delay = 0, streaming = parallel);
+ALTER SUBSCRIPTION regress_testsub SET (min_apply_delay = 123);
+ERROR: cannot set min_apply_delay for subscription in parallel streaming mode
+ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE);
+DROP SUBSCRIPTION regress_testsub;
DROP ROLE regress_subscription_user;
DROP ROLE regress_subscription_user2;
DROP ROLE regress_subscription_user3;
diff --git a/src/test/regress/sql/subscription.sql b/src/test/regress/sql/subscription.sql
index cc53458d91..d5ad91a96f 100644
--- a/src/test/regress/sql/subscription.sql
+++ b/src/test/regress/sql/subscription.sql
@@ -333,6 +333,31 @@ ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE);
DROP SUBSCRIPTION regress_testsub;
RESET SESSION AUTHORIZATION;
+
+-- fail -- min_apply_delay must be a non-negative integer
+CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUBLICATION testpub WITH (connect = false, min_apply_delay = foo);
+CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUBLICATION testpub WITH (connect = false, min_apply_delay = -1);
+
+-- fail - utilizing streaming = parallel with time-delayed replication is not supported
+CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUBLICATION testpub WITH (connect = false, streaming = parallel, min_apply_delay = 123);
+
+-- success -- min_apply_delay value without unit is taken as milliseconds
+CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUBLICATION testpub WITH (connect = false, min_apply_delay = 123);
+\dRs+
+
+-- success -- min_apply_delay value with unit is converted into ms and stored as an integer
+ALTER SUBSCRIPTION regress_testsub SET (min_apply_delay = '1 d');
+\dRs+
+
+-- fail - alter subscription with streaming = parallel should fail when time-delayed replication is set
+ALTER SUBSCRIPTION regress_testsub SET (streaming = parallel);
+
+-- fail - alter subscription with min_apply_delay should fail when streaming = parallel is set
+ALTER SUBSCRIPTION regress_testsub SET (min_apply_delay = 0, streaming = parallel);
+ALTER SUBSCRIPTION regress_testsub SET (min_apply_delay = 123);
+ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE);
+DROP SUBSCRIPTION regress_testsub;
+
DROP ROLE regress_subscription_user;
DROP ROLE regress_subscription_user2;
DROP ROLE regress_subscription_user3;
diff --git a/src/test/subscription/t/001_rep_changes.pl b/src/test/subscription/t/001_rep_changes.pl
index 91aa068c95..01f2c4284d 100644
--- a/src/test/subscription/t/001_rep_changes.pl
+++ b/src/test/subscription/t/001_rep_changes.pl
@@ -515,6 +515,37 @@ $node_publisher->poll_query_until('postgres',
or die
"Timed out while waiting for apply to restart after renaming SUBSCRIPTION";
+# Test time-delayed logical replication
+#
+# If the subscription sets min_apply_delay parameter, the logical replication
+# worker will delay the transaction apply for min_apply_delay milliseconds. We
+# verify this by looking at the time difference between a) when tuples are
+# inserted on the publisher, and b) when those changes are replicated on the
+# subscriber. Even on slow machines, this strategy will give predictable behavior.
+
+# Set min_apply_delay parameter to 3 seconds
+my $delay = 3;
+$node_subscriber->safe_psql('postgres',
+ "ALTER SUBSCRIPTION tap_sub_renamed SET (min_apply_delay = '${delay}s')");
+
+# Before doing the insertion, get the current timestamp that will be
+# used as a comparison base.
+my $publisher_insert_time = time();
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_ins VALUES (generate_series(1101, 1120))");
+
+$node_subscriber->poll_query_until('postgres',
+ "SELECT count(*) = 1 FROM tab_ins WHERE a = 1120;"
+ )
+ or die
+ "failed to replicate changes";
+
+# This test is successful if and only if the LSN has been applied with at least
+# the configured apply delay.
+ok( time() - $publisher_insert_time >= $delay,
+ "subscriber applies WAL only after replication delay for non-streaming transaction"
+);
+
# check all the cleanup
$node_subscriber->safe_psql('postgres', "DROP SUBSCRIPTION tap_sub_renamed");
--
2.27.0
Dear hackers,
I have rebased an update the PoC. Please see attached.
In [1]/messages/by-id/TYAPR01MB5866D871F60DDFD8FAA2CDE4F5BD9@TYAPR01MB5866.jpnprd01.prod.outlook.com, I wrote:
### Restore from files
To check the elapsed time from the commit, all commit_time of delayed transactions
must be stored in the memory. Basically it can store when the worker handle COMMIT
message, but it must do special treatment for restarting.
When an apply worker receives COMMIT/PREPARE/COMMIT PREPARED message, it writes
the message, flush them, and cache the commit_time. When worker restarts, it open
files, check the final message (this is done by seeking some bytes from end of
the file), and then cache the written commit_time.
But I have been thinking that this spec is terrible. Therefore, I have implemented
new approach which uses the its filename for restoring when it is commit. Followings
are the summary.
When a worker receives a BEGIN message, it creates a new file and writes its
changes to it. The filename contains the following dash-separated components:
1. Subscription OID
2. XID of the delayed transaction on the publisher
3. Status of the delaying transaction
4. Upper 32 bits of the commit_lsn
5. Lower 32 bits of the commit_lsn
6. Upper 32 bits of the end_lsn
7. Lower 32 bits of the end_lsn
8. Commit time
At the beginning, the new file contains components 4-8 as 0 because the worker
does not know their values. When it receives a COMMIT message, the changes are
written to the permanent file, and the file is renamed to an appropriate value.
While restarting, the worker reads the directory containing the files and caches
their commit time into memory from the filenames. Files do not need to be opened
at this point. Therefore, PREPARE/COMMIT PREPARED messages are no longer written
into the file. The status of transactions can be distinguished from the filename.
Another notable change is the addition of a replication option. If the
min_apply_delay is greater than 0, a new parameter called "require_schema" is
passed via START_REPICATION command. When "require_schema" is enabled, the publisher
sends its schema (RELATION and TYPE messages) every time it sends decoded DMLs.
This is necessary because delayed transactions may be applied after the subscriber
is restarted, and the LogicalRepRelMap hash is destroyed at that time. If the
RELATION message is not written into the delayed file, and the worker restarts
just before applying the transaction, it will fail to open the local relation
and display an error message: "ERROR: no relation map entry".
And some small bugs were also fixed.
[1]: /messages/by-id/TYAPR01MB5866D871F60DDFD8FAA2CDE4F5BD9@TYAPR01MB5866.jpnprd01.prod.outlook.com
Best Regards,
Hayato Kuroda
FUJITSU LIMITED
Attachments:
v4-0001-WIP-Time-delayed-logical-replication-by-serializi.patchapplication/octet-stream; name=v4-0001-WIP-Time-delayed-logical-replication-by-serializi.patchDownload
From b5691915daf748daa5b8e57eea2cb344e684cbca Mon Sep 17 00:00:00 2001
From: Hayato Kuroda <kuroda.hayato@fujitsu.com>
Date: Wed, 19 Apr 2023 09:26:12 +0000
Subject: [PATCH v4] (WIP) Time-delayed logical replication by serializing
changes
Similar to physical replication, a time-delayed copy of the data for
logical replication is useful for some scenarios (particularly to fix
errors that might cause data loss).
This patch implements a new subscription parameter called 'min_apply_delay'.
If the subscription sets min_apply_delay parameter, the logical
replication worker will delay the transaction apply for min_apply_delay
milliseconds.
The delaying is implemented by serializing changes into file. The file is
created when the worker receives BEGIN message. The worker writes received
changes and flush at COMMIT. The delayed transaction is checked its commit time
for every main loop, and applied from the file when the time exceeds the
min_apply_delay. The commit time is stored in memory when the transaction is
committed, or the worker restarts.
The delay is calculated between the WAL time stamp and the current time
on the subscriber.
The combination of parallel streaming mode and min_apply_delay is not
allowed. This is because in parallel streaming mode, we start applying
the transaction stream as soon as the first change arrives without
knowing the transaction's prepare/commit time. This means we cannot
calculate the underlying network/decoding lag between publisher and
subscriber, and so always waiting for the full 'min_apply_delay' period
might include unnecessary delay.
The other possibility was to apply the delay at the end of the parallel
apply transaction but that would cause issues related to resource
bloat and locks being held for a long time.
Currently the combination of skip transaction feature and min_apply_delay
does not work well.
Earlier versions were written by Euler Taveira, Takamichi Osumi, and Kuroda Hayato
Author: Kuroda Hayato, Takamichi Osumi
---
doc/src/sgml/catalogs.sgml | 9 +
doc/src/sgml/glossary.sgml | 15 +
doc/src/sgml/logical-replication.sgml | 7 +
doc/src/sgml/ref/alter_subscription.sgml | 5 +-
doc/src/sgml/ref/create_subscription.sgml | 47 +-
src/backend/catalog/pg_subscription.c | 1 +
src/backend/catalog/system_views.sql | 6 +-
src/backend/commands/subscriptioncmds.c | 123 +-
.../libpqwalreceiver/libpqwalreceiver.c | 4 +
src/backend/replication/logical/worker.c | 1058 +++++++++++++++--
src/backend/replication/pgoutput/pgoutput.c | 21 +-
src/bin/pg_dump/pg_dump.c | 13 +-
src/bin/pg_dump/pg_dump.h | 1 +
src/bin/psql/describe.c | 8 +-
src/bin/psql/tab-complete.c | 13 +-
src/include/catalog/pg_subscription.h | 3 +
src/include/replication/pgoutput.h | 1 +
src/include/replication/walreceiver.h | 1 +
src/test/regress/expected/subscription.out | 182 +--
src/test/regress/sql/subscription.sql | 27 +
src/test/subscription/t/001_rep_changes.pl | 31 +
21 files changed, 1385 insertions(+), 191 deletions(-)
diff --git a/doc/src/sgml/catalogs.sgml b/doc/src/sgml/catalogs.sgml
index 5240840552..35a7b6a9e8 100644
--- a/doc/src/sgml/catalogs.sgml
+++ b/doc/src/sgml/catalogs.sgml
@@ -7891,6 +7891,15 @@ SCRAM-SHA-256$<replaceable><iteration count></replaceable>:<replaceable>&l
</para></entry>
</row>
+ <row>
+ <entry role="catalog_table_entry"><para role="column_definition">
+ <structfield>subminapplydelay</structfield> <type>int4</type>
+ </para>
+ <para>
+ The minimum delay, in milliseconds, for applying changes
+ </para></entry>
+ </row>
+
<row>
<entry role="catalog_table_entry"><para role="column_definition">
<structfield>subname</structfield> <type>name</type>
diff --git a/doc/src/sgml/glossary.sgml b/doc/src/sgml/glossary.sgml
index 29bf1873bd..204fe7f3ae 100644
--- a/doc/src/sgml/glossary.sgml
+++ b/doc/src/sgml/glossary.sgml
@@ -1757,6 +1757,21 @@
</glossdef>
</glossentry>
+ <glossentry id="glossary-time-delayed-replication">
+ <glossterm>Time-delayed replication</glossterm>
+ <glossdef>
+ <para>
+ Replication setup that delays the application of changes by a specified
+ minimum time-delay period.
+ </para>
+ <para>
+ For more information, see
+ <xref linkend="guc-recovery-min-apply-delay"/> for physical replication
+ and <xref linkend="sql-createsubscription"/> for logical replication.
+ </para>
+ </glossdef>
+ </glossentry>
+
<glossentry id="glossary-toast">
<glossterm>TOAST</glossterm>
<glossdef>
diff --git a/doc/src/sgml/logical-replication.sgml b/doc/src/sgml/logical-replication.sgml
index c65f4aabfd..0be4d652aa 100644
--- a/doc/src/sgml/logical-replication.sgml
+++ b/doc/src/sgml/logical-replication.sgml
@@ -257,6 +257,13 @@
option of <command>CREATE SUBSCRIPTION</command> for details.
</para>
+ <para>
+ A subscription can delay the application of changes by specifying the
+ <link linkend="sql-createsubscription-with-min-apply-delay"><literal>min_apply_delay</literal></link>
+ subscription parameter. See <xref linkend="sql-createsubscription"/> for
+ details.
+ </para>
+
<sect2 id="logical-replication-subscription-slot">
<title>Replication Slot Management</title>
diff --git a/doc/src/sgml/ref/alter_subscription.sgml b/doc/src/sgml/ref/alter_subscription.sgml
index a85e04e4d6..bf6c5fe7f0 100644
--- a/doc/src/sgml/ref/alter_subscription.sgml
+++ b/doc/src/sgml/ref/alter_subscription.sgml
@@ -225,8 +225,9 @@ ALTER SUBSCRIPTION <replaceable class="parameter">name</replaceable> RENAME TO <
<link linkend="sql-createsubscription-with-streaming"><literal>streaming</literal></link>,
<link linkend="sql-createsubscription-with-disable-on-error"><literal>disable_on_error</literal></link>,
<link linkend="sql-createsubscription-with-password-required"><literal>password_required</literal></link>,
- <link linkend="sql-createsubscription-with-run-as-owner"><literal>run_as_owner</literal></link>, and
- <link linkend="sql-createsubscription-with-origin"><literal>origin</literal></link>.
+ <link linkend="sql-createsubscription-with-run-as-owner"><literal>run_as_owner</literal></link>,
+ <link linkend="sql-createsubscription-with-origin"><literal>origin</literal></link>, and
+ <link linkend="sql-createsubscription-with-min-apply-delay"><literal>min_apply_delay</literal></link>.
Only a superuser can set <literal>password_required = false</literal>.
</para>
</listitem>
diff --git a/doc/src/sgml/ref/create_subscription.sgml b/doc/src/sgml/ref/create_subscription.sgml
index 71652fd918..a3ac91f27b 100644
--- a/doc/src/sgml/ref/create_subscription.sgml
+++ b/doc/src/sgml/ref/create_subscription.sgml
@@ -399,7 +399,47 @@ CREATE SUBSCRIPTION <replaceable class="parameter">subscription_name</replaceabl
</para>
</listitem>
</varlistentry>
- </variablelist></para>
+
+ <varlistentry id="sql-createsubscription-with-min-apply-delay">
+ <term><literal>min_apply_delay</literal> (<type>integer</type>)</term>
+ <listitem>
+ <para>
+ By default, the subscriber applies changes as soon as possible. This
+ parameter allows the user to delay the application of changes by a
+ given time period. This is done by writing all the changes into a
+ file once and apply contents after spending time. If the value is
+ specified without units, it is taken as milliseconds. The default
+ is zero (no delay). See <xref linkend="config-setting-names-values"/>
+ for details on the available valid time units.
+ </para>
+ <para>
+ Any delay becomes effective only after all initial table
+ synchronization has finished and occurs before each transaction starts
+ to get applied on the subscriber. The delay is calculated as the
+ difference between the WAL timestamp as written on the publisher and
+ the current time on the subscriber. Any overhead of time spent in
+ logical decoding and in transferring the transaction may reduce the
+ actual wait time. Even if the overhead already exceeds the requested
+ <literal>min_apply_delay</literal> value, all the changes are written
+ into file and applied immediately. If the system clocks on publisher
+ and subscriber are not synchronized, this may lead to apply changes
+ earlier than expected, but this is not a major issue because this
+ parameter is typically much larger than the time deviations between
+ servers.
+ </para>
+ <warning>
+ <para>
+ Delaying the replication means there is a much longer time between
+ making a change on the publisher, and that change being committed
+ on the subscriber. This can impact the performance of synchronous
+ replication. See <xref linkend="guc-synchronous-commit"/>
+ parameter.
+ </para>
+ </warning>
+ </listitem>
+ </varlistentry>
+ </variablelist>
+ </para>
</listitem>
</varlistentry>
@@ -472,6 +512,11 @@ CREATE SUBSCRIPTION <replaceable class="parameter">subscription_name</replaceabl
published with different column lists are not supported.
</para>
+ <para>
+ A non-zero <literal>min_apply_delay</literal> parameter is not allowed when
+ streaming in parallel mode.
+ </para>
+
<para>
We allow non-existent publications to be specified so that users can add
those later. This means
diff --git a/src/backend/catalog/pg_subscription.c b/src/backend/catalog/pg_subscription.c
index d07f88ce28..56f8fdda10 100644
--- a/src/backend/catalog/pg_subscription.c
+++ b/src/backend/catalog/pg_subscription.c
@@ -66,6 +66,7 @@ GetSubscription(Oid subid, bool missing_ok)
sub->skiplsn = subform->subskiplsn;
sub->name = pstrdup(NameStr(subform->subname));
sub->owner = subform->subowner;
+ sub->minapplydelay = subform->subminapplydelay;
sub->enabled = subform->subenabled;
sub->binary = subform->subbinary;
sub->stream = subform->substream;
diff --git a/src/backend/catalog/system_views.sql b/src/backend/catalog/system_views.sql
index 2129c916aa..aeac1d8064 100644
--- a/src/backend/catalog/system_views.sql
+++ b/src/backend/catalog/system_views.sql
@@ -1324,9 +1324,9 @@ REVOKE ALL ON pg_replication_origin_status FROM public;
-- All columns of pg_subscription except subconninfo are publicly readable.
REVOKE ALL ON pg_subscription FROM public;
-GRANT SELECT (oid, subdbid, subskiplsn, subname, subowner, subenabled,
- subbinary, substream, subtwophasestate, subdisableonerr,
- subpasswordrequired, subrunasowner,
+GRANT SELECT (oid, subdbid, subskiplsn, subminapplydelay, subname, subowner,
+ subenabled, subbinary, substream, subtwophasestate,
+ subdisableonerr, subpasswordrequired, subrunasowner,
subslotname, subsynccommit, subpublications, suborigin)
ON pg_subscription TO public;
diff --git a/src/backend/commands/subscriptioncmds.c b/src/backend/commands/subscriptioncmds.c
index 3251d89ba8..aaa2065311 100644
--- a/src/backend/commands/subscriptioncmds.c
+++ b/src/backend/commands/subscriptioncmds.c
@@ -71,6 +71,7 @@
#define SUBOPT_RUN_AS_OWNER 0x00001000
#define SUBOPT_LSN 0x00002000
#define SUBOPT_ORIGIN 0x00004000
+#define SUBOPT_MIN_APPLY_DELAY 0x00008000
/* check if the 'val' has 'bits' set */
#define IsSet(val, bits) (((val) & (bits)) == (bits))
@@ -97,6 +98,7 @@ typedef struct SubOpts
bool runasowner;
char *origin;
XLogRecPtr lsn;
+ int32 min_apply_delay;
} SubOpts;
static List *fetch_table_list(WalReceiverConn *wrconn, List *publications);
@@ -107,7 +109,7 @@ static void check_publications_origin(WalReceiverConn *wrconn,
static void check_duplicates_in_publist(List *publist, Datum *datums);
static List *merge_publications(List *oldpublist, List *newpublist, bool addpub, const char *subname);
static void ReportSlotConnectionError(List *rstates, Oid subid, char *slotname, char *err);
-
+static int32 defGetMinApplyDelay(DefElem *def);
/*
* Common option parsing function for CREATE and ALTER SUBSCRIPTION commands.
@@ -157,6 +159,8 @@ parse_subscription_options(ParseState *pstate, List *stmt_options,
opts->runasowner = false;
if (IsSet(supported_opts, SUBOPT_ORIGIN))
opts->origin = pstrdup(LOGICALREP_ORIGIN_ANY);
+ if (IsSet(supported_opts, SUBOPT_MIN_APPLY_DELAY))
+ opts->min_apply_delay = 0;
/* Parse options */
foreach(lc, stmt_options)
@@ -353,6 +357,15 @@ parse_subscription_options(ParseState *pstate, List *stmt_options,
opts->specified_opts |= SUBOPT_LSN;
opts->lsn = lsn;
}
+ else if (IsSet(supported_opts, SUBOPT_MIN_APPLY_DELAY) &&
+ strcmp(defel->defname, "min_apply_delay") == 0)
+ {
+ if (IsSet(opts->specified_opts, SUBOPT_MIN_APPLY_DELAY))
+ errorConflictingDefElem(defel, pstate);
+
+ opts->specified_opts |= SUBOPT_MIN_APPLY_DELAY;
+ opts->min_apply_delay = defGetMinApplyDelay(defel);
+ }
else
ereport(ERROR,
(errcode(ERRCODE_SYNTAX_ERROR),
@@ -433,6 +446,32 @@ parse_subscription_options(ParseState *pstate, List *stmt_options,
"slot_name = NONE", "create_slot = false")));
}
}
+
+ /*
+ * The combination of parallel streaming mode and min_apply_delay is not
+ * allowed. This is because in parallel streaming mode, we start applying
+ * the transaction stream as soon as the first change arrives without
+ * knowing the transaction's prepare/commit time. This means we cannot
+ * calculate the underlying network/decoding lag between publisher and
+ * subscriber, and so always waiting for the full 'min_apply_delay' period
+ * might include unnecessary delay.
+ *
+ * The other possibility was to apply the delay at the end of the parallel
+ * apply transaction but that would cause issues related to resource bloat
+ * and locks being held for a long time.
+ */
+ if (IsSet(supported_opts, SUBOPT_MIN_APPLY_DELAY) &&
+ opts->min_apply_delay > 0 &&
+ opts->streaming == LOGICALREP_STREAM_PARALLEL)
+ ereport(ERROR,
+ errcode(ERRCODE_SYNTAX_ERROR),
+
+ /*
+ * translator: the first %s is a string of the form "parameter > 0"
+ * and the second one is "option = value".
+ */
+ errmsg("%s and %s are mutually exclusive options",
+ "min_apply_delay > 0", "streaming = parallel"));
}
/*
@@ -591,7 +630,8 @@ CreateSubscription(ParseState *pstate, CreateSubscriptionStmt *stmt,
SUBOPT_SYNCHRONOUS_COMMIT | SUBOPT_BINARY |
SUBOPT_STREAMING | SUBOPT_TWOPHASE_COMMIT |
SUBOPT_DISABLE_ON_ERR | SUBOPT_PASSWORD_REQUIRED |
- SUBOPT_RUN_AS_OWNER | SUBOPT_ORIGIN);
+ SUBOPT_RUN_AS_OWNER | SUBOPT_ORIGIN |
+ SUBOPT_MIN_APPLY_DELAY);
parse_subscription_options(pstate, stmt->options, supported_opts, &opts);
/*
@@ -682,6 +722,7 @@ CreateSubscription(ParseState *pstate, CreateSubscriptionStmt *stmt,
values[Anum_pg_subscription_oid - 1] = ObjectIdGetDatum(subid);
values[Anum_pg_subscription_subdbid - 1] = ObjectIdGetDatum(MyDatabaseId);
values[Anum_pg_subscription_subskiplsn - 1] = LSNGetDatum(InvalidXLogRecPtr);
+ values[Anum_pg_subscription_subminapplydelay - 1] = Int32GetDatum(opts.min_apply_delay);
values[Anum_pg_subscription_subname - 1] =
DirectFunctionCall1(namein, CStringGetDatum(stmt->subname));
values[Anum_pg_subscription_subowner - 1] = ObjectIdGetDatum(owner);
@@ -1130,7 +1171,8 @@ AlterSubscription(ParseState *pstate, AlterSubscriptionStmt *stmt,
SUBOPT_SYNCHRONOUS_COMMIT | SUBOPT_BINARY |
SUBOPT_STREAMING | SUBOPT_DISABLE_ON_ERR |
SUBOPT_PASSWORD_REQUIRED |
- SUBOPT_RUN_AS_OWNER | SUBOPT_ORIGIN);
+ SUBOPT_RUN_AS_OWNER | SUBOPT_ORIGIN |
+ SUBOPT_MIN_APPLY_DELAY);
parse_subscription_options(pstate, stmt->options,
supported_opts, &opts);
@@ -1174,6 +1216,19 @@ AlterSubscription(ParseState *pstate, AlterSubscriptionStmt *stmt,
if (IsSet(opts.specified_opts, SUBOPT_STREAMING))
{
+ /*
+ * The combination of parallel streaming mode and
+ * min_apply_delay is not allowed. See
+ * parse_subscription_options.
+ */
+ if (opts.streaming == LOGICALREP_STREAM_PARALLEL &&
+ !IsSet(opts.specified_opts, SUBOPT_MIN_APPLY_DELAY)
+ && sub->minapplydelay > 0)
+ ereport(ERROR,
+ errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
+ errmsg("cannot set parallel streaming mode for subscription with %s",
+ "min_apply_delay"));
+
values[Anum_pg_subscription_substream - 1] =
CharGetDatum(opts.streaming);
replaces[Anum_pg_subscription_substream - 1] = true;
@@ -1202,6 +1257,26 @@ AlterSubscription(ParseState *pstate, AlterSubscriptionStmt *stmt,
= true;
}
+ if (IsSet(opts.specified_opts, SUBOPT_MIN_APPLY_DELAY))
+ {
+ /*
+ * The combination of parallel streaming mode and
+ * min_apply_delay is not allowed. See
+ * parse_subscription_options.
+ */
+ if (opts.min_apply_delay > 0 &&
+ !IsSet(opts.specified_opts, SUBOPT_STREAMING)
+ && sub->stream == LOGICALREP_STREAM_PARALLEL)
+ ereport(ERROR,
+ errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
+ errmsg("cannot set %s for subscription in parallel streaming mode",
+ "min_apply_delay"));
+
+ values[Anum_pg_subscription_subminapplydelay - 1] =
+ Int32GetDatum(opts.min_apply_delay);
+ replaces[Anum_pg_subscription_subminapplydelay - 1] = true;
+ }
+
if (IsSet(opts.specified_opts, SUBOPT_ORIGIN))
{
values[Anum_pg_subscription_suborigin - 1] =
@@ -2343,3 +2418,45 @@ defGetStreamingMode(DefElem *def)
def->defname)));
return LOGICALREP_STREAM_OFF; /* keep compiler quiet */
}
+
+/*
+ * Extract the min_apply_delay value from a DefElem. This is very similar to
+ * parse_and_validate_value() for integer values, because min_apply_delay
+ * accepts the same parameter format as recovery_min_apply_delay.
+ */
+static int32
+defGetMinApplyDelay(DefElem *def)
+{
+ char *input_string;
+ int result;
+ const char *hintmsg;
+
+ input_string = defGetString(def);
+
+ /*
+ * Parse given string as parameter which has millisecond unit
+ */
+ if (!parse_int(input_string, &result, GUC_UNIT_MS, &hintmsg))
+ ereport(ERROR,
+ (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+ errmsg("invalid value for parameter \"%s\": \"%s\"",
+ "min_apply_delay", input_string),
+ hintmsg ? errhint("%s", _(hintmsg)) : 0));
+
+ /*
+ * Check both the lower boundary for the valid min_apply_delay range and
+ * the upper boundary as the safeguard for some platforms where INT_MAX is
+ * wider than int32 respectively. Although parse_int() has confirmed that
+ * the result is less than or equal to INT_MAX, the value will be stored
+ * in a catalog column of int32.
+ */
+ if (result < 0 || result > PG_INT32_MAX)
+ ereport(ERROR,
+ (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+ errmsg("%d ms is outside the valid range for parameter \"%s\" (%d .. %d)",
+ result,
+ "min_apply_delay",
+ 0, PG_INT32_MAX)));
+
+ return result;
+}
diff --git a/src/backend/replication/libpqwalreceiver/libpqwalreceiver.c b/src/backend/replication/libpqwalreceiver/libpqwalreceiver.c
index 052505e46f..0fb073c2c1 100644
--- a/src/backend/replication/libpqwalreceiver/libpqwalreceiver.c
+++ b/src/backend/replication/libpqwalreceiver/libpqwalreceiver.c
@@ -470,6 +470,10 @@ libpqrcv_startstreaming(WalReceiverConn *conn,
appendStringInfo(&cmd, ", origin '%s'",
options->proto.logical.origin);
+ if (options->proto.logical.require_schema &&
+ PQserverVersion(conn->streamConn) >= 160000)
+ appendStringInfo(&cmd, ", require_schema 'on'");
+
pubnames = options->proto.logical.publication_names;
pubnames_str = stringlist_to_identifierstr(conn->streamConn, pubnames);
if (!pubnames_str)
diff --git a/src/backend/replication/logical/worker.c b/src/backend/replication/logical/worker.c
index 3d58910c14..f7ee87d188 100644
--- a/src/backend/replication/logical/worker.c
+++ b/src/backend/replication/logical/worker.c
@@ -153,6 +153,7 @@
#include "catalog/pg_subscription.h"
#include "catalog/pg_subscription_rel.h"
#include "catalog/pg_tablespace.h"
+#include "common/file_utils.h"
#include "commands/tablecmds.h"
#include "commands/tablespace.h"
#include "commands/trigger.h"
@@ -370,6 +371,59 @@ typedef struct ApplySubXactData
static ApplySubXactData subxact_data = {0, 0, InvalidTransactionId, NULL};
+/* XXX macros for time-delayed logical replicaiton */
+
+/* DELAYED_DIR stores files that contains changes of delayed transactions. */
+#define DELAYED_DIR "pg_logical/delayed_txns"
+
+/*
+ * The filename consists of the following, dash separated, components:
+ * 1) subscription oid
+ * 2) xid of delayed transaction on publisher
+ * 3) status of the delaying transaction
+ * 4) upper 32bit of the commit_lsn
+ * 5) lower 32bit of the commit_lsn
+ * 6) upper 32bit of the end_lsn
+ * 7) lower 32bit of the end_lsn
+ * 8) committime
+ */
+#define DELAYED_FORMAT "delayed-%x-%x-%c-%X-%X-%X-%X-" INT64_FORMAT
+#define DELAYED_TXN_COMMITTED 'c'
+#define DELAYED_TXN_PREPARED 'p'
+#define DELAYED_TXN_UNKNOWN 'u'
+
+/* List entry to map xid and commit time */
+typedef struct DelayedTxnListEntry
+{
+ TransactionId xid;
+ LogicalRepCommitData commit_data;
+} DelayedTxnListEntry;
+
+/*
+ * An entry is appended when the we receives commit message and time-delayed
+ * logical replication is requested. The entry will be deleted after contents
+ * are applied.
+ */
+static List *DelayedTxnList = NIL;
+
+/* fields valid only when time-delayed logical replication is requested */
+static bool in_delayed_transaction = false;
+
+static TransactionId delayed_xid = InvalidTransactionId;
+
+/*
+ * Store flushed lsn for time-delayed logical replication. This is used when
+ * we send a feedback message to the publisher.
+ */
+static XLogRecPtr last_flushed = InvalidXLogRecPtr;
+
+/*
+ * FIXME: global file descriptor may be not sufficient. There is a possibility
+ * that non-streaming transactions are come concurrently. At that time
+ * create_delay_file() for the second transaction will be failed...
+ */
+static int delayed_fd = -1;
+
static inline void subxact_filename(char *path, Oid subid, TransactionId xid);
static inline void changes_filename(char *path, Oid subid, TransactionId xid);
@@ -432,6 +486,534 @@ static inline void reset_apply_error_context_info(void);
static TransApplyAction get_transaction_apply_action(TransactionId xid,
ParallelApplyWorkerInfo **winfo);
+static void begin_replication_step(void);
+static void end_replication_step(void);
+
+/* Functions for time-delayed logical replicaiton */
+static void cache_commit_data(LogicalRepCommitData *commit_data, TransactionId xid);
+static void flush_delayed_changes(LogicalRepCommitData *commit_data);
+static void delay_file_name(char *path, Oid subid, TransactionId xid,
+ char status, XLogRecPtr commit_lsn,
+ XLogRecPtr end_lsn, TimestampTz committime);
+static bool is_given_transaction_delayed(Oid subid, TransactionId xid);
+static void create_delay_file(TransactionId xid);
+static bool handle_delayed_transaction(char action, StringInfo s);
+
+/*
+ * Cache commit_data into the list
+ */
+static void
+cache_commit_data(LogicalRepCommitData *commit_data, TransactionId xid)
+{
+ MemoryContext old;
+ DelayedTxnListEntry *entry;
+
+ old = MemoryContextSwitchTo(ApplyContext);
+
+ entry = palloc0(sizeof(DelayedTxnListEntry));
+
+ /* Contruct an entry and append it */
+ entry->xid = xid;
+ memcpy(&entry->commit_data, commit_data, sizeof(LogicalRepCommitData));
+ DelayedTxnList = lappend(DelayedTxnList, entry);
+
+ MemoryContextSwitchTo(old);
+
+ elog(DEBUG1, "transaction %u is cached", xid);
+
+}
+
+/*
+ * Flush given changes, rename and close the file. This will be called at the
+ * end of the transaction.
+ */
+static void
+flush_delayed_changes(LogicalRepCommitData *commit_data)
+{
+ char old_path[MAXPGPATH];
+ char new_path[MAXPGPATH];
+
+ Assert(delayed_fd > 0);
+ Assert(TransactionIdIsValid(delayed_xid));
+
+ /* Cache given commit_data into the list */
+ cache_commit_data(commit_data, delayed_xid);
+
+ /*
+ * Close file. No need to flush here because it will be done in
+ * durable_rename().
+ */
+ close(delayed_fd);
+
+ /* Construct old/new filename */
+ delay_file_name(old_path, MyLogicalRepWorker->subid, delayed_xid,
+ DELAYED_TXN_UNKNOWN, InvalidXLogRecPtr, InvalidXLogRecPtr,
+ 0);
+ delay_file_name(new_path, MyLogicalRepWorker->subid, delayed_xid,
+ DELAYED_TXN_COMMITTED, commit_data->commit_lsn,
+ commit_data->end_lsn, commit_data->committime);
+
+ /* And do actual rename */
+ if (durable_rename(old_path, new_path, PANIC))
+ abort();
+
+ /* Store flushed lsn */
+ last_flushed = commit_data->end_lsn;
+
+ /* Cleanup */
+ delayed_fd = -1;
+ delayed_xid = InvalidTransactionId;
+ in_delayed_transaction = false;
+}
+
+/*
+ * Get formal filename from needed information
+ */
+static void
+delay_file_name(char *path, Oid subid, TransactionId xid, char status,
+ XLogRecPtr commit_lsn, XLogRecPtr end_lsn,
+ TimestampTz committime)
+{
+ snprintf(path, MAXPGPATH, DELAYED_DIR "/" DELAYED_FORMAT, subid, xid,
+ status, LSN_FORMAT_ARGS(commit_lsn), LSN_FORMAT_ARGS(end_lsn),
+ committime);
+}
+
+/*
+ * Check whether the given transaction is delayed. This is done by checking the
+ * delay file.
+ */
+static bool
+is_given_transaction_delayed(Oid subid, TransactionId xid)
+{
+ struct stat st;
+ char path[MAXPGPATH];
+
+ delay_file_name(path, subid, xid, DELAYED_TXN_PREPARED, InvalidXLogRecPtr,
+ InvalidXLogRecPtr, 0);
+
+ return stat(path, &st) == 0;
+}
+
+/*
+ * Apply the delayed transaction. In the function a delayed file is opened and
+ * read. Apply worker applies written changes.
+ */
+static void
+apply_delayed_transaction(TransactionId xid, LogicalRepCommitData *commit_data)
+{
+ StringInfoData s2;
+ int nchanges;
+ char path[MAXPGPATH];
+ char *buffer = NULL;
+ MemoryContext oldcxt;
+ ResourceOwner oldowner;
+
+ /* Make sure we have an open transaction */
+ begin_replication_step();
+
+ /*
+ * Allocate file handle and memory required to process all the messages in
+ * TopTransactionContext to avoid them getting reset after each message is
+ * processed.
+ */
+ oldcxt = MemoryContextSwitchTo(TopTransactionContext);
+
+ /* Open the spool file for the committed transaction */
+ delay_file_name(path, MyLogicalRepWorker->subid, xid,
+ DELAYED_TXN_COMMITTED, commit_data->commit_lsn,
+ commit_data->end_lsn, commit_data->committime);
+ elog(DEBUG1, "replaying changes from file \"%s\"", path);
+
+ /*
+ * Make sure the file is owned by the toplevel transaction so that the
+ * file will not be accidentally closed when aborting a subtransaction.
+ */
+ oldowner = CurrentResourceOwner;
+ CurrentResourceOwner = TopTransactionResourceOwner;
+
+ /* Open the specified file */
+ delayed_fd = BasicOpenFile(path, O_RDONLY | PG_BINARY);
+
+ Assert(delayed_fd > 0);
+
+ CurrentResourceOwner = oldowner;
+
+ buffer = palloc(BLCKSZ);
+ initStringInfo(&s2);
+
+ MemoryContextSwitchTo(oldcxt);
+
+ remote_final_lsn = commit_data->end_lsn;
+
+ /*
+ * Make sure the handle apply_dispatch methods are aware we're in a remote
+ * transaction.
+ */
+ in_remote_transaction = true;
+ pgstat_report_activity(STATE_RUNNING, NULL);
+
+ end_replication_step();
+
+ /*
+ * Read the entries one by one and pass them through the same logic as in
+ * apply_dispatch.
+ */
+ nchanges = 0;
+ while (true)
+ {
+ size_t nbytes;
+ int len;
+
+ CHECK_FOR_INTERRUPTS();
+
+ /* read length of the on-disk record */
+ nbytes = read(delayed_fd, &len, sizeof(len));
+
+ /* have we reached end of the file? */
+ if (nbytes == 0)
+ break;
+
+ /* do we have a correct length? */
+ if (len <= 0)
+ elog(ERROR, "incorrect length %d in delaed transaction's changes file \"%s\"",
+ len, path);
+
+ /* make sure we have sufficiently large buffer */
+ buffer = repalloc(buffer, len);
+
+ /* and finally read the data into the buffer */
+ read(delayed_fd, buffer, len);
+
+ /* copy the buffer to the stringinfo and call apply_dispatch */
+ resetStringInfo(&s2);
+ appendBinaryStringInfo(&s2, buffer, len);
+
+ /* Ensure we are reading the data into our memory context. */
+ oldcxt = MemoryContextSwitchTo(ApplyMessageContext);
+
+ apply_dispatch(&s2);
+
+ MemoryContextReset(ApplyMessageContext);
+
+ MemoryContextSwitchTo(oldcxt);
+
+ nchanges++;
+
+ if (nchanges % 1000 == 0)
+ elog(DEBUG1, "replayed %d changes from file \"%s\"",
+ nchanges, path);
+ }
+
+ close(delayed_fd);
+ delayed_fd = -1;
+ durable_unlink(path, LOG);
+
+ elog(DEBUG1, "replayed %d (all) changes from file \"%s\"",
+ nchanges, path);
+
+ return;
+}
+
+/*
+ * Create a file that will be written changes.
+ */
+static void
+create_delay_file(TransactionId xid)
+{
+ char path[MAXPGPATH];
+ int fd;
+
+ Assert(TransactionIdIsValid(xid));
+ Assert(delayed_fd < 0);
+
+ /*
+ * Construct filename. Other information like commit_lsn will be filled
+ * when it will be committed.
+ */
+ delay_file_name(path, MyLogicalRepWorker->subid, xid, DELAYED_TXN_UNKNOWN,
+ InvalidXLogRecPtr, InvalidXLogRecPtr, 0);
+
+ elog(DEBUG1, "creating a file \"%s\" for time-delayed logical replication",
+ path);
+
+ fd = BasicOpenFile(path, O_WRONLY | O_CREAT | O_EXCL | O_APPEND | PG_BINARY);
+
+ if (fd < 0)
+ ereport(ERROR,
+ errcode_for_file_access(),
+ errmsg("could not create file \"%s\": %m",
+ path));
+
+ delayed_fd = fd;
+}
+
+/*
+ * Create a directory that holds delayed files
+ */
+static void
+initialize_delay_directory(void)
+{
+ char path[MAXPGPATH];
+
+ snprintf(path, MAXPGPATH, DELAYED_DIR);
+ if (MakePGDirectory(path) < 0 && errno != EEXIST)
+ ereport(ERROR,
+ errcode_for_file_access(),
+ errmsg("could not create directory \"%s\": %m",
+ path));
+
+ START_CRIT_SECTION();
+ fsync_fname(path, true);
+ END_CRIT_SECTION();
+}
+
+/*
+ * Transform information from commit_prepared style to commit style.
+ */
+static void
+ConstructCommitFromCommitPrepared(LogicalRepCommitData *commit,
+ LogicalRepCommitPreparedTxnData *prepare_data)
+{
+ commit->commit_lsn = prepare_data->commit_lsn;
+ commit->committime = prepare_data->commit_time;
+ commit->end_lsn = prepare_data->end_lsn;
+}
+
+/*
+ * Restore the delayed transaction from given information.
+ *
+ * This return false only when the status is unknown, which measn that the
+ * worker was shutted down before receiving the COMMIT/PREPARE/COMMIT PREPARED
+ * message. In this case we must receive whole the messages and write them into
+ * file again.
+ */
+static bool
+RestoreDelayedTxn(char status, XLogRecPtr commit_lsn, XLogRecPtr end_lsn,
+ TimestampTz committime, TransactionId xid)
+
+{
+ switch (status)
+ {
+ case DELAYED_TXN_UNKNOWN:
+ return false;
+
+ case DELAYED_TXN_COMMITTED:
+ {
+ LogicalRepCommitData commit_data = {
+ .commit_lsn = commit_lsn,
+ .committime = committime,
+ .end_lsn = end_lsn,
+ };
+ cache_commit_data(&commit_data, xid);
+ break;
+ }
+
+ case DELAYED_TXN_PREPARED:
+ /* Do nothing */
+ break;
+
+ default:
+ Assert(false);
+ return false; /* Keep compiler quiet */
+ }
+
+ /* Update last_flushed to avoid to recevie same transaction again */
+ last_flushed = end_lsn;
+
+ return true;
+}
+
+/*
+ * list_sort() comparator for sorting DelayedTxnList in commitime order.
+ */
+static int
+file_sort_by_committime(const ListCell *a_p, const ListCell *b_p)
+{
+ DelayedTxnListEntry *a = (DelayedTxnListEntry *) lfirst(a_p);
+ DelayedTxnListEntry *b = (DelayedTxnListEntry *) lfirst(b_p);
+
+ if (a->commit_data.committime < b->commit_data.committime)
+ return -1;
+ else if (a->commit_data.committime > b->commit_data.committime)
+ return 1;
+ return 0;
+}
+
+
+/*
+ * Restore all the delayed transactions to memory.
+ */
+static void
+RestoreDelayedTxns(void)
+{
+ DIR *delayed_dir;
+ struct dirent *delayed_de;
+
+ /* Read all the file step-by-step */
+ delayed_dir = AllocateDir(DELAYED_DIR);
+ while ((delayed_de = ReadDir(delayed_dir, DELAYED_DIR)) != NULL)
+ {
+ Oid subid = InvalidOid;
+ TransactionId xid = InvalidTransactionId;
+ char status = 0;
+ XLogRecPtr commit_lsn = InvalidXLogRecPtr,
+ end_lsn = InvalidXLogRecPtr;
+ TimestampTz committime = 0;
+ uint32 commit_hi = 0,
+ commit_lo = 0,
+ end_hi = 0,
+ end_lo = 0;
+
+ if (strcmp(delayed_de->d_name, ".") == 0 ||
+ strcmp(delayed_de->d_name, "..") == 0)
+ continue;
+
+ /* Ignore files that aren't ours */
+ if (strncmp(delayed_de->d_name, "delayed-", 8) != 0)
+ continue;
+
+ /* Parse filename */
+ if (sscanf(delayed_de->d_name, DELAYED_FORMAT, &subid, &xid, &status, &commit_hi,
+ &commit_lo, &end_hi, &end_lo, &committime) != 8)
+ elog(ERROR, "could not parse filename \"%s\"", delayed_de->d_name);
+
+ /* Skip if the file has been generated by other subscriptions */
+ if (MyLogicalRepWorker->subid != subid)
+ continue;
+
+ elog(DEBUG1, "start to restore from %s", delayed_de->d_name);
+
+ commit_lsn = ((uint64) commit_hi) << 32 | commit_lo;
+ end_lsn = ((uint64) end_hi) << 32 | end_lo;
+
+ /*
+ * Do actual restore here. If the server was shutted down while
+ * receiving transactions, the status is UNKNOWN and
+ * RestoreDelayedTxn() returns false. At that time we must remove the
+ * file once and receive changes again.
+ */
+ if (!RestoreDelayedTxn(status, commit_lsn, end_lsn, committime, xid))
+ {
+ char path[MAXPGPATH];
+
+ snprintf(path, MAXPGPATH, DELAYED_DIR "/%s", delayed_de->d_name);
+ durable_unlink(path, LOG);
+ }
+ }
+ FreeDir(delayed_dir);
+
+ list_sort(DelayedTxnList, file_sort_by_committime);
+}
+
+/*
+ * Restore delayed transactions, or initialize the directory
+ */
+static void
+InitializeDelayedTxn(void)
+{
+ struct stat st;
+ char path[MAXPGPATH];
+
+ snprintf(path, MAXPGPATH, DELAYED_DIR);
+
+ /*
+ * If the given directory does not exist, create one. Otherwise start to
+ * restore.
+ */
+ if (stat(path, &st) != 0)
+ {
+ initialize_delay_directory();
+ return;
+ }
+
+ RestoreDelayedTxns();
+}
+
+/*
+ * Write a given message to a file. This is called for every message.
+ * This returns true only when changes are written into file.
+ *
+ * The format of the serialized changes is same as the streamed one. This
+ * has a length (not including the length), action code (identifying the
+ * message type) and message contents (without the subxact TransactionId
+ * value).
+ */
+static bool
+handle_delayed_transaction(char action, StringInfo s)
+{
+ int len;
+
+ /* Return if we are not in delay */
+ if (!in_delayed_transaction)
+ return false;
+
+ Assert(delayed_fd > 0);
+ Assert(TransactionIdIsValid(delayed_xid));
+
+ len = (s->len - s->cursor) + sizeof(char);
+
+ if (write(delayed_fd, &len, sizeof(len)) != sizeof(len))
+ abort();
+ if (write(delayed_fd, &action, sizeof(action)) != sizeof(action))
+ abort();
+
+ len = (s->len - s->cursor);
+
+ if (write(delayed_fd, &s->data[s->cursor], len) != len)
+ abort();
+
+ return true;
+}
+
+/*
+ * Check the delayed transactions and apply if we elapsed sufficient time
+ */
+static void
+check_delayed_transaction(void)
+{
+ TimestampTz now;
+ ListCell *lc;
+ int n = 0;
+
+ if (in_streamed_transaction)
+ return;
+
+ now = GetCurrentTimestamp();
+
+ /* Read cache on-by-one */
+ foreach(lc, DelayedTxnList)
+ {
+ DelayedTxnListEntry *entry = (DelayedTxnListEntry *) lfirst(lc);
+ LogicalRepCommitData *commit_data = &entry->commit_data;
+ TimestampTz delayUntil;
+ long diffms;
+
+ delayUntil = TimestampTzPlusMilliseconds(commit_data->committime,
+ MySubscription->minapplydelay);
+
+ diffms = TimestampDifferenceMilliseconds(now, delayUntil);
+
+ /*
+ * The cache is aligned the commit ordering, so we do not have to check
+ * latter entries if we find transactions that should not be applied.
+ */
+ if (diffms > 0)
+ break;
+
+ elog(DEBUG1, "started to apply transaction %u", entry->xid);
+
+ apply_delayed_transaction(entry->xid, commit_data);
+ apply_handle_commit_internal(commit_data);
+
+ delayed_xid = InvalidTransactionId;
+ in_delayed_transaction = false;
+ n++;
+ }
+ /* Discards applied entries */
+ DelayedTxnList = list_delete_first_n(DelayedTxnList, n);
+}
+
/*
* Return the name of the logical replication worker.
*/
@@ -1019,13 +1601,28 @@ apply_handle_begin(StringInfo s)
logicalrep_read_begin(s, &begin_data);
set_apply_error_context_xact(begin_data.xid, begin_data.final_lsn);
- remote_final_lsn = begin_data.final_lsn;
+ /*
+ * Prepare to write changes into file if time-delayed replication is
+ * requested.
+ */
+ if (MySubscription->minapplydelay && AllTablesyncsReady())
+ {
+ in_delayed_transaction = true;
- maybe_start_skipping_changes(begin_data.final_lsn);
+ create_delay_file(begin_data.xid);
- in_remote_transaction = true;
+ delayed_xid = begin_data.xid;
+ }
+ else
+ {
+ remote_final_lsn = begin_data.final_lsn;
- pgstat_report_activity(STATE_RUNNING, NULL);
+ maybe_start_skipping_changes(begin_data.final_lsn);
+
+ in_remote_transaction = true;
+
+ pgstat_report_activity(STATE_RUNNING, NULL);
+ }
}
/*
@@ -1037,20 +1634,40 @@ static void
apply_handle_commit(StringInfo s)
{
LogicalRepCommitData commit_data;
+ /* Save the message before it is consumed. */
+ StringInfoData original_msg = *s;
+
+ /*
+ * If we are applying the delayed transaction, skip here.
+ * Actual COMMIT will be done outside the apply_delayed_transaction()
+ */
+ if (delayed_fd > 0 && !in_delayed_transaction)
+ return;
logicalrep_read_commit(s, &commit_data);
- if (commit_data.commit_lsn != remote_final_lsn)
- ereport(ERROR,
- (errcode(ERRCODE_PROTOCOL_VIOLATION),
- errmsg_internal("incorrect commit LSN %X/%X in commit message (expected %X/%X)",
- LSN_FORMAT_ARGS(commit_data.commit_lsn),
- LSN_FORMAT_ARGS(remote_final_lsn))));
+ /* If we are applying, skip here. */
- apply_handle_commit_internal(&commit_data);
+ if (in_delayed_transaction)
+ {
+ /* Write a commit message into file and flush all of messages */
+ handle_delayed_transaction(LOGICAL_REP_MSG_COMMIT, &original_msg);
+ flush_delayed_changes(&commit_data);
+ }
+ else
+ {
+ if (commit_data.commit_lsn != remote_final_lsn)
+ ereport(ERROR,
+ (errcode(ERRCODE_PROTOCOL_VIOLATION),
+ errmsg_internal("incorrect commit LSN %X/%X in commit message (expected %X/%X)",
+ LSN_FORMAT_ARGS(commit_data.commit_lsn),
+ LSN_FORMAT_ARGS(remote_final_lsn))));
- /* Process any tables that are being synchronized in parallel. */
- process_syncing_tables(commit_data.end_lsn);
+ apply_handle_commit_internal(&commit_data);
+
+ /* Process any tables that are being synchronized in parallel. */
+ process_syncing_tables(commit_data.end_lsn);
+ }
pgstat_report_activity(STATE_IDLE, NULL);
reset_apply_error_context_info();
@@ -1076,13 +1693,28 @@ apply_handle_begin_prepare(StringInfo s)
logicalrep_read_begin_prepare(s, &begin_data);
set_apply_error_context_xact(begin_data.xid, begin_data.prepare_lsn);
- remote_final_lsn = begin_data.prepare_lsn;
+ /*
+ * Prepare to write changes into file if time-delayed replication is
+ * requested.
+ */
+ if (MySubscription->minapplydelay && AllTablesyncsReady())
+ {
+ in_delayed_transaction = true;
- maybe_start_skipping_changes(begin_data.prepare_lsn);
+ create_delay_file(begin_data.xid);
- in_remote_transaction = true;
+ delayed_xid = begin_data.xid;
+ }
+ else
+ {
+ remote_final_lsn = begin_data.prepare_lsn;
- pgstat_report_activity(STATE_RUNNING, NULL);
+ maybe_start_skipping_changes(begin_data.prepare_lsn);
+
+ in_remote_transaction = true;
+
+ pgstat_report_activity(STATE_RUNNING, NULL);
+ }
}
/*
@@ -1124,57 +1756,102 @@ apply_handle_prepare_internal(LogicalRepPreparedTxnData *prepare_data)
/*
* Handle PREPARE message.
+ *
+ * When time-delayed logical replication is requested, we just write a message
+ * into file and return. This means that no transaction is prepared on
+ * subscriber. This can avoid that the apply worker acquires locks for a long
+ * time due to the long min_apply_time.
+ *
+ * Even if the transaction is applied from delayed file, the transaction is not
+ * prepared. We just skip PREPARE message.
*/
static void
apply_handle_prepare(StringInfo s)
{
LogicalRepPreparedTxnData prepare_data;
- logicalrep_read_prepare(s, &prepare_data);
-
- if (prepare_data.prepare_lsn != remote_final_lsn)
- ereport(ERROR,
- (errcode(ERRCODE_PROTOCOL_VIOLATION),
- errmsg_internal("incorrect prepare LSN %X/%X in prepare message (expected %X/%X)",
- LSN_FORMAT_ARGS(prepare_data.prepare_lsn),
- LSN_FORMAT_ARGS(remote_final_lsn))));
-
/*
- * Unlike commit, here, we always prepare the transaction even though no
- * change has happened in this transaction or all changes are skipped. It
- * is done this way because at commit prepared time, we won't know whether
- * we have skipped preparing a transaction because of those reasons.
- *
- * XXX, We can optimize such that at commit prepared time, we first check
- * whether we have prepared the transaction or not but that doesn't seem
- * worthwhile because such cases shouldn't be common.
+ * If we are writing changes into delayed file, construct a modified
+ * message and write it. This is needed for avoiding to write gid into
+ * file. More detail, see atop ReadPreparedCommonRecord().
*/
- begin_replication_step();
+ if (in_delayed_transaction)
+ {
+ char old_path[MAXPGPATH];
+ char new_path[MAXPGPATH];
- apply_handle_prepare_internal(&prepare_data);
+ /* Cleanup */
+ close(delayed_fd);
- end_replication_step();
- CommitTransactionCommand();
- pgstat_report_stat(false);
+ /*
+ * Construct old/new filename.
+ *
+ * Note that commit_lsn, end_lsn, and committime are not filled here.
+ * This is because when COMMIT PREPARED is come, we do no have a good
+ * way to indicate the related transaction file if they are filled.
+ */
+ delay_file_name(old_path, MyLogicalRepWorker->subid, delayed_xid,
+ DELAYED_TXN_UNKNOWN, InvalidXLogRecPtr, InvalidXLogRecPtr, 0);
- store_flush_position(prepare_data.end_lsn, XactLastCommitEnd);
+ delay_file_name(new_path, MyLogicalRepWorker->subid, delayed_xid,
+ DELAYED_TXN_PREPARED, InvalidXLogRecPtr, InvalidXLogRecPtr, 0);
- in_remote_transaction = false;
+ /* And do actual rename */
+ if (durable_rename(old_path, new_path, PANIC))
+ abort();
- /* Process any tables that are being synchronized in parallel. */
- process_syncing_tables(prepare_data.end_lsn);
+ /* Store flushed lsn */
+ last_flushed = prepare_data.end_lsn;
- /*
- * Since we have already prepared the transaction, in a case where the
- * server crashes before clearing the subskiplsn, it will be left but the
- * transaction won't be resent. But that's okay because it's a rare case
- * and the subskiplsn will be cleared when finishing the next transaction.
- */
- stop_skipping_changes();
- clear_subscription_skip_lsn(prepare_data.prepare_lsn);
+ delayed_fd = -1;
+ delayed_xid = InvalidTransactionId;
+ in_delayed_transaction = false;
+ }
+ else
+ {
+ logicalrep_read_prepare(s, &prepare_data);
- pgstat_report_activity(STATE_IDLE, NULL);
- reset_apply_error_context_info();
+ if (prepare_data.prepare_lsn != remote_final_lsn)
+ ereport(ERROR,
+ (errcode(ERRCODE_PROTOCOL_VIOLATION),
+ errmsg_internal("incorrect prepare LSN %X/%X in prepare message (expected %X/%X)",
+ LSN_FORMAT_ARGS(prepare_data.prepare_lsn),
+ LSN_FORMAT_ARGS(remote_final_lsn))));
+
+ /*
+ * Unlike commit, here, we always prepare the transaction even though no
+ * change has happened in this transaction or all changes are skipped. It
+ * is done this way because at commit prepared time, we won't know whether
+ * we have skipped preparing a transaction because of those reasons.
+ *
+ * XXX, We can optimize such that at commit prepared time, we first check
+ * whether we have prepared the transaction or not but that doesn't seem
+ * worthwhile because such cases shouldn't be common.
+ */
+ begin_replication_step();
+
+ apply_handle_prepare_internal(&prepare_data);
+
+ end_replication_step();
+ CommitTransactionCommand();
+ pgstat_report_stat(false);
+
+ store_flush_position(prepare_data.end_lsn, XactLastCommitEnd);
+
+ in_remote_transaction = false;
+
+ /* Process any tables that are being synchronized in parallel. */
+ process_syncing_tables(prepare_data.end_lsn);
+
+ /*
+ * Since we have already prepared the transaction, in a case where the
+ * server crashes before clearing the subskiplsn, it will be left but the
+ * transaction won't be resent. But that's okay because it's a rare case
+ * and the subskiplsn will be cleared when finishing the next transaction.
+ */
+ stop_skipping_changes();
+ clear_subscription_skip_lsn(prepare_data.prepare_lsn);
+ }
}
/*
@@ -1192,38 +1869,95 @@ apply_handle_commit_prepared(StringInfo s)
LogicalRepCommitPreparedTxnData prepare_data;
char gid[GIDSIZE];
- logicalrep_read_commit_prepared(s, &prepare_data);
- set_apply_error_context_xact(prepare_data.xid, prepare_data.commit_lsn);
-
- /* Compute GID for two_phase transactions. */
- TwoPhaseTransactionGid(MySubscription->oid, prepare_data.xid,
- gid, sizeof(gid));
+ if (delayed_fd > 0 && !in_delayed_transaction)
+ return;
- /* There is no transaction when COMMIT PREPARED is called */
- begin_replication_step();
+ logicalrep_read_commit_prepared(s, &prepare_data);
/*
- * Update origin state so we can restart streaming from correct position
- * in case of crash.
+ * Check whether delayed file exists or not. If we have a file and we have
+ * not opened yet, it means that time-delayed logical replication has been
+ * requested. At that time we write the modified message.
+ * Otherwise, the transaction will be committed normally.
*/
- replorigin_session_origin_lsn = prepare_data.end_lsn;
- replorigin_session_origin_timestamp = prepare_data.commit_time;
+ if (delayed_fd < 0 &&
+ is_given_transaction_delayed(MyLogicalRepWorker->subid, prepare_data.xid))
+ {
+ char old_path[MAXPGPATH];
+ char new_path[MAXPGPATH];
+ LogicalRepCommitData commit_data = {0};
- FinishPreparedTransaction(gid, true);
- end_replication_step();
- CommitTransactionCommand();
- pgstat_report_stat(false);
+ /*
+ * Open the delayed transaction file.
+ *
+ * Apart from RestoreDelayedTxns(), we don't want to read whole the
+ * directory to find the related file. That's why we use Invalid LSN
+ * and committime to indicate it.
+ */
+ delay_file_name(old_path, MyLogicalRepWorker->subid, prepare_data.xid,
+ DELAYED_TXN_PREPARED, InvalidXLogRecPtr,
+ InvalidXLogRecPtr, 0);
- store_flush_position(prepare_data.end_lsn, XactLastCommitEnd);
- in_remote_transaction = false;
+ delayed_fd = BasicOpenFile(old_path, O_WRONLY | O_APPEND | PG_BINARY);
+ if (delayed_fd < 0)
+ ereport(ERROR,
+ errcode_for_file_access(),
+ errmsg("could not open file \"%s\": %m",
+ old_path));
- /* Process any tables that are being synchronized in parallel. */
- process_syncing_tables(prepare_data.end_lsn);
+ delay_file_name(new_path, MyLogicalRepWorker->subid,
+ prepare_data.xid, DELAYED_TXN_COMMITTED,
+ prepare_data.commit_lsn, prepare_data.end_lsn,
+ prepare_data.commit_time);
- clear_subscription_skip_lsn(prepare_data.end_lsn);
+ close(delayed_fd);
- pgstat_report_activity(STATE_IDLE, NULL);
- reset_apply_error_context_info();
+ if (durable_rename(old_path, new_path, PANIC))
+ abort();
+
+ /* Store flushed lsn */
+ last_flushed = prepare_data.end_lsn;
+
+ delayed_fd = -1;
+ delayed_xid = InvalidTransactionId;
+ in_delayed_transaction = false;
+
+ ConstructCommitFromCommitPrepared(&commit_data, &prepare_data);
+
+ /* Cache the commited transaction */
+ cache_commit_data(&commit_data, prepare_data.xid);
+ }
+ else
+ {
+ set_apply_error_context_xact(prepare_data.xid, prepare_data.commit_lsn);
+
+ /* Compute GID for two_phase transactions. */
+ TwoPhaseTransactionGid(MySubscription->oid, prepare_data.xid,
+ gid, sizeof(gid));
+
+ /* There is no transaction when COMMIT PREPARED is called */
+ begin_replication_step();
+
+ /*
+ * Update origin state so we can restart streaming from correct position
+ * in case of crash.
+ */
+ replorigin_session_origin_lsn = prepare_data.end_lsn;
+ replorigin_session_origin_timestamp = prepare_data.commit_time;
+
+ FinishPreparedTransaction(gid, true);
+ end_replication_step();
+ CommitTransactionCommand();
+ pgstat_report_stat(false);
+
+ store_flush_position(prepare_data.end_lsn, XactLastCommitEnd);
+ in_remote_transaction = false;
+
+ /* Process any tables that are being synchronized in parallel. */
+ process_syncing_tables(prepare_data.end_lsn);
+
+ clear_subscription_skip_lsn(prepare_data.end_lsn);
+ }
}
/*
@@ -1242,6 +1976,23 @@ apply_handle_rollback_prepared(StringInfo s)
char gid[GIDSIZE];
logicalrep_read_rollback_prepared(s, &rollback_data);
+
+ /*
+ * If the delayed file exists, just remove it. The delayed transaction have
+ * never prepared, so it's OK not to call FinishPreparedTransaction().
+ */
+ if (is_given_transaction_delayed(MyLogicalRepWorker->subid, rollback_data.xid))
+ {
+ char path[MAXPGPATH];
+ delay_file_name(path, MyLogicalRepWorker->subid, rollback_data.xid,
+ DELAYED_TXN_PREPARED, InvalidXLogRecPtr,
+ InvalidXLogRecPtr, 0);
+
+ durable_unlink(path, LOG);
+
+ return;
+ }
+
set_apply_error_context_xact(rollback_data.xid, rollback_data.rollback_end_lsn);
/* Compute GID for two_phase transactions. */
@@ -1317,16 +2068,66 @@ apply_handle_stream_prepare(StringInfo s)
switch (apply_action)
{
case TRANS_LEADER_APPLY:
+ /*
+ * If time-delayed is requested, start to write changes to
+ * permanent file instead of temporary one.
+ */
+ if (MySubscription->minapplydelay)
+ {
+ in_delayed_transaction = true;
+
+ create_delay_file(prepare_data.xid);
+
+ delayed_xid = prepare_data.xid;
+ }
/*
* The transaction has been serialized to file, so replay all the
* spooled operations.
+ * Note that if time-delayed replication is requested, changes are
+ * written into permanent file here.
*/
apply_spooled_messages(MyLogicalRepWorker->stream_fileset,
prepare_data.xid, prepare_data.prepare_lsn);
- /* Mark the transaction as prepared. */
- apply_handle_prepare_internal(&prepare_data);
+
+ /*
+ * If time-delayed replication is requested, construct a modified
+ * message and write it. This is needed for avoiding to write gid into
+ * file. More detail, see atop ReadPreparedCommonRecord().
+ */
+ if (MySubscription->minapplydelay)
+ {
+ char old_path[MAXPGPATH];
+ char new_path[MAXPGPATH];
+
+ close(delayed_fd);
+
+ delay_file_name(old_path, MyLogicalRepWorker->subid,
+ prepare_data.xid, DELAYED_TXN_UNKNOWN,
+ InvalidXLogRecPtr, InvalidXLogRecPtr, 0);
+
+ delay_file_name(new_path, MyLogicalRepWorker->subid,
+ prepare_data.xid, DELAYED_TXN_PREPARED,
+ InvalidXLogRecPtr, InvalidXLogRecPtr, 0);
+
+ if (durable_rename(old_path, new_path, PANIC))
+ abort();
+
+ /* Store flushed lsn */
+ last_flushed = prepare_data.end_lsn;
+
+ close(delayed_fd);
+
+ delayed_fd = -1;
+ delayed_xid = InvalidTransactionId;
+ in_delayed_transaction = false;
+ }
+ else
+ {
+ /* Mark the transaction as prepared. */
+ apply_handle_prepare_internal(&prepare_data);
+ }
CommitTransactionCommand();
@@ -1405,8 +2206,11 @@ apply_handle_stream_prepare(StringInfo s)
pgstat_report_stat(false);
- /* Process any tables that are being synchronized in parallel. */
- process_syncing_tables(prepare_data.end_lsn);
+ if (list_length(DelayedTxnList) == 0)
+ {
+ /* Process any tables that are being synchronized in parallel. */
+ process_syncing_tables(prepare_data.end_lsn);
+ }
/*
* Similar to prepare case, the subskiplsn could be left in a case of
@@ -2175,19 +2979,43 @@ apply_handle_stream_commit(StringInfo s)
{
case TRANS_LEADER_APPLY:
+ /*
+ * If time-delayed is requested, start to write changes to
+ * permanent file instead of temporary one.
+ */
+ if (MySubscription->minapplydelay)
+ {
+ in_delayed_transaction = true;
+
+ create_delay_file(xid);
+
+ delayed_xid = xid;
+ }
+
/*
* The transaction has been serialized to file, so replay all the
* spooled operations.
+ * Note that if time-delayed replication is requested, changes are
+ * written into permanent file here.
*/
apply_spooled_messages(MyLogicalRepWorker->stream_fileset, xid,
- commit_data.commit_lsn);
+ commit_data.commit_lsn);
- apply_handle_commit_internal(&commit_data);
+
+ /* Flush changes if time-delayed is requested */
+ if (MySubscription->minapplydelay)
+ {
+ handle_delayed_transaction(LOGICAL_REP_MSG_COMMIT, &original_msg);
+ flush_delayed_changes(&commit_data);
+ }
+ else
+ apply_handle_commit_internal(&commit_data);
/* Unlink the files with serialized changes and subxact info. */
stream_cleanup_files(MyLogicalRepWorker->subid, xid);
elog(DEBUG1, "finished processing the STREAM COMMIT command");
+
break;
case TRANS_LEADER_SEND_TO_PARALLEL:
@@ -2249,8 +3077,11 @@ apply_handle_stream_commit(StringInfo s)
break;
}
- /* Process any tables that are being synchronized in parallel. */
- process_syncing_tables(commit_data.end_lsn);
+ if (list_length(DelayedTxnList) == 0)
+ {
+ /* Process any tables that are being synchronized in parallel. */
+ process_syncing_tables(commit_data.end_lsn);
+ }
pgstat_report_activity(STATE_IDLE, NULL);
@@ -2325,7 +3156,8 @@ apply_handle_relation(StringInfo s)
{
LogicalRepRelation *rel;
- if (handle_streamed_transaction(LOGICAL_REP_MSG_RELATION, s))
+ if (handle_streamed_transaction(LOGICAL_REP_MSG_RELATION, s) ||
+ handle_delayed_transaction(LOGICAL_REP_MSG_RELATION, s))
return;
rel = logicalrep_read_rel(s);
@@ -2348,7 +3180,8 @@ apply_handle_type(StringInfo s)
{
LogicalRepTyp typ;
- if (handle_streamed_transaction(LOGICAL_REP_MSG_TYPE, s))
+ if (handle_streamed_transaction(LOGICAL_REP_MSG_TYPE, s) ||
+ handle_delayed_transaction(LOGICAL_REP_MSG_TYPE, s))
return;
logicalrep_read_typ(s, &typ);
@@ -2408,7 +3241,8 @@ apply_handle_insert(StringInfo s)
* streamed transactions.
*/
if (is_skipping_changes() ||
- handle_streamed_transaction(LOGICAL_REP_MSG_INSERT, s))
+ handle_streamed_transaction(LOGICAL_REP_MSG_INSERT, s) ||
+ handle_delayed_transaction(LOGICAL_REP_MSG_INSERT, s))
return;
begin_replication_step();
@@ -2560,7 +3394,8 @@ apply_handle_update(StringInfo s)
* streamed transactions.
*/
if (is_skipping_changes() ||
- handle_streamed_transaction(LOGICAL_REP_MSG_UPDATE, s))
+ handle_streamed_transaction(LOGICAL_REP_MSG_UPDATE, s) ||
+ handle_delayed_transaction(LOGICAL_REP_MSG_UPDATE, s))
return;
begin_replication_step();
@@ -2741,7 +3576,8 @@ apply_handle_delete(StringInfo s)
* streamed transactions.
*/
if (is_skipping_changes() ||
- handle_streamed_transaction(LOGICAL_REP_MSG_DELETE, s))
+ handle_streamed_transaction(LOGICAL_REP_MSG_DELETE, s) ||
+ handle_delayed_transaction(LOGICAL_REP_MSG_DELETE, s))
return;
begin_replication_step();
@@ -3169,7 +4005,8 @@ apply_handle_truncate(StringInfo s)
* streamed transactions.
*/
if (is_skipping_changes() ||
- handle_streamed_transaction(LOGICAL_REP_MSG_TRUNCATE, s))
+ handle_streamed_transaction(LOGICAL_REP_MSG_TRUNCATE, s) ||
+ handle_delayed_transaction(LOGICAL_REP_MSG_TRUNCATE, s))
return;
begin_replication_step();
@@ -3431,11 +4268,14 @@ get_flush_position(XLogRecPtr *write, XLogRecPtr *flush,
pos = dlist_tail_element(FlushPosition, node,
&lsn_mapping);
*write = pos->remote_end;
- *have_pending_txes = true;
- return;
+ break;
}
}
+ /* If change are written into file, report the LSN instead */
+ if (last_flushed > *flush)
+ *flush = last_flushed;
+
*have_pending_txes = !dlist_is_empty(&lsn_mapping);
}
@@ -3632,9 +4472,13 @@ LogicalRepApplyLoop(XLogRecPtr last_received)
maybe_reread_subscription();
/* Process any table synchronization changes. */
- process_syncing_tables(last_received);
+ if (list_length(DelayedTxnList) == 0)
+ process_syncing_tables(last_received);
}
+ /* Check delayed transactions and apply them */
+ check_delayed_transaction();
+
/* Cleanup the memory. */
MemoryContextResetAndDeleteChildren(ApplyMessageContext);
MemoryContextSwitchTo(TopMemoryContext);
@@ -3776,8 +4620,14 @@ send_feedback(XLogRecPtr recvpos, bool force, bool requestReply)
/*
* No outstanding transactions to flush, we can report the latest received
* position. This is important for synchronous replication.
+ *
+ * If the logical replication subscription has unprocessed changes then do
+ * not inform the publisher that the received latest LSN is already
+ * applied and flushed, otherwise, the publisher will make a wrong
+ * assumption about the logical replication progress. Instead, just send a
+ * feedback message to avoid a replication timeout during the delay.
*/
- if (!have_pending_txes)
+ if (!have_pending_txes && (list_length(DelayedTxnList) == 0))
flushpos = writepos = recvpos;
if (writepos < last_writepos)
@@ -3936,7 +4786,8 @@ maybe_reread_subscription(void)
newsub->stream != MySubscription->stream ||
strcmp(newsub->origin, MySubscription->origin) != 0 ||
newsub->owner != MySubscription->owner ||
- !equal(newsub->publications, MySubscription->publications))
+ !equal(newsub->publications, MySubscription->publications) ||
+ (newsub->minapplydelay == 0) != (MySubscription->minapplydelay == 0))
{
if (am_parallel_apply_worker())
ereport(LOG,
@@ -4581,6 +5432,9 @@ ApplyWorkerMain(Datum main_arg)
(errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
errmsg("subscription has no replication slot set")));
+ /* Check delayed files or initialize directory */
+ InitializeDelayedTxn();
+
/* Setup replication origin tracking. */
StartTransactionCommand();
ReplicationOriginNameForLogicalRep(MySubscription->oid, InvalidOid,
@@ -4592,6 +5446,14 @@ ApplyWorkerMain(Datum main_arg)
replorigin_session_origin = originid;
origin_startpos = replorigin_session_get_progress(false);
+ /*
+ * If last_flushed exceeds origin_startpos, it means that some
+ * transactions are delaying. They have already been written into
+ * pernament file, so no need to recevie them again.
+ */
+ if (origin_startpos < last_flushed)
+ origin_startpos = last_flushed;
+
/* Is the use of a password mandatory? */
must_use_password = MySubscription->passwordrequired &&
!superuser_arg(MySubscription->owner);
@@ -4663,9 +5525,15 @@ ApplyWorkerMain(Datum main_arg)
options.proto.logical.twophase = false;
options.proto.logical.origin = pstrdup(MySubscription->origin);
+ options.proto.logical.require_schema = false;
+
if (!am_tablesync_worker())
{
+ if (server_version >= 160000)
+ options.proto.logical.require_schema =
+ MySubscription->minapplydelay > 0;
+
/*
* Even when the two_phase mode is requested by the user, it remains
* as the tri-state PENDING until all tablesyncs have reached READY
diff --git a/src/backend/replication/pgoutput/pgoutput.c b/src/backend/replication/pgoutput/pgoutput.c
index f88389de84..6718fe062b 100644
--- a/src/backend/replication/pgoutput/pgoutput.c
+++ b/src/backend/replication/pgoutput/pgoutput.c
@@ -286,11 +286,13 @@ parse_output_parameters(List *options, PGOutputData *data)
bool streaming_given = false;
bool two_phase_option_given = false;
bool origin_option_given = false;
+ bool require_schema_option_given = false;
data->binary = false;
data->streaming = LOGICALREP_STREAM_OFF;
data->messages = false;
data->two_phase = false;
+ data->require_schema = false;
foreach(lc, options)
{
@@ -397,6 +399,16 @@ parse_output_parameters(List *options, PGOutputData *data)
errcode(ERRCODE_INVALID_PARAMETER_VALUE),
errmsg("unrecognized origin value: \"%s\"", data->origin));
}
+ else if (strcmp(defel->defname, "require_schema") == 0)
+ {
+ if (require_schema_option_given)
+ ereport(ERROR,
+ (errcode(ERRCODE_SYNTAX_ERROR),
+ errmsg("conflicting or redundant options")));
+ require_schema_option_given = true;
+
+ data->require_schema = defGetBoolean(defel);
+ }
else
elog(ERROR, "unrecognized pgoutput option: %s", defel->defname);
}
@@ -677,7 +689,8 @@ pgoutput_rollback_prepared_txn(LogicalDecodingContext *ctx,
static void
maybe_send_schema(LogicalDecodingContext *ctx,
ReorderBufferChange *change,
- Relation relation, RelationSyncEntry *relentry)
+ Relation relation, RelationSyncEntry *relentry,
+ PGOutputData *data)
{
bool schema_sent;
TransactionId xid = InvalidTransactionId;
@@ -717,7 +730,7 @@ maybe_send_schema(LogicalDecodingContext *ctx,
schema_sent = relentry->schema_sent;
/* Nothing to do if we already sent the schema. */
- if (schema_sent)
+ if (!data->require_schema && schema_sent)
return;
/*
@@ -1520,7 +1533,7 @@ pgoutput_change(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
* Schema should be sent using the original relation because it also sends
* the ancestor's relation.
*/
- maybe_send_schema(ctx, change, relation, relentry);
+ maybe_send_schema(ctx, change, relation, relentry, data);
OutputPluginPrepareWrite(ctx, true);
@@ -1605,7 +1618,7 @@ pgoutput_truncate(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
if (txndata && !txndata->sent_begin_txn)
pgoutput_send_begin(ctx, txn);
- maybe_send_schema(ctx, change, relation, relentry);
+ maybe_send_schema(ctx, change, relation, relentry, data);
}
if (nrelids > 0)
diff --git a/src/bin/pg_dump/pg_dump.c b/src/bin/pg_dump/pg_dump.c
index 058244cd17..3ca0121d20 100644
--- a/src/bin/pg_dump/pg_dump.c
+++ b/src/bin/pg_dump/pg_dump.c
@@ -4610,6 +4610,7 @@ getSubscriptions(Archive *fout)
int i_subpublications;
int i_subbinary;
int i_subpasswordrequired;
+ int i_subminapplydelay;
int i,
ntups;
@@ -4664,11 +4665,13 @@ getSubscriptions(Archive *fout)
if (fout->remoteVersion >= 160000)
appendPQExpBufferStr(query,
" s.suborigin,\n"
- " s.subpasswordrequired\n");
+ " s.subpasswordrequired,\n"
+ " s.subminapplydelay\n");
else
appendPQExpBuffer(query,
" '%s' AS suborigin,\n"
- " 't' AS subpasswordrequired\n",
+ " 't' AS subpasswordrequired,\n"
+ " 0 AS subminapplydelay\n",
LOGICALREP_ORIGIN_ANY);
appendPQExpBufferStr(query,
@@ -4698,6 +4701,7 @@ getSubscriptions(Archive *fout)
i_subdisableonerr = PQfnumber(res, "subdisableonerr");
i_suborigin = PQfnumber(res, "suborigin");
i_subpasswordrequired = PQfnumber(res, "subpasswordrequired");
+ i_subminapplydelay = PQfnumber(res, "subminapplydelay");
subinfo = pg_malloc(ntups * sizeof(SubscriptionInfo));
@@ -4730,6 +4734,8 @@ getSubscriptions(Archive *fout)
subinfo[i].suborigin = pg_strdup(PQgetvalue(res, i, i_suborigin));
subinfo[i].subpasswordrequired =
pg_strdup(PQgetvalue(res, i, i_subpasswordrequired));
+ subinfo[i].subminapplydelay =
+ atoi(PQgetvalue(res, i, i_subminapplydelay));
/* Decide whether we want to dump it */
selectDumpableObject(&(subinfo[i].dobj), fout);
@@ -4814,6 +4820,9 @@ dumpSubscription(Archive *fout, const SubscriptionInfo *subinfo)
if (strcmp(subinfo->subpasswordrequired, "t") != 0)
appendPQExpBuffer(query, ", password_required = false");
+ if (subinfo->subminapplydelay > 0)
+ appendPQExpBuffer(query, ", min_apply_delay = '%d ms'", subinfo->subminapplydelay);
+
appendPQExpBufferStr(query, ");\n");
if (subinfo->dobj.dump & DUMP_COMPONENT_DEFINITION)
diff --git a/src/bin/pg_dump/pg_dump.h b/src/bin/pg_dump/pg_dump.h
index ed6ce41ad7..6bf889a00a 100644
--- a/src/bin/pg_dump/pg_dump.h
+++ b/src/bin/pg_dump/pg_dump.h
@@ -662,6 +662,7 @@ typedef struct _SubscriptionInfo
char *subdisableonerr;
char *suborigin;
char *subsynccommit;
+ int subminapplydelay;
char *subpublications;
char *subpasswordrequired;
} SubscriptionInfo;
diff --git a/src/bin/psql/describe.c b/src/bin/psql/describe.c
index 83a37ee601..78f2426d99 100644
--- a/src/bin/psql/describe.c
+++ b/src/bin/psql/describe.c
@@ -6493,7 +6493,7 @@ describeSubscriptions(const char *pattern, bool verbose)
PGresult *res;
printQueryOpt myopt = pset.popt;
static const bool translate_columns[] = {false, false, false, false,
- false, false, false, false, false, false, false, false, false};
+ false, false, false, false, false, false, false, false, false, false};
if (pset.sversion < 100000)
{
@@ -6551,9 +6551,11 @@ describeSubscriptions(const char *pattern, bool verbose)
if (pset.sversion >= 160000)
appendPQExpBuffer(&buf,
", suborigin AS \"%s\"\n"
- ", subrunasowner AS \"%s\"\n",
+ ", subrunasowner AS \"%s\"\n"
+ ", subminapplydelay AS \"%s\"\n",
gettext_noop("Origin"),
- gettext_noop("Run as Owner?"));
+ gettext_noop("Run as Owner?"),
+ gettext_noop("Min apply delay"));
appendPQExpBuffer(&buf,
", subsynccommit AS \"%s\"\n"
diff --git a/src/bin/psql/tab-complete.c b/src/bin/psql/tab-complete.c
index 5825b2a195..d7a9aeff86 100644
--- a/src/bin/psql/tab-complete.c
+++ b/src/bin/psql/tab-complete.c
@@ -1925,9 +1925,9 @@ psql_completion(const char *text, int start, int end)
COMPLETE_WITH("(", "PUBLICATION");
/* ALTER SUBSCRIPTION <name> SET ( */
else if (HeadMatches("ALTER", "SUBSCRIPTION", MatchAny) && TailMatches("SET", "("))
- COMPLETE_WITH("binary", "disable_on_error", "origin",
- "password_required", "run_as_owner", "slot_name",
- "streaming", "synchronous_commit");
+ COMPLETE_WITH("binary", "disable_on_error", "min_apply_delay",
+ "origin", "password_required", "run_as_owner",
+ "slot_name", "streaming", "synchronous_commit");
/* ALTER SUBSCRIPTION <name> SKIP ( */
else if (HeadMatches("ALTER", "SUBSCRIPTION", MatchAny) && TailMatches("SKIP", "("))
COMPLETE_WITH("lsn");
@@ -3269,9 +3269,10 @@ psql_completion(const char *text, int start, int end)
/* Complete "CREATE SUBSCRIPTION <name> ... WITH ( <opt>" */
else if (HeadMatches("CREATE", "SUBSCRIPTION") && TailMatches("WITH", "("))
COMPLETE_WITH("binary", "connect", "copy_data", "create_slot",
- "disable_on_error", "enabled", "origin",
- "password_required", "run_as_owner", "slot_name",
- "streaming", "synchronous_commit", "two_phase");
+ "disable_on_error", "enabled", "min_apply_delay",
+ "origin", "password_required", "run_as_owner",
+ "slot_name", "streaming", "synchronous_commit",
+ "two_phase");
/* CREATE TRIGGER --- is allowed inside CREATE SCHEMA, so use TailMatches */
diff --git a/src/include/catalog/pg_subscription.h b/src/include/catalog/pg_subscription.h
index 91d729d62d..649e789240 100644
--- a/src/include/catalog/pg_subscription.h
+++ b/src/include/catalog/pg_subscription.h
@@ -74,6 +74,8 @@ CATALOG(pg_subscription,6100,SubscriptionRelationId) BKI_SHARED_RELATION BKI_ROW
Oid subowner BKI_LOOKUP(pg_authid); /* Owner of the subscription */
+ int32 subminapplydelay; /* Replication apply delay (ms) */
+
bool subenabled; /* True if the subscription is enabled (the
* worker should be running) */
@@ -127,6 +129,7 @@ typedef struct Subscription
* skipped */
char *name; /* Name of the subscription */
Oid owner; /* Oid of the subscription owner */
+ int32 minapplydelay; /* Replication apply delay (ms) */
bool enabled; /* Indicates if the subscription is enabled */
bool binary; /* Indicates if the subscription wants data in
* binary format */
diff --git a/src/include/replication/pgoutput.h b/src/include/replication/pgoutput.h
index b4a8015403..59d924084f 100644
--- a/src/include/replication/pgoutput.h
+++ b/src/include/replication/pgoutput.h
@@ -30,6 +30,7 @@ typedef struct PGOutputData
bool messages;
bool two_phase;
char *origin;
+ bool require_schema;
} PGOutputData;
#endif /* PGOUTPUT_H */
diff --git a/src/include/replication/walreceiver.h b/src/include/replication/walreceiver.h
index 281626fa6f..954d297401 100644
--- a/src/include/replication/walreceiver.h
+++ b/src/include/replication/walreceiver.h
@@ -187,6 +187,7 @@ typedef struct
* prepare time */
char *origin; /* Only publish data originating from the
* specified origin */
+ bool require_schema;
} logical;
} proto;
} WalRcvStreamOptions;
diff --git a/src/test/regress/expected/subscription.out b/src/test/regress/expected/subscription.out
index 9c52890f1d..b0e1ea5a56 100644
--- a/src/test/regress/expected/subscription.out
+++ b/src/test/regress/expected/subscription.out
@@ -115,18 +115,18 @@ CREATE SUBSCRIPTION regress_testsub4 CONNECTION 'dbname=regress_doesnotexist' PU
WARNING: subscription was created, but is not connected
HINT: To initiate replication, you must manually create the replication slot, enable the subscription, and refresh the subscription.
\dRs+ regress_testsub4
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Run as Owner? | Synchronous commit | Conninfo | Skip LSN
-------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+---------------+--------------------+-----------------------------+----------
- regress_testsub4 | regress_subscription_user | f | {testpub} | f | off | d | f | none | f | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Run as Owner? | Min apply delay | Synchronous commit | Conninfo | Skip LSN
+------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+---------------+-----------------+--------------------+-----------------------------+----------
+ regress_testsub4 | regress_subscription_user | f | {testpub} | f | off | d | f | none | f | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
ALTER SUBSCRIPTION regress_testsub4 SET (origin = any);
\dRs+ regress_testsub4
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Run as Owner? | Synchronous commit | Conninfo | Skip LSN
-------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+---------------+--------------------+-----------------------------+----------
- regress_testsub4 | regress_subscription_user | f | {testpub} | f | off | d | f | any | f | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Run as Owner? | Min apply delay | Synchronous commit | Conninfo | Skip LSN
+------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+---------------+-----------------+--------------------+-----------------------------+----------
+ regress_testsub4 | regress_subscription_user | f | {testpub} | f | off | d | f | any | f | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
DROP SUBSCRIPTION regress_testsub3;
@@ -144,10 +144,10 @@ ALTER SUBSCRIPTION regress_testsub CONNECTION 'foobar';
ERROR: invalid connection string syntax: missing "=" after "foobar" in connection info string
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Run as Owner? | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+---------------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub} | f | off | d | f | any | f | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Run as Owner? | Min apply delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+---------------+-----------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | off | d | f | any | f | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
ALTER SUBSCRIPTION regress_testsub SET PUBLICATION testpub2, testpub3 WITH (refresh = false);
@@ -166,10 +166,10 @@ ERROR: unrecognized subscription parameter: "create_slot"
-- ok
ALTER SUBSCRIPTION regress_testsub SKIP (lsn = '0/12345');
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Run as Owner? | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+---------------------+--------+-----------+------------------+------------------+--------+---------------+--------------------+------------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub2,testpub3} | f | off | d | f | any | f | off | dbname=regress_doesnotexist2 | 0/12345
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Run as Owner? | Min apply delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+---------------------+--------+-----------+------------------+------------------+--------+---------------+-----------------+--------------------+------------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub2,testpub3} | f | off | d | f | any | f | 0 | off | dbname=regress_doesnotexist2 | 0/12345
(1 row)
-- ok - with lsn = NONE
@@ -178,10 +178,10 @@ ALTER SUBSCRIPTION regress_testsub SKIP (lsn = NONE);
ALTER SUBSCRIPTION regress_testsub SKIP (lsn = '0/0');
ERROR: invalid WAL location (LSN): 0/0
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Run as Owner? | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+---------------------+--------+-----------+------------------+------------------+--------+---------------+--------------------+------------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub2,testpub3} | f | off | d | f | any | f | off | dbname=regress_doesnotexist2 | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Run as Owner? | Min apply delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+---------------------+--------+-----------+------------------+------------------+--------+---------------+-----------------+--------------------+------------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub2,testpub3} | f | off | d | f | any | f | 0 | off | dbname=regress_doesnotexist2 | 0/0
(1 row)
BEGIN;
@@ -213,10 +213,10 @@ ALTER SUBSCRIPTION regress_testsub_foo SET (synchronous_commit = foobar);
ERROR: invalid value for parameter "synchronous_commit": "foobar"
HINT: Available values: local, remote_write, remote_apply, on, off.
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Run as Owner? | Synchronous commit | Conninfo | Skip LSN
----------------------+---------------------------+---------+---------------------+--------+-----------+------------------+------------------+--------+---------------+--------------------+------------------------------+----------
- regress_testsub_foo | regress_subscription_user | f | {testpub2,testpub3} | f | off | d | f | any | f | local | dbname=regress_doesnotexist2 | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Run as Owner? | Min apply delay | Synchronous commit | Conninfo | Skip LSN
+---------------------+---------------------------+---------+---------------------+--------+-----------+------------------+------------------+--------+---------------+-----------------+--------------------+------------------------------+----------
+ regress_testsub_foo | regress_subscription_user | f | {testpub2,testpub3} | f | off | d | f | any | f | 0 | local | dbname=regress_doesnotexist2 | 0/0
(1 row)
-- rename back to keep the rest simple
@@ -245,19 +245,19 @@ CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUB
WARNING: subscription was created, but is not connected
HINT: To initiate replication, you must manually create the replication slot, enable the subscription, and refresh the subscription.
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Run as Owner? | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+---------------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub} | t | off | d | f | any | f | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Run as Owner? | Min apply delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+---------------+-----------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | t | off | d | f | any | f | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
ALTER SUBSCRIPTION regress_testsub SET (binary = false);
ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE);
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Run as Owner? | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+---------------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub} | f | off | d | f | any | f | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Run as Owner? | Min apply delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+---------------+-----------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | off | d | f | any | f | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
DROP SUBSCRIPTION regress_testsub;
@@ -269,27 +269,27 @@ CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUB
WARNING: subscription was created, but is not connected
HINT: To initiate replication, you must manually create the replication slot, enable the subscription, and refresh the subscription.
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Run as Owner? | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+---------------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub} | f | on | d | f | any | f | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Run as Owner? | Min apply delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+---------------+-----------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | on | d | f | any | f | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
ALTER SUBSCRIPTION regress_testsub SET (streaming = parallel);
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Run as Owner? | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+---------------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub} | f | parallel | d | f | any | f | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Run as Owner? | Min apply delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+---------------+-----------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | parallel | d | f | any | f | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
ALTER SUBSCRIPTION regress_testsub SET (streaming = false);
ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE);
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Run as Owner? | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+---------------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub} | f | off | d | f | any | f | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Run as Owner? | Min apply delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+---------------+-----------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | off | d | f | any | f | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
-- fail - publication already exists
@@ -304,10 +304,10 @@ ALTER SUBSCRIPTION regress_testsub ADD PUBLICATION testpub1, testpub2 WITH (refr
ALTER SUBSCRIPTION regress_testsub ADD PUBLICATION testpub1, testpub2 WITH (refresh = false);
ERROR: publication "testpub1" is already in subscription "regress_testsub"
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Run as Owner? | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-----------------------------+--------+-----------+------------------+------------------+--------+---------------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub,testpub1,testpub2} | f | off | d | f | any | f | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Run as Owner? | Min apply delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-----------------------------+--------+-----------+------------------+------------------+--------+---------------+-----------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub,testpub1,testpub2} | f | off | d | f | any | f | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
-- fail - publication used more than once
@@ -322,10 +322,10 @@ ERROR: publication "testpub3" is not in subscription "regress_testsub"
-- ok - delete publications
ALTER SUBSCRIPTION regress_testsub DROP PUBLICATION testpub1, testpub2 WITH (refresh = false);
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Run as Owner? | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+---------------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub} | f | off | d | f | any | f | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Run as Owner? | Min apply delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+---------------+-----------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | off | d | f | any | f | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
DROP SUBSCRIPTION regress_testsub;
@@ -361,10 +361,10 @@ CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUB
WARNING: subscription was created, but is not connected
HINT: To initiate replication, you must manually create the replication slot, enable the subscription, and refresh the subscription.
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Run as Owner? | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+---------------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub} | f | off | p | f | any | f | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Run as Owner? | Min apply delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+---------------+-----------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | off | p | f | any | f | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
--fail - alter of two_phase option not supported.
@@ -373,10 +373,10 @@ ERROR: unrecognized subscription parameter: "two_phase"
-- but can alter streaming when two_phase enabled
ALTER SUBSCRIPTION regress_testsub SET (streaming = true);
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Run as Owner? | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+---------------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub} | f | on | p | f | any | f | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Run as Owner? | Min apply delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+---------------+-----------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | on | p | f | any | f | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE);
@@ -386,10 +386,10 @@ CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUB
WARNING: subscription was created, but is not connected
HINT: To initiate replication, you must manually create the replication slot, enable the subscription, and refresh the subscription.
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Run as Owner? | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+---------------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub} | f | on | p | f | any | f | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Run as Owner? | Min apply delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+---------------+-----------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | on | p | f | any | f | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE);
@@ -402,18 +402,18 @@ CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUB
WARNING: subscription was created, but is not connected
HINT: To initiate replication, you must manually create the replication slot, enable the subscription, and refresh the subscription.
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Run as Owner? | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+---------------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub} | f | off | d | f | any | f | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Run as Owner? | Min apply delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+---------------+-----------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | off | d | f | any | f | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
ALTER SUBSCRIPTION regress_testsub SET (disable_on_error = true);
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Run as Owner? | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+---------------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub} | f | off | d | t | any | f | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Run as Owner? | Min apply delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+---------------+-----------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | off | d | t | any | f | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE);
@@ -463,6 +463,44 @@ ERROR: permission denied for database regression
-- ok, owning it is enough for this stuff
ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE);
DROP SUBSCRIPTION regress_testsub;
+SET SESSION AUTHORIZATION regress_subscription_user;
+-- fail -- min_apply_delay must be a non-negative integer
+CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUBLICATION testpub WITH (connect = false, min_apply_delay = foo);
+ERROR: invalid value for parameter "min_apply_delay": "foo"
+CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUBLICATION testpub WITH (connect = false, min_apply_delay = -1);
+ERROR: -1 ms is outside the valid range for parameter "min_apply_delay" (0 .. 2147483647)
+-- fail - utilizing streaming = parallel with time-delayed replication is not supported
+CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUBLICATION testpub WITH (connect = false, streaming = parallel, min_apply_delay = 123);
+ERROR: min_apply_delay > 0 and streaming = parallel are mutually exclusive options
+-- success -- min_apply_delay value without unit is taken as milliseconds
+CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUBLICATION testpub WITH (connect = false, min_apply_delay = 123);
+WARNING: subscription was created, but is not connected
+HINT: To initiate replication, you must manually create the replication slot, enable the subscription, and refresh the subscription.
+\dRs+
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Run as Owner? | Min apply delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+---------------+-----------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | off | d | f | any | f | 123 | off | dbname=regress_doesnotexist | 0/0
+(1 row)
+
+-- success -- min_apply_delay value with unit is converted into ms and stored as an integer
+ALTER SUBSCRIPTION regress_testsub SET (min_apply_delay = '1 d');
+\dRs+
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Run as Owner? | Min apply delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+---------------+-----------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | off | d | f | any | f | 86400000 | off | dbname=regress_doesnotexist | 0/0
+(1 row)
+
+-- fail - alter subscription with streaming = parallel should fail when time-delayed replication is set
+ALTER SUBSCRIPTION regress_testsub SET (streaming = parallel);
+ERROR: cannot set parallel streaming mode for subscription with min_apply_delay
+-- fail - alter subscription with min_apply_delay should fail when streaming = parallel is set
+ALTER SUBSCRIPTION regress_testsub SET (min_apply_delay = 0, streaming = parallel);
+ALTER SUBSCRIPTION regress_testsub SET (min_apply_delay = 123);
+ERROR: cannot set min_apply_delay for subscription in parallel streaming mode
+ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE);
+DROP SUBSCRIPTION regress_testsub;
RESET SESSION AUTHORIZATION;
DROP ROLE regress_subscription_user;
DROP ROLE regress_subscription_user2;
diff --git a/src/test/regress/sql/subscription.sql b/src/test/regress/sql/subscription.sql
index cc53458d91..a3f1f2cf1d 100644
--- a/src/test/regress/sql/subscription.sql
+++ b/src/test/regress/sql/subscription.sql
@@ -332,7 +332,34 @@ ALTER SUBSCRIPTION regress_testsub RENAME TO regress_testsub2;
ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE);
DROP SUBSCRIPTION regress_testsub;
+SET SESSION AUTHORIZATION regress_subscription_user;
+
+-- fail -- min_apply_delay must be a non-negative integer
+CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUBLICATION testpub WITH (connect = false, min_apply_delay = foo);
+CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUBLICATION testpub WITH (connect = false, min_apply_delay = -1);
+
+-- fail - utilizing streaming = parallel with time-delayed replication is not supported
+CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUBLICATION testpub WITH (connect = false, streaming = parallel, min_apply_delay = 123);
+
+-- success -- min_apply_delay value without unit is taken as milliseconds
+CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUBLICATION testpub WITH (connect = false, min_apply_delay = 123);
+\dRs+
+
+-- success -- min_apply_delay value with unit is converted into ms and stored as an integer
+ALTER SUBSCRIPTION regress_testsub SET (min_apply_delay = '1 d');
+\dRs+
+
+-- fail - alter subscription with streaming = parallel should fail when time-delayed replication is set
+ALTER SUBSCRIPTION regress_testsub SET (streaming = parallel);
+
+-- fail - alter subscription with min_apply_delay should fail when streaming = parallel is set
+ALTER SUBSCRIPTION regress_testsub SET (min_apply_delay = 0, streaming = parallel);
+ALTER SUBSCRIPTION regress_testsub SET (min_apply_delay = 123);
+ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE);
+DROP SUBSCRIPTION regress_testsub;
+
RESET SESSION AUTHORIZATION;
+
DROP ROLE regress_subscription_user;
DROP ROLE regress_subscription_user2;
DROP ROLE regress_subscription_user3;
diff --git a/src/test/subscription/t/001_rep_changes.pl b/src/test/subscription/t/001_rep_changes.pl
index 91aa068c95..01f2c4284d 100644
--- a/src/test/subscription/t/001_rep_changes.pl
+++ b/src/test/subscription/t/001_rep_changes.pl
@@ -515,6 +515,37 @@ $node_publisher->poll_query_until('postgres',
or die
"Timed out while waiting for apply to restart after renaming SUBSCRIPTION";
+# Test time-delayed logical replication
+#
+# If the subscription sets min_apply_delay parameter, the logical replication
+# worker will delay the transaction apply for min_apply_delay milliseconds. We
+# verify this by looking at the time difference between a) when tuples are
+# inserted on the publisher, and b) when those changes are replicated on the
+# subscriber. Even on slow machines, this strategy will give predictable behavior.
+
+# Set min_apply_delay parameter to 3 seconds
+my $delay = 3;
+$node_subscriber->safe_psql('postgres',
+ "ALTER SUBSCRIPTION tap_sub_renamed SET (min_apply_delay = '${delay}s')");
+
+# Before doing the insertion, get the current timestamp that will be
+# used as a comparison base.
+my $publisher_insert_time = time();
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_ins VALUES (generate_series(1101, 1120))");
+
+$node_subscriber->poll_query_until('postgres',
+ "SELECT count(*) = 1 FROM tab_ins WHERE a = 1120;"
+ )
+ or die
+ "failed to replicate changes";
+
+# This test is successful if and only if the LSN has been applied with at least
+# the configured apply delay.
+ok( time() - $publisher_insert_time >= $delay,
+ "subscriber applies WAL only after replication delay for non-streaming transaction"
+);
+
# check all the cleanup
$node_subscriber->safe_psql('postgres', "DROP SUBSCRIPTION tap_sub_renamed");
--
2.27.0
Dear hackers,
I rebased and refined my PoC. Followings are the changes:
* Added support for ALTER SUBSCRIPTION .. SKIP LSN. The skip operation is done when
the application starts. User must indicate the commit_lsn of the transaction to
skip the transaction. If the apply worker faces ERROR, it will output the commit_lsn.
Apart from non-delayed transactions, the prepared but not committed transaction
cannot be skipped. This is because currently the prepare_lsn is not recorded to
the file.
* Added integrity checks. When the debug build is enabled, each messages written in
the files has the CRC checksums. When the message is read by apply worker, the
worker checks it and raise PANIC if the process fails to compare. I'm not sure
the performancec degradation can be accepted, so I added it only when
USE_ASSERT_CHECKING is on.
Best Regards,
Hayato Kuroda
FUJITSU LIMITED
Attachments:
v5-0001-WIP-Time-delayed-logical-replication-by-serializi.patchapplication/octet-stream; name=v5-0001-WIP-Time-delayed-logical-replication-by-serializi.patchDownload
From f90c4537719dd95ef2f18aa3d0e8b88fdf977f64 Mon Sep 17 00:00:00 2001
From: Hayato Kuroda <kuroda.hayato@fujitsu.com>
Date: Wed, 19 Apr 2023 09:26:12 +0000
Subject: [PATCH v5] (WIP) Time-delayed logical replication by serializing
changes
Similar to physical replication, a time-delayed copy of the data for
logical replication is useful for some scenarios (particularly to fix
errors that might cause data loss).
This patch implements a new subscription parameter called 'min_apply_delay'.
If the subscription sets min_apply_delay parameter, the logical
replication worker will delay the transaction apply for min_apply_delay
milliseconds.
The delaying is implemented by serializing changes into file. The file is
created when the worker receives BEGIN message. The worker writes received
changes and flush at COMMIT. The delayed transaction is checked its commit time
for every main loop, and applied from the file when the time exceeds the
min_apply_delay. The commit time is stored in memory when the transaction is
committed, or the worker restarts.
The delay is calculated between the WAL time stamp and the current time
on the subscriber.
The combination of parallel streaming mode and min_apply_delay is not
allowed. This is because in parallel streaming mode, we start applying
the transaction stream as soon as the first change arrives without
knowing the transaction's prepare/commit time. This means we cannot
calculate the underlying network/decoding lag between publisher and
subscriber, and so always waiting for the full 'min_apply_delay' period
might include unnecessary delay.
The other possibility was to apply the delay at the end of the parallel
apply transaction but that would cause issues related to resource
bloat and locks being held for a long time.
Earlier versions were written by Euler Taveira, Takamichi Osumi, and Kuroda Hayato
Author: Kuroda Hayato, Takamichi Osumi
---
doc/src/sgml/catalogs.sgml | 9 +
doc/src/sgml/glossary.sgml | 15 +
doc/src/sgml/logical-replication.sgml | 7 +
doc/src/sgml/ref/alter_subscription.sgml | 12 +-
doc/src/sgml/ref/create_subscription.sgml | 56 +-
src/backend/catalog/pg_subscription.c | 1 +
src/backend/catalog/system_views.sql | 6 +-
src/backend/commands/subscriptioncmds.c | 123 +-
.../libpqwalreceiver/libpqwalreceiver.c | 4 +
src/backend/replication/logical/worker.c | 1154 +++++++++++++++--
src/backend/replication/pgoutput/pgoutput.c | 21 +-
src/bin/pg_dump/pg_dump.c | 13 +-
src/bin/pg_dump/pg_dump.h | 1 +
src/bin/psql/describe.c | 9 +-
src/bin/psql/tab-complete.c | 13 +-
src/include/catalog/pg_subscription.h | 3 +
src/include/replication/pgoutput.h | 1 +
src/include/replication/walreceiver.h | 1 +
src/test/regress/expected/subscription.out | 190 +--
src/test/regress/sql/subscription.sql | 27 +
src/test/subscription/t/001_rep_changes.pl | 31 +
src/tools/pgindent/typedefs.list | 1 +
22 files changed, 1500 insertions(+), 198 deletions(-)
diff --git a/doc/src/sgml/catalogs.sgml b/doc/src/sgml/catalogs.sgml
index 5240840552..35a7b6a9e8 100644
--- a/doc/src/sgml/catalogs.sgml
+++ b/doc/src/sgml/catalogs.sgml
@@ -7891,6 +7891,15 @@ SCRAM-SHA-256$<replaceable><iteration count></replaceable>:<replaceable>&l
</para></entry>
</row>
+ <row>
+ <entry role="catalog_table_entry"><para role="column_definition">
+ <structfield>subminapplydelay</structfield> <type>int4</type>
+ </para>
+ <para>
+ The minimum delay, in milliseconds, for applying changes
+ </para></entry>
+ </row>
+
<row>
<entry role="catalog_table_entry"><para role="column_definition">
<structfield>subname</structfield> <type>name</type>
diff --git a/doc/src/sgml/glossary.sgml b/doc/src/sgml/glossary.sgml
index 29bf1873bd..204fe7f3ae 100644
--- a/doc/src/sgml/glossary.sgml
+++ b/doc/src/sgml/glossary.sgml
@@ -1757,6 +1757,21 @@
</glossdef>
</glossentry>
+ <glossentry id="glossary-time-delayed-replication">
+ <glossterm>Time-delayed replication</glossterm>
+ <glossdef>
+ <para>
+ Replication setup that delays the application of changes by a specified
+ minimum time-delay period.
+ </para>
+ <para>
+ For more information, see
+ <xref linkend="guc-recovery-min-apply-delay"/> for physical replication
+ and <xref linkend="sql-createsubscription"/> for logical replication.
+ </para>
+ </glossdef>
+ </glossentry>
+
<glossentry id="glossary-toast">
<glossterm>TOAST</glossterm>
<glossdef>
diff --git a/doc/src/sgml/logical-replication.sgml b/doc/src/sgml/logical-replication.sgml
index c65f4aabfd..0be4d652aa 100644
--- a/doc/src/sgml/logical-replication.sgml
+++ b/doc/src/sgml/logical-replication.sgml
@@ -257,6 +257,13 @@
option of <command>CREATE SUBSCRIPTION</command> for details.
</para>
+ <para>
+ A subscription can delay the application of changes by specifying the
+ <link linkend="sql-createsubscription-with-min-apply-delay"><literal>min_apply_delay</literal></link>
+ subscription parameter. See <xref linkend="sql-createsubscription"/> for
+ details.
+ </para>
+
<sect2 id="logical-replication-subscription-slot">
<title>Replication Slot Management</title>
diff --git a/doc/src/sgml/ref/alter_subscription.sgml b/doc/src/sgml/ref/alter_subscription.sgml
index a85e04e4d6..bdca13004a 100644
--- a/doc/src/sgml/ref/alter_subscription.sgml
+++ b/doc/src/sgml/ref/alter_subscription.sgml
@@ -225,8 +225,9 @@ ALTER SUBSCRIPTION <replaceable class="parameter">name</replaceable> RENAME TO <
<link linkend="sql-createsubscription-with-streaming"><literal>streaming</literal></link>,
<link linkend="sql-createsubscription-with-disable-on-error"><literal>disable_on_error</literal></link>,
<link linkend="sql-createsubscription-with-password-required"><literal>password_required</literal></link>,
- <link linkend="sql-createsubscription-with-run-as-owner"><literal>run_as_owner</literal></link>, and
- <link linkend="sql-createsubscription-with-origin"><literal>origin</literal></link>.
+ <link linkend="sql-createsubscription-with-run-as-owner"><literal>run_as_owner</literal></link>,
+ <link linkend="sql-createsubscription-with-origin"><literal>origin</literal></link>, and
+ <link linkend="sql-createsubscription-with-min-apply-delay"><literal>min_apply_delay</literal></link>.
Only a superuser can set <literal>password_required = false</literal>.
</para>
</listitem>
@@ -263,8 +264,11 @@ ALTER SUBSCRIPTION <replaceable class="parameter">name</replaceable> RENAME TO <
Specifies the finish LSN of the remote transaction whose changes
are to be skipped by the logical replication worker. The finish LSN
is the LSN at which the transaction is either committed or prepared.
- Skipping individual subtransactions is not supported. Setting
- <literal>NONE</literal> resets the LSN.
+ Note that if the time-delayed logical replication is enabled, the LSN
+ corresponding to prepared cannot be specified. In that case the LSN
+ corresponding to either of commit or commit prepared must be
+ specified. Skipping individual subtransactions is not supported.
+ Setting <literal>NONE</literal> resets the LSN.
</para>
</listitem>
</varlistentry>
diff --git a/doc/src/sgml/ref/create_subscription.sgml b/doc/src/sgml/ref/create_subscription.sgml
index 71652fd918..c5c7e9990a 100644
--- a/doc/src/sgml/ref/create_subscription.sgml
+++ b/doc/src/sgml/ref/create_subscription.sgml
@@ -399,7 +399,56 @@ CREATE SUBSCRIPTION <replaceable class="parameter">subscription_name</replaceabl
</para>
</listitem>
</varlistentry>
- </variablelist></para>
+
+ <varlistentry id="sql-createsubscription-with-min-apply-delay">
+ <term><literal>min_apply_delay</literal> (<type>integer</type>)</term>
+ <listitem>
+ <para>
+ By default, the subscriber applies changes as soon as possible. This
+ parameter allows the user to delay the application of changes by a
+ given time period. This is done by writing all the changes into a
+ file once and apply contents after spending time. If the value is
+ specified without units, it is taken as milliseconds. The default
+ is zero (no delay). See <xref linkend="config-setting-names-values"/>
+ for details on the available valid time units.
+ </para>
+ <para>
+ Any delay becomes effective only after all initial table
+ synchronization has finished and occurs before each transaction starts
+ to get applied on the subscriber. The delay is calculated as the
+ difference between the WAL timestamp as written on the publisher and
+ the current time on the subscriber. Any overhead of time spent in
+ logical decoding and in transferring the transaction may reduce the
+ actual wait time. Even if the overhead already exceeds the requested
+ <literal>min_apply_delay</literal> value, all the changes are written
+ into file and applied immediately. If the system clocks on publisher
+ and subscriber are not synchronized, this may lead to apply changes
+ earlier than expected, but this is not a major issue because this
+ parameter is typically much larger than the time deviations between
+ servers.
+ </para>
+ <para>
+ This parameter can be used with the <literal>two_phase</literal>
+ parameter. However, when the delay is enabled and a prepared
+ transaction is sent from the publisher, the transaction is not
+ prepared on the subscriber node and is instead written into an
+ intermediate file. Once the <command>COMMIT PREPARED</command> is
+ received and more time than the <literal>min_apply_delay</literal>
+ has elapsed, the file will be loaded and applied.
+ </para>
+ <warning>
+ <para>
+ Delaying the replication means there is a much longer time between
+ making a change on the publisher, and that change being committed
+ on the subscriber. This can impact the performance of synchronous
+ replication. See <xref linkend="guc-synchronous-commit"/>
+ parameter.
+ </para>
+ </warning>
+ </listitem>
+ </varlistentry>
+ </variablelist>
+ </para>
</listitem>
</varlistentry>
@@ -472,6 +521,11 @@ CREATE SUBSCRIPTION <replaceable class="parameter">subscription_name</replaceabl
published with different column lists are not supported.
</para>
+ <para>
+ A non-zero <literal>min_apply_delay</literal> parameter is not allowed when
+ streaming in parallel mode.
+ </para>
+
<para>
We allow non-existent publications to be specified so that users can add
those later. This means
diff --git a/src/backend/catalog/pg_subscription.c b/src/backend/catalog/pg_subscription.c
index d07f88ce28..56f8fdda10 100644
--- a/src/backend/catalog/pg_subscription.c
+++ b/src/backend/catalog/pg_subscription.c
@@ -66,6 +66,7 @@ GetSubscription(Oid subid, bool missing_ok)
sub->skiplsn = subform->subskiplsn;
sub->name = pstrdup(NameStr(subform->subname));
sub->owner = subform->subowner;
+ sub->minapplydelay = subform->subminapplydelay;
sub->enabled = subform->subenabled;
sub->binary = subform->subbinary;
sub->stream = subform->substream;
diff --git a/src/backend/catalog/system_views.sql b/src/backend/catalog/system_views.sql
index 48aacf66ee..4f6bb247e9 100644
--- a/src/backend/catalog/system_views.sql
+++ b/src/backend/catalog/system_views.sql
@@ -1324,9 +1324,9 @@ REVOKE ALL ON pg_replication_origin_status FROM public;
-- All columns of pg_subscription except subconninfo are publicly readable.
REVOKE ALL ON pg_subscription FROM public;
-GRANT SELECT (oid, subdbid, subskiplsn, subname, subowner, subenabled,
- subbinary, substream, subtwophasestate, subdisableonerr,
- subpasswordrequired, subrunasowner,
+GRANT SELECT (oid, subdbid, subskiplsn, subminapplydelay, subname, subowner,
+ subenabled, subbinary, substream, subtwophasestate,
+ subdisableonerr, subpasswordrequired, subrunasowner,
subslotname, subsynccommit, subpublications, suborigin)
ON pg_subscription TO public;
diff --git a/src/backend/commands/subscriptioncmds.c b/src/backend/commands/subscriptioncmds.c
index 56eafbff10..9c9a0d5a67 100644
--- a/src/backend/commands/subscriptioncmds.c
+++ b/src/backend/commands/subscriptioncmds.c
@@ -71,6 +71,7 @@
#define SUBOPT_RUN_AS_OWNER 0x00001000
#define SUBOPT_LSN 0x00002000
#define SUBOPT_ORIGIN 0x00004000
+#define SUBOPT_MIN_APPLY_DELAY 0x00008000
/* check if the 'val' has 'bits' set */
#define IsSet(val, bits) (((val) & (bits)) == (bits))
@@ -97,6 +98,7 @@ typedef struct SubOpts
bool runasowner;
char *origin;
XLogRecPtr lsn;
+ int32 min_apply_delay;
} SubOpts;
static List *fetch_table_list(WalReceiverConn *wrconn, List *publications);
@@ -107,7 +109,7 @@ static void check_publications_origin(WalReceiverConn *wrconn,
static void check_duplicates_in_publist(List *publist, Datum *datums);
static List *merge_publications(List *oldpublist, List *newpublist, bool addpub, const char *subname);
static void ReportSlotConnectionError(List *rstates, Oid subid, char *slotname, char *err);
-
+static int32 defGetMinApplyDelay(DefElem *def);
/*
* Common option parsing function for CREATE and ALTER SUBSCRIPTION commands.
@@ -157,6 +159,8 @@ parse_subscription_options(ParseState *pstate, List *stmt_options,
opts->runasowner = false;
if (IsSet(supported_opts, SUBOPT_ORIGIN))
opts->origin = pstrdup(LOGICALREP_ORIGIN_ANY);
+ if (IsSet(supported_opts, SUBOPT_MIN_APPLY_DELAY))
+ opts->min_apply_delay = 0;
/* Parse options */
foreach(lc, stmt_options)
@@ -353,6 +357,15 @@ parse_subscription_options(ParseState *pstate, List *stmt_options,
opts->specified_opts |= SUBOPT_LSN;
opts->lsn = lsn;
}
+ else if (IsSet(supported_opts, SUBOPT_MIN_APPLY_DELAY) &&
+ strcmp(defel->defname, "min_apply_delay") == 0)
+ {
+ if (IsSet(opts->specified_opts, SUBOPT_MIN_APPLY_DELAY))
+ errorConflictingDefElem(defel, pstate);
+
+ opts->specified_opts |= SUBOPT_MIN_APPLY_DELAY;
+ opts->min_apply_delay = defGetMinApplyDelay(defel);
+ }
else
ereport(ERROR,
(errcode(ERRCODE_SYNTAX_ERROR),
@@ -433,6 +446,32 @@ parse_subscription_options(ParseState *pstate, List *stmt_options,
"slot_name = NONE", "create_slot = false")));
}
}
+
+ /*
+ * The combination of parallel streaming mode and min_apply_delay is not
+ * allowed. This is because in parallel streaming mode, we start applying
+ * the transaction stream as soon as the first change arrives without
+ * knowing the transaction's prepare/commit time. This means we cannot
+ * calculate the underlying network/decoding lag between publisher and
+ * subscriber, and so always waiting for the full 'min_apply_delay' period
+ * might include unnecessary delay.
+ *
+ * The other possibility was to apply the delay at the end of the parallel
+ * apply transaction but that would cause issues related to resource bloat
+ * and locks being held for a long time.
+ */
+ if (IsSet(supported_opts, SUBOPT_MIN_APPLY_DELAY) &&
+ opts->min_apply_delay > 0 &&
+ opts->streaming == LOGICALREP_STREAM_PARALLEL)
+ ereport(ERROR,
+ errcode(ERRCODE_SYNTAX_ERROR),
+
+ /*
+ * translator: the first %s is a string of the form "parameter > 0"
+ * and the second one is "option = value".
+ */
+ errmsg("%s and %s are mutually exclusive options",
+ "min_apply_delay > 0", "streaming = parallel"));
}
/*
@@ -591,7 +630,8 @@ CreateSubscription(ParseState *pstate, CreateSubscriptionStmt *stmt,
SUBOPT_SYNCHRONOUS_COMMIT | SUBOPT_BINARY |
SUBOPT_STREAMING | SUBOPT_TWOPHASE_COMMIT |
SUBOPT_DISABLE_ON_ERR | SUBOPT_PASSWORD_REQUIRED |
- SUBOPT_RUN_AS_OWNER | SUBOPT_ORIGIN);
+ SUBOPT_RUN_AS_OWNER | SUBOPT_ORIGIN |
+ SUBOPT_MIN_APPLY_DELAY);
parse_subscription_options(pstate, stmt->options, supported_opts, &opts);
/*
@@ -682,6 +722,7 @@ CreateSubscription(ParseState *pstate, CreateSubscriptionStmt *stmt,
values[Anum_pg_subscription_oid - 1] = ObjectIdGetDatum(subid);
values[Anum_pg_subscription_subdbid - 1] = ObjectIdGetDatum(MyDatabaseId);
values[Anum_pg_subscription_subskiplsn - 1] = LSNGetDatum(InvalidXLogRecPtr);
+ values[Anum_pg_subscription_subminapplydelay - 1] = Int32GetDatum(opts.min_apply_delay);
values[Anum_pg_subscription_subname - 1] =
DirectFunctionCall1(namein, CStringGetDatum(stmt->subname));
values[Anum_pg_subscription_subowner - 1] = ObjectIdGetDatum(owner);
@@ -1130,7 +1171,8 @@ AlterSubscription(ParseState *pstate, AlterSubscriptionStmt *stmt,
SUBOPT_SYNCHRONOUS_COMMIT | SUBOPT_BINARY |
SUBOPT_STREAMING | SUBOPT_DISABLE_ON_ERR |
SUBOPT_PASSWORD_REQUIRED |
- SUBOPT_RUN_AS_OWNER | SUBOPT_ORIGIN);
+ SUBOPT_RUN_AS_OWNER | SUBOPT_ORIGIN |
+ SUBOPT_MIN_APPLY_DELAY);
parse_subscription_options(pstate, stmt->options,
supported_opts, &opts);
@@ -1174,6 +1216,19 @@ AlterSubscription(ParseState *pstate, AlterSubscriptionStmt *stmt,
if (IsSet(opts.specified_opts, SUBOPT_STREAMING))
{
+ /*
+ * The combination of parallel streaming mode and
+ * min_apply_delay is not allowed. See
+ * parse_subscription_options.
+ */
+ if (opts.streaming == LOGICALREP_STREAM_PARALLEL &&
+ !IsSet(opts.specified_opts, SUBOPT_MIN_APPLY_DELAY)
+ && sub->minapplydelay > 0)
+ ereport(ERROR,
+ errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
+ errmsg("cannot set parallel streaming mode for subscription with %s",
+ "min_apply_delay"));
+
values[Anum_pg_subscription_substream - 1] =
CharGetDatum(opts.streaming);
replaces[Anum_pg_subscription_substream - 1] = true;
@@ -1202,6 +1257,26 @@ AlterSubscription(ParseState *pstate, AlterSubscriptionStmt *stmt,
= true;
}
+ if (IsSet(opts.specified_opts, SUBOPT_MIN_APPLY_DELAY))
+ {
+ /*
+ * The combination of parallel streaming mode and
+ * min_apply_delay is not allowed. See
+ * parse_subscription_options.
+ */
+ if (opts.min_apply_delay > 0 &&
+ !IsSet(opts.specified_opts, SUBOPT_STREAMING)
+ && sub->stream == LOGICALREP_STREAM_PARALLEL)
+ ereport(ERROR,
+ errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
+ errmsg("cannot set %s for subscription in parallel streaming mode",
+ "min_apply_delay"));
+
+ values[Anum_pg_subscription_subminapplydelay - 1] =
+ Int32GetDatum(opts.min_apply_delay);
+ replaces[Anum_pg_subscription_subminapplydelay - 1] = true;
+ }
+
if (IsSet(opts.specified_opts, SUBOPT_ORIGIN))
{
values[Anum_pg_subscription_suborigin - 1] =
@@ -2343,3 +2418,45 @@ defGetStreamingMode(DefElem *def)
def->defname)));
return LOGICALREP_STREAM_OFF; /* keep compiler quiet */
}
+
+/*
+ * Extract the min_apply_delay value from a DefElem. This is very similar to
+ * parse_and_validate_value() for integer values, because min_apply_delay
+ * accepts the same parameter format as recovery_min_apply_delay.
+ */
+static int32
+defGetMinApplyDelay(DefElem *def)
+{
+ char *input_string;
+ int result;
+ const char *hintmsg;
+
+ input_string = defGetString(def);
+
+ /*
+ * Parse given string as parameter which has millisecond unit
+ */
+ if (!parse_int(input_string, &result, GUC_UNIT_MS, &hintmsg))
+ ereport(ERROR,
+ (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+ errmsg("invalid value for parameter \"%s\": \"%s\"",
+ "min_apply_delay", input_string),
+ hintmsg ? errhint("%s", _(hintmsg)) : 0));
+
+ /*
+ * Check both the lower boundary for the valid min_apply_delay range and
+ * the upper boundary as the safeguard for some platforms where INT_MAX is
+ * wider than int32 respectively. Although parse_int() has confirmed that
+ * the result is less than or equal to INT_MAX, the value will be stored
+ * in a catalog column of int32.
+ */
+ if (result < 0 || result > PG_INT32_MAX)
+ ereport(ERROR,
+ (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+ errmsg("%d ms is outside the valid range for parameter \"%s\" (%d .. %d)",
+ result,
+ "min_apply_delay",
+ 0, PG_INT32_MAX)));
+
+ return result;
+}
diff --git a/src/backend/replication/libpqwalreceiver/libpqwalreceiver.c b/src/backend/replication/libpqwalreceiver/libpqwalreceiver.c
index 052505e46f..0fb073c2c1 100644
--- a/src/backend/replication/libpqwalreceiver/libpqwalreceiver.c
+++ b/src/backend/replication/libpqwalreceiver/libpqwalreceiver.c
@@ -470,6 +470,10 @@ libpqrcv_startstreaming(WalReceiverConn *conn,
appendStringInfo(&cmd, ", origin '%s'",
options->proto.logical.origin);
+ if (options->proto.logical.require_schema &&
+ PQserverVersion(conn->streamConn) >= 160000)
+ appendStringInfo(&cmd, ", require_schema 'on'");
+
pubnames = options->proto.logical.publication_names;
pubnames_str = stringlist_to_identifierstr(conn->streamConn, pubnames);
if (!pubnames_str)
diff --git a/src/backend/replication/logical/worker.c b/src/backend/replication/logical/worker.c
index dbf88c9553..1a8f483b45 100644
--- a/src/backend/replication/logical/worker.c
+++ b/src/backend/replication/logical/worker.c
@@ -153,6 +153,7 @@
#include "catalog/pg_subscription.h"
#include "catalog/pg_subscription_rel.h"
#include "catalog/pg_tablespace.h"
+#include "common/file_utils.h"
#include "commands/tablecmds.h"
#include "commands/tablespace.h"
#include "commands/trigger.h"
@@ -168,6 +169,7 @@
#include "optimizer/optimizer.h"
#include "parser/parse_relation.h"
#include "pgstat.h"
+#include "port/pg_crc32c.h"
#include "postmaster/bgworker.h"
#include "postmaster/interrupt.h"
#include "postmaster/postmaster.h"
@@ -370,6 +372,86 @@ typedef struct ApplySubXactData
static ApplySubXactData subxact_data = {0, 0, InvalidTransactionId, NULL};
+/* XXX macros for time-delayed logical replicaiton */
+
+/* DELAYED_DIR stores files that contains changes of delayed transactions. */
+#define DELAYED_DIR "pg_logical/delayed_txns"
+
+/*
+ * The filename consists of the following, dash separated, components:
+ * 1) subscription oid
+ * 2) xid of delayed transaction on publisher
+ * 3) status of the delaying transaction
+ * 4) upper 32bit of the commit_lsn
+ * 5) lower 32bit of the commit_lsn
+ * 6) upper 32bit of the end_lsn
+ * 7) lower 32bit of the end_lsn
+ * 8) committime
+ */
+#define DELAYED_FORMAT "delayed-%x-%x-%c-%X-%X-%X-%X-" INT64_FORMAT
+#define DELAYED_TXN_COMMITTED 'c'
+#define DELAYED_TXN_PREPARED 'p'
+#define DELAYED_TXN_UNKNOWN 'u'
+
+#define DELAY_MAGIC ((uint32) 0xEE816C) /* format identifier */
+
+
+/* List entry to map xid and commit time */
+typedef struct DelayedTxnListEntry
+{
+ TransactionId xid;
+ LogicalRepCommitData commit_data;
+} DelayedTxnListEntry;
+
+/*
+ * Replication message on-disk data structure
+ */
+typedef struct ReplicationMessageOnDisk
+{
+ /* Data not covered by checksum */
+ uint32 magic;
+ pg_crc32c checksum;
+
+ /* Data covered by checksum */
+ int length; /* length of actual message, action is not
+ * included */
+ char action;
+
+ /* Actual message follows, it is also covered by checksum */
+} ReplicationMessageOnDisk;
+
+/* Size of the part not covered by the checksum */
+#define ReplicationMessageOnDiskNotChecksummedSize \
+ offsetof(ReplicationMessageOnDisk, length)
+/* Size of the part covered by the checksum */
+#define ReplicationMessageOnDiskChecksummedSize \
+ sizeof(ReplicationMessageOnDisk) - ReplicationMessageOnDiskNotChecksummedSize
+
+/*
+ * An entry is appended when the we receives commit message and time-delayed
+ * logical replication is requested. The entry will be deleted after contents
+ * are applied.
+ */
+static List *DelayedTxnList = NIL;
+
+/* fields valid only when time-delayed logical replication is requested */
+static bool in_delayed_transaction = false;
+
+static TransactionId delayed_xid = InvalidTransactionId;
+
+/*
+ * Store flushed lsn for time-delayed logical replication. This is used when
+ * we send a feedback message to the publisher.
+ */
+static XLogRecPtr last_flushed = InvalidXLogRecPtr;
+
+/*
+ * FIXME: global file descriptor may be not sufficient. There is a possibility
+ * that non-streaming transactions are come concurrently. At that time
+ * create_delay_file() for the second transaction will be failed...
+ */
+static int delayed_fd = -1;
+
static inline void subxact_filename(char *path, Oid subid, TransactionId xid);
static inline void changes_filename(char *path, Oid subid, TransactionId xid);
@@ -432,6 +514,599 @@ static inline void reset_apply_error_context_info(void);
static TransApplyAction get_transaction_apply_action(TransactionId xid,
ParallelApplyWorkerInfo **winfo);
+static void begin_replication_step(void);
+static void end_replication_step(void);
+
+/* Functions for time-delayed logical replicaiton */
+static void cache_commit_data(LogicalRepCommitData *commit_data, TransactionId xid);
+static void flush_delayed_changes(LogicalRepCommitData *commit_data);
+static void delay_file_name(char *path, Oid subid, TransactionId xid,
+ char status, XLogRecPtr commit_lsn,
+ XLogRecPtr end_lsn, TimestampTz committime);
+static bool is_given_transaction_delayed(Oid subid, TransactionId xid);
+static void create_delay_file(TransactionId xid);
+static bool handle_delayed_transaction(char action, StringInfo s);
+
+/*
+ * Cache commit_data into the list
+ */
+static void
+cache_commit_data(LogicalRepCommitData *commit_data, TransactionId xid)
+{
+ MemoryContext old;
+ DelayedTxnListEntry *entry;
+
+ old = MemoryContextSwitchTo(ApplyContext);
+
+ entry = palloc0(sizeof(DelayedTxnListEntry));
+
+ /* Contruct an entry and append it */
+ entry->xid = xid;
+ memcpy(&entry->commit_data, commit_data, sizeof(LogicalRepCommitData));
+ DelayedTxnList = lappend(DelayedTxnList, entry);
+
+ MemoryContextSwitchTo(old);
+
+ elog(DEBUG1, "transaction %u is cached", xid);
+
+}
+
+/*
+ * Flush given changes, rename and close the file. This will be called at the
+ * end of the transaction.
+ */
+static void
+flush_delayed_changes(LogicalRepCommitData *commit_data)
+{
+ char old_path[MAXPGPATH];
+ char new_path[MAXPGPATH];
+
+ Assert(delayed_fd > 0);
+ Assert(TransactionIdIsValid(delayed_xid));
+
+ /* Cache given commit_data into the list */
+ cache_commit_data(commit_data, delayed_xid);
+
+ /*
+ * Close file. No need to flush here because it will be done in
+ * durable_rename().
+ */
+ CloseTransientFile(delayed_fd);
+
+ /* Construct old/new filename */
+ delay_file_name(old_path, MyLogicalRepWorker->subid, delayed_xid,
+ DELAYED_TXN_UNKNOWN, InvalidXLogRecPtr, InvalidXLogRecPtr,
+ 0);
+ delay_file_name(new_path, MyLogicalRepWorker->subid, delayed_xid,
+ DELAYED_TXN_COMMITTED, commit_data->commit_lsn,
+ commit_data->end_lsn, commit_data->committime);
+
+ /* And do actual rename */
+ if (durable_rename(old_path, new_path, PANIC))
+ abort();
+
+ /* Store flushed lsn */
+ last_flushed = commit_data->end_lsn;
+
+ /* Cleanup */
+ delayed_fd = -1;
+ delayed_xid = InvalidTransactionId;
+ in_delayed_transaction = false;
+}
+
+/*
+ * Get formal filename from needed information
+ */
+static void
+delay_file_name(char *path, Oid subid, TransactionId xid, char status,
+ XLogRecPtr commit_lsn, XLogRecPtr end_lsn,
+ TimestampTz committime)
+{
+ snprintf(path, MAXPGPATH, DELAYED_DIR "/" DELAYED_FORMAT, subid, xid,
+ status, LSN_FORMAT_ARGS(commit_lsn), LSN_FORMAT_ARGS(end_lsn),
+ committime);
+}
+
+/*
+ * Check whether the given transaction is delayed. This is done by checking the
+ * delay file.
+ */
+static bool
+is_given_transaction_delayed(Oid subid, TransactionId xid)
+{
+ struct stat st;
+ char path[MAXPGPATH];
+
+ delay_file_name(path, subid, xid, DELAYED_TXN_PREPARED, InvalidXLogRecPtr,
+ InvalidXLogRecPtr, 0);
+
+ return stat(path, &st) == 0;
+}
+
+/*
+ * Apply the delayed transaction. In the function a delayed file is opened and
+ * read. Apply worker applies written changes.
+ */
+static void
+apply_delayed_transaction(TransactionId xid, LogicalRepCommitData *commit_data)
+{
+ StringInfoData s2;
+ int nchanges;
+ char path[MAXPGPATH];
+ char *buffer = NULL;
+ MemoryContext oldcxt;
+ ResourceOwner oldowner;
+
+ /* Make sure we have an open transaction */
+ begin_replication_step();
+
+ /*
+ * Allocate file handle and memory required to process all the messages in
+ * TopTransactionContext to avoid them getting reset after each message is
+ * processed.
+ */
+ oldcxt = MemoryContextSwitchTo(TopTransactionContext);
+
+ /* Open the spool file for the committed transaction */
+ delay_file_name(path, MyLogicalRepWorker->subid, xid,
+ DELAYED_TXN_COMMITTED, commit_data->commit_lsn,
+ commit_data->end_lsn, commit_data->committime);
+ elog(DEBUG1, "replaying changes from file \"%s\"", path);
+
+ /*
+ * Make sure the file is owned by the toplevel transaction so that the
+ * file will not be accidentally closed when aborting a subtransaction.
+ */
+ oldowner = CurrentResourceOwner;
+ CurrentResourceOwner = TopTransactionResourceOwner;
+
+ /* Open the specified file */
+ delayed_fd = OpenTransientFile(path, O_RDONLY | PG_BINARY);
+
+ Assert(delayed_fd > 0);
+
+ CurrentResourceOwner = oldowner;
+
+ buffer = palloc(BLCKSZ);
+ initStringInfo(&s2);
+
+ MemoryContextSwitchTo(oldcxt);
+
+ set_apply_error_context_xact(xid, commit_data->commit_lsn);
+
+ remote_final_lsn = commit_data->end_lsn;
+
+ maybe_start_skipping_changes(commit_data->commit_lsn);
+
+ /*
+ * Make sure the handle apply_dispatch methods are aware we're in a remote
+ * transaction.
+ */
+ in_remote_transaction = true;
+ pgstat_report_activity(STATE_RUNNING, NULL);
+
+ end_replication_step();
+
+ /*
+ * Read the entries one by one and pass them through the same logic as in
+ * apply_dispatch.
+ */
+ nchanges = 0;
+ while (true)
+ {
+ ReplicationMessageOnDisk ondisk;
+
+ CHECK_FOR_INTERRUPTS();
+
+ /* read the on-disk record */
+ if (!read(delayed_fd, &ondisk, sizeof(ondisk)))
+ break;
+
+ /* verify magic */
+ if (ondisk.magic != DELAY_MAGIC)
+ ereport(PANIC,
+ errcode(ERRCODE_DATA_CORRUPTED),
+ errmsg("delayed file \"%s\" has wrong magic number: %u instead of %u",
+ path, ondisk.magic, DELAY_MAGIC));
+
+ buffer = repalloc(buffer, ondisk.length);
+ read(delayed_fd, buffer, ondisk.length);
+
+ /* Verify the if required */
+#ifdef USE_ASSERT_CHECKING
+ {
+ pg_crc32c checksum;
+
+ INIT_CRC32C(checksum);
+ COMP_CRC32C(checksum,
+ (char *) &ondisk + ReplicationMessageOnDiskNotChecksummedSize,
+ ReplicationMessageOnDiskChecksummedSize);
+ COMP_CRC32C(checksum, buffer, ondisk.length);
+ FIN_CRC32C(checksum);
+
+ if (!EQ_CRC32C(checksum, ondisk.checksum))
+ ereport(PANIC,
+ (errmsg("checksum mismatch for delayed transaction file \"%s\": is %u, should be %u",
+ path, checksum, ondisk.checksum)));
+ }
+#endif
+
+ /* copy the buffer to the stringinfo and call apply_dispatch */
+ resetStringInfo(&s2);
+ appendStringInfoChar(&s2, ondisk.action);
+ appendBinaryStringInfo(&s2, buffer, ondisk.length);
+
+ apply_dispatch(&s2);
+
+ MemoryContextReset(ApplyMessageContext);
+
+ MemoryContextSwitchTo(oldcxt);
+
+ nchanges++;
+
+ if (nchanges % 1000 == 0)
+ elog(DEBUG1, "replayed %d changes from file \"%s\"",
+ nchanges, path);
+ }
+
+ CloseTransientFile(delayed_fd);
+ delayed_fd = -1;
+
+ apply_handle_commit_internal(commit_data);
+
+ durable_unlink(path, LOG);
+
+ elog(DEBUG1, "replayed %d (all) changes from file \"%s\"",
+ nchanges, path);
+
+ return;
+}
+
+/*
+ * Create a file that will be written changes.
+ */
+static void
+create_delay_file(TransactionId xid)
+{
+ char path[MAXPGPATH];
+ int fd;
+
+ Assert(TransactionIdIsValid(xid));
+ Assert(delayed_fd < 0);
+
+ /*
+ * Construct filename. Other information like commit_lsn will be filled
+ * when it will be committed.
+ */
+ delay_file_name(path, MyLogicalRepWorker->subid, xid, DELAYED_TXN_UNKNOWN,
+ InvalidXLogRecPtr, InvalidXLogRecPtr, 0);
+
+ elog(DEBUG1, "creating a file \"%s\" for time-delayed logical replication",
+ path);
+
+ fd = OpenTransientFile(path, O_WRONLY | O_CREAT | O_EXCL | O_APPEND | PG_BINARY);
+
+ if (fd < 0)
+ ereport(ERROR,
+ errcode_for_file_access(),
+ errmsg("could not create file \"%s\": %m",
+ path));
+
+ delayed_fd = fd;
+}
+
+/*
+ * Create a directory that holds delayed files
+ */
+static void
+initialize_delay_directory(void)
+{
+ char path[MAXPGPATH];
+
+ snprintf(path, MAXPGPATH, DELAYED_DIR);
+ if (MakePGDirectory(path) < 0 && errno != EEXIST)
+ ereport(ERROR,
+ errcode_for_file_access(),
+ errmsg("could not create directory \"%s\": %m",
+ path));
+
+ START_CRIT_SECTION();
+ fsync_fname(path, true);
+ END_CRIT_SECTION();
+}
+
+/*
+ * Transform information from commit_prepared style to commit style.
+ */
+static void
+ConstructCommitFromCommitPrepared(LogicalRepCommitData *commit,
+ LogicalRepCommitPreparedTxnData *prepare_data)
+{
+ commit->commit_lsn = prepare_data->commit_lsn;
+ commit->committime = prepare_data->commit_time;
+ commit->end_lsn = prepare_data->end_lsn;
+}
+
+/*
+ * Restore the delayed transaction from given information.
+ *
+ * This return false only when the status is unknown, which measn that the
+ * worker was shutted down before receiving the COMMIT/PREPARE/COMMIT PREPARED
+ * message. In this case we must receive whole the messages and write them into
+ * file again.
+ */
+static bool
+RestoreDelayedTxn(char status, XLogRecPtr commit_lsn, XLogRecPtr end_lsn,
+ TimestampTz committime, TransactionId xid)
+
+{
+ switch (status)
+ {
+ case DELAYED_TXN_UNKNOWN:
+ return false;
+
+ case DELAYED_TXN_COMMITTED:
+ {
+ LogicalRepCommitData commit_data = {
+ .commit_lsn = commit_lsn,
+ .committime = committime,
+ .end_lsn = end_lsn,
+ };
+
+ cache_commit_data(&commit_data, xid);
+ break;
+ }
+
+ case DELAYED_TXN_PREPARED:
+ /* Do nothing */
+ break;
+
+ default:
+ Assert(false);
+ return false; /* Keep compiler quiet */
+ }
+
+ /* Update last_flushed to avoid to recevie same transaction again */
+ last_flushed = end_lsn;
+
+ return true;
+}
+
+/*
+ * list_sort() comparator for sorting DelayedTxnList in commitime order.
+ */
+static int
+file_sort_by_committime(const ListCell *a_p, const ListCell *b_p)
+{
+ DelayedTxnListEntry *a = (DelayedTxnListEntry *) lfirst(a_p);
+ DelayedTxnListEntry *b = (DelayedTxnListEntry *) lfirst(b_p);
+
+ if (a->commit_data.committime < b->commit_data.committime)
+ return -1;
+ else if (a->commit_data.committime > b->commit_data.committime)
+ return 1;
+ return 0;
+}
+
+
+/*
+ * Restore all the delayed transactions to memory.
+ */
+static void
+RestoreDelayedTxns(void)
+{
+ DIR *delayed_dir;
+ struct dirent *delayed_de;
+
+ /* Read all the file step-by-step */
+ delayed_dir = AllocateDir(DELAYED_DIR);
+ while ((delayed_de = ReadDir(delayed_dir, DELAYED_DIR)) != NULL)
+ {
+ Oid subid = InvalidOid;
+ TransactionId xid = InvalidTransactionId;
+ char status = 0;
+ XLogRecPtr commit_lsn = InvalidXLogRecPtr,
+ end_lsn = InvalidXLogRecPtr;
+ TimestampTz committime = 0;
+ uint32 commit_hi = 0,
+ commit_lo = 0,
+ end_hi = 0,
+ end_lo = 0;
+
+ if (strcmp(delayed_de->d_name, ".") == 0 ||
+ strcmp(delayed_de->d_name, "..") == 0)
+ continue;
+
+ /* Ignore files that aren't ours */
+ if (strncmp(delayed_de->d_name, "delayed-", 8) != 0)
+ continue;
+
+ /* Parse filename */
+ if (sscanf(delayed_de->d_name, DELAYED_FORMAT, &subid, &xid, &status, &commit_hi,
+ &commit_lo, &end_hi, &end_lo, &committime) != 8)
+ elog(ERROR, "could not parse filename \"%s\"", delayed_de->d_name);
+
+ /* Skip if the file has been generated by other subscriptions */
+ if (MyLogicalRepWorker->subid != subid)
+ continue;
+
+ elog(DEBUG1, "start to restore from %s", delayed_de->d_name);
+
+ commit_lsn = ((uint64) commit_hi) << 32 | commit_lo;
+ end_lsn = ((uint64) end_hi) << 32 | end_lo;
+
+ /*
+ * Do actual restore here. If the server was shutted down while
+ * receiving transactions, the status is UNKNOWN and
+ * RestoreDelayedTxn() returns false. At that time we must remove the
+ * file once and receive changes again.
+ */
+ if (!RestoreDelayedTxn(status, commit_lsn, end_lsn, committime, xid))
+ {
+ char path[MAXPGPATH];
+
+ snprintf(path, MAXPGPATH, DELAYED_DIR "/%s", delayed_de->d_name);
+ durable_unlink(path, LOG);
+ }
+ }
+ FreeDir(delayed_dir);
+
+ list_sort(DelayedTxnList, file_sort_by_committime);
+}
+
+/*
+ * Restore delayed transactions, or initialize the directory
+ */
+static void
+InitializeDelayedTxn(void)
+{
+ struct stat st;
+ char path[MAXPGPATH];
+
+ snprintf(path, MAXPGPATH, DELAYED_DIR);
+
+ /*
+ * If the given directory does not exist, create one. Otherwise start to
+ * restore.
+ */
+ if (stat(path, &st) != 0)
+ {
+ initialize_delay_directory();
+ return;
+ }
+
+ RestoreDelayedTxns();
+}
+
+/*
+ * Write a given message to a file. This is called for every message.
+ * This returns true only when changes are written into file.
+ *
+ * The format of the serialized changes is same as the streamed one. This
+ * has a length (not including the length), action code (identifying the
+ * message type) and message contents (without the subxact TransactionId
+ * value).
+ */
+static bool
+handle_delayed_transaction(char action, StringInfo s)
+{
+ ReplicationMessageOnDisk ondisk;
+ pg_crc32c checksum = 0;
+
+ /* Return if we are not in delay */
+ if (!in_delayed_transaction)
+ return false;
+
+ Assert(delayed_fd > 0);
+ Assert(TransactionIdIsValid(delayed_xid));
+
+ ondisk.magic = DELAY_MAGIC;
+ ondisk.length = (s->len - s->cursor);
+ ondisk.action = action;
+
+ /* Calculate CRC if required */
+#ifdef USE_ASSERT_CHECKING
+ INIT_CRC32C(checksum);
+ COMP_CRC32C(checksum,
+ (char *) &ondisk + ReplicationMessageOnDiskNotChecksummedSize,
+ ReplicationMessageOnDiskChecksummedSize);
+ COMP_CRC32C(checksum, &s->data[s->cursor], ondisk.length);
+ FIN_CRC32C(checksum);
+#endif
+
+ ondisk.checksum = checksum;
+
+ /* Write header part */
+ if (write(delayed_fd, &ondisk, sizeof(ondisk)) != sizeof(ondisk))
+ {
+ int save_errno = errno;
+ char path[MAXPGPATH];
+
+ CloseTransientFile(delayed_fd);
+ delay_file_name(path, MyLogicalRepWorker->subid, delayed_xid,
+ DELAYED_TXN_UNKNOWN, InvalidXLogRecPtr,
+ InvalidXLogRecPtr, 0);
+
+ /* if write didn't set errno, assume problem is no disk space */
+ errno = save_errno ? save_errno : ENOSPC;
+ ereport(ERROR,
+ errcode_for_file_access(),
+ errmsg("could not write to file \"%s\": %m",
+ path));
+ return false; /* Keep compiler quiet */
+ }
+
+ /* Write actual message */
+ if (write(delayed_fd, &s->data[s->cursor], ondisk.length) != ondisk.length)
+ {
+ int save_errno = errno;
+ char path[MAXPGPATH];
+
+ CloseTransientFile(delayed_fd);
+ delay_file_name(path, MyLogicalRepWorker->subid, delayed_xid,
+ DELAYED_TXN_UNKNOWN, InvalidXLogRecPtr,
+ InvalidXLogRecPtr, 0);
+
+ /* if write didn't set errno, assume problem is no disk space */
+ errno = save_errno ? save_errno : ENOSPC;
+ ereport(ERROR,
+ errcode_for_file_access(),
+ errmsg("could not write to file \"%s\": %m",
+ path));
+ return false; /* Keep compiler quiet */
+ }
+
+ return true;
+}
+
+/*
+ * Check the delayed transactions and apply if we elapsed sufficient time
+ */
+static void
+check_delayed_transaction(void)
+{
+ TimestampTz now;
+ ListCell *lc;
+ int n = 0;
+
+ if (in_streamed_transaction)
+ return;
+
+ now = GetCurrentTimestamp();
+
+ /* Read cache on-by-one */
+ foreach(lc, DelayedTxnList)
+ {
+ DelayedTxnListEntry *entry = (DelayedTxnListEntry *) lfirst(lc);
+ LogicalRepCommitData *commit_data = &entry->commit_data;
+ TimestampTz delayUntil;
+ long diffms;
+
+ delayUntil = TimestampTzPlusMilliseconds(commit_data->committime,
+ MySubscription->minapplydelay);
+
+ diffms = TimestampDifferenceMilliseconds(now, delayUntil);
+
+ /*
+ * The cache is aligned the commit ordering, so we do not have to
+ * check latter entries if we find transactions that should not be
+ * applied.
+ */
+ if (diffms > 0)
+ break;
+
+ elog(DEBUG1, "started to apply transaction %u", entry->xid);
+
+ apply_delayed_transaction(entry->xid, commit_data);
+
+ delayed_xid = InvalidTransactionId;
+ in_delayed_transaction = false;
+ n++;
+ }
+ /* Discards applied entries */
+ DelayedTxnList = list_delete_first_n(DelayedTxnList, n);
+}
+
/*
* Return the name of the logical replication worker.
*/
@@ -1019,13 +1694,28 @@ apply_handle_begin(StringInfo s)
logicalrep_read_begin(s, &begin_data);
set_apply_error_context_xact(begin_data.xid, begin_data.final_lsn);
- remote_final_lsn = begin_data.final_lsn;
+ /*
+ * Prepare to write changes into file if time-delayed replication is
+ * requested.
+ */
+ if (MySubscription->minapplydelay && AllTablesyncsReady())
+ {
+ in_delayed_transaction = true;
- maybe_start_skipping_changes(begin_data.final_lsn);
+ create_delay_file(begin_data.xid);
- in_remote_transaction = true;
+ delayed_xid = begin_data.xid;
+ }
+ else
+ {
+ remote_final_lsn = begin_data.final_lsn;
- pgstat_report_activity(STATE_RUNNING, NULL);
+ maybe_start_skipping_changes(begin_data.final_lsn);
+
+ in_remote_transaction = true;
+
+ pgstat_report_activity(STATE_RUNNING, NULL);
+ }
}
/*
@@ -1038,19 +1728,40 @@ apply_handle_commit(StringInfo s)
{
LogicalRepCommitData commit_data;
+ /* Save the message before it is consumed. */
+ StringInfoData original_msg = *s;
+
+ /*
+ * If we are applying the delayed transaction, skip here. Actual COMMIT
+ * will be done in apply_delayed_transaction()
+ */
+ if (delayed_fd > 0 && !in_delayed_transaction)
+ return;
+
logicalrep_read_commit(s, &commit_data);
- if (commit_data.commit_lsn != remote_final_lsn)
- ereport(ERROR,
- (errcode(ERRCODE_PROTOCOL_VIOLATION),
- errmsg_internal("incorrect commit LSN %X/%X in commit message (expected %X/%X)",
- LSN_FORMAT_ARGS(commit_data.commit_lsn),
- LSN_FORMAT_ARGS(remote_final_lsn))));
+ /* If we are applying, skip here. */
+
+ if (in_delayed_transaction)
+ {
+ /* Write a commit message into file and flush all of messages */
+ handle_delayed_transaction(LOGICAL_REP_MSG_COMMIT, &original_msg);
+ flush_delayed_changes(&commit_data);
+ }
+ else
+ {
+ if (commit_data.commit_lsn != remote_final_lsn)
+ ereport(ERROR,
+ (errcode(ERRCODE_PROTOCOL_VIOLATION),
+ errmsg_internal("incorrect commit LSN %X/%X in commit message (expected %X/%X)",
+ LSN_FORMAT_ARGS(commit_data.commit_lsn),
+ LSN_FORMAT_ARGS(remote_final_lsn))));
- apply_handle_commit_internal(&commit_data);
+ apply_handle_commit_internal(&commit_data);
- /* Process any tables that are being synchronized in parallel. */
- process_syncing_tables(commit_data.end_lsn);
+ /* Process any tables that are being synchronized in parallel. */
+ process_syncing_tables(commit_data.end_lsn);
+ }
pgstat_report_activity(STATE_IDLE, NULL);
reset_apply_error_context_info();
@@ -1076,13 +1787,28 @@ apply_handle_begin_prepare(StringInfo s)
logicalrep_read_begin_prepare(s, &begin_data);
set_apply_error_context_xact(begin_data.xid, begin_data.prepare_lsn);
- remote_final_lsn = begin_data.prepare_lsn;
+ /*
+ * Prepare to write changes into file if time-delayed replication is
+ * requested.
+ */
+ if (MySubscription->minapplydelay && AllTablesyncsReady())
+ {
+ in_delayed_transaction = true;
- maybe_start_skipping_changes(begin_data.prepare_lsn);
+ create_delay_file(begin_data.xid);
- in_remote_transaction = true;
+ delayed_xid = begin_data.xid;
+ }
+ else
+ {
+ remote_final_lsn = begin_data.prepare_lsn;
- pgstat_report_activity(STATE_RUNNING, NULL);
+ maybe_start_skipping_changes(begin_data.prepare_lsn);
+
+ in_remote_transaction = true;
+
+ pgstat_report_activity(STATE_RUNNING, NULL);
+ }
}
/*
@@ -1124,57 +1850,102 @@ apply_handle_prepare_internal(LogicalRepPreparedTxnData *prepare_data)
/*
* Handle PREPARE message.
+ *
+ * When time-delayed logical replication is requested, we just write a message
+ * into file and return. This means that no transaction is prepared on
+ * subscriber. This can avoid that the apply worker acquires locks for a long
+ * time due to the long min_apply_time.
+ *
+ * Even if the transaction is applied from delayed file, the transaction is not
+ * prepared. We just skip PREPARE message.
*/
static void
apply_handle_prepare(StringInfo s)
{
LogicalRepPreparedTxnData prepare_data;
- logicalrep_read_prepare(s, &prepare_data);
-
- if (prepare_data.prepare_lsn != remote_final_lsn)
- ereport(ERROR,
- (errcode(ERRCODE_PROTOCOL_VIOLATION),
- errmsg_internal("incorrect prepare LSN %X/%X in prepare message (expected %X/%X)",
- LSN_FORMAT_ARGS(prepare_data.prepare_lsn),
- LSN_FORMAT_ARGS(remote_final_lsn))));
-
/*
- * Unlike commit, here, we always prepare the transaction even though no
- * change has happened in this transaction or all changes are skipped. It
- * is done this way because at commit prepared time, we won't know whether
- * we have skipped preparing a transaction because of those reasons.
- *
- * XXX, We can optimize such that at commit prepared time, we first check
- * whether we have prepared the transaction or not but that doesn't seem
- * worthwhile because such cases shouldn't be common.
+ * If we are writing changes into delayed file, construct a modified
+ * message and write it. This is needed for avoiding to write gid into
+ * file. More detail, see atop ReadPreparedCommonRecord().
*/
- begin_replication_step();
+ if (in_delayed_transaction)
+ {
+ char old_path[MAXPGPATH];
+ char new_path[MAXPGPATH];
- apply_handle_prepare_internal(&prepare_data);
+ /* Cleanup */
+ CloseTransientFile(delayed_fd);
- end_replication_step();
- CommitTransactionCommand();
- pgstat_report_stat(false);
+ /*
+ * Construct old/new filename.
+ *
+ * Note that commit_lsn, end_lsn, and committime are not filled here.
+ * This is because when COMMIT PREPARED is come, we do no have a good
+ * way to indicate the related transaction file if they are filled.
+ */
+ delay_file_name(old_path, MyLogicalRepWorker->subid, delayed_xid,
+ DELAYED_TXN_UNKNOWN, InvalidXLogRecPtr, InvalidXLogRecPtr, 0);
- store_flush_position(prepare_data.end_lsn, XactLastCommitEnd);
+ delay_file_name(new_path, MyLogicalRepWorker->subid, delayed_xid,
+ DELAYED_TXN_PREPARED, InvalidXLogRecPtr, InvalidXLogRecPtr, 0);
- in_remote_transaction = false;
+ /* And do actual rename */
+ if (durable_rename(old_path, new_path, PANIC))
+ abort();
- /* Process any tables that are being synchronized in parallel. */
- process_syncing_tables(prepare_data.end_lsn);
+ /* Store flushed lsn */
+ last_flushed = prepare_data.end_lsn;
- /*
- * Since we have already prepared the transaction, in a case where the
- * server crashes before clearing the subskiplsn, it will be left but the
- * transaction won't be resent. But that's okay because it's a rare case
- * and the subskiplsn will be cleared when finishing the next transaction.
- */
- stop_skipping_changes();
- clear_subscription_skip_lsn(prepare_data.prepare_lsn);
+ delayed_fd = -1;
+ delayed_xid = InvalidTransactionId;
+ in_delayed_transaction = false;
+ }
+ else
+ {
+ logicalrep_read_prepare(s, &prepare_data);
- pgstat_report_activity(STATE_IDLE, NULL);
- reset_apply_error_context_info();
+ if (prepare_data.prepare_lsn != remote_final_lsn)
+ ereport(ERROR,
+ (errcode(ERRCODE_PROTOCOL_VIOLATION),
+ errmsg_internal("incorrect prepare LSN %X/%X in prepare message (expected %X/%X)",
+ LSN_FORMAT_ARGS(prepare_data.prepare_lsn),
+ LSN_FORMAT_ARGS(remote_final_lsn))));
+
+ /*
+ * Unlike commit, here, we always prepare the transaction even though no
+ * change has happened in this transaction or all changes are skipped. It
+ * is done this way because at commit prepared time, we won't know whether
+ * we have skipped preparing a transaction because of those reasons.
+ *
+ * XXX, We can optimize such that at commit prepared time, we first check
+ * whether we have prepared the transaction or not but that doesn't seem
+ * worthwhile because such cases shouldn't be common.
+ */
+ begin_replication_step();
+
+ apply_handle_prepare_internal(&prepare_data);
+
+ end_replication_step();
+ CommitTransactionCommand();
+ pgstat_report_stat(false);
+
+ store_flush_position(prepare_data.end_lsn, XactLastCommitEnd);
+
+ in_remote_transaction = false;
+
+ /* Process any tables that are being synchronized in parallel. */
+ process_syncing_tables(prepare_data.end_lsn);
+
+ /*
+ * Since we have already prepared the transaction, in a case where the
+ * server crashes before clearing the subskiplsn, it will be left but the
+ * transaction won't be resent. But that's okay because it's a rare case
+ * and the subskiplsn will be cleared when finishing the next transaction.
+ */
+ stop_skipping_changes();
+ clear_subscription_skip_lsn(prepare_data.prepare_lsn);
+ }
}
/*
@@ -1192,38 +1963,95 @@ apply_handle_commit_prepared(StringInfo s)
LogicalRepCommitPreparedTxnData prepare_data;
char gid[GIDSIZE];
- logicalrep_read_commit_prepared(s, &prepare_data);
- set_apply_error_context_xact(prepare_data.xid, prepare_data.commit_lsn);
-
- /* Compute GID for two_phase transactions. */
- TwoPhaseTransactionGid(MySubscription->oid, prepare_data.xid,
- gid, sizeof(gid));
+ if (delayed_fd > 0 && !in_delayed_transaction)
+ return;
- /* There is no transaction when COMMIT PREPARED is called */
- begin_replication_step();
+ logicalrep_read_commit_prepared(s, &prepare_data);
/*
- * Update origin state so we can restart streaming from correct position
- * in case of crash.
+ * Check whether delayed file exists or not. If we have a file and we have
+ * not opened yet, it means that time-delayed logical replication has been
+ * requested. At that time we write the modified message.
+ * Otherwise, the transaction will be committed normally.
*/
- replorigin_session_origin_lsn = prepare_data.end_lsn;
- replorigin_session_origin_timestamp = prepare_data.commit_time;
+ if (delayed_fd < 0 &&
+ is_given_transaction_delayed(MyLogicalRepWorker->subid, prepare_data.xid))
+ {
+ char old_path[MAXPGPATH];
+ char new_path[MAXPGPATH];
+ LogicalRepCommitData commit_data = {0};
- FinishPreparedTransaction(gid, true);
- end_replication_step();
- CommitTransactionCommand();
- pgstat_report_stat(false);
+ /*
+ * Open the delayed transaction file.
+ *
+ * Apart from RestoreDelayedTxns(), we don't want to read whole the
+ * directory to find the related file. That's why we use Invalid LSN
+ * and committime to indicate it.
+ */
+ delay_file_name(old_path, MyLogicalRepWorker->subid, prepare_data.xid,
+ DELAYED_TXN_PREPARED, InvalidXLogRecPtr,
+ InvalidXLogRecPtr, 0);
- store_flush_position(prepare_data.end_lsn, XactLastCommitEnd);
- in_remote_transaction = false;
+ delayed_fd = OpenTransientFile(old_path, O_WRONLY | O_APPEND | PG_BINARY);
+ if (delayed_fd < 0)
+ ereport(ERROR,
+ errcode_for_file_access(),
+ errmsg("could not open file \"%s\": %m",
+ old_path));
- /* Process any tables that are being synchronized in parallel. */
- process_syncing_tables(prepare_data.end_lsn);
+ delay_file_name(new_path, MyLogicalRepWorker->subid,
+ prepare_data.xid, DELAYED_TXN_COMMITTED,
+ prepare_data.commit_lsn, prepare_data.end_lsn,
+ prepare_data.commit_time);
- clear_subscription_skip_lsn(prepare_data.end_lsn);
+ CloseTransientFile(delayed_fd);
- pgstat_report_activity(STATE_IDLE, NULL);
- reset_apply_error_context_info();
+ if (durable_rename(old_path, new_path, PANIC))
+ abort();
+
+ /* Store flushed lsn */
+ last_flushed = prepare_data.end_lsn;
+
+ delayed_fd = -1;
+ delayed_xid = InvalidTransactionId;
+ in_delayed_transaction = false;
+
+ ConstructCommitFromCommitPrepared(&commit_data, &prepare_data);
+
+ /* Cache the commited transaction */
+ cache_commit_data(&commit_data, prepare_data.xid);
+ }
+ else
+ {
+ set_apply_error_context_xact(prepare_data.xid, prepare_data.commit_lsn);
+
+ /* Compute GID for two_phase transactions. */
+ TwoPhaseTransactionGid(MySubscription->oid, prepare_data.xid,
+ gid, sizeof(gid));
+
+ /* There is no transaction when COMMIT PREPARED is called */
+ begin_replication_step();
+
+ /*
+ * Update origin state so we can restart streaming from correct position
+ * in case of crash.
+ */
+ replorigin_session_origin_lsn = prepare_data.end_lsn;
+ replorigin_session_origin_timestamp = prepare_data.commit_time;
+
+ FinishPreparedTransaction(gid, true);
+ end_replication_step();
+ CommitTransactionCommand();
+ pgstat_report_stat(false);
+
+ store_flush_position(prepare_data.end_lsn, XactLastCommitEnd);
+ in_remote_transaction = false;
+
+ /* Process any tables that are being synchronized in parallel. */
+ process_syncing_tables(prepare_data.end_lsn);
+
+ clear_subscription_skip_lsn(prepare_data.end_lsn);
+ }
}
/*
@@ -1242,6 +2070,25 @@ apply_handle_rollback_prepared(StringInfo s)
char gid[GIDSIZE];
logicalrep_read_rollback_prepared(s, &rollback_data);
+
+ /*
+ * If the delayed file exists, just remove it. The delayed transaction
+ * have never prepared, so it's OK not to call
+ * FinishPreparedTransaction().
+ */
+ if (is_given_transaction_delayed(MyLogicalRepWorker->subid, rollback_data.xid))
+ {
+ char path[MAXPGPATH];
+
+ delay_file_name(path, MyLogicalRepWorker->subid, rollback_data.xid,
+ DELAYED_TXN_PREPARED, InvalidXLogRecPtr,
+ InvalidXLogRecPtr, 0);
+
+ durable_unlink(path, LOG);
+ clear_subscription_skip_lsn(rollback_data.rollback_end_lsn);
+ return;
+ }
+
set_apply_error_context_xact(rollback_data.xid, rollback_data.rollback_end_lsn);
/* Compute GID for two_phase transactions. */
@@ -1317,16 +2164,65 @@ apply_handle_stream_prepare(StringInfo s)
switch (apply_action)
{
case TRANS_LEADER_APPLY:
+ /*
+ * If time-delayed is requested, start to write changes to
+ * permanent file instead of temporary one.
+ */
+ if (MySubscription->minapplydelay)
+ {
+ in_delayed_transaction = true;
+
+ create_delay_file(prepare_data.xid);
+
+ delayed_xid = prepare_data.xid;
+ }
/*
* The transaction has been serialized to file, so replay all the
- * spooled operations.
+ * spooled operations. Note that if time-delayed replication is
+ * requested, changes are written into permanent file here.
*/
apply_spooled_messages(MyLogicalRepWorker->stream_fileset,
prepare_data.xid, prepare_data.prepare_lsn);
- /* Mark the transaction as prepared. */
- apply_handle_prepare_internal(&prepare_data);
+
+ /*
+ * If time-delayed replication is requested, construct a modified
+ * message and write it. This is needed for avoiding to write gid
+ * into file. More detail, see atop ReadPreparedCommonRecord().
+ */
+ if (MySubscription->minapplydelay)
+ {
+ char old_path[MAXPGPATH];
+ char new_path[MAXPGPATH];
+
+ CloseTransientFile(delayed_fd);
+
+ delay_file_name(old_path, MyLogicalRepWorker->subid,
+ prepare_data.xid, DELAYED_TXN_UNKNOWN,
+ InvalidXLogRecPtr, InvalidXLogRecPtr, 0);
+
+ delay_file_name(new_path, MyLogicalRepWorker->subid,
+ prepare_data.xid, DELAYED_TXN_PREPARED,
+ InvalidXLogRecPtr, InvalidXLogRecPtr, 0);
+
+ if (durable_rename(old_path, new_path, PANIC))
+ abort();
+
+ /* Store flushed lsn */
+ last_flushed = prepare_data.end_lsn;
+
+ CloseTransientFile(delayed_fd);
+
+ delayed_fd = -1;
+ delayed_xid = InvalidTransactionId;
+ in_delayed_transaction = false;
+ }
+ else
+ {
+ /* Mark the transaction as prepared. */
+ apply_handle_prepare_internal(&prepare_data);
+ }
CommitTransactionCommand();
@@ -1405,8 +2301,11 @@ apply_handle_stream_prepare(StringInfo s)
pgstat_report_stat(false);
- /* Process any tables that are being synchronized in parallel. */
- process_syncing_tables(prepare_data.end_lsn);
+ if (list_length(DelayedTxnList) == 0)
+ {
+ /* Process any tables that are being synchronized in parallel. */
+ process_syncing_tables(prepare_data.end_lsn);
+ }
/*
* Similar to prepare case, the subskiplsn could be left in a case of
@@ -2175,19 +3074,42 @@ apply_handle_stream_commit(StringInfo s)
{
case TRANS_LEADER_APPLY:
+ /*
+ * If time-delayed is requested, start to write changes to
+ * permanent file instead of temporary one.
+ */
+ if (MySubscription->minapplydelay)
+ {
+ in_delayed_transaction = true;
+
+ create_delay_file(xid);
+
+ delayed_xid = xid;
+ }
+
/*
* The transaction has been serialized to file, so replay all the
- * spooled operations.
+ * spooled operations. Note that if time-delayed replication is
+ * requested, changes are written into permanent file here.
*/
apply_spooled_messages(MyLogicalRepWorker->stream_fileset, xid,
commit_data.commit_lsn);
- apply_handle_commit_internal(&commit_data);
+
+ /* Flush changes if time-delayed is requested */
+ if (MySubscription->minapplydelay)
+ {
+ handle_delayed_transaction(LOGICAL_REP_MSG_COMMIT, &original_msg);
+ flush_delayed_changes(&commit_data);
+ }
+ else
+ apply_handle_commit_internal(&commit_data);
/* Unlink the files with serialized changes and subxact info. */
stream_cleanup_files(MyLogicalRepWorker->subid, xid);
elog(DEBUG1, "finished processing the STREAM COMMIT command");
+
break;
case TRANS_LEADER_SEND_TO_PARALLEL:
@@ -2249,8 +3171,11 @@ apply_handle_stream_commit(StringInfo s)
break;
}
- /* Process any tables that are being synchronized in parallel. */
- process_syncing_tables(commit_data.end_lsn);
+ if (list_length(DelayedTxnList) == 0)
+ {
+ /* Process any tables that are being synchronized in parallel. */
+ process_syncing_tables(commit_data.end_lsn);
+ }
pgstat_report_activity(STATE_IDLE, NULL);
@@ -2325,7 +3250,8 @@ apply_handle_relation(StringInfo s)
{
LogicalRepRelation *rel;
- if (handle_streamed_transaction(LOGICAL_REP_MSG_RELATION, s))
+ if (handle_streamed_transaction(LOGICAL_REP_MSG_RELATION, s) ||
+ handle_delayed_transaction(LOGICAL_REP_MSG_RELATION, s))
return;
rel = logicalrep_read_rel(s);
@@ -2348,7 +3274,8 @@ apply_handle_type(StringInfo s)
{
LogicalRepTyp typ;
- if (handle_streamed_transaction(LOGICAL_REP_MSG_TYPE, s))
+ if (handle_streamed_transaction(LOGICAL_REP_MSG_TYPE, s) ||
+ handle_delayed_transaction(LOGICAL_REP_MSG_TYPE, s))
return;
logicalrep_read_typ(s, &typ);
@@ -2408,7 +3335,8 @@ apply_handle_insert(StringInfo s)
* streamed transactions.
*/
if (is_skipping_changes() ||
- handle_streamed_transaction(LOGICAL_REP_MSG_INSERT, s))
+ handle_streamed_transaction(LOGICAL_REP_MSG_INSERT, s) ||
+ handle_delayed_transaction(LOGICAL_REP_MSG_INSERT, s))
return;
begin_replication_step();
@@ -2560,7 +3488,8 @@ apply_handle_update(StringInfo s)
* streamed transactions.
*/
if (is_skipping_changes() ||
- handle_streamed_transaction(LOGICAL_REP_MSG_UPDATE, s))
+ handle_streamed_transaction(LOGICAL_REP_MSG_UPDATE, s) ||
+ handle_delayed_transaction(LOGICAL_REP_MSG_UPDATE, s))
return;
begin_replication_step();
@@ -2741,7 +3670,8 @@ apply_handle_delete(StringInfo s)
* streamed transactions.
*/
if (is_skipping_changes() ||
- handle_streamed_transaction(LOGICAL_REP_MSG_DELETE, s))
+ handle_streamed_transaction(LOGICAL_REP_MSG_DELETE, s) ||
+ handle_delayed_transaction(LOGICAL_REP_MSG_DELETE, s))
return;
begin_replication_step();
@@ -3169,7 +4099,8 @@ apply_handle_truncate(StringInfo s)
* streamed transactions.
*/
if (is_skipping_changes() ||
- handle_streamed_transaction(LOGICAL_REP_MSG_TRUNCATE, s))
+ handle_streamed_transaction(LOGICAL_REP_MSG_TRUNCATE, s) ||
+ handle_delayed_transaction(LOGICAL_REP_MSG_TRUNCATE, s))
return;
begin_replication_step();
@@ -3431,11 +4362,14 @@ get_flush_position(XLogRecPtr *write, XLogRecPtr *flush,
pos = dlist_tail_element(FlushPosition, node,
&lsn_mapping);
*write = pos->remote_end;
- *have_pending_txes = true;
- return;
+ break;
}
}
+ /* If change are written into file, report the LSN instead */
+ if (last_flushed > *flush)
+ *flush = last_flushed;
+
*have_pending_txes = !dlist_is_empty(&lsn_mapping);
}
@@ -3632,9 +4566,13 @@ LogicalRepApplyLoop(XLogRecPtr last_received)
maybe_reread_subscription();
/* Process any table synchronization changes. */
- process_syncing_tables(last_received);
+ if (list_length(DelayedTxnList) == 0)
+ process_syncing_tables(last_received);
}
+ /* Check delayed transactions and apply them */
+ check_delayed_transaction();
+
/* Cleanup the memory. */
MemoryContextResetAndDeleteChildren(ApplyMessageContext);
MemoryContextSwitchTo(TopMemoryContext);
@@ -3776,8 +4714,14 @@ send_feedback(XLogRecPtr recvpos, bool force, bool requestReply)
/*
* No outstanding transactions to flush, we can report the latest received
* position. This is important for synchronous replication.
+ *
+ * If the logical replication subscription has unprocessed changes then do
+ * not inform the publisher that the received latest LSN is already
+ * applied and flushed, otherwise, the publisher will make a wrong
+ * assumption about the logical replication progress. Instead, just send a
+ * feedback message to avoid a replication timeout during the delay.
*/
- if (!have_pending_txes)
+ if (!have_pending_txes && (list_length(DelayedTxnList) == 0))
flushpos = writepos = recvpos;
if (writepos < last_writepos)
@@ -3937,7 +4881,8 @@ maybe_reread_subscription(void)
newsub->passwordrequired != MySubscription->passwordrequired ||
strcmp(newsub->origin, MySubscription->origin) != 0 ||
newsub->owner != MySubscription->owner ||
- !equal(newsub->publications, MySubscription->publications))
+ !equal(newsub->publications, MySubscription->publications) ||
+ (newsub->minapplydelay == 0) != (MySubscription->minapplydelay == 0))
{
if (am_parallel_apply_worker())
ereport(LOG,
@@ -4582,6 +5527,9 @@ ApplyWorkerMain(Datum main_arg)
(errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
errmsg("subscription has no replication slot set")));
+ /* Check delayed files or initialize directory */
+ InitializeDelayedTxn();
+
/* Setup replication origin tracking. */
StartTransactionCommand();
ReplicationOriginNameForLogicalRep(MySubscription->oid, InvalidOid,
@@ -4593,6 +5541,14 @@ ApplyWorkerMain(Datum main_arg)
replorigin_session_origin = originid;
origin_startpos = replorigin_session_get_progress(false);
+ /*
+ * If last_flushed exceeds origin_startpos, it means that some
+ * transactions are delaying. They have already been written into
+ * pernament file, so no need to recevie them again.
+ */
+ if (origin_startpos < last_flushed)
+ origin_startpos = last_flushed;
+
/* Is the use of a password mandatory? */
must_use_password = MySubscription->passwordrequired &&
!superuser_arg(MySubscription->owner);
@@ -4664,9 +5620,15 @@ ApplyWorkerMain(Datum main_arg)
options.proto.logical.twophase = false;
options.proto.logical.origin = pstrdup(MySubscription->origin);
+ options.proto.logical.require_schema = false;
+
if (!am_tablesync_worker())
{
+ if (server_version >= 160000)
+ options.proto.logical.require_schema =
+ MySubscription->minapplydelay > 0;
+
/*
* Even when the two_phase mode is requested by the user, it remains
* as the tri-state PENDING until all tablesyncs have reached READY
diff --git a/src/backend/replication/pgoutput/pgoutput.c b/src/backend/replication/pgoutput/pgoutput.c
index f88389de84..6718fe062b 100644
--- a/src/backend/replication/pgoutput/pgoutput.c
+++ b/src/backend/replication/pgoutput/pgoutput.c
@@ -286,11 +286,13 @@ parse_output_parameters(List *options, PGOutputData *data)
bool streaming_given = false;
bool two_phase_option_given = false;
bool origin_option_given = false;
+ bool require_schema_option_given = false;
data->binary = false;
data->streaming = LOGICALREP_STREAM_OFF;
data->messages = false;
data->two_phase = false;
+ data->require_schema = false;
foreach(lc, options)
{
@@ -397,6 +399,16 @@ parse_output_parameters(List *options, PGOutputData *data)
errcode(ERRCODE_INVALID_PARAMETER_VALUE),
errmsg("unrecognized origin value: \"%s\"", data->origin));
}
+ else if (strcmp(defel->defname, "require_schema") == 0)
+ {
+ if (require_schema_option_given)
+ ereport(ERROR,
+ (errcode(ERRCODE_SYNTAX_ERROR),
+ errmsg("conflicting or redundant options")));
+ require_schema_option_given = true;
+
+ data->require_schema = defGetBoolean(defel);
+ }
else
elog(ERROR, "unrecognized pgoutput option: %s", defel->defname);
}
@@ -677,7 +689,8 @@ pgoutput_rollback_prepared_txn(LogicalDecodingContext *ctx,
static void
maybe_send_schema(LogicalDecodingContext *ctx,
ReorderBufferChange *change,
- Relation relation, RelationSyncEntry *relentry)
+ Relation relation, RelationSyncEntry *relentry,
+ PGOutputData *data)
{
bool schema_sent;
TransactionId xid = InvalidTransactionId;
@@ -717,7 +730,7 @@ maybe_send_schema(LogicalDecodingContext *ctx,
schema_sent = relentry->schema_sent;
/* Nothing to do if we already sent the schema. */
- if (schema_sent)
+ if (!data->require_schema && schema_sent)
return;
/*
@@ -1520,7 +1533,7 @@ pgoutput_change(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
* Schema should be sent using the original relation because it also sends
* the ancestor's relation.
*/
- maybe_send_schema(ctx, change, relation, relentry);
+ maybe_send_schema(ctx, change, relation, relentry, data);
OutputPluginPrepareWrite(ctx, true);
@@ -1605,7 +1618,7 @@ pgoutput_truncate(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
if (txndata && !txndata->sent_begin_txn)
pgoutput_send_begin(ctx, txn);
- maybe_send_schema(ctx, change, relation, relentry);
+ maybe_send_schema(ctx, change, relation, relentry, data);
}
if (nrelids > 0)
diff --git a/src/bin/pg_dump/pg_dump.c b/src/bin/pg_dump/pg_dump.c
index 41a51ec5cd..e579e4e88e 100644
--- a/src/bin/pg_dump/pg_dump.c
+++ b/src/bin/pg_dump/pg_dump.c
@@ -4610,6 +4610,7 @@ getSubscriptions(Archive *fout)
int i_subpublications;
int i_subbinary;
int i_subpasswordrequired;
+ int i_subminapplydelay;
int i,
ntups;
@@ -4664,11 +4665,13 @@ getSubscriptions(Archive *fout)
if (fout->remoteVersion >= 160000)
appendPQExpBufferStr(query,
" s.suborigin,\n"
- " s.subpasswordrequired\n");
+ " s.subpasswordrequired,\n"
+ " s.subminapplydelay\n");
else
appendPQExpBuffer(query,
" '%s' AS suborigin,\n"
- " 't' AS subpasswordrequired\n",
+ " 't' AS subpasswordrequired,\n"
+ " 0 AS subminapplydelay\n",
LOGICALREP_ORIGIN_ANY);
appendPQExpBufferStr(query,
@@ -4698,6 +4701,7 @@ getSubscriptions(Archive *fout)
i_subdisableonerr = PQfnumber(res, "subdisableonerr");
i_suborigin = PQfnumber(res, "suborigin");
i_subpasswordrequired = PQfnumber(res, "subpasswordrequired");
+ i_subminapplydelay = PQfnumber(res, "subminapplydelay");
subinfo = pg_malloc(ntups * sizeof(SubscriptionInfo));
@@ -4730,6 +4734,8 @@ getSubscriptions(Archive *fout)
subinfo[i].suborigin = pg_strdup(PQgetvalue(res, i, i_suborigin));
subinfo[i].subpasswordrequired =
pg_strdup(PQgetvalue(res, i, i_subpasswordrequired));
+ subinfo[i].subminapplydelay =
+ atoi(PQgetvalue(res, i, i_subminapplydelay));
/* Decide whether we want to dump it */
selectDumpableObject(&(subinfo[i].dobj), fout);
@@ -4814,6 +4820,9 @@ dumpSubscription(Archive *fout, const SubscriptionInfo *subinfo)
if (strcmp(subinfo->subpasswordrequired, "t") != 0)
appendPQExpBuffer(query, ", password_required = false");
+ if (subinfo->subminapplydelay > 0)
+ appendPQExpBuffer(query, ", min_apply_delay = '%d ms'", subinfo->subminapplydelay);
+
appendPQExpBufferStr(query, ");\n");
if (subinfo->dobj.dump & DUMP_COMPONENT_DEFINITION)
diff --git a/src/bin/pg_dump/pg_dump.h b/src/bin/pg_dump/pg_dump.h
index ed6ce41ad7..6bf889a00a 100644
--- a/src/bin/pg_dump/pg_dump.h
+++ b/src/bin/pg_dump/pg_dump.h
@@ -662,6 +662,7 @@ typedef struct _SubscriptionInfo
char *subdisableonerr;
char *suborigin;
char *subsynccommit;
+ int subminapplydelay;
char *subpublications;
char *subpasswordrequired;
} SubscriptionInfo;
diff --git a/src/bin/psql/describe.c b/src/bin/psql/describe.c
index 058e41e749..4f2498c3ad 100644
--- a/src/bin/psql/describe.c
+++ b/src/bin/psql/describe.c
@@ -6493,7 +6493,8 @@ describeSubscriptions(const char *pattern, bool verbose)
PGresult *res;
printQueryOpt myopt = pset.popt;
static const bool translate_columns[] = {false, false, false, false,
- false, false, false, false, false, false, false, false, false, false};
+ false, false, false, false, false, false, false, false, false, false,
+ false};
if (pset.sversion < 100000)
{
@@ -6552,10 +6553,12 @@ describeSubscriptions(const char *pattern, bool verbose)
appendPQExpBuffer(&buf,
", suborigin AS \"%s\"\n"
", subpasswordrequired AS \"%s\"\n"
- ", subrunasowner AS \"%s\"\n",
+ ", subrunasowner AS \"%s\"\n"
+ ", subminapplydelay AS \"%s\"\n",
gettext_noop("Origin"),
gettext_noop("Password required"),
- gettext_noop("Run as Owner?"));
+ gettext_noop("Run as Owner?"),
+ gettext_noop("Min apply delay"));
appendPQExpBuffer(&buf,
", subsynccommit AS \"%s\"\n"
diff --git a/src/bin/psql/tab-complete.c b/src/bin/psql/tab-complete.c
index bd04244969..7deea7f25c 100644
--- a/src/bin/psql/tab-complete.c
+++ b/src/bin/psql/tab-complete.c
@@ -1925,9 +1925,9 @@ psql_completion(const char *text, int start, int end)
COMPLETE_WITH("(", "PUBLICATION");
/* ALTER SUBSCRIPTION <name> SET ( */
else if (HeadMatches("ALTER", "SUBSCRIPTION", MatchAny) && TailMatches("SET", "("))
- COMPLETE_WITH("binary", "disable_on_error", "origin",
- "password_required", "run_as_owner", "slot_name",
- "streaming", "synchronous_commit");
+ COMPLETE_WITH("binary", "disable_on_error", "min_apply_delay",
+ "origin", "password_required", "run_as_owner",
+ "slot_name", "streaming", "synchronous_commit");
/* ALTER SUBSCRIPTION <name> SKIP ( */
else if (HeadMatches("ALTER", "SUBSCRIPTION", MatchAny) && TailMatches("SKIP", "("))
COMPLETE_WITH("lsn");
@@ -3269,9 +3269,10 @@ psql_completion(const char *text, int start, int end)
/* Complete "CREATE SUBSCRIPTION <name> ... WITH ( <opt>" */
else if (HeadMatches("CREATE", "SUBSCRIPTION") && TailMatches("WITH", "("))
COMPLETE_WITH("binary", "connect", "copy_data", "create_slot",
- "disable_on_error", "enabled", "origin",
- "password_required", "run_as_owner", "slot_name",
- "streaming", "synchronous_commit", "two_phase");
+ "disable_on_error", "enabled", "min_apply_delay",
+ "origin", "password_required", "run_as_owner",
+ "slot_name", "streaming", "synchronous_commit",
+ "two_phase");
/* CREATE TRIGGER --- is allowed inside CREATE SCHEMA, so use TailMatches */
diff --git a/src/include/catalog/pg_subscription.h b/src/include/catalog/pg_subscription.h
index 91d729d62d..649e789240 100644
--- a/src/include/catalog/pg_subscription.h
+++ b/src/include/catalog/pg_subscription.h
@@ -74,6 +74,8 @@ CATALOG(pg_subscription,6100,SubscriptionRelationId) BKI_SHARED_RELATION BKI_ROW
Oid subowner BKI_LOOKUP(pg_authid); /* Owner of the subscription */
+ int32 subminapplydelay; /* Replication apply delay (ms) */
+
bool subenabled; /* True if the subscription is enabled (the
* worker should be running) */
@@ -127,6 +129,7 @@ typedef struct Subscription
* skipped */
char *name; /* Name of the subscription */
Oid owner; /* Oid of the subscription owner */
+ int32 minapplydelay; /* Replication apply delay (ms) */
bool enabled; /* Indicates if the subscription is enabled */
bool binary; /* Indicates if the subscription wants data in
* binary format */
diff --git a/src/include/replication/pgoutput.h b/src/include/replication/pgoutput.h
index b4a8015403..59d924084f 100644
--- a/src/include/replication/pgoutput.h
+++ b/src/include/replication/pgoutput.h
@@ -30,6 +30,7 @@ typedef struct PGOutputData
bool messages;
bool two_phase;
char *origin;
+ bool require_schema;
} PGOutputData;
#endif /* PGOUTPUT_H */
diff --git a/src/include/replication/walreceiver.h b/src/include/replication/walreceiver.h
index 281626fa6f..954d297401 100644
--- a/src/include/replication/walreceiver.h
+++ b/src/include/replication/walreceiver.h
@@ -187,6 +187,7 @@ typedef struct
* prepare time */
char *origin; /* Only publish data originating from the
* specified origin */
+ bool require_schema;
} logical;
} proto;
} WalRcvStreamOptions;
diff --git a/src/test/regress/expected/subscription.out b/src/test/regress/expected/subscription.out
index d736246259..54fbf53837 100644
--- a/src/test/regress/expected/subscription.out
+++ b/src/test/regress/expected/subscription.out
@@ -115,18 +115,18 @@ CREATE SUBSCRIPTION regress_testsub4 CONNECTION 'dbname=regress_doesnotexist' PU
WARNING: subscription was created, but is not connected
HINT: To initiate replication, you must manually create the replication slot, enable the subscription, and refresh the subscription.
\dRs+ regress_testsub4
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Password required | Run as Owner? | Synchronous commit | Conninfo | Skip LSN
-------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+-------------------+---------------+--------------------+-----------------------------+----------
- regress_testsub4 | regress_subscription_user | f | {testpub} | f | off | d | f | none | t | f | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Password required | Run as Owner? | Min apply delay | Synchronous commit | Conninfo | Skip LSN
+------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+-------------------+---------------+-----------------+--------------------+-----------------------------+----------
+ regress_testsub4 | regress_subscription_user | f | {testpub} | f | off | d | f | none | t | f | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
ALTER SUBSCRIPTION regress_testsub4 SET (origin = any);
\dRs+ regress_testsub4
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Password required | Run as Owner? | Synchronous commit | Conninfo | Skip LSN
-------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+-------------------+---------------+--------------------+-----------------------------+----------
- regress_testsub4 | regress_subscription_user | f | {testpub} | f | off | d | f | any | t | f | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Password required | Run as Owner? | Min apply delay | Synchronous commit | Conninfo | Skip LSN
+------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+-------------------+---------------+-----------------+--------------------+-----------------------------+----------
+ regress_testsub4 | regress_subscription_user | f | {testpub} | f | off | d | f | any | t | f | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
DROP SUBSCRIPTION regress_testsub3;
@@ -144,10 +144,10 @@ ALTER SUBSCRIPTION regress_testsub CONNECTION 'foobar';
ERROR: invalid connection string syntax: missing "=" after "foobar" in connection info string
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Password required | Run as Owner? | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+-------------------+---------------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub} | f | off | d | f | any | t | f | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Password required | Run as Owner? | Min apply delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+-------------------+---------------+-----------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | off | d | f | any | t | f | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
ALTER SUBSCRIPTION regress_testsub SET PUBLICATION testpub2, testpub3 WITH (refresh = false);
@@ -155,10 +155,10 @@ ALTER SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist2';
ALTER SUBSCRIPTION regress_testsub SET (slot_name = 'newname');
ALTER SUBSCRIPTION regress_testsub SET (password_required = false);
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Password required | Run as Owner? | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+---------------------+--------+-----------+------------------+------------------+--------+-------------------+---------------+--------------------+------------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub2,testpub3} | f | off | d | f | any | f | f | off | dbname=regress_doesnotexist2 | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Password required | Run as Owner? | Min apply delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+---------------------+--------+-----------+------------------+------------------+--------+-------------------+---------------+-----------------+--------------------+------------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub2,testpub3} | f | off | d | f | any | f | f | 0 | off | dbname=regress_doesnotexist2 | 0/0
(1 row)
ALTER SUBSCRIPTION regress_testsub SET (password_required = true);
@@ -173,10 +173,10 @@ ERROR: unrecognized subscription parameter: "create_slot"
-- ok
ALTER SUBSCRIPTION regress_testsub SKIP (lsn = '0/12345');
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Password required | Run as Owner? | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+---------------------+--------+-----------+------------------+------------------+--------+-------------------+---------------+--------------------+------------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub2,testpub3} | f | off | d | f | any | t | f | off | dbname=regress_doesnotexist2 | 0/12345
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Password required | Run as Owner? | Min apply delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+---------------------+--------+-----------+------------------+------------------+--------+-------------------+---------------+-----------------+--------------------+------------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub2,testpub3} | f | off | d | f | any | t | f | 0 | off | dbname=regress_doesnotexist2 | 0/12345
(1 row)
-- ok - with lsn = NONE
@@ -185,10 +185,10 @@ ALTER SUBSCRIPTION regress_testsub SKIP (lsn = NONE);
ALTER SUBSCRIPTION regress_testsub SKIP (lsn = '0/0');
ERROR: invalid WAL location (LSN): 0/0
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Password required | Run as Owner? | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+---------------------+--------+-----------+------------------+------------------+--------+-------------------+---------------+--------------------+------------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub2,testpub3} | f | off | d | f | any | t | f | off | dbname=regress_doesnotexist2 | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Password required | Run as Owner? | Min apply delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+---------------------+--------+-----------+------------------+------------------+--------+-------------------+---------------+-----------------+--------------------+------------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub2,testpub3} | f | off | d | f | any | t | f | 0 | off | dbname=regress_doesnotexist2 | 0/0
(1 row)
BEGIN;
@@ -220,10 +220,10 @@ ALTER SUBSCRIPTION regress_testsub_foo SET (synchronous_commit = foobar);
ERROR: invalid value for parameter "synchronous_commit": "foobar"
HINT: Available values: local, remote_write, remote_apply, on, off.
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Password required | Run as Owner? | Synchronous commit | Conninfo | Skip LSN
----------------------+---------------------------+---------+---------------------+--------+-----------+------------------+------------------+--------+-------------------+---------------+--------------------+------------------------------+----------
- regress_testsub_foo | regress_subscription_user | f | {testpub2,testpub3} | f | off | d | f | any | t | f | local | dbname=regress_doesnotexist2 | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Password required | Run as Owner? | Min apply delay | Synchronous commit | Conninfo | Skip LSN
+---------------------+---------------------------+---------+---------------------+--------+-----------+------------------+------------------+--------+-------------------+---------------+-----------------+--------------------+------------------------------+----------
+ regress_testsub_foo | regress_subscription_user | f | {testpub2,testpub3} | f | off | d | f | any | t | f | 0 | local | dbname=regress_doesnotexist2 | 0/0
(1 row)
-- rename back to keep the rest simple
@@ -252,19 +252,19 @@ CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUB
WARNING: subscription was created, but is not connected
HINT: To initiate replication, you must manually create the replication slot, enable the subscription, and refresh the subscription.
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Password required | Run as Owner? | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+-------------------+---------------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub} | t | off | d | f | any | t | f | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Password required | Run as Owner? | Min apply delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+-------------------+---------------+-----------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | t | off | d | f | any | t | f | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
ALTER SUBSCRIPTION regress_testsub SET (binary = false);
ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE);
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Password required | Run as Owner? | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+-------------------+---------------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub} | f | off | d | f | any | t | f | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Password required | Run as Owner? | Min apply delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+-------------------+---------------+-----------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | off | d | f | any | t | f | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
DROP SUBSCRIPTION regress_testsub;
@@ -276,27 +276,27 @@ CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUB
WARNING: subscription was created, but is not connected
HINT: To initiate replication, you must manually create the replication slot, enable the subscription, and refresh the subscription.
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Password required | Run as Owner? | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+-------------------+---------------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub} | f | on | d | f | any | t | f | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Password required | Run as Owner? | Min apply delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+-------------------+---------------+-----------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | on | d | f | any | t | f | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
ALTER SUBSCRIPTION regress_testsub SET (streaming = parallel);
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Password required | Run as Owner? | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+-------------------+---------------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub} | f | parallel | d | f | any | t | f | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Password required | Run as Owner? | Min apply delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+-------------------+---------------+-----------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | parallel | d | f | any | t | f | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
ALTER SUBSCRIPTION regress_testsub SET (streaming = false);
ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE);
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Password required | Run as Owner? | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+-------------------+---------------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub} | f | off | d | f | any | t | f | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Password required | Run as Owner? | Min apply delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+-------------------+---------------+-----------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | off | d | f | any | t | f | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
-- fail - publication already exists
@@ -311,10 +311,10 @@ ALTER SUBSCRIPTION regress_testsub ADD PUBLICATION testpub1, testpub2 WITH (refr
ALTER SUBSCRIPTION regress_testsub ADD PUBLICATION testpub1, testpub2 WITH (refresh = false);
ERROR: publication "testpub1" is already in subscription "regress_testsub"
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Password required | Run as Owner? | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-----------------------------+--------+-----------+------------------+------------------+--------+-------------------+---------------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub,testpub1,testpub2} | f | off | d | f | any | t | f | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Password required | Run as Owner? | Min apply delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-----------------------------+--------+-----------+------------------+------------------+--------+-------------------+---------------+-----------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub,testpub1,testpub2} | f | off | d | f | any | t | f | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
-- fail - publication used more than once
@@ -329,10 +329,10 @@ ERROR: publication "testpub3" is not in subscription "regress_testsub"
-- ok - delete publications
ALTER SUBSCRIPTION regress_testsub DROP PUBLICATION testpub1, testpub2 WITH (refresh = false);
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Password required | Run as Owner? | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+-------------------+---------------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub} | f | off | d | f | any | t | f | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Password required | Run as Owner? | Min apply delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+-------------------+---------------+-----------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | off | d | f | any | t | f | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
DROP SUBSCRIPTION regress_testsub;
@@ -368,10 +368,10 @@ CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUB
WARNING: subscription was created, but is not connected
HINT: To initiate replication, you must manually create the replication slot, enable the subscription, and refresh the subscription.
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Password required | Run as Owner? | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+-------------------+---------------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub} | f | off | p | f | any | t | f | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Password required | Run as Owner? | Min apply delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+-------------------+---------------+-----------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | off | p | f | any | t | f | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
--fail - alter of two_phase option not supported.
@@ -380,10 +380,10 @@ ERROR: unrecognized subscription parameter: "two_phase"
-- but can alter streaming when two_phase enabled
ALTER SUBSCRIPTION regress_testsub SET (streaming = true);
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Password required | Run as Owner? | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+-------------------+---------------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub} | f | on | p | f | any | t | f | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Password required | Run as Owner? | Min apply delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+-------------------+---------------+-----------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | on | p | f | any | t | f | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE);
@@ -393,10 +393,10 @@ CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUB
WARNING: subscription was created, but is not connected
HINT: To initiate replication, you must manually create the replication slot, enable the subscription, and refresh the subscription.
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Password required | Run as Owner? | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+-------------------+---------------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub} | f | on | p | f | any | t | f | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Password required | Run as Owner? | Min apply delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+-------------------+---------------+-----------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | on | p | f | any | t | f | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE);
@@ -409,18 +409,18 @@ CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUB
WARNING: subscription was created, but is not connected
HINT: To initiate replication, you must manually create the replication slot, enable the subscription, and refresh the subscription.
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Password required | Run as Owner? | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+-------------------+---------------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub} | f | off | d | f | any | t | f | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Password required | Run as Owner? | Min apply delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+-------------------+---------------+-----------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | off | d | f | any | t | f | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
ALTER SUBSCRIPTION regress_testsub SET (disable_on_error = true);
\dRs+
- List of subscriptions
- Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Password required | Run as Owner? | Synchronous commit | Conninfo | Skip LSN
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+-------------------+---------------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f | {testpub} | f | off | d | t | any | t | f | off | dbname=regress_doesnotexist | 0/0
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Password required | Run as Owner? | Min apply delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+-------------------+---------------+-----------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | off | d | t | any | t | f | 0 | off | dbname=regress_doesnotexist | 0/0
(1 row)
ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE);
@@ -470,6 +470,44 @@ ERROR: permission denied for database regression
-- ok, owning it is enough for this stuff
ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE);
DROP SUBSCRIPTION regress_testsub;
+SET SESSION AUTHORIZATION regress_subscription_user;
+-- fail -- min_apply_delay must be a non-negative integer
+CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUBLICATION testpub WITH (connect = false, min_apply_delay = foo);
+ERROR: invalid value for parameter "min_apply_delay": "foo"
+CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUBLICATION testpub WITH (connect = false, min_apply_delay = -1);
+ERROR: -1 ms is outside the valid range for parameter "min_apply_delay" (0 .. 2147483647)
+-- fail - utilizing streaming = parallel with time-delayed replication is not supported
+CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUBLICATION testpub WITH (connect = false, streaming = parallel, min_apply_delay = 123);
+ERROR: min_apply_delay > 0 and streaming = parallel are mutually exclusive options
+-- success -- min_apply_delay value without unit is taken as milliseconds
+CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUBLICATION testpub WITH (connect = false, min_apply_delay = 123);
+WARNING: subscription was created, but is not connected
+HINT: To initiate replication, you must manually create the replication slot, enable the subscription, and refresh the subscription.
+\dRs+
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Password required | Run as Owner? | Min apply delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+-------------------+---------------+-----------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | off | d | f | any | t | f | 123 | off | dbname=regress_doesnotexist | 0/0
+(1 row)
+
+-- success -- min_apply_delay value with unit is converted into ms and stored as an integer
+ALTER SUBSCRIPTION regress_testsub SET (min_apply_delay = '1 d');
+\dRs+
+ List of subscriptions
+ Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Password required | Run as Owner? | Min apply delay | Synchronous commit | Conninfo | Skip LSN
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+-------------------+---------------+-----------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f | {testpub} | f | off | d | f | any | t | f | 86400000 | off | dbname=regress_doesnotexist | 0/0
+(1 row)
+
+-- fail - alter subscription with streaming = parallel should fail when time-delayed replication is set
+ALTER SUBSCRIPTION regress_testsub SET (streaming = parallel);
+ERROR: cannot set parallel streaming mode for subscription with min_apply_delay
+-- fail - alter subscription with min_apply_delay should fail when streaming = parallel is set
+ALTER SUBSCRIPTION regress_testsub SET (min_apply_delay = 0, streaming = parallel);
+ALTER SUBSCRIPTION regress_testsub SET (min_apply_delay = 123);
+ERROR: cannot set min_apply_delay for subscription in parallel streaming mode
+ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE);
+DROP SUBSCRIPTION regress_testsub;
RESET SESSION AUTHORIZATION;
DROP ROLE regress_subscription_user;
DROP ROLE regress_subscription_user2;
diff --git a/src/test/regress/sql/subscription.sql b/src/test/regress/sql/subscription.sql
index 55d7dbc9ab..32d8235b6f 100644
--- a/src/test/regress/sql/subscription.sql
+++ b/src/test/regress/sql/subscription.sql
@@ -334,7 +334,34 @@ ALTER SUBSCRIPTION regress_testsub RENAME TO regress_testsub2;
ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE);
DROP SUBSCRIPTION regress_testsub;
+SET SESSION AUTHORIZATION regress_subscription_user;
+
+-- fail -- min_apply_delay must be a non-negative integer
+CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUBLICATION testpub WITH (connect = false, min_apply_delay = foo);
+CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUBLICATION testpub WITH (connect = false, min_apply_delay = -1);
+
+-- fail - utilizing streaming = parallel with time-delayed replication is not supported
+CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUBLICATION testpub WITH (connect = false, streaming = parallel, min_apply_delay = 123);
+
+-- success -- min_apply_delay value without unit is taken as milliseconds
+CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUBLICATION testpub WITH (connect = false, min_apply_delay = 123);
+\dRs+
+
+-- success -- min_apply_delay value with unit is converted into ms and stored as an integer
+ALTER SUBSCRIPTION regress_testsub SET (min_apply_delay = '1 d');
+\dRs+
+
+-- fail - alter subscription with streaming = parallel should fail when time-delayed replication is set
+ALTER SUBSCRIPTION regress_testsub SET (streaming = parallel);
+
+-- fail - alter subscription with min_apply_delay should fail when streaming = parallel is set
+ALTER SUBSCRIPTION regress_testsub SET (min_apply_delay = 0, streaming = parallel);
+ALTER SUBSCRIPTION regress_testsub SET (min_apply_delay = 123);
+ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE);
+DROP SUBSCRIPTION regress_testsub;
+
RESET SESSION AUTHORIZATION;
+
DROP ROLE regress_subscription_user;
DROP ROLE regress_subscription_user2;
DROP ROLE regress_subscription_user3;
diff --git a/src/test/subscription/t/001_rep_changes.pl b/src/test/subscription/t/001_rep_changes.pl
index 91aa068c95..01f2c4284d 100644
--- a/src/test/subscription/t/001_rep_changes.pl
+++ b/src/test/subscription/t/001_rep_changes.pl
@@ -515,6 +515,37 @@ $node_publisher->poll_query_until('postgres',
or die
"Timed out while waiting for apply to restart after renaming SUBSCRIPTION";
+# Test time-delayed logical replication
+#
+# If the subscription sets min_apply_delay parameter, the logical replication
+# worker will delay the transaction apply for min_apply_delay milliseconds. We
+# verify this by looking at the time difference between a) when tuples are
+# inserted on the publisher, and b) when those changes are replicated on the
+# subscriber. Even on slow machines, this strategy will give predictable behavior.
+
+# Set min_apply_delay parameter to 3 seconds
+my $delay = 3;
+$node_subscriber->safe_psql('postgres',
+ "ALTER SUBSCRIPTION tap_sub_renamed SET (min_apply_delay = '${delay}s')");
+
+# Before doing the insertion, get the current timestamp that will be
+# used as a comparison base.
+my $publisher_insert_time = time();
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO tab_ins VALUES (generate_series(1101, 1120))");
+
+$node_subscriber->poll_query_until('postgres',
+ "SELECT count(*) = 1 FROM tab_ins WHERE a = 1120;"
+ )
+ or die
+ "failed to replicate changes";
+
+# This test is successful if and only if the LSN has been applied with at least
+# the configured apply delay.
+ok( time() - $publisher_insert_time >= $delay,
+ "subscriber applies WAL only after replication delay for non-streaming transaction"
+);
+
# check all the cleanup
$node_subscriber->safe_psql('postgres', "DROP SUBSCRIPTION tap_sub_renamed");
diff --git a/src/tools/pgindent/typedefs.list b/src/tools/pgindent/typedefs.list
index b4058b88c3..782c9fd3a3 100644
--- a/src/tools/pgindent/typedefs.list
+++ b/src/tools/pgindent/typedefs.list
@@ -2337,6 +2337,7 @@ ReplaceVarsFromTargetList_context
ReplaceVarsNoMatchOption
ReplicaIdentityStmt
ReplicationKind
+ReplicationMessageOnDisk
ReplicationSlot
ReplicationSlotCtlData
ReplicationSlotInvalidationCause
--
2.27.0
On Fri, Apr 28, 2023 at 2:35 PM Hayato Kuroda (Fujitsu)
<kuroda.hayato@fujitsu.com> wrote:
Dear hackers,
I rebased and refined my PoC. Followings are the changes:
Thanks.
Apologies for being late here. Please bear with me if I'm repeating
any of the discussed points.
I'm mainly trying to understand the production level use-case behind
this feature, and for that matter, recovery_min_apply_delay. AFAIK,
people try to keep the replication lag as minimum as possible i.e.
near zero to avoid the extreme problems on production servers - wal
file growth, blocked vacuum, crash and downtime.
The proposed feature commit message and existing docs about
recovery_min_apply_delay justify the reason as 'offering opportunities
to correct data loss errors'. If someone wants to enable
recovery_min_apply_delay/min_apply_delay on production servers, I'm
guessing their values will be in hours, not in minutes; for the simple
reason that when a data loss occurs, people/infrastructure monitoring
postgres need to know it first and need time to respond with
corrective actions to recover data loss. When these parameters are
set, the primary server mustn't be generating too much WAL to avoid
eventual crash/downtime. Who would really want to be so defensive
against somebody who may or may not accidentally cause data loss and
enable these features on production servers (especially when these can
take down the primary server) and live happily with the induced
replication lag?
AFAIK, PITR is what people use for recovering from data loss errors in
production.
IMO, before we even go implement the apply delay feature for logical
replication, it's worth to understand if induced replication lags have
any production level significance. We can also debate if providing
apply delay hooks is any better with simple out-of-the-box extensions
as opposed to the core providing these features.
Bharath Rupireddy
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com
On Wed, May 10, 2023 at 5:35 PM Bharath Rupireddy
<bharath.rupireddyforpostgres@gmail.com> wrote:
On Fri, Apr 28, 2023 at 2:35 PM Hayato Kuroda (Fujitsu)
<kuroda.hayato@fujitsu.com> wrote:Dear hackers,
I rebased and refined my PoC. Followings are the changes:
Thanks.
Apologies for being late here. Please bear with me if I'm repeating
any of the discussed points.I'm mainly trying to understand the production level use-case behind
this feature, and for that matter, recovery_min_apply_delay. AFAIK,
people try to keep the replication lag as minimum as possible i.e.
near zero to avoid the extreme problems on production servers - wal
file growth, blocked vacuum, crash and downtime.The proposed feature commit message and existing docs about
recovery_min_apply_delay justify the reason as 'offering opportunities
to correct data loss errors'. If someone wants to enable
recovery_min_apply_delay/min_apply_delay on production servers, I'm
guessing their values will be in hours, not in minutes; for the simple
reason that when a data loss occurs, people/infrastructure monitoring
postgres need to know it first and need time to respond with
corrective actions to recover data loss. When these parameters are
set, the primary server mustn't be generating too much WAL to avoid
eventual crash/downtime. Who would really want to be so defensive
against somebody who may or may not accidentally cause data loss and
enable these features on production servers (especially when these can
take down the primary server) and live happily with the induced
replication lag?AFAIK, PITR is what people use for recovering from data loss errors in
production.
I think PITR is not a preferred way to achieve this because it can be
quite time-consuming. See how Gitlab[1]https://about.gitlab.com/blog/2019/02/13/delayed-replication-for-disaster-recovery-with-postgresql/ uses delayed replication in
PostgreSQL. This is one of the use cases I came across but I am sure
there will be others as well, otherwise, we would not have introduced
this feature in the first place.
Some of the other solutions like MySQL also have this feature. See
[2]: https://dev.mysql.com/doc/refman/8.0/en/replication-delayed.html
pglogical has this feature and there is a customer demand for the same
[3]: /messages/by-id/73b06a32-56ab-4056-86ff-e307f3c316f1@www.fastmail.com
IMO, before we even go implement the apply delay feature for logical
replication, it's worth to understand if induced replication lags have
any production level significance.
I think the main thing here is to come up with the right design to
implement this feature. In the last release, we found some blocking
problems with the proposed patch at that time but Kuroda-San came up
with a new patch with a different design based on the discussion here.
I haven't looked at it yet though.
[1]: https://about.gitlab.com/blog/2019/02/13/delayed-replication-for-disaster-recovery-with-postgresql/
[2]: https://dev.mysql.com/doc/refman/8.0/en/replication-delayed.html
[3]: /messages/by-id/73b06a32-56ab-4056-86ff-e307f3c316f1@www.fastmail.com
--
With Regards,
Amit Kapila.
Dear Amit-san, Bharath,
Thank you for giving your opinion!
Some of the other solutions like MySQL also have this feature. See
[2], you can also read the other use cases in that article. It seems
pglogical has this feature and there is a customer demand for the same
[3]
Additionally, the Db2[1]https://www.ibm.com/docs/en/db2/11.5?topic=parameters-hadr-replay-delay-hadr-replay-delay seems to have similar feature. If we extend to DBaaSes,
RDS for MySQL [2]https://aws.amazon.com/jp/blogs/database/recover-from-a-disaster-with-delayed-replication-in-amazon-rds-for-mysql/ and TencentDB [3]https://www.tencentcloud.com/document/product/236/41085 have that. These may indicate the needs
of the delayed replication.
[1]: https://www.ibm.com/docs/en/db2/11.5?topic=parameters-hadr-replay-delay-hadr-replay-delay
[2]: https://aws.amazon.com/jp/blogs/database/recover-from-a-disaster-with-delayed-replication-in-amazon-rds-for-mysql/
[3]: https://www.tencentcloud.com/document/product/236/41085
Best Regards,
Hayato Kuroda
FUJITSU LIMITED
On Fri, Apr 28, 2023 at 2:35 PM Hayato Kuroda (Fujitsu)
<kuroda.hayato@fujitsu.com> wrote:
Dear hackers,
I rebased and refined my PoC. Followings are the changes:
1. Is my understanding correct that this patch creates the delay files
for each transaction? If so, did you consider other approaches such as
using one file to avoid creating many files?
2. For streaming transactions, first the changes are written in the
temp file and then moved to the delay file. It seems like there is a
double work. Is it possible to unify it such that when min_apply_delay
is specified, we just use the delay file without sacrificing the
advantages like stream sub-abort can truncate the changes?
3. Ideally, there shouldn't be a performance impact of this feature on
regular transactions because the delay file is created only when
min_apply_delay is active but better to do some testing of the same.
Overall, I think such an approach can address comments by Sawada-San
[1]: /messages/by-id/CAD21AoAeG2+RsUYD9+mEwr8-rrt8R1bqpe56T2D=euO-Qs-GAg@mail.gmail.com
achieve this feature. It would be good to see what others think of
this approach.
[1]: /messages/by-id/CAD21AoAeG2+RsUYD9+mEwr8-rrt8R1bqpe56T2D=euO-Qs-GAg@mail.gmail.com
--
With Regards,
Amit Kapila.
On Thu, May 11, 2023 at 2:04 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
On Fri, Apr 28, 2023 at 2:35 PM Hayato Kuroda (Fujitsu)
<kuroda.hayato@fujitsu.com> wrote:Dear hackers,
I rebased and refined my PoC. Followings are the changes:
1. Is my understanding correct that this patch creates the delay files
for each transaction? If so, did you consider other approaches such as
using one file to avoid creating many files?
2. For streaming transactions, first the changes are written in the
temp file and then moved to the delay file. It seems like there is a
double work. Is it possible to unify it such that when min_apply_delay
is specified, we just use the delay file without sacrificing the
advantages like stream sub-abort can truncate the changes?
3. Ideally, there shouldn't be a performance impact of this feature on
regular transactions because the delay file is created only when
min_apply_delay is active but better to do some testing of the same.
In addition to the points Amit raised, if the 'required_schema' option
is specified in START_REPLICATION, the publisher sends schema
information for every change. I think it leads to significant
overhead. Did you consider alternative approaches such as sending the
schema information for every transaction or the subscriber requests
the publisher to send it?
Overall, I think such an approach can address comments by Sawada-San
[1] but not sure if Sawada-San or others have any better ideas to
achieve this feature. It would be good to see what others think of
this approach.
I agree with this approach.
When it comes to the idea of writing logical changes to permanent
files, I think it would also be a good idea (and perhaps could be a
building block of this feature) that we write streamed changes to a
permanent file so that the apply worker can retry to apply them
without retrieving the same changes again from the publisher.
Regards,
--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com
Dear Amit, Sawada-san,
Thank you for replying!
On Thu, May 11, 2023 at 2:04 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
On Fri, Apr 28, 2023 at 2:35 PM Hayato Kuroda (Fujitsu)
<kuroda.hayato@fujitsu.com> wrote:Dear hackers,
I rebased and refined my PoC. Followings are the changes:
1. Is my understanding correct that this patch creates the delay files
for each transaction? If so, did you consider other approaches such as
using one file to avoid creating many files?
2. For streaming transactions, first the changes are written in the
temp file and then moved to the delay file. It seems like there is a
double work. Is it possible to unify it such that when min_apply_delay
is specified, we just use the delay file without sacrificing the
advantages like stream sub-abort can truncate the changes?
3. Ideally, there shouldn't be a performance impact of this feature on
regular transactions because the delay file is created only when
min_apply_delay is active but better to do some testing of the same.In addition to the points Amit raised, if the 'required_schema' option
is specified in START_REPLICATION, the publisher sends schema
information for every change. I think it leads to significant
overhead. Did you consider alternative approaches such as sending the
schema information for every transaction or the subscriber requests
the publisher to send it?
Thanks for giving your opinions. Except for suggestion 2, I have never considered.
I will analyze them and share my opinion later.
About 2, I chose the style in order to simplify the source code, but I'm now planning
to follow suggestions.
Overall, I think such an approach can address comments by Sawada-San
[1] but not sure if Sawada-San or others have any better ideas to
achieve this feature. It would be good to see what others think of
this approach.I agree with this approach.
When it comes to the idea of writing logical changes to permanent
files, I think it would also be a good idea (and perhaps could be a
building block of this feature) that we write streamed changes to a
permanent file so that the apply worker can retry to apply them
without retrieving the same changes again from the publisher.
I'm very relieved to hear that.
One question: did you mean to say that serializing changes into the permanent files
can be extend to the non-delay case, right? I think once I will treat for delayed
replication, and then we can consider later.
Best Regards,
Hayato Kuroda
FUJITSU LIMITED
On Fri, May 12, 2023 at 12:48 PM Hayato Kuroda (Fujitsu)
<kuroda.hayato@fujitsu.com> wrote:
Overall, I think such an approach can address comments by Sawada-San
[1] but not sure if Sawada-San or others have any better ideas to
achieve this feature. It would be good to see what others think of
this approach.I agree with this approach.
When it comes to the idea of writing logical changes to permanent
files, I think it would also be a good idea (and perhaps could be a
building block of this feature) that we write streamed changes to a
permanent file so that the apply worker can retry to apply them
without retrieving the same changes again from the publisher.I'm very relieved to hear that.
One question: did you mean to say that serializing changes into the permanent files
can be extend to the non-delay case, right? I think once I will treat for delayed
replication, and then we can consider later.
What I was thinking of is that we implement non-delay cases (only for
streamed transactions) and then extend it to delay cases (i.e. adding
non-streamed transaction support and the delay mechanism). It might be
helpful if this patch becomes large and this approach can enable us to
reduce the complexity or divide the patch. That being said, I've not
considered this approach enough yet and it's just an idea. Extending
this feature to non-delay cases later also makes sense to me.
Regards,
--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com
On Fri, May 12, 2023 at 7:38 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
On Thu, May 11, 2023 at 2:04 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
On Fri, Apr 28, 2023 at 2:35 PM Hayato Kuroda (Fujitsu)
<kuroda.hayato@fujitsu.com> wrote:Dear hackers,
I rebased and refined my PoC. Followings are the changes:
1. Is my understanding correct that this patch creates the delay files
for each transaction? If so, did you consider other approaches such as
using one file to avoid creating many files?
2. For streaming transactions, first the changes are written in the
temp file and then moved to the delay file. It seems like there is a
double work. Is it possible to unify it such that when min_apply_delay
is specified, we just use the delay file without sacrificing the
advantages like stream sub-abort can truncate the changes?
3. Ideally, there shouldn't be a performance impact of this feature on
regular transactions because the delay file is created only when
min_apply_delay is active but better to do some testing of the same.In addition to the points Amit raised, if the 'required_schema' option
is specified in START_REPLICATION, the publisher sends schema
information for every change. I think it leads to significant
overhead. Did you consider alternative approaches such as sending the
schema information for every transaction or the subscriber requests
the publisher to send it?
Why do we need this new flag? I can't see any comments in the related
code which explain its need.
Overall, I think such an approach can address comments by Sawada-San
[1] but not sure if Sawada-San or others have any better ideas to
achieve this feature. It would be good to see what others think of
this approach.I agree with this approach.
When it comes to the idea of writing logical changes to permanent
files, I think it would also be a good idea (and perhaps could be a
building block of this feature) that we write streamed changes to a
permanent file so that the apply worker can retry to apply them
without retrieving the same changes again from the publisher.
I think we anyway won't be able to send confirmation till we write or
process the commit. If it gets interrupted anytime in between we need
to get all the changes again. I think using Fileset with temp files
has quite a few advantages for streaming as are noted in the header
comments of worker.c. We can investigate to replace that with
permanent files but I don't see that the advantages outweigh the
change. Also, after parallel apply, I am expecting, most users would
prefer that mode for large transactions, so making changes in the
serialized path doesn't seem like a good idea to me.
Having said that, I also thought that it would be a good idea if both
streaming and time-delayed can use the same code path in some way
w.r.t writing to files but couldn't come up with any good idea without
more downsides. I see that Kuroda-San has tried to keep the code path
isolated for this feature but still see that one can question the
implementation approach.
--
With Regards,
Amit Kapila.
On Fri, May 12, 2023 at 10:33 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
On Fri, May 12, 2023 at 7:38 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
On Thu, May 11, 2023 at 2:04 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
On Fri, Apr 28, 2023 at 2:35 PM Hayato Kuroda (Fujitsu)
<kuroda.hayato@fujitsu.com> wrote:Dear hackers,
I rebased and refined my PoC. Followings are the changes:
1. Is my understanding correct that this patch creates the delay files
for each transaction? If so, did you consider other approaches such as
using one file to avoid creating many files?
2. For streaming transactions, first the changes are written in the
temp file and then moved to the delay file. It seems like there is a
double work. Is it possible to unify it such that when min_apply_delay
is specified, we just use the delay file without sacrificing the
advantages like stream sub-abort can truncate the changes?
3. Ideally, there shouldn't be a performance impact of this feature on
regular transactions because the delay file is created only when
min_apply_delay is active but better to do some testing of the same.In addition to the points Amit raised, if the 'required_schema' option
is specified in START_REPLICATION, the publisher sends schema
information for every change. I think it leads to significant
overhead. Did you consider alternative approaches such as sending the
schema information for every transaction or the subscriber requests
the publisher to send it?Why do we need this new flag? I can't see any comments in the related
code which explain its need.
So as per the email [1]/messages/by-id/TYAPR01MB5866568A5C1E71338328B20CF5629@TYAPR01MB5866.jpnprd01.prod.outlook.com, this would be required after the subscriber
restart. I guess we ideally need it once per delay file (considering
that we have one file for all delayed xacts). In the worst case, we
can have it per transaction as suggested by Sawada-San.
[1]: /messages/by-id/TYAPR01MB5866568A5C1E71338328B20CF5629@TYAPR01MB5866.jpnprd01.prod.outlook.com
--
With Regards,
Amit Kapila.
Dear Amit,
Thank you for giving suggestions.
Dear hackers,
I rebased and refined my PoC. Followings are the changes:
1. Is my understanding correct that this patch creates the delay files
for each transaction? If so, did you consider other approaches such as
using one file to avoid creating many files?
I have been analyzing the approach which uses only one file per subscription, per
your suggestion. Currently I'm not sure whether it is good approach or not, so could
you please give me any feedbacks?
TL;DR: rotating segment files like WALs may be used, but there are several issues.
# Assumption
* Streamed txns are also serialized to the same permanent file, in the received order.
* No additional sorting is considered.
# Considerations
As a premise, applied txns must be removed from files, otherwise the disk becomes
full in some day and it leads PANIC.
## Naive approach - serialize all the changes to one large file
If workers continue to write received changes from the head naively, it may be
difficult to purge applied txns because there seems not to have a good way to
truncate the first part of the file. I could not find related functions in fd.h.
## Alternative approach - separate the file into segments
Alternative approach I came up with is that the file is divided into some segments
- like WAL - and remove it if all written txns are applied. It may work well in
non-streaming, 1pc case, but may not in other cases.
### Regarding the PREPARE transactions
At that time it is more likely to occur that the segment which contains the
actual txn is differ from the segment where COMMIT PREPARED. Hence the worker
must check all the remained segments to find the actual messages from them. Isn't
it inefficient? There is another approach that workers apply the PREPARE
immediately and spill to file only COMMIT PREPARED, but in this case the worker
have been acquiring the lock and never released it till delay is done.
### Regarding the streamed transactions
As for streaming case, chunks of txns are separated into several segments.
Hence the worker must check all the remained segments to find chunks messages
from them, same as above. Isn't it inefficient too?
Additionally, segments which have prepared or streamed transactions cannot be
removed, so even if the case many files may be generated and remained.
Anyway, it may be difficult to accept to stream in-progress transactions while
delaying the application. IIUC the motivation of steaming is to reduce the lag
between nodes, and it is opposite of this feature. So it might be okay, not sure.
### Regarding the publisher - timing to send schema may be fuzzy
Another issue is that the timing when publisher sends the schema information
cannot be determined on publisher itself. As discussed on hackers, publisher
must send schema information once per segment file, but it is controlled on
subscriber side.
I'm thinking that the walsender cannot recognize the changing of segments and
understand the timing to send them.
That's it. I'm very happy to get idea.
Best Regards,
Hayato Kuroda
FUJITSU LIMITED
Dear hackers,
At PGcon and other places we have discussed the time-delayed logical replication,
but now we have understood that there are no easy ways. Followings are our analysis.
# Abstract
To implement the time-dealyed logical replication for more proper approach,
the worker must serialize all the received messages into permanent files.
But PostgreSQL does not have good infrastructures for the purpose so huge engineering is needed.
## Review: problem of without-file approach
In the without-file approach, the apply worker process sleeps while delaying the application.
This approach is chosen in earlier versions like [1]/messages/by-id/f026292b-c9ee-472e-beaa-d32c5c3a2ced@www.fastmail.com, but it contains problems which was
shared by Sawada-san [2]/messages/by-id/CAD21AoAeG2+RsUYD9+mEwr8-rrt8R1bqpe56T2D=euO-Qs-GAg@mail.gmail.com. They lead the PANIC error due to the disk full.
A) WALs cannot be recycled on publisher because they are not flushed on subscriber.
B) Moreover, vacuuming cannot remove dead tuples on publisher.
## Alternative approach: serializing messages to files
To prevent any potential issues, the worker should serialize all incoming messages
to a permanent file, like what the physical walreceiver does.
Here, messages are first written into files at the beginning of transactions and then flushed at the end.
This approach could slove problem a), b), but it still has many considerations and difficulties.
### How to separate messages into files?
There are two possibilities for dividing messages into files, but neither of them is ideal.
1. Create a file per received transaction.
In this case files will be removed after the delay-period is exceeded and it is applied.
This is the simplest approach, but the number of files is bloat.
2. Use one large file or segmented file (like WAL).
This can reduce the number of files, but we must consider further things:
A) Purge – We must purge the applied transaction, but we do not have a good way
to remove one transaction from the large file.
B) 2PC – It is more likely to occur that the segment which contains the actual
transaction differs from the segment where COMMIT PREPARED.
Hence the worker must check all the segments to find the actual messages from them.
C) Streamed in-progress transactions - chunks of transactions are separated
into several segments. Hence the worker must check all the segments to find
chunks messages from them, same as above.
### Handle the case when the file exceeds the limitation
Regardless of the option chosen from the ones mentioned above, there is a possibility
that the file size could exceed the file system's limit. This can occur as the
publisher can send transactions of any length.
PostgreSQL provides a mechanism for working with such large files - BufFile data structure,
but it could not be used as-is for several reasons:
A) It only supports the buffered-I/O. A read or write of the low-level File
occurs only when the buffer is filled or emptied. So, we cannot control when it is persisted.
B) It can be used only for temporary purpose. Internally the BufFile creates
some physical files into $PGDATA/base/pgsql_tmp directories, and files in the
subdirectory will be removed when postmaster restarts.
C) It does not have mechanisms for restoring information after the restart.
BufFile contains virtual positions such as file index and offset, but these
fields are stored in a memory structure, so the BufFile will forget the ordering
of files and its initial/final position after restarts.
D) It cannot remove a part of virtual file. Even if a large file is separated
into multiple physical files and all transactions in a physical file are already
applied, BufFile cannot remove only one part.
[1]: /messages/by-id/f026292b-c9ee-472e-beaa-d32c5c3a2ced@www.fastmail.com
[2]: /messages/by-id/CAD21AoAeG2+RsUYD9+mEwr8-rrt8R1bqpe56T2D=euO-Qs-GAg@mail.gmail.com
Acknowledgement:
Amit, Peter, Sawada-san
Thank you for discussing with me off-list.
Best Regards,
Hayato Kuroda
FUJITSU LIMITED
Dear hackers,
At PGcon and other places we have discussed the time-delayed logical
replication,
but now we have understood that there are no easy ways. Followings are our
analysis.
At this point, I have not planned to develop the PoC anymore, unless better idea
or infrastructure will come.
Best Regards,
Hayato Kuroda
FUJITSU LIMITED