Interval for launching the table sync worker
Hi all,
While testing table sync worker for logical replication I noticed that
if the table sync worker of logical replication failed to insert the
data for whatever reason, the table sync worker process exits with
error. And then the main apply worker launches the table sync worker
again soon without interval. This routine is executed at very high
frequency without interval.
Should we do put a interval (wal_retrieve_interval or make a new GUC
parameter?) for launching the table sync worker?
Regards,
--
Masahiko Sawada
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
I was thinking the same.
At Thu, 6 Apr 2017 11:33:22 +0900, Masahiko Sawada <sawada.mshk@gmail.com> wrote in <CAD21AoDCnyRJDUY=ESVVe68AukvOP2dFomTeBFpAd1TiFbjsGg@mail.gmail.com>
Hi all,
While testing table sync worker for logical replication I noticed that
if the table sync worker of logical replication failed to insert the
data for whatever reason, the table sync worker process exits with
error. And then the main apply worker launches the table sync worker
again soon without interval. This routine is executed at very high
frequency without interval.Should we do put a interval (wal_retrieve_interval or make a new GUC
parameter?) for launching the table sync worker?
After introducing encoding conversion, untranslatable characters
in a published table causes this situation. Interval can reduce
the frequence of reconnecting, but I think that walreciever
should refrain from reconnecting on unrecoverable(or repeating)
error in walsender.
regards,
--
Kyotaro Horiguchi
NTT Open Source Software Center
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Thu, Apr 6, 2017 at 3:45 PM, Kyotaro HORIGUCHI
<horiguchi.kyotaro@lab.ntt.co.jp> wrote:
I was thinking the same.
At Thu, 6 Apr 2017 11:33:22 +0900, Masahiko Sawada <sawada.mshk@gmail.com> wrote in <CAD21AoDCnyRJDUY=ESVVe68AukvOP2dFomTeBFpAd1TiFbjsGg@mail.gmail.com>
Hi all,
While testing table sync worker for logical replication I noticed that
if the table sync worker of logical replication failed to insert the
data for whatever reason, the table sync worker process exits with
error. And then the main apply worker launches the table sync worker
again soon without interval. This routine is executed at very high
frequency without interval.Should we do put a interval (wal_retrieve_interval or make a new GUC
parameter?) for launching the table sync worker?After introducing encoding conversion, untranslatable characters
in a published table causes this situation.
I think it's better to make a new GUC parameter for the table sync
worker. Having multiple behaviors in wal_retrieve_retry_interval is
not good idea. Thought?
Interval can reduce
the frequence of reconnecting, but I think that walreciever
should refrain from reconnecting on unrecoverable(or repeating)
error in walsender.
Yeah, that's also considerable issue.
Regards,
--
Masahiko Sawada
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
At Thu, 6 Apr 2017 16:15:33 +0900, Masahiko Sawada <sawada.mshk@gmail.com> wrote in <CAD21AoCrcwi3SwKKOW_efwW0finxyycLgsbw09n5uy2sxneO_A@mail.gmail.com>
On Thu, Apr 6, 2017 at 3:45 PM, Kyotaro HORIGUCHI
<horiguchi.kyotaro@lab.ntt.co.jp> wrote:I was thinking the same.
At Thu, 6 Apr 2017 11:33:22 +0900, Masahiko Sawada <sawada.mshk@gmail.com> wrote in <CAD21AoDCnyRJDUY=ESVVe68AukvOP2dFomTeBFpAd1TiFbjsGg@mail.gmail.com>
Hi all,
While testing table sync worker for logical replication I noticed that
if the table sync worker of logical replication failed to insert the
data for whatever reason, the table sync worker process exits with
error. And then the main apply worker launches the table sync worker
again soon without interval. This routine is executed at very high
frequency without interval.Should we do put a interval (wal_retrieve_interval or make a new GUC
parameter?) for launching the table sync worker?After introducing encoding conversion, untranslatable characters
in a published table causes this situation.I think it's better to make a new GUC parameter for the table sync
worker. Having multiple behaviors in wal_retrieve_retry_interval is
not good idea. Thought?
I prefer subscription option than GUC. Something like following.
CREATE SUBSCRIPTION s1 CONNECTION 'blah'
PUBLICATION p1 WITH (noreconnect = true);
Stored in pg_subscription?
Interval can reduce
the frequence of reconnecting, but I think that walreciever
should refrain from reconnecting on unrecoverable(or repeating)
error in walsender.Yeah, that's also considerable issue.
But not to do now.
regards,
--
Kyotaro Horiguchi
NTT Open Source Software Center
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
At Thu, 06 Apr 2017 17:02:14 +0900 (Tokyo Standard Time), Kyotaro HORIGUCHI <horiguchi.kyotaro@lab.ntt.co.jp> wrote in <20170406.170214.263553093.horiguchi.kyotaro@lab.ntt.co.jp>
At Thu, 6 Apr 2017 16:15:33 +0900, Masahiko Sawada <sawada.mshk@gmail.com> wrote in <CAD21AoCrcwi3SwKKOW_efwW0finxyycLgsbw09n5uy2sxneO_A@mail.gmail.com>
On Thu, Apr 6, 2017 at 3:45 PM, Kyotaro HORIGUCHI
<horiguchi.kyotaro@lab.ntt.co.jp> wrote:I was thinking the same.
At Thu, 6 Apr 2017 11:33:22 +0900, Masahiko Sawada <sawada.mshk@gmail.com> wrote in <CAD21AoDCnyRJDUY=ESVVe68AukvOP2dFomTeBFpAd1TiFbjsGg@mail.gmail.com>
Hi all,
While testing table sync worker for logical replication I noticed that
if the table sync worker of logical replication failed to insert the
data for whatever reason, the table sync worker process exits with
error. And then the main apply worker launches the table sync worker
again soon without interval. This routine is executed at very high
frequency without interval.Should we do put a interval (wal_retrieve_interval or make a new GUC
parameter?) for launching the table sync worker?
Hmm. I was playing with something wrong. Now I see the invervals
5 seconds. No problem.
After introducing encoding conversion, untranslatable characters
in a published table causes this situation.I think it's better to make a new GUC parameter for the table sync
worker. Having multiple behaviors in wal_retrieve_retry_interval is
not good idea. Thought?
So, this is working, sorry.
I prefer subscription option than GUC. Something like following.
CREATE SUBSCRIPTION s1 CONNECTION 'blah'
PUBLICATION p1 WITH (noreconnect = true);Stored in pg_subscription?
Interval can reduce
the frequence of reconnecting, but I think that walreciever
should refrain from reconnecting on unrecoverable(or repeating)
error in walsender.Yeah, that's also considerable issue.
But not to do now.
--
Kyotaro Horiguchi
NTT Open Source Software Center
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Thu, Apr 6, 2017 at 9:06 PM, Kyotaro HORIGUCHI
<horiguchi.kyotaro@lab.ntt.co.jp> wrote:
At Thu, 06 Apr 2017 17:02:14 +0900 (Tokyo Standard Time), Kyotaro HORIGUCHI <horiguchi.kyotaro@lab.ntt.co.jp> wrote in <20170406.170214.263553093.horiguchi.kyotaro@lab.ntt.co.jp>
At Thu, 6 Apr 2017 16:15:33 +0900, Masahiko Sawada <sawada.mshk@gmail.com> wrote in <CAD21AoCrcwi3SwKKOW_efwW0finxyycLgsbw09n5uy2sxneO_A@mail.gmail.com>
On Thu, Apr 6, 2017 at 3:45 PM, Kyotaro HORIGUCHI
<horiguchi.kyotaro@lab.ntt.co.jp> wrote:I was thinking the same.
At Thu, 6 Apr 2017 11:33:22 +0900, Masahiko Sawada <sawada.mshk@gmail.com> wrote in <CAD21AoDCnyRJDUY=ESVVe68AukvOP2dFomTeBFpAd1TiFbjsGg@mail.gmail.com>
Hi all,
While testing table sync worker for logical replication I noticed that
if the table sync worker of logical replication failed to insert the
data for whatever reason, the table sync worker process exits with
error. And then the main apply worker launches the table sync worker
again soon without interval. This routine is executed at very high
frequency without interval.Should we do put a interval (wal_retrieve_interval or make a new GUC
parameter?) for launching the table sync worker?Hmm. I was playing with something wrong. Now I see the invervals
5 seconds. No problem.
Yeah, this issue happens only in the table sync worker.
After introducing encoding conversion, untranslatable characters
in a published table causes this situation.I think it's better to make a new GUC parameter for the table sync
worker. Having multiple behaviors in wal_retrieve_retry_interval is
not good idea. Thought?So, this is working, sorry.
I prefer subscription option than GUC. Something like following.
CREATE SUBSCRIPTION s1 CONNECTION 'blah'
PUBLICATION p1 WITH (noreconnect = true);Stored in pg_subscription?
Interval can reduce
the frequence of reconnecting, but I think that walreciever
should refrain from reconnecting on unrecoverable(or repeating)
error in walsender.Yeah, that's also considerable issue.
But not to do now.
I've added this as an open item, and sent a patch for this.
Regards,
--
Masahiko Sawada
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On 06/04/17 16:44, Masahiko Sawada wrote:
On Thu, Apr 6, 2017 at 9:06 PM, Kyotaro HORIGUCHI
<horiguchi.kyotaro@lab.ntt.co.jp> wrote:At Thu, 06 Apr 2017 17:02:14 +0900 (Tokyo Standard Time), Kyotaro HORIGUCHI <horiguchi.kyotaro@lab.ntt.co.jp> wrote in <20170406.170214.263553093.horiguchi.kyotaro@lab.ntt.co.jp>
At Thu, 6 Apr 2017 16:15:33 +0900, Masahiko Sawada <sawada.mshk@gmail.com> wrote in <CAD21AoCrcwi3SwKKOW_efwW0finxyycLgsbw09n5uy2sxneO_A@mail.gmail.com>
On Thu, Apr 6, 2017 at 3:45 PM, Kyotaro HORIGUCHI
<horiguchi.kyotaro@lab.ntt.co.jp> wrote:I was thinking the same.
At Thu, 6 Apr 2017 11:33:22 +0900, Masahiko Sawada <sawada.mshk@gmail.com> wrote in <CAD21AoDCnyRJDUY=ESVVe68AukvOP2dFomTeBFpAd1TiFbjsGg@mail.gmail.com>
Hi all,
While testing table sync worker for logical replication I noticed that
if the table sync worker of logical replication failed to insert the
data for whatever reason, the table sync worker process exits with
error. And then the main apply worker launches the table sync worker
again soon without interval. This routine is executed at very high
frequency without interval.Should we do put a interval (wal_retrieve_interval or make a new GUC
parameter?) for launching the table sync worker?Hmm. I was playing with something wrong. Now I see the invervals
5 seconds. No problem.Yeah, this issue happens only in the table sync worker.
After introducing encoding conversion, untranslatable characters
in a published table causes this situation.I think it's better to make a new GUC parameter for the table sync
worker. Having multiple behaviors in wal_retrieve_retry_interval is
not good idea. Thought?So, this is working, sorry.
I prefer subscription option than GUC. Something like following.
CREATE SUBSCRIPTION s1 CONNECTION 'blah'
PUBLICATION p1 WITH (noreconnect = true);Stored in pg_subscription?
I don't think that's a very good solution, you'd lose replication on
every network glitch, upstream server restart, etc.
Interval can reduce
the frequence of reconnecting, but I think that walreciever
should refrain from reconnecting on unrecoverable(or repeating)
error in walsender.Yeah, that's also considerable issue.
But not to do now.
I've added this as an open item, and sent a patch for this.
I am not exactly sure what's the open item from this thread. To use the
wal_retrieve_interval to limit table sync restarts?
--
Petr Jelinek http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
At Thu, 6 Apr 2017 18:42:37 +0200, Petr Jelinek <petr.jelinek@2ndquadrant.com> wrote in <8c7afb37-be73-c6bd-80bc-e87522f0461a@2ndquadrant.com>
On 06/04/17 16:44, Masahiko Sawada wrote:
On Thu, Apr 6, 2017 at 9:06 PM, Kyotaro HORIGUCHI
<horiguchi.kyotaro@lab.ntt.co.jp> wrote:I prefer subscription option than GUC. Something like following.
CREATE SUBSCRIPTION s1 CONNECTION 'blah'
PUBLICATION p1 WITH (noreconnect = true);Stored in pg_subscription?
I don't think that's a very good solution, you'd lose replication on
every network glitch, upstream server restart, etc.
Yes, you're right. This would work if apply worker distinguishes
permanent error. But it is overkill so far.
I've added this as an open item, and sent a patch for this.
I am not exactly sure what's the open item from this thread. To use the
wal_retrieve_interval to limit table sync restarts?
It's not me. I also don't think this critical.
regards,
--
Kyotaro Horiguchi
NTT Open Source Software Center
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Fri, Apr 7, 2017 at 1:23 PM, Kyotaro HORIGUCHI
<horiguchi.kyotaro@lab.ntt.co.jp> wrote:
At Thu, 6 Apr 2017 18:42:37 +0200, Petr Jelinek <petr.jelinek@2ndquadrant.com> wrote in <8c7afb37-be73-c6bd-80bc-e87522f0461a@2ndquadrant.com>
On 06/04/17 16:44, Masahiko Sawada wrote:
On Thu, Apr 6, 2017 at 9:06 PM, Kyotaro HORIGUCHI
<horiguchi.kyotaro@lab.ntt.co.jp> wrote:I prefer subscription option than GUC. Something like following.
CREATE SUBSCRIPTION s1 CONNECTION 'blah'
PUBLICATION p1 WITH (noreconnect = true);Stored in pg_subscription?
I don't think that's a very good solution, you'd lose replication on
every network glitch, upstream server restart, etc.Yes, you're right. This would work if apply worker distinguishes
permanent error. But it is overkill so far.I've added this as an open item, and sent a patch for this.
I am not exactly sure what's the open item from this thread. To use the
wal_retrieve_interval to limit table sync restarts?It's not me. I also don't think this critical.
Thank you for the comment.
It's not critical but it could be problem. So I thought we should fix
it before the PostgreSQL 10 release. If it's not appropriate as an
open item I'll remove it.
Regards,
--
Masahiko Sawada
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On 4/7/17 01:10, Masahiko Sawada wrote:
It's not critical but it could be problem. So I thought we should fix
it before the PostgreSQL 10 release. If it's not appropriate as an
open item I'll remove it.
You wrote that you "sent" a patch, but I don't see a patch anywhere.
I think a nonintrusive patch for this could be considered.
--
Peter Eisentraut http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Sat, Apr 8, 2017 at 8:13 AM, Peter Eisentraut
<peter.eisentraut@2ndquadrant.com> wrote:
On 4/7/17 01:10, Masahiko Sawada wrote:
It's not critical but it could be problem. So I thought we should fix
it before the PostgreSQL 10 release. If it's not appropriate as an
open item I'll remove it.You wrote that you "sent" a patch, but I don't see a patch anywhere.
I think a nonintrusive patch for this could be considered.
Oops, I made a mistake. I'll send a patch tomorrow.
Regards,
--
Masahiko Sawada
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Sun, Apr 9, 2017 at 6:25 PM, Masahiko Sawada <sawada.mshk@gmail.com> wrote:
On Sat, Apr 8, 2017 at 8:13 AM, Peter Eisentraut
<peter.eisentraut@2ndquadrant.com> wrote:On 4/7/17 01:10, Masahiko Sawada wrote:
It's not critical but it could be problem. So I thought we should fix
it before the PostgreSQL 10 release. If it's not appropriate as an
open item I'll remove it.You wrote that you "sent" a patch, but I don't see a patch anywhere.
I think a nonintrusive patch for this could be considered.
Oops, I made a mistake. I'll send a patch tomorrow.
I've attached the patch. This patch introduces GUC parameter
table_sync_retry_interval which controls the interval of launching the
table sync worker process.
Regards,
--
Masahiko Sawada
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center
Attachments:
table_sync_retry_interval.patchapplication/octet-stream; name=table_sync_retry_interval.patchDownload
diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml
index 744c5e8..5a0e467 100644
--- a/doc/src/sgml/config.sgml
+++ b/doc/src/sgml/config.sgml
@@ -3474,6 +3474,21 @@ ANY <replaceable class="parameter">num_sync</replaceable> ( <replaceable class="
</listitem>
</varlistentry>
+ <varlistentry id="guc-table-sync-retry-interval" xreflabel="table_sync_retry_interval">
+ <term><varname>table_sync_retry_interval</varname> (<type>integer</type>)
+ <indexterm>
+ <primary><varname>table_sync_retry_interval</> configuration parameter</primary>
+ </indexterm>
+ </term>
+ <listitem>
+ <para>
+ Specify how long the subscriber should wait before retrying to copy the initial
+ data after a failed attempt. The default is 5 seconds. Units
+ are milliseconds if not specified.
+ </para>
+ </listitem>
+ </varlistentry>
+
</variablelist>
</sect2>
diff --git a/src/backend/replication/logical/launcher.c b/src/backend/replication/logical/launcher.c
index fecff93..0ec57a4 100644
--- a/src/backend/replication/logical/launcher.c
+++ b/src/backend/replication/logical/launcher.c
@@ -58,6 +58,7 @@
int max_logical_replication_workers = 4;
int max_sync_workers_per_subscription = 2;
+int table_sync_retry_interval = 5000;
LogicalRepWorker *MyLogicalRepWorker = NULL;
diff --git a/src/backend/replication/logical/tablesync.c b/src/backend/replication/logical/tablesync.c
index a067fe3..c8637f9 100644
--- a/src/backend/replication/logical/tablesync.c
+++ b/src/backend/replication/logical/tablesync.c
@@ -395,11 +395,21 @@ process_syncing_tables_for_apply(XLogRecPtr current_lsn)
*/
else if (!syncworker && nsyncworkers < max_sync_workers_per_subscription)
{
- logicalrep_worker_launch(MyLogicalRepWorker->dbid,
- MySubscription->oid,
- MySubscription->name,
- MyLogicalRepWorker->userid,
- rstate->relid);
+ static TimestampTz last_start_time = 0;
+ TimestampTz now;
+
+ now = GetCurrentTimestamp();
+
+ if (TimestampDifferenceExceeds(last_start_time, now,
+ table_sync_retry_interval))
+ {
+ logicalrep_worker_launch(MyLogicalRepWorker->dbid,
+ MySubscription->oid,
+ MySubscription->name,
+ MyLogicalRepWorker->userid,
+ rstate->relid);
+ last_start_time = now;
+ }
}
}
}
diff --git a/src/backend/utils/misc/guc.c b/src/backend/utils/misc/guc.c
index 19d258d..7e32d57 100644
--- a/src/backend/utils/misc/guc.c
+++ b/src/backend/utils/misc/guc.c
@@ -2621,6 +2621,18 @@ static struct config_int ConfigureNamesInt[] =
},
{
+ {"table_sync_retry_interval", PGC_SIGHUP, REPLICATION_STANDBY,
+ gettext_noop("Sets the time to wait before retrying to launch table "
+ "synchronization worker after a failed attempt."),
+ NULL,
+ GUC_UNIT_MS
+ },
+ &table_sync_retry_interval,
+ 5000, 1, INT_MAX,
+ NULL, NULL, NULL
+ },
+
+ {
{"wal_retrieve_retry_interval", PGC_SIGHUP, REPLICATION_STANDBY,
gettext_noop("Sets the time to wait before retrying to retrieve WAL "
"after a failed attempt."),
diff --git a/src/backend/utils/misc/postgresql.conf.sample b/src/backend/utils/misc/postgresql.conf.sample
index 512be0a..bf2af18 100644
--- a/src/backend/utils/misc/postgresql.conf.sample
+++ b/src/backend/utils/misc/postgresql.conf.sample
@@ -272,6 +272,8 @@
# in milliseconds; 0 disables
#wal_retrieve_retry_interval = 5s # time to wait before retrying to
# retrieve WAL after a failed attempt
+#table_sync_retry_interval = 5s # time to wait before retyring to
+ # synchronize table after a failed attempt
#------------------------------------------------------------------------------
diff --git a/src/include/replication/logicallauncher.h b/src/include/replication/logicallauncher.h
index 060946a..e9ea7da 100644
--- a/src/include/replication/logicallauncher.h
+++ b/src/include/replication/logicallauncher.h
@@ -14,6 +14,7 @@
extern int max_logical_replication_workers;
extern int max_sync_workers_per_subscription;
+extern int table_sync_retry_interval;
extern void ApplyLauncherRegister(void);
extern void ApplyLauncherMain(Datum main_arg);
On 10/04/17 11:02, Masahiko Sawada wrote:
On Sun, Apr 9, 2017 at 6:25 PM, Masahiko Sawada <sawada.mshk@gmail.com> wrote:
On Sat, Apr 8, 2017 at 8:13 AM, Peter Eisentraut
<peter.eisentraut@2ndquadrant.com> wrote:On 4/7/17 01:10, Masahiko Sawada wrote:
It's not critical but it could be problem. So I thought we should fix
it before the PostgreSQL 10 release. If it's not appropriate as an
open item I'll remove it.You wrote that you "sent" a patch, but I don't see a patch anywhere.
I think a nonintrusive patch for this could be considered.
Oops, I made a mistake. I'll send a patch tomorrow.
I've attached the patch. This patch introduces GUC parameter
table_sync_retry_interval which controls the interval of launching the
table sync worker process.
I don't think solution is quite this simple. This will cause all table
sync workers to be delayed which means concurrency will suffer and the
initial sync of all tables will take much longer especially if there is
little data. We need a way to either detect if we are launching same
worker that was already launched before, or alternatively if we are
launching crashed worker and only then apply the delay.
--
Petr Jelinek http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On 4/10/17 08:10, Petr Jelinek wrote:
I don't think solution is quite this simple. This will cause all table
sync workers to be delayed which means concurrency will suffer and the
initial sync of all tables will take much longer especially if there is
little data. We need a way to either detect if we are launching same
worker that was already launched before, or alternatively if we are
launching crashed worker and only then apply the delay.
Perhaps instead of a global last_start_time, we store a per relation
last_start_time in SubscriptionRelState?
--
Peter Eisentraut http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Wed, Apr 12, 2017 at 1:28 PM, Peter Eisentraut
<peter.eisentraut@2ndquadrant.com> wrote:
On 4/10/17 08:10, Petr Jelinek wrote:
I don't think solution is quite this simple. This will cause all table
sync workers to be delayed which means concurrency will suffer and the
initial sync of all tables will take much longer especially if there is
little data. We need a way to either detect if we are launching same
worker that was already launched before, or alternatively if we are
launching crashed worker and only then apply the delay.Perhaps instead of a global last_start_time, we store a per relation
last_start_time in SubscriptionRelState?
I was thinking the same. But a problem is that the list of
SubscriptionRelState is refreshed whenever the syncing table state
becomes invalid (table_state_valid = false). I guess we need to
improve these logic including GetSubscriptionNotReadyRelations().
Regards,
--
Masahiko Sawada
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On 4/12/17 00:48, Masahiko Sawada wrote:
On Wed, Apr 12, 2017 at 1:28 PM, Peter Eisentraut
Perhaps instead of a global last_start_time, we store a per relation
last_start_time in SubscriptionRelState?I was thinking the same. But a problem is that the list of
SubscriptionRelState is refreshed whenever the syncing table state
becomes invalid (table_state_valid = false). I guess we need to
improve these logic including GetSubscriptionNotReadyRelations().
The table states are invalidated on a syscache callback from
pg_subscription_rel, which happens roughly speaking when a table
finishes the initial sync. So if we're worried about failing tablesync
workers relaunching to quickly, this would only be a problem if a
tablesync of another table finishes right in that restart window. That
doesn't seem a terrible issue to me.
--
Peter Eisentraut http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Wed, Apr 12, 2017 at 11:46 PM, Peter Eisentraut
<peter.eisentraut@2ndquadrant.com> wrote:
On 4/12/17 00:48, Masahiko Sawada wrote:
On Wed, Apr 12, 2017 at 1:28 PM, Peter Eisentraut
Perhaps instead of a global last_start_time, we store a per relation
last_start_time in SubscriptionRelState?I was thinking the same. But a problem is that the list of
SubscriptionRelState is refreshed whenever the syncing table state
becomes invalid (table_state_valid = false). I guess we need to
improve these logic including GetSubscriptionNotReadyRelations().The table states are invalidated on a syscache callback from
pg_subscription_rel, which happens roughly speaking when a table
finishes the initial sync. So if we're worried about failing tablesync
workers relaunching to quickly, this would only be a problem if a
tablesync of another table finishes right in that restart window. That
doesn't seem a terrible issue to me.
I think the table states are invalidated whenever the table sync
worker starts, because the table sync worker updates its status of
pg_subscription_rel and commits it before starting actual copy. So we
cannot rely on that. I thought we can store last_start_time into
pg_subscription_rel but it might be overkill. I'm now thinking to
change GetSubscriptionNotReadyRealtions so that last_start_time in
SubscriptionRelState is taken over to new list.
Regards,
--
Masahiko Sawada
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Thu, Apr 13, 2017 at 11:53 AM, Masahiko Sawada <sawada.mshk@gmail.com> wrote:
On Wed, Apr 12, 2017 at 11:46 PM, Peter Eisentraut
<peter.eisentraut@2ndquadrant.com> wrote:On 4/12/17 00:48, Masahiko Sawada wrote:
On Wed, Apr 12, 2017 at 1:28 PM, Peter Eisentraut
Perhaps instead of a global last_start_time, we store a per relation
last_start_time in SubscriptionRelState?I was thinking the same. But a problem is that the list of
SubscriptionRelState is refreshed whenever the syncing table state
becomes invalid (table_state_valid = false). I guess we need to
improve these logic including GetSubscriptionNotReadyRelations().The table states are invalidated on a syscache callback from
pg_subscription_rel, which happens roughly speaking when a table
finishes the initial sync. So if we're worried about failing tablesync
workers relaunching to quickly, this would only be a problem if a
tablesync of another table finishes right in that restart window. That
doesn't seem a terrible issue to me.I think the table states are invalidated whenever the table sync
worker starts, because the table sync worker updates its status of
pg_subscription_rel and commits it before starting actual copy. So we
cannot rely on that. I thought we can store last_start_time into
pg_subscription_rel but it might be overkill. I'm now thinking to
change GetSubscriptionNotReadyRealtions so that last_start_time in
SubscriptionRelState is taken over to new list.
Attached the latest patch. It didn't actually necessary to change
GetSubscriptionNotReadyRelations. I just changed the logic refreshing
the sync table state list.
Please give me feedback.
Regards,
--
Masahiko Sawada
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center
Attachments:
table_sync_retry_interval_v2.patchapplication/octet-stream; name=table_sync_retry_interval_v2.patchDownload
diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml
index 744c5e8..5a0e467 100644
--- a/doc/src/sgml/config.sgml
+++ b/doc/src/sgml/config.sgml
@@ -3474,6 +3474,21 @@ ANY <replaceable class="parameter">num_sync</replaceable> ( <replaceable class="
</listitem>
</varlistentry>
+ <varlistentry id="guc-table-sync-retry-interval" xreflabel="table_sync_retry_interval">
+ <term><varname>table_sync_retry_interval</varname> (<type>integer</type>)
+ <indexterm>
+ <primary><varname>table_sync_retry_interval</> configuration parameter</primary>
+ </indexterm>
+ </term>
+ <listitem>
+ <para>
+ Specify how long the subscriber should wait before retrying to copy the initial
+ data after a failed attempt. The default is 5 seconds. Units
+ are milliseconds if not specified.
+ </para>
+ </listitem>
+ </varlistentry>
+
</variablelist>
</sect2>
diff --git a/src/backend/catalog/pg_subscription.c b/src/backend/catalog/pg_subscription.c
index f5ba9f6..573e23f 100644
--- a/src/backend/catalog/pg_subscription.c
+++ b/src/backend/catalog/pg_subscription.c
@@ -480,6 +480,7 @@ GetSubscriptionNotReadyRelations(Oid subid)
relstate->relid = subrel->srrelid;
relstate->state = subrel->srsubstate;
relstate->lsn = subrel->srsublsn;
+ relstate->last_start_time = 0;
res = lappend(res, relstate);
}
diff --git a/src/backend/replication/logical/launcher.c b/src/backend/replication/logical/launcher.c
index 7ba239c..1c260d9 100644
--- a/src/backend/replication/logical/launcher.c
+++ b/src/backend/replication/logical/launcher.c
@@ -58,6 +58,7 @@
int max_logical_replication_workers = 4;
int max_sync_workers_per_subscription = 2;
+int table_sync_retry_interval = 5000;
LogicalRepWorker *MyLogicalRepWorker = NULL;
diff --git a/src/backend/replication/logical/tablesync.c b/src/backend/replication/logical/tablesync.c
index d1f2734..e7019fb 100644
--- a/src/backend/replication/logical/tablesync.c
+++ b/src/backend/replication/logical/tablesync.c
@@ -237,7 +237,10 @@ process_syncing_tables_for_sync(XLogRecPtr current_lsn)
*
* If there are tables that need synchronizing and are not being synchronized
* yet, start sync workers for them (if there are free slots for sync
- * workers).
+ * workers). To prevent starting the sync worker for the same relation at a
+ * high frequency after fails, we store its last start time to each sync
+ * state info. We start the sync worker for the same relation at intervals
+ * of table_sync_retry_interval.
*
* For tables that are being synchronized already, check if sync workers
* either need action from the apply worker or have finished.
@@ -265,11 +268,12 @@ process_syncing_tables_for_apply(XLogRecPtr current_lsn)
{
MemoryContext oldctx;
List *rstates;
+ List *old_table_states;
ListCell *lc;
SubscriptionRelState *rstate;
- /* Clean the old list. */
- list_free_deep(table_states);
+ /* Save old sync state info list, clean the old list later */
+ old_table_states = table_states;
table_states = NIL;
StartTransactionCommand();
@@ -281,14 +285,43 @@ process_syncing_tables_for_apply(XLogRecPtr current_lsn)
oldctx = MemoryContextSwitchTo(CacheMemoryContext);
foreach(lc, rstates)
{
+ ListCell *old_lc;
+ ListCell *old_prev = NULL;
+ ListCell *old_next = NULL;
+
rstate = palloc(sizeof(SubscriptionRelState));
memcpy(rstate, lfirst(lc), sizeof(SubscriptionRelState));
+
+ /*
+ * Iterate old sync state info to take over last_start_time
+ * to new entries.
+ */
+ for (old_lc = list_head(old_table_states); old_lc != NULL; old_lc = old_next)
+ {
+ SubscriptionRelState *s = (SubscriptionRelState *) lfirst(old_lc);
+
+ old_next = lnext(old_lc);
+
+ /* Found the sync state info, take over last_start time */
+ if (rstate->relid == s->relid)
+ {
+ rstate->last_start_time = s->last_start_time;
+ old_table_states = list_delete_cell(old_table_states, old_lc, old_prev);
+ break;
+ }
+
+ old_prev = old_lc;
+ }
+
table_states = lappend(table_states, rstate);
}
MemoryContextSwitchTo(oldctx);
CommitTransactionCommand();
+ /* Clean old list */
+ list_free_deep(old_table_states);
+
table_states_valid = true;
}
@@ -395,11 +428,20 @@ process_syncing_tables_for_apply(XLogRecPtr current_lsn)
*/
else if (!syncworker && nsyncworkers < max_sync_workers_per_subscription)
{
- logicalrep_worker_launch(MyLogicalRepWorker->dbid,
- MySubscription->oid,
- MySubscription->name,
- MyLogicalRepWorker->userid,
- rstate->relid);
+ TimestampTz now;
+
+ now = GetCurrentTimestamp();
+
+ if (TimestampDifferenceExceeds(rstate->last_start_time, now,
+ table_sync_retry_interval))
+ {
+ logicalrep_worker_launch(MyLogicalRepWorker->dbid,
+ MySubscription->oid,
+ MySubscription->name,
+ MyLogicalRepWorker->userid,
+ rstate->relid);
+ rstate->last_start_time = now;
+ }
}
}
}
diff --git a/src/backend/utils/misc/guc.c b/src/backend/utils/misc/guc.c
index 9ad8361..8bca324 100644
--- a/src/backend/utils/misc/guc.c
+++ b/src/backend/utils/misc/guc.c
@@ -2623,6 +2623,18 @@ static struct config_int ConfigureNamesInt[] =
},
{
+ {"table_sync_retry_interval", PGC_SIGHUP, REPLICATION_STANDBY,
+ gettext_noop("Sets the time to wait before retrying to launch table "
+ "synchronization worker after a failed attempt."),
+ NULL,
+ GUC_UNIT_MS
+ },
+ &table_sync_retry_interval,
+ 5000, 1, INT_MAX,
+ NULL, NULL, NULL
+ },
+
+ {
{"wal_retrieve_retry_interval", PGC_SIGHUP, REPLICATION_STANDBY,
gettext_noop("Sets the time to wait before retrying to retrieve WAL "
"after a failed attempt."),
diff --git a/src/backend/utils/misc/postgresql.conf.sample b/src/backend/utils/misc/postgresql.conf.sample
index 1435d92..632aa08 100644
--- a/src/backend/utils/misc/postgresql.conf.sample
+++ b/src/backend/utils/misc/postgresql.conf.sample
@@ -271,6 +271,8 @@
# in milliseconds; 0 disables
#wal_retrieve_retry_interval = 5s # time to wait before retrying to
# retrieve WAL after a failed attempt
+#table_sync_retry_interval = 5s # time to wait before retyring to
+ # synchronize table after a failed attempt
# - Subscribers -
diff --git a/src/include/catalog/pg_subscription_rel.h b/src/include/catalog/pg_subscription_rel.h
index 9f4f152..d8264f8 100644
--- a/src/include/catalog/pg_subscription_rel.h
+++ b/src/include/catalog/pg_subscription_rel.h
@@ -66,6 +66,7 @@ typedef struct SubscriptionRelState
Oid relid;
XLogRecPtr lsn;
char state;
+ TimestampTz last_start_time;
} SubscriptionRelState;
extern Oid SetSubscriptionRelState(Oid subid, Oid relid, char state,
diff --git a/src/include/replication/logicallauncher.h b/src/include/replication/logicallauncher.h
index 060946a..e9ea7da 100644
--- a/src/include/replication/logicallauncher.h
+++ b/src/include/replication/logicallauncher.h
@@ -14,6 +14,7 @@
extern int max_logical_replication_workers;
extern int max_sync_workers_per_subscription;
+extern int table_sync_retry_interval;
extern void ApplyLauncherRegister(void);
extern void ApplyLauncherMain(Datum main_arg);
I confused sync and apply workers.
sync worker failure at start causes immediate retries.
At Thu, 13 Apr 2017 11:53:27 +0900, Masahiko Sawada <sawada.mshk@gmail.com> wrote in <CAD21AoCR6eHgk0vaHShjO4Bre_VDKjHUbL9EuWHaUgRPSPPyVQ@mail.gmail.com>
On Wed, Apr 12, 2017 at 11:46 PM, Peter Eisentraut
<peter.eisentraut@2ndquadrant.com> wrote:On 4/12/17 00:48, Masahiko Sawada wrote:
On Wed, Apr 12, 2017 at 1:28 PM, Peter Eisentraut
Perhaps instead of a global last_start_time, we store a per relation
last_start_time in SubscriptionRelState?I was thinking the same. But a problem is that the list of
SubscriptionRelState is refreshed whenever the syncing table state
becomes invalid (table_state_valid = false). I guess we need to
improve these logic including GetSubscriptionNotReadyRelations().The table states are invalidated on a syscache callback from
pg_subscription_rel, which happens roughly speaking when a table
finishes the initial sync. So if we're worried about failing tablesync
workers relaunching to quickly, this would only be a problem if a
tablesync of another table finishes right in that restart window. That
doesn't seem a terrible issue to me.I think the table states are invalidated whenever the table sync
worker starts, because the table sync worker updates its status of
pg_subscription_rel and commits it before starting actual copy. So we
cannot rely on that. I thought we can store last_start_time into
pg_subscription_rel but it might be overkill. I'm now thinking to
change GetSubscriptionNotReadyRealtions so that last_start_time in
SubscriptionRelState is taken over to new list.
This resolves the problem but, if I understand correctly, the
many pallocs in process_syncing_tables_for_apply() is working on
ApplyContext and the context is reset before the next visit here
(in LogicalRepApplyLoop).
Although this is not a problem of this patch, this is a problem
generally.
regards,
--
Kyotaro Horiguchi
NTT Open Source Software Center
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Ouch! I replied to wrong mail.
At Thu, 13 Apr 2017 19:55:04 +0900 (Tokyo Standard Time), Kyotaro HORIGUCHI <horiguchi.kyotaro@lab.ntt.co.jp> wrote in <20170413.195504.89348773.horiguchi.kyotaro@lab.ntt.co.jp>
I confused sync and apply workers.
sync worker failure at start causes immediate retries.At Thu, 13 Apr 2017 11:53:27 +0900, Masahiko Sawada <sawada.mshk@gmail.com> wrote in <CAD21AoCR6eHgk0vaHShjO4Bre_VDKjHUbL9EuWHaUgRPSPPyVQ@mail.gmail.com>
On Wed, Apr 12, 2017 at 11:46 PM, Peter Eisentraut
<peter.eisentraut@2ndquadrant.com> wrote:On 4/12/17 00:48, Masahiko Sawada wrote:
On Wed, Apr 12, 2017 at 1:28 PM, Peter Eisentraut
Perhaps instead of a global last_start_time, we store a per relation
last_start_time in SubscriptionRelState?I was thinking the same. But a problem is that the list of
SubscriptionRelState is refreshed whenever the syncing table state
becomes invalid (table_state_valid = false). I guess we need to
improve these logic including GetSubscriptionNotReadyRelations().The table states are invalidated on a syscache callback from
pg_subscription_rel, which happens roughly speaking when a table
finishes the initial sync. So if we're worried about failing tablesync
workers relaunching to quickly, this would only be a problem if a
tablesync of another table finishes right in that restart window. That
doesn't seem a terrible issue to me.I think the table states are invalidated whenever the table sync
worker starts, because the table sync worker updates its status of
pg_subscription_rel and commits it before starting actual copy. So we
cannot rely on that. I thought we can store last_start_time into
pg_subscription_rel but it might be overkill. I'm now thinking to
change GetSubscriptionNotReadyRealtions so that last_start_time in
SubscriptionRelState is taken over to new list.
The right target of "This" below is found at the following URL.
/messages/by-id/CAD21AoBt_XUdppddFak661_LBM2t3CfK52aLKHG+ekd7SkzLmg@mail.gmail.com
This resolves the problem but, if I understand correctly, the
many pallocs in process_syncing_tables_for_apply() is working on
ApplyContext and the context is reset before the next visit here
(in LogicalRepApplyLoop).Although this is not a problem of this patch, this is a problem
generally.
regards,
--
Kyotaro Horiguchi
NTT Open Source Software Center
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On 13/04/17 13:01, Kyotaro HORIGUCHI wrote:
Ouch! I replied to wrong mail.
At Thu, 13 Apr 2017 19:55:04 +0900 (Tokyo Standard Time), Kyotaro HORIGUCHI <horiguchi.kyotaro@lab.ntt.co.jp> wrote in <20170413.195504.89348773.horiguchi.kyotaro@lab.ntt.co.jp>
I confused sync and apply workers.
sync worker failure at start causes immediate retries.At Thu, 13 Apr 2017 11:53:27 +0900, Masahiko Sawada <sawada.mshk@gmail.com> wrote in <CAD21AoCR6eHgk0vaHShjO4Bre_VDKjHUbL9EuWHaUgRPSPPyVQ@mail.gmail.com>
On Wed, Apr 12, 2017 at 11:46 PM, Peter Eisentraut
<peter.eisentraut@2ndquadrant.com> wrote:On 4/12/17 00:48, Masahiko Sawada wrote:
On Wed, Apr 12, 2017 at 1:28 PM, Peter Eisentraut
Perhaps instead of a global last_start_time, we store a per relation
last_start_time in SubscriptionRelState?I was thinking the same. But a problem is that the list of
SubscriptionRelState is refreshed whenever the syncing table state
becomes invalid (table_state_valid = false). I guess we need to
improve these logic including GetSubscriptionNotReadyRelations().The table states are invalidated on a syscache callback from
pg_subscription_rel, which happens roughly speaking when a table
finishes the initial sync. So if we're worried about failing tablesync
workers relaunching to quickly, this would only be a problem if a
tablesync of another table finishes right in that restart window. That
doesn't seem a terrible issue to me.I think the table states are invalidated whenever the table sync
worker starts, because the table sync worker updates its status of
pg_subscription_rel and commits it before starting actual copy. So we
cannot rely on that. I thought we can store last_start_time into
pg_subscription_rel but it might be overkill. I'm now thinking to
change GetSubscriptionNotReadyRealtions so that last_start_time in
SubscriptionRelState is taken over to new list.The right target of "This" below is found at the following URL.
/messages/by-id/CAD21AoBt_XUdppddFak661_LBM2t3CfK52aLKHG+ekd7SkzLmg@mail.gmail.com
This resolves the problem but, if I understand correctly, the
many pallocs in process_syncing_tables_for_apply() is working on
ApplyContext and the context is reset before the next visit here
(in LogicalRepApplyLoop).Although this is not a problem of this patch, this is a problem
generally.
Huh? We explicitly switch to CacheMemoryContext before pallocing
anything that should survive long term.
--
Petr Jelinek http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On 13/04/17 12:23, Masahiko Sawada wrote:
On Thu, Apr 13, 2017 at 11:53 AM, Masahiko Sawada <sawada.mshk@gmail.com> wrote:
On Wed, Apr 12, 2017 at 11:46 PM, Peter Eisentraut
<peter.eisentraut@2ndquadrant.com> wrote:On 4/12/17 00:48, Masahiko Sawada wrote:
On Wed, Apr 12, 2017 at 1:28 PM, Peter Eisentraut
Perhaps instead of a global last_start_time, we store a per relation
last_start_time in SubscriptionRelState?I was thinking the same. But a problem is that the list of
SubscriptionRelState is refreshed whenever the syncing table state
becomes invalid (table_state_valid = false). I guess we need to
improve these logic including GetSubscriptionNotReadyRelations().The table states are invalidated on a syscache callback from
pg_subscription_rel, which happens roughly speaking when a table
finishes the initial sync. So if we're worried about failing tablesync
workers relaunching to quickly, this would only be a problem if a
tablesync of another table finishes right in that restart window. That
doesn't seem a terrible issue to me.I think the table states are invalidated whenever the table sync
worker starts, because the table sync worker updates its status of
pg_subscription_rel and commits it before starting actual copy. So we
cannot rely on that. I thought we can store last_start_time into
pg_subscription_rel but it might be overkill. I'm now thinking to
change GetSubscriptionNotReadyRealtions so that last_start_time in
SubscriptionRelState is taken over to new list.Attached the latest patch. It didn't actually necessary to change
GetSubscriptionNotReadyRelations. I just changed the logic refreshing
the sync table state list.
Please give me feedback.
Hmm this might work. Although I was actually wondering if we could store
the last start timestamp in the worker shared memory and do some magic
with that (ie, not clearing subid and relid and try to then do rate
limiting based on subid+relid+timestamp stored in shmem). That would
then work same way for the main apply workers as well. It would have the
disadvantage that if some tables were consistently failing, no other
tables could get synchronized as the amount of workers is limited. OTOH
the approach chosen in the patch will first try all tables and only then
start rate limiting, not quite sure which is better.
--
Petr Jelinek http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
At Thu, 13 Apr 2017 20:02:33 +0200, Petr Jelinek <petr.jelinek@2ndquadrant.com> wrote in <a06d19af-a63a-c546-873c-818b26f4ef10@2ndquadrant.com>
Although this is not a problem of this patch, this is a problem
generally.Huh? We explicitly switch to CacheMemoryContext before pallocing
anything that should survive long term.
Ah.. yes, sorry for the noise.
--
Kyotaro Horiguchi
NTT Open Source Software Center
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Fri, Apr 14, 2017 at 7:09 AM, Petr Jelinek
<petr.jelinek@2ndquadrant.com> wrote:
On 13/04/17 12:23, Masahiko Sawada wrote:
On Thu, Apr 13, 2017 at 11:53 AM, Masahiko Sawada <sawada.mshk@gmail.com> wrote:
On Wed, Apr 12, 2017 at 11:46 PM, Peter Eisentraut
<peter.eisentraut@2ndquadrant.com> wrote:On 4/12/17 00:48, Masahiko Sawada wrote:
On Wed, Apr 12, 2017 at 1:28 PM, Peter Eisentraut
Perhaps instead of a global last_start_time, we store a per relation
last_start_time in SubscriptionRelState?I was thinking the same. But a problem is that the list of
SubscriptionRelState is refreshed whenever the syncing table state
becomes invalid (table_state_valid = false). I guess we need to
improve these logic including GetSubscriptionNotReadyRelations().The table states are invalidated on a syscache callback from
pg_subscription_rel, which happens roughly speaking when a table
finishes the initial sync. So if we're worried about failing tablesync
workers relaunching to quickly, this would only be a problem if a
tablesync of another table finishes right in that restart window. That
doesn't seem a terrible issue to me.I think the table states are invalidated whenever the table sync
worker starts, because the table sync worker updates its status of
pg_subscription_rel and commits it before starting actual copy. So we
cannot rely on that. I thought we can store last_start_time into
pg_subscription_rel but it might be overkill. I'm now thinking to
change GetSubscriptionNotReadyRealtions so that last_start_time in
SubscriptionRelState is taken over to new list.Attached the latest patch. It didn't actually necessary to change
GetSubscriptionNotReadyRelations. I just changed the logic refreshing
the sync table state list.
Please give me feedback.Hmm this might work. Although I was actually wondering if we could store
the last start timestamp in the worker shared memory and do some magic
with that (ie, not clearing subid and relid and try to then do rate
limiting based on subid+relid+timestamp stored in shmem). That would
then work same way for the main apply workers as well. It would have the
disadvantage that if some tables were consistently failing, no other
tables could get synchronized as the amount of workers is limited.
Hmm I guess that it's not a good design that a table sync worker and a
apply worker for a relation takes sole possession of a worker slot
until it successes. It would bother each other.
Regards,
--
Masahiko Sawada
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On 14/04/17 12:18, Masahiko Sawada wrote:
On Fri, Apr 14, 2017 at 7:09 AM, Petr Jelinek
<petr.jelinek@2ndquadrant.com> wrote:On 13/04/17 12:23, Masahiko Sawada wrote:
On Thu, Apr 13, 2017 at 11:53 AM, Masahiko Sawada <sawada.mshk@gmail.com> wrote:
On Wed, Apr 12, 2017 at 11:46 PM, Peter Eisentraut
<peter.eisentraut@2ndquadrant.com> wrote:On 4/12/17 00:48, Masahiko Sawada wrote:
On Wed, Apr 12, 2017 at 1:28 PM, Peter Eisentraut
Perhaps instead of a global last_start_time, we store a per relation
last_start_time in SubscriptionRelState?I was thinking the same. But a problem is that the list of
SubscriptionRelState is refreshed whenever the syncing table state
becomes invalid (table_state_valid = false). I guess we need to
improve these logic including GetSubscriptionNotReadyRelations().The table states are invalidated on a syscache callback from
pg_subscription_rel, which happens roughly speaking when a table
finishes the initial sync. So if we're worried about failing tablesync
workers relaunching to quickly, this would only be a problem if a
tablesync of another table finishes right in that restart window. That
doesn't seem a terrible issue to me.I think the table states are invalidated whenever the table sync
worker starts, because the table sync worker updates its status of
pg_subscription_rel and commits it before starting actual copy. So we
cannot rely on that. I thought we can store last_start_time into
pg_subscription_rel but it might be overkill. I'm now thinking to
change GetSubscriptionNotReadyRealtions so that last_start_time in
SubscriptionRelState is taken over to new list.Attached the latest patch. It didn't actually necessary to change
GetSubscriptionNotReadyRelations. I just changed the logic refreshing
the sync table state list.
Please give me feedback.Hmm this might work. Although I was actually wondering if we could store
the last start timestamp in the worker shared memory and do some magic
with that (ie, not clearing subid and relid and try to then do rate
limiting based on subid+relid+timestamp stored in shmem). That would
then work same way for the main apply workers as well. It would have the
disadvantage that if some tables were consistently failing, no other
tables could get synchronized as the amount of workers is limited.Hmm I guess that it's not a good design that a table sync worker and a
apply worker for a relation takes sole possession of a worker slot
until it successes. It would bother each other.
Not sure what you mean by apply worker for relation, I meant the main
subscription apply worker, the "per relation apply worker" is same as
tablesync worker.
Thinking about the possible failure scenarios more, I guess you are
right. Because if there is "global" event for the publisher/subscriber
connection (ie network goes down or something) it will affect the main
worker as well so it won't launch anything. If there is table specific
issue it will affect only that table.
The corner-cases with doing it per table where we launch tablesync
needlessly are pretty much just a) connection limit on publisher
reached, b) slot limit on publisher reached, that's probably okay. We
might be able to catch these two specifically and do some global
limiting based on those (something like your original patch except the
start-time global variable would only be set on specific error and
stored in shmem), but that does not need to be solved immediately.
--
Petr Jelinek http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Fri, Apr 14, 2017 at 9:14 PM, Petr Jelinek
<petr.jelinek@2ndquadrant.com> wrote:
On 14/04/17 12:18, Masahiko Sawada wrote:
On Fri, Apr 14, 2017 at 7:09 AM, Petr Jelinek
<petr.jelinek@2ndquadrant.com> wrote:On 13/04/17 12:23, Masahiko Sawada wrote:
On Thu, Apr 13, 2017 at 11:53 AM, Masahiko Sawada <sawada.mshk@gmail.com> wrote:
On Wed, Apr 12, 2017 at 11:46 PM, Peter Eisentraut
<peter.eisentraut@2ndquadrant.com> wrote:On 4/12/17 00:48, Masahiko Sawada wrote:
On Wed, Apr 12, 2017 at 1:28 PM, Peter Eisentraut
Perhaps instead of a global last_start_time, we store a per relation
last_start_time in SubscriptionRelState?I was thinking the same. But a problem is that the list of
SubscriptionRelState is refreshed whenever the syncing table state
becomes invalid (table_state_valid = false). I guess we need to
improve these logic including GetSubscriptionNotReadyRelations().The table states are invalidated on a syscache callback from
pg_subscription_rel, which happens roughly speaking when a table
finishes the initial sync. So if we're worried about failing tablesync
workers relaunching to quickly, this would only be a problem if a
tablesync of another table finishes right in that restart window. That
doesn't seem a terrible issue to me.I think the table states are invalidated whenever the table sync
worker starts, because the table sync worker updates its status of
pg_subscription_rel and commits it before starting actual copy. So we
cannot rely on that. I thought we can store last_start_time into
pg_subscription_rel but it might be overkill. I'm now thinking to
change GetSubscriptionNotReadyRealtions so that last_start_time in
SubscriptionRelState is taken over to new list.Attached the latest patch. It didn't actually necessary to change
GetSubscriptionNotReadyRelations. I just changed the logic refreshing
the sync table state list.
Please give me feedback.Hmm this might work. Although I was actually wondering if we could store
the last start timestamp in the worker shared memory and do some magic
with that (ie, not clearing subid and relid and try to then do rate
limiting based on subid+relid+timestamp stored in shmem). That would
then work same way for the main apply workers as well. It would have the
disadvantage that if some tables were consistently failing, no other
tables could get synchronized as the amount of workers is limited.Hmm I guess that it's not a good design that a table sync worker and a
apply worker for a relation takes sole possession of a worker slot
until it successes. It would bother each other.Not sure what you mean by apply worker for relation, I meant the main
subscription apply worker, the "per relation apply worker" is same as
tablesync worker.
Oops, I made a mistake. I meant just a apply worker.
Thinking about the possible failure scenarios more, I guess you are
right. Because if there is "global" event for the publisher/subscriber
connection (ie network goes down or something) it will affect the main
worker as well so it won't launch anything. If there is table specific
issue it will affect only that table.The corner-cases with doing it per table where we launch tablesync
needlessly are pretty much just a) connection limit on publisher
reached, b) slot limit on publisher reached, that's probably okay. We
might be able to catch these two specifically and do some global
limiting based on those (something like your original patch except the
start-time global variable would only be set on specific error and
stored in shmem), but that does not need to be solved immediately.--
Petr Jelinek http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services
Regards,
--
Masahiko Sawada
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On 4/13/17 06:23, Masahiko Sawada wrote:
Attached the latest patch. It didn't actually necessary to change
GetSubscriptionNotReadyRelations. I just changed the logic refreshing
the sync table state list.
Please give me feedback.
I think this is a reasonable direction, but do we really need
table_sync_retry_interval separate from wal_retrieve_retry_interval?
--
Peter Eisentraut http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On 4/13/17 18:09, Petr Jelinek wrote:
It would have the
disadvantage that if some tables were consistently failing, no other
tables could get synchronized as the amount of workers is limited. OTOH
the approach chosen in the patch will first try all tables and only then
start rate limiting, not quite sure which is better.
I think the latter is better.
One of the scenarios cited originally somewhere was one table failing
because of an encoding issue. So it's useful to allow other tables to
continue.
--
Peter Eisentraut http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Thu, Apr 06, 2017 at 11:33:22AM +0900, Masahiko Sawada wrote:
While testing table sync worker for logical replication I noticed that
if the table sync worker of logical replication failed to insert the
data for whatever reason, the table sync worker process exits with
error. And then the main apply worker launches the table sync worker
again soon without interval. This routine is executed at very high
frequency without interval.Should we do put a interval (wal_retrieve_interval or make a new GUC
parameter?) for launching the table sync worker?
[Action required within three days. This is a generic notification.]
The above-described topic is currently a PostgreSQL 10 open item. Peter,
since you committed the patch believed to have created it, you own this open
item. If some other commit is more relevant or if this does not belong as a
v10 open item, please let us know. Otherwise, please observe the policy on
open item ownership[1]/messages/by-id/20170404140717.GA2675809@tornado.leadboat.com and send a status update within three calendar days of
this message. Include a date for your subsequent status update. Testers may
discover new open items at any time, and I want to plan to get them all fixed
well in advance of shipping v10. Consequently, I will appreciate your efforts
toward speedy resolution. Thanks.
[1]: /messages/by-id/20170404140717.GA2675809@tornado.leadboat.com
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On 4/13/17 06:23, Masahiko Sawada wrote:
Attached the latest patch. It didn't actually necessary to change
GetSubscriptionNotReadyRelations. I just changed the logic refreshing
the sync table state list.
I think this was the right direction, but then I got worried about
having a loop within a loop to copy over the last start times. If you
have very many tables, that could be a big nested loop.
Here is an alternative proposal to store the last start times in a hash
table.
(We also might want to lower the interval for the test suite, because
there are intentional failures in there, and this patch or one like it
will cause the tests to run a few seconds longer.)
--
Peter Eisentraut http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Attachments:
0001-Wait-between-tablesync-worker-restarts.patchinvalid/octet-stream; name=0001-Wait-between-tablesync-worker-restarts.patchDownload
From a6eb4a795c6e1e9347e0f33ea426635bd9a8afac Mon Sep 17 00:00:00 2001
From: Peter Eisentraut <peter_e@gmx.net>
Date: Tue, 18 Apr 2017 10:18:04 -0400
Subject: [PATCH] Wait between tablesync worker restarts
Before restarting a tablesync worker for the same relation, wait
wal_retrieve_retry_interval. This avoids restarting failing workers in
a tight loop.
---
src/backend/replication/logical/tablesync.c | 44 +++++++++++++++++++++++++----
1 file changed, 38 insertions(+), 6 deletions(-)
diff --git a/src/backend/replication/logical/tablesync.c b/src/backend/replication/logical/tablesync.c
index 108326bef1..88d01adbd2 100644
--- a/src/backend/replication/logical/tablesync.c
+++ b/src/backend/replication/logical/tablesync.c
@@ -245,7 +245,10 @@ process_syncing_tables_for_sync(XLogRecPtr current_lsn)
*
* If there are tables that need synchronizing and are not being synchronized
* yet, start sync workers for them (if there are free slots for sync
- * workers).
+ * workers). To prevent starting the sync worker for the same relation at a
+ * high frequency after a failure, we store its last start time with each sync
+ * state info. We start the sync worker for the same relation after waiting
+ * at least wal_retrieve_retry_interval.
*
* For tables that are being synchronized already, check if sync workers
* either need action from the apply worker or have finished.
@@ -263,11 +266,28 @@ process_syncing_tables_for_sync(XLogRecPtr current_lsn)
static void
process_syncing_tables_for_apply(XLogRecPtr current_lsn)
{
+ struct tablesync_start_time_mapping
+ {
+ Oid relid;
+ TimestampTz last_start_time;
+ };
static List *table_states = NIL;
+ static HTAB *last_start_times = NULL;
ListCell *lc;
Assert(!IsTransactionState());
+ if (!last_start_times)
+ {
+ HASHCTL ctl;
+
+ memset(&ctl, 0, sizeof(ctl));
+ ctl.keysize = sizeof(Oid);
+ ctl.entrysize = sizeof(struct tablesync_start_time_mapping);
+ last_start_times = hash_create("Logical replication table sync worker start times",
+ 256, &ctl, HASH_ELEM | HASH_BLOBS);
+ }
+
/* We need up to date sync state info for subscription tables here. */
if (!table_states_valid)
{
@@ -403,11 +423,23 @@ process_syncing_tables_for_apply(XLogRecPtr current_lsn)
*/
else if (!syncworker && nsyncworkers < max_sync_workers_per_subscription)
{
- logicalrep_worker_launch(MyLogicalRepWorker->dbid,
- MySubscription->oid,
- MySubscription->name,
- MyLogicalRepWorker->userid,
- rstate->relid);
+ TimestampTz now = GetCurrentTimestamp();
+ struct tablesync_start_time_mapping *hentry;
+ bool found;
+
+ hentry = hash_search(last_start_times, &rstate->relid, HASH_ENTER, &found);
+
+ if (!found ||
+ TimestampDifferenceExceeds(hentry->last_start_time, now,
+ wal_retrieve_retry_interval))
+ {
+ logicalrep_worker_launch(MyLogicalRepWorker->dbid,
+ MySubscription->oid,
+ MySubscription->name,
+ MyLogicalRepWorker->userid,
+ rstate->relid);
+ hentry->last_start_time = now;
+ }
}
}
}
--
2.12.2
On 18/04/17 16:22, Peter Eisentraut wrote:
On 4/13/17 06:23, Masahiko Sawada wrote:
Attached the latest patch. It didn't actually necessary to change
GetSubscriptionNotReadyRelations. I just changed the logic refreshing
the sync table state list.I think this was the right direction, but then I got worried about
having a loop within a loop to copy over the last start times. If you
have very many tables, that could be a big nested loop.Here is an alternative proposal to store the last start times in a hash
table.
Hmm if we create hashtable for this, I'd say create hashtable for the
whole table_states then. The reason why it's list now was that it seemed
unnecessary to have hashtable when it will be empty almost always but
there is no need to have both hashtable + list IMHO.
--
Petr Jelinek http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Tue, Apr 18, 2017 at 11:22 PM, Peter Eisentraut
<peter.eisentraut@2ndquadrant.com> wrote:
On 4/13/17 06:23, Masahiko Sawada wrote:
Attached the latest patch. It didn't actually necessary to change
GetSubscriptionNotReadyRelations. I just changed the logic refreshing
the sync table state list.I think this was the right direction, but then I got worried about
having a loop within a loop to copy over the last start times. If you
have very many tables, that could be a big nested loop.Here is an alternative proposal to store the last start times in a hash
table.
If we use wal_retrieve_retry_interval for the table sync worker, I
think we need to update the documentation as well. Currently the
documentation mentions that a bit, but since
wal_retrieve_retry_interval will be used at three different places for
different reason it would confuse the user.
Regards,
--
Masahiko Sawada
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On 4/18/17 11:59, Petr Jelinek wrote:
Hmm if we create hashtable for this, I'd say create hashtable for the
whole table_states then. The reason why it's list now was that it seemed
unnecessary to have hashtable when it will be empty almost always but
there is no need to have both hashtable + list IMHO.
The difference is that we blow away the list of states when the catalog
changes, but we keep the hash table with the start times around. We
need two things with different life times.
--
Peter Eisentraut http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On 18/04/17 18:14, Peter Eisentraut wrote:
On 4/18/17 11:59, Petr Jelinek wrote:
Hmm if we create hashtable for this, I'd say create hashtable for the
whole table_states then. The reason why it's list now was that it seemed
unnecessary to have hashtable when it will be empty almost always but
there is no need to have both hashtable + list IMHO.The difference is that we blow away the list of states when the catalog
changes, but we keep the hash table with the start times around. We
need two things with different life times.
Why can't we just update the hashtable based on the catalog? I mean once
the record is not needed in the list, the table has been synced so there
is no need for the timestamp either since we'll not try to start the
worker again.
--
Petr Jelinek http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
At Tue, 18 Apr 2017 18:40:56 +0200, Petr Jelinek <petr.jelinek@2ndquadrant.com> wrote in <f64d87d1-bef3-5e3e-a999-ba302816a0ee@2ndquadrant.com>
On 18/04/17 18:14, Peter Eisentraut wrote:
On 4/18/17 11:59, Petr Jelinek wrote:
Hmm if we create hashtable for this, I'd say create hashtable for the
whole table_states then. The reason why it's list now was that it seemed
unnecessary to have hashtable when it will be empty almost always but
there is no need to have both hashtable + list IMHO.
I understant that but I also don't like the frequent palloc/pfree
in long-lasting context and double loop like Peter.
The difference is that we blow away the list of states when the catalog
changes, but we keep the hash table with the start times around. We
need two things with different life times.
On the other hand, hash seems overdone. Addition to that, the
hash-version leaks stale entries while subscriptions are
modified. But vacuuming them costs high.
Why can't we just update the hashtable based on the catalog? I mean once
the record is not needed in the list, the table has been synced so there
is no need for the timestamp either since we'll not try to start the
worker again.
Considering the anticipated complexity when many subscriptions
exist in the list, and complexity to remove stale entries in the
hash, I'm inclining toward poroposing just to add 'worker_start'
in pg_subscription_rel. We already have the similars in
pg_stat_activity and pg_stat_replication.
regards,
--
Kyotaro Horiguchi
NTT Open Source Software Center
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Wed, Apr 19, 2017 at 5:12 PM, Kyotaro HORIGUCHI
<horiguchi.kyotaro@lab.ntt.co.jp> wrote:
At Tue, 18 Apr 2017 18:40:56 +0200, Petr Jelinek <petr.jelinek@2ndquadrant.com> wrote in <f64d87d1-bef3-5e3e-a999-ba302816a0ee@2ndquadrant.com>
On 18/04/17 18:14, Peter Eisentraut wrote:
On 4/18/17 11:59, Petr Jelinek wrote:
Hmm if we create hashtable for this, I'd say create hashtable for the
whole table_states then. The reason why it's list now was that it seemed
unnecessary to have hashtable when it will be empty almost always but
there is no need to have both hashtable + list IMHO.I understant that but I also don't like the frequent palloc/pfree
in long-lasting context and double loop like Peter.The difference is that we blow away the list of states when the catalog
changes, but we keep the hash table with the start times around. We
need two things with different life times.On the other hand, hash seems overdone. Addition to that, the
hash-version leaks stale entries while subscriptions are
modified. But vacuuming them costs high.Why can't we just update the hashtable based on the catalog? I mean once
the record is not needed in the list, the table has been synced so there
is no need for the timestamp either since we'll not try to start the
worker again.
I guess the table sync worker for the same table could need to be
started again. For example, please image a case where the table
belonging to the publication is removed from it and the corresponding
subscription is refreshed, and then the table is added to it again. We
have the record of the table with timestamp in the hash table when the
table sync in the first time, but the table sync after refreshed could
have to wait for the interval.
Considering the anticipated complexity when many subscriptions
exist in the list, and complexity to remove stale entries in the
hash, I'm inclining toward poroposing just to add 'worker_start'
in pg_subscription_rel. We already have the similars in
pg_stat_activity and pg_stat_replication.
I was thinking the same. But I'm concerned last start time of table
sync worker is not kind of useful information for the user and we
already have similar value in pg_stat_activity
(pg_stat_replication.backend_start is actually taken from
pg_stat_activity.backend_start). I'm not sure whether it's good idea
to show the slightly shifed timestamps having same meaning in two
system view.
Regards,
--
Masahiko Sawada
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On 19/04/17 14:42, Masahiko Sawada wrote:
On Wed, Apr 19, 2017 at 5:12 PM, Kyotaro HORIGUCHI
<horiguchi.kyotaro@lab.ntt.co.jp> wrote:At Tue, 18 Apr 2017 18:40:56 +0200, Petr Jelinek <petr.jelinek@2ndquadrant.com> wrote in <f64d87d1-bef3-5e3e-a999-ba302816a0ee@2ndquadrant.com>
On 18/04/17 18:14, Peter Eisentraut wrote:
On 4/18/17 11:59, Petr Jelinek wrote:
Hmm if we create hashtable for this, I'd say create hashtable for the
whole table_states then. The reason why it's list now was that it seemed
unnecessary to have hashtable when it will be empty almost always but
there is no need to have both hashtable + list IMHO.I understant that but I also don't like the frequent palloc/pfree
in long-lasting context and double loop like Peter.The difference is that we blow away the list of states when the catalog
changes, but we keep the hash table with the start times around. We
need two things with different life times.On the other hand, hash seems overdone. Addition to that, the
hash-version leaks stale entries while subscriptions are
modified. But vacuuming them costs high.Why can't we just update the hashtable based on the catalog? I mean once
the record is not needed in the list, the table has been synced so there
is no need for the timestamp either since we'll not try to start the
worker again.I guess the table sync worker for the same table could need to be
started again. For example, please image a case where the table
belonging to the publication is removed from it and the corresponding
subscription is refreshed, and then the table is added to it again. We
have the record of the table with timestamp in the hash table when the
table sync in the first time, but the table sync after refreshed could
have to wait for the interval.
But why do we want to wait in such case where user has explicitly
requested refresh?
--
Petr Jelinek http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Wed, Apr 19, 2017 at 10:07 PM, Petr Jelinek
<petr.jelinek@2ndquadrant.com> wrote:
On 19/04/17 14:42, Masahiko Sawada wrote:
On Wed, Apr 19, 2017 at 5:12 PM, Kyotaro HORIGUCHI
<horiguchi.kyotaro@lab.ntt.co.jp> wrote:At Tue, 18 Apr 2017 18:40:56 +0200, Petr Jelinek <petr.jelinek@2ndquadrant.com> wrote in <f64d87d1-bef3-5e3e-a999-ba302816a0ee@2ndquadrant.com>
On 18/04/17 18:14, Peter Eisentraut wrote:
On 4/18/17 11:59, Petr Jelinek wrote:
Hmm if we create hashtable for this, I'd say create hashtable for the
whole table_states then. The reason why it's list now was that it seemed
unnecessary to have hashtable when it will be empty almost always but
there is no need to have both hashtable + list IMHO.I understant that but I also don't like the frequent palloc/pfree
in long-lasting context and double loop like Peter.The difference is that we blow away the list of states when the catalog
changes, but we keep the hash table with the start times around. We
need two things with different life times.On the other hand, hash seems overdone. Addition to that, the
hash-version leaks stale entries while subscriptions are
modified. But vacuuming them costs high.Why can't we just update the hashtable based on the catalog? I mean once
the record is not needed in the list, the table has been synced so there
is no need for the timestamp either since we'll not try to start the
worker again.I guess the table sync worker for the same table could need to be
started again. For example, please image a case where the table
belonging to the publication is removed from it and the corresponding
subscription is refreshed, and then the table is added to it again. We
have the record of the table with timestamp in the hash table when the
table sync in the first time, but the table sync after refreshed could
have to wait for the interval.But why do we want to wait in such case where user has explicitly
requested refresh?
Yeah, sorry, I meant that we don't want to wait but cannot launch the
tablesync worker in such case.
But after more thought, the latest patch Peter proposed has the same
problem. Perhaps we need to scan always whole pg_subscription_rel and
remove the entry if the corresponding table get synced.
Regards,
--
Masahiko Sawada
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On 19/04/17 15:57, Masahiko Sawada wrote:
On Wed, Apr 19, 2017 at 10:07 PM, Petr Jelinek
<petr.jelinek@2ndquadrant.com> wrote:On 19/04/17 14:42, Masahiko Sawada wrote:
On Wed, Apr 19, 2017 at 5:12 PM, Kyotaro HORIGUCHI
<horiguchi.kyotaro@lab.ntt.co.jp> wrote:At Tue, 18 Apr 2017 18:40:56 +0200, Petr Jelinek <petr.jelinek@2ndquadrant.com> wrote in <f64d87d1-bef3-5e3e-a999-ba302816a0ee@2ndquadrant.com>
On 18/04/17 18:14, Peter Eisentraut wrote:
On 4/18/17 11:59, Petr Jelinek wrote:
Hmm if we create hashtable for this, I'd say create hashtable for the
whole table_states then. The reason why it's list now was that it seemed
unnecessary to have hashtable when it will be empty almost always but
there is no need to have both hashtable + list IMHO.I understant that but I also don't like the frequent palloc/pfree
in long-lasting context and double loop like Peter.The difference is that we blow away the list of states when the catalog
changes, but we keep the hash table with the start times around. We
need two things with different life times.On the other hand, hash seems overdone. Addition to that, the
hash-version leaks stale entries while subscriptions are
modified. But vacuuming them costs high.Why can't we just update the hashtable based on the catalog? I mean once
the record is not needed in the list, the table has been synced so there
is no need for the timestamp either since we'll not try to start the
worker again.I guess the table sync worker for the same table could need to be
started again. For example, please image a case where the table
belonging to the publication is removed from it and the corresponding
subscription is refreshed, and then the table is added to it again. We
have the record of the table with timestamp in the hash table when the
table sync in the first time, but the table sync after refreshed could
have to wait for the interval.But why do we want to wait in such case where user has explicitly
requested refresh?Yeah, sorry, I meant that we don't want to wait but cannot launch the
tablesync worker in such case.But after more thought, the latest patch Peter proposed has the same
problem. Perhaps we need to scan always whole pg_subscription_rel and
remove the entry if the corresponding table get synced.
Yes that's what I mean by "Why can't we just update the hashtable based
on the catalog". And if we do that then I don't understand why do we
need both hastable and linked list if we need to update both based on
catalog reads anyway.
--
Petr Jelinek http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Sun, Apr 16, 2017 at 06:08:41AM +0000, Noah Misch wrote:
On Thu, Apr 06, 2017 at 11:33:22AM +0900, Masahiko Sawada wrote:
While testing table sync worker for logical replication I noticed that
if the table sync worker of logical replication failed to insert the
data for whatever reason, the table sync worker process exits with
error. And then the main apply worker launches the table sync worker
again soon without interval. This routine is executed at very high
frequency without interval.Should we do put a interval (wal_retrieve_interval or make a new GUC
parameter?) for launching the table sync worker?[Action required within three days. This is a generic notification.]
The above-described topic is currently a PostgreSQL 10 open item. Peter,
since you committed the patch believed to have created it, you own this open
item. If some other commit is more relevant or if this does not belong as a
v10 open item, please let us know. Otherwise, please observe the policy on
open item ownership[1] and send a status update within three calendar days of
this message. Include a date for your subsequent status update. Testers may
discover new open items at any time, and I want to plan to get them all fixed
well in advance of shipping v10. Consequently, I will appreciate your efforts
toward speedy resolution. Thanks.[1] /messages/by-id/20170404140717.GA2675809@tornado.leadboat.com
This PostgreSQL 10 open item is past due for your status update. Kindly send
a status update within 24 hours, and include a date for your subsequent status
update. Refer to the policy on open item ownership:
/messages/by-id/20170404140717.GA2675809@tornado.leadboat.com
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Thu, Apr 20, 2017 at 12:30 AM, Petr Jelinek
<petr.jelinek@2ndquadrant.com> wrote:
On 19/04/17 15:57, Masahiko Sawada wrote:
On Wed, Apr 19, 2017 at 10:07 PM, Petr Jelinek
<petr.jelinek@2ndquadrant.com> wrote:On 19/04/17 14:42, Masahiko Sawada wrote:
On Wed, Apr 19, 2017 at 5:12 PM, Kyotaro HORIGUCHI
<horiguchi.kyotaro@lab.ntt.co.jp> wrote:At Tue, 18 Apr 2017 18:40:56 +0200, Petr Jelinek <petr.jelinek@2ndquadrant.com> wrote in <f64d87d1-bef3-5e3e-a999-ba302816a0ee@2ndquadrant.com>
On 18/04/17 18:14, Peter Eisentraut wrote:
On 4/18/17 11:59, Petr Jelinek wrote:
Hmm if we create hashtable for this, I'd say create hashtable for the
whole table_states then. The reason why it's list now was that it seemed
unnecessary to have hashtable when it will be empty almost always but
there is no need to have both hashtable + list IMHO.I understant that but I also don't like the frequent palloc/pfree
in long-lasting context and double loop like Peter.The difference is that we blow away the list of states when the catalog
changes, but we keep the hash table with the start times around. We
need two things with different life times.On the other hand, hash seems overdone. Addition to that, the
hash-version leaks stale entries while subscriptions are
modified. But vacuuming them costs high.Why can't we just update the hashtable based on the catalog? I mean once
the record is not needed in the list, the table has been synced so there
is no need for the timestamp either since we'll not try to start the
worker again.I guess the table sync worker for the same table could need to be
started again. For example, please image a case where the table
belonging to the publication is removed from it and the corresponding
subscription is refreshed, and then the table is added to it again. We
have the record of the table with timestamp in the hash table when the
table sync in the first time, but the table sync after refreshed could
have to wait for the interval.But why do we want to wait in such case where user has explicitly
requested refresh?Yeah, sorry, I meant that we don't want to wait but cannot launch the
tablesync worker in such case.But after more thought, the latest patch Peter proposed has the same
problem. Perhaps we need to scan always whole pg_subscription_rel and
remove the entry if the corresponding table get synced.Yes that's what I mean by "Why can't we just update the hashtable based
on the catalog". And if we do that then I don't understand why do we
need both hastable and linked list if we need to update both based on
catalog reads anyway.
Thanks, I've now understood correctly. Yes, I think you're right. If
we update the hash table based on the catalog whenever table state is
invalidated, we don't need to have both hash table and list.
BTW, in current HEAD the SUBREL_STATE_SYNCWAIT is not stored in the
pg_subscription_catalog. So the following condition seems not correct.
We should use "syncworker->relstate == SUBSCRIPTION_STATE_SYNCWAIT"
instead?
/*
* There is a worker synchronizing the relation and waiting for
* apply to do something.
*/
if (syncworker && rstate->state == SUBREL_STATE_SYNCWAIT)
{
/*
* There are three possible synchronization situations here.
*
* a) Apply is in front of the table sync: We tell the table
* sync to CATCHUP.
*
* b) Apply is behind the table sync: We tell the table sync
* to mark the table as SYNCDONE and finish.
* c) Apply and table sync are at the same position: We tell
* table sync to mark the table as READY and finish.
*
* In any case we'll need to wait for table sync to change
* the state in catalog and only then continue ourselves.
*/
Regards,
--
Masahiko Sawada
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On 20/04/17 06:21, Masahiko Sawada wrote:
On Thu, Apr 20, 2017 at 12:30 AM, Petr Jelinek
<petr.jelinek@2ndquadrant.com> wrote:On 19/04/17 15:57, Masahiko Sawada wrote:
On Wed, Apr 19, 2017 at 10:07 PM, Petr Jelinek
<petr.jelinek@2ndquadrant.com> wrote:On 19/04/17 14:42, Masahiko Sawada wrote:
On Wed, Apr 19, 2017 at 5:12 PM, Kyotaro HORIGUCHI
<horiguchi.kyotaro@lab.ntt.co.jp> wrote:At Tue, 18 Apr 2017 18:40:56 +0200, Petr Jelinek <petr.jelinek@2ndquadrant.com> wrote in <f64d87d1-bef3-5e3e-a999-ba302816a0ee@2ndquadrant.com>
On 18/04/17 18:14, Peter Eisentraut wrote:
On 4/18/17 11:59, Petr Jelinek wrote:
Hmm if we create hashtable for this, I'd say create hashtable for the
whole table_states then. The reason why it's list now was that it seemed
unnecessary to have hashtable when it will be empty almost always but
there is no need to have both hashtable + list IMHO.I understant that but I also don't like the frequent palloc/pfree
in long-lasting context and double loop like Peter.The difference is that we blow away the list of states when the catalog
changes, but we keep the hash table with the start times around. We
need two things with different life times.On the other hand, hash seems overdone. Addition to that, the
hash-version leaks stale entries while subscriptions are
modified. But vacuuming them costs high.Why can't we just update the hashtable based on the catalog? I mean once
the record is not needed in the list, the table has been synced so there
is no need for the timestamp either since we'll not try to start the
worker again.I guess the table sync worker for the same table could need to be
started again. For example, please image a case where the table
belonging to the publication is removed from it and the corresponding
subscription is refreshed, and then the table is added to it again. We
have the record of the table with timestamp in the hash table when the
table sync in the first time, but the table sync after refreshed could
have to wait for the interval.But why do we want to wait in such case where user has explicitly
requested refresh?Yeah, sorry, I meant that we don't want to wait but cannot launch the
tablesync worker in such case.But after more thought, the latest patch Peter proposed has the same
problem. Perhaps we need to scan always whole pg_subscription_rel and
remove the entry if the corresponding table get synced.Yes that's what I mean by "Why can't we just update the hashtable based
on the catalog". And if we do that then I don't understand why do we
need both hastable and linked list if we need to update both based on
catalog reads anyway.Thanks, I've now understood correctly. Yes, I think you're right. If
we update the hash table based on the catalog whenever table state is
invalidated, we don't need to have both hash table and list.BTW, in current HEAD the SUBREL_STATE_SYNCWAIT is not stored in the
pg_subscription_catalog. So the following condition seems not correct.
We should use "syncworker->relstate == SUBSCRIPTION_STATE_SYNCWAIT"
instead?/*
* There is a worker synchronizing the relation and waiting for
* apply to do something.
*/
if (syncworker && rstate->state == SUBREL_STATE_SYNCWAIT)
{
/*
* There are three possible synchronization situations here.
*
* a) Apply is in front of the table sync: We tell the table
* sync to CATCHUP.
*
* b) Apply is behind the table sync: We tell the table sync
* to mark the table as SYNCDONE and finish.* c) Apply and table sync are at the same position: We tell
* table sync to mark the table as READY and finish.
*
* In any case we'll need to wait for table sync to change
* the state in catalog and only then continue ourselves.
*/
Good catch. Although it's not comment that's wrong, it's the if. We
should not compare rstate->state but syncworker->relstate.
The reason why SUBREL_STATE_SYNCWAIT is not stored in catalog is that
we'd have to commit from sync worker which could leave table in
partially synchronized state on crash without the ability to resume
afterwards (since the slot on other side will be gone).
--
Petr Jelinek http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On 4/19/17 23:02, Noah Misch wrote:
This PostgreSQL 10 open item is past due for your status update. Kindly send
a status update within 24 hours, and include a date for your subsequent status
update. Refer to the policy on open item ownership:
/messages/by-id/20170404140717.GA2675809@tornado.leadboat.com
This is currently on the back burner relative to other issues. Next
check-in from me by Monday.
--
Peter Eisentraut http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Thu, Apr 20, 2017 at 8:43 PM, Petr Jelinek
<petr.jelinek@2ndquadrant.com> wrote:
On 20/04/17 06:21, Masahiko Sawada wrote:
On Thu, Apr 20, 2017 at 12:30 AM, Petr Jelinek
<petr.jelinek@2ndquadrant.com> wrote:On 19/04/17 15:57, Masahiko Sawada wrote:
On Wed, Apr 19, 2017 at 10:07 PM, Petr Jelinek
<petr.jelinek@2ndquadrant.com> wrote:On 19/04/17 14:42, Masahiko Sawada wrote:
On Wed, Apr 19, 2017 at 5:12 PM, Kyotaro HORIGUCHI
<horiguchi.kyotaro@lab.ntt.co.jp> wrote:At Tue, 18 Apr 2017 18:40:56 +0200, Petr Jelinek <petr.jelinek@2ndquadrant.com> wrote in <f64d87d1-bef3-5e3e-a999-ba302816a0ee@2ndquadrant.com>
On 18/04/17 18:14, Peter Eisentraut wrote:
On 4/18/17 11:59, Petr Jelinek wrote:
Hmm if we create hashtable for this, I'd say create hashtable for the
whole table_states then. The reason why it's list now was that it seemed
unnecessary to have hashtable when it will be empty almost always but
there is no need to have both hashtable + list IMHO.I understant that but I also don't like the frequent palloc/pfree
in long-lasting context and double loop like Peter.The difference is that we blow away the list of states when the catalog
changes, but we keep the hash table with the start times around. We
need two things with different life times.On the other hand, hash seems overdone. Addition to that, the
hash-version leaks stale entries while subscriptions are
modified. But vacuuming them costs high.Why can't we just update the hashtable based on the catalog? I mean once
the record is not needed in the list, the table has been synced so there
is no need for the timestamp either since we'll not try to start the
worker again.I guess the table sync worker for the same table could need to be
started again. For example, please image a case where the table
belonging to the publication is removed from it and the corresponding
subscription is refreshed, and then the table is added to it again. We
have the record of the table with timestamp in the hash table when the
table sync in the first time, but the table sync after refreshed could
have to wait for the interval.But why do we want to wait in such case where user has explicitly
requested refresh?Yeah, sorry, I meant that we don't want to wait but cannot launch the
tablesync worker in such case.But after more thought, the latest patch Peter proposed has the same
problem. Perhaps we need to scan always whole pg_subscription_rel and
remove the entry if the corresponding table get synced.Yes that's what I mean by "Why can't we just update the hashtable based
on the catalog". And if we do that then I don't understand why do we
need both hastable and linked list if we need to update both based on
catalog reads anyway.Thanks, I've now understood correctly. Yes, I think you're right. If
we update the hash table based on the catalog whenever table state is
invalidated, we don't need to have both hash table and list.BTW, in current HEAD the SUBREL_STATE_SYNCWAIT is not stored in the
pg_subscription_catalog. So the following condition seems not correct.
We should use "syncworker->relstate == SUBSCRIPTION_STATE_SYNCWAIT"
instead?/*
* There is a worker synchronizing the relation and waiting for
* apply to do something.
*/
if (syncworker && rstate->state == SUBREL_STATE_SYNCWAIT)
{
/*
* There are three possible synchronization situations here.
*
* a) Apply is in front of the table sync: We tell the table
* sync to CATCHUP.
*
* b) Apply is behind the table sync: We tell the table sync
* to mark the table as SYNCDONE and finish.* c) Apply and table sync are at the same position: We tell
* table sync to mark the table as READY and finish.
*
* In any case we'll need to wait for table sync to change
* the state in catalog and only then continue ourselves.
*/Good catch. Although it's not comment that's wrong, it's the if. We
should not compare rstate->state but syncworker->relstate.
I've attached a patch to fix this bug.
Regards,
--
Masahiko Sawada
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center
Attachments:
bug_fix.patchapplication/octet-stream; name=bug_fix.patchDownload
diff --git a/src/backend/replication/logical/tablesync.c b/src/backend/replication/logical/tablesync.c
index b4b48d9..30da0fb 100644
--- a/src/backend/replication/logical/tablesync.c
+++ b/src/backend/replication/logical/tablesync.c
@@ -351,7 +351,7 @@ process_syncing_tables_for_apply(XLogRecPtr current_lsn)
* There is a worker synchronizing the relation and waiting for
* apply to do something.
*/
- if (syncworker && rstate->state == SUBREL_STATE_SYNCWAIT)
+ if (syncworker && syncworker->relstate == SUBREL_STATE_SYNCWAIT)
{
/*
* There are three possible synchronization situations here.
Hello,
At Thu, 20 Apr 2017 13:21:14 +0900, Masahiko Sawada <sawada.mshk@gmail.com> wrote in <CAD21AoDrw0OaHE=oVRRhQX248kjJ7W+1ViM3K76aP46HnHJsnQ@mail.gmail.com>
On Thu, Apr 20, 2017 at 12:30 AM, Petr Jelinek
<petr.jelinek@2ndquadrant.com> wrote:On 19/04/17 15:57, Masahiko Sawada wrote:
On Wed, Apr 19, 2017 at 10:07 PM, Petr Jelinek
<petr.jelinek@2ndquadrant.com> wrote:On 19/04/17 14:42, Masahiko Sawada wrote:
On Wed, Apr 19, 2017 at 5:12 PM, Kyotaro HORIGUCHI
<horiguchi.kyotaro@lab.ntt.co.jp> wrote:At Tue, 18 Apr 2017 18:40:56 +0200, Petr Jelinek <petr.jelinek@2ndquadrant.com> wrote in <f64d87d1-bef3-5e3e-a999-ba302816a0ee@2ndquadrant.com>
On 18/04/17 18:14, Peter Eisentraut wrote:
On 4/18/17 11:59, Petr Jelinek wrote:
Hmm if we create hashtable for this, I'd say create hashtable for the
whole table_states then. The reason why it's list now was that it seemed
unnecessary to have hashtable when it will be empty almost always but
there is no need to have both hashtable + list IMHO.I understant that but I also don't like the frequent palloc/pfree
in long-lasting context and double loop like Peter.The difference is that we blow away the list of states when the catalog
changes, but we keep the hash table with the start times around. We
need two things with different life times.On the other hand, hash seems overdone. Addition to that, the
hash-version leaks stale entries while subscriptions are
modified. But vacuuming them costs high.Why can't we just update the hashtable based on the catalog? I mean once
the record is not needed in the list, the table has been synced so there
is no need for the timestamp either since we'll not try to start the
worker again.I guess the table sync worker for the same table could need to be
started again. For example, please image a case where the table
belonging to the publication is removed from it and the corresponding
subscription is refreshed, and then the table is added to it again. We
have the record of the table with timestamp in the hash table when the
table sync in the first time, but the table sync after refreshed could
have to wait for the interval.But why do we want to wait in such case where user has explicitly
requested refresh?Yeah, sorry, I meant that we don't want to wait but cannot launch the
tablesync worker in such case.But after more thought, the latest patch Peter proposed has the same
problem. Perhaps we need to scan always whole pg_subscription_rel and
remove the entry if the corresponding table get synced.Yes that's what I mean by "Why can't we just update the hashtable based
on the catalog". And if we do that then I don't understand why do we
need both hastable and linked list if we need to update both based on
catalog reads anyway.Thanks, I've now understood correctly. Yes, I think you're right. If
we update the hash table based on the catalog whenever table state is
invalidated, we don't need to have both hash table and list.
Ah, ok. The patch from Peter still generating and replacing the
content of the list. The attached patch stores everything into
SubscriptionRelState. Oppositte to my anticiation, the hash can
be efectively kept small and removed.
BTW, in current HEAD the SUBREL_STATE_SYNCWAIT is not stored in the
pg_subscription_catalog. So the following condition seems not correct.
We should use "syncworker->relstate == SUBSCRIPTION_STATE_SYNCWAIT"
instead?
regards,
--
Kyotaro Horiguchi
NTT Open Source Software Center
Attachments:
0001-Wait-between-tablesync-worker-restarts_v2.patchtext/x-patch; charset=us-asciiDownload
From b0ce46d4973465ba2ae7550da2295b31945a73b1 Mon Sep 17 00:00:00 2001
From: Kyotaro Horiguchi <horiguchi.kyotaro@lab.ntt.co.jp>
Date: Fri, 21 Apr 2017 17:26:09 +0900
Subject: [PATCH] Wait between tablesync worker restarts
Before restarting a tablesync worker for the same relation, wait
wal_retrieve_retry_interval. This avoids restarting failing workers in
a tight loop.
---
src/backend/catalog/pg_subscription.c | 31 +++++---
src/backend/replication/logical/tablesync.c | 106 +++++++++++++++++++---------
src/include/catalog/pg_subscription_rel.h | 6 +-
3 files changed, 97 insertions(+), 46 deletions(-)
diff --git a/src/backend/catalog/pg_subscription.c b/src/backend/catalog/pg_subscription.c
index a183850..4300f10 100644
--- a/src/backend/catalog/pg_subscription.c
+++ b/src/backend/catalog/pg_subscription.c
@@ -31,11 +31,11 @@
#include "utils/array.h"
#include "utils/builtins.h"
#include "utils/fmgroids.h"
+#include "utils/hsearch.h"
#include "utils/pg_lsn.h"
#include "utils/rel.h"
#include "utils/syscache.h"
-
static List *textarray_to_stringlist(ArrayType *textarray);
/*
@@ -450,17 +450,17 @@ GetSubscriptionRelations(Oid subid)
/*
* Get all relations for subscription that are not in a ready state.
*
- * Returned list is palloced in current memory context.
+ * Returns the number of live entries
*/
-List *
-GetSubscriptionNotReadyRelations(Oid subid)
+int
+GetSubscriptionNotReadyRelations(struct HTAB *localstate, Oid subid)
{
- List *res = NIL;
Relation rel;
HeapTuple tup;
int nkeys = 0;
ScanKeyData skey[2];
SysScanDesc scan;
+ int ret = 0;
rel = heap_open(SubscriptionRelRelationId, AccessShareLock);
@@ -481,20 +481,29 @@ GetSubscriptionNotReadyRelations(Oid subid)
{
Form_pg_subscription_rel subrel;
SubscriptionRelState *relstate;
+ bool found = false;
subrel = (Form_pg_subscription_rel) GETSTRUCT(tup);
- relstate = (SubscriptionRelState *)palloc(sizeof(SubscriptionRelState));
- relstate->relid = subrel->srrelid;
- relstate->state = subrel->srsubstate;
- relstate->lsn = subrel->srsublsn;
+ relstate = hash_search(localstate, (void *) &subrel->srrelid,
+ HASH_ENTER, &found);
- res = lappend(res, relstate);
+ if (!found) {
+ relstate->relid = subrel->srrelid;
+ relstate->state = subrel->srsubstate;
+ relstate->lsn = subrel->srsublsn;
+ relstate->last_worker_start = 0;
+ }
+
+ /* This is false here. Set as alive since found in the catalog. */
+ Assert(!relstate->alive);
+ relstate->alive = true;
+ ret++;
}
/* Cleanup */
systable_endscan(scan);
heap_close(rel, AccessShareLock);
- return res;
+ return ret;
}
diff --git a/src/backend/replication/logical/tablesync.c b/src/backend/replication/logical/tablesync.c
index d287e95..09c75a5 100644
--- a/src/backend/replication/logical/tablesync.c
+++ b/src/backend/replication/logical/tablesync.c
@@ -243,7 +243,10 @@ process_syncing_tables_for_sync(XLogRecPtr current_lsn)
*
* If there are tables that need synchronizing and are not being synchronized
* yet, start sync workers for them (if there are free slots for sync
- * workers).
+ * workers). To prevent starting the sync worker for the same relation at a
+ * high frequency after a failure, we store its last start time with each sync
+ * state info. We start the sync worker for the same relation after waiting
+ * at least wal_retrieve_retry_interval.
*
* For tables that are being synchronized already, check if sync workers
* either need action from the apply worker or have finished.
@@ -258,50 +261,77 @@ process_syncing_tables_for_sync(XLogRecPtr current_lsn)
* If the synchronization position is reached, then the table can be marked as
* READY and is no longer tracked.
*/
+
static void
process_syncing_tables_for_apply(XLogRecPtr current_lsn)
{
- static List *table_states = NIL;
- ListCell *lc;
+ static HTAB *subrel_local_state = NULL;
+ HASH_SEQ_STATUS hash_seq;
+ SubscriptionRelState *rstate;
+ bool local_state_updated = false;
+ int nrels = 0;
Assert(!IsTransactionState());
/* We need up to date sync state info for subscription tables here. */
if (!table_states_valid)
{
- MemoryContext oldctx;
- List *rstates;
- ListCell *lc;
- SubscriptionRelState *rstate;
-
- /* Clean the old list. */
- list_free_deep(table_states);
- table_states = NIL;
+ /* Create the local state hash if not exists */
+ if (!subrel_local_state)
+ {
+ HASHCTL ctl;
+
+ memset(&ctl, 0, sizeof(ctl));
+ ctl.keysize = sizeof(Oid);
+ ctl.entrysize = sizeof(SubscriptionRelState);
+ subrel_local_state =
+ hash_create("Logical replication table sync worker start times",
+ 256, &ctl, HASH_ELEM | HASH_BLOBS);
+ }
StartTransactionCommand();
- /* Fetch all non-ready tables. */
- rstates = GetSubscriptionNotReadyRelations(MySubscription->oid);
+ /* Update local state hash with non-ready tables. */
+ nrels = GetSubscriptionNotReadyRelations(subrel_local_state,
+ MySubscription->oid);
+ local_state_updated = true;
+ CommitTransactionCommand();
- /* Allocate the tracking info in a permanent memory context. */
- oldctx = MemoryContextSwitchTo(CacheMemoryContext);
- foreach(lc, rstates)
+ table_states_valid = true;
+ }
+ /* Just count the entries if the local_state is not update this time */
+ else if (subrel_local_state != NULL)
+ nrels = hash_get_num_entries(subrel_local_state);
+
+ /* No relation to be synched */
+ if (nrels == 0)
+ {
+ if (subrel_local_state != NULL)
{
- rstate = palloc(sizeof(SubscriptionRelState));
- memcpy(rstate, lfirst(lc), sizeof(SubscriptionRelState));
- table_states = lappend(table_states, rstate);
+ hash_destroy(subrel_local_state);
+ subrel_local_state = NULL;
}
- MemoryContextSwitchTo(oldctx);
-
- CommitTransactionCommand();
- table_states_valid = true;
+ return;
}
- /* Process all tables that are being synchronized. */
- foreach(lc, table_states)
+ /* Process all tables that are to be synchronized. */
+ hash_seq_init(&hash_seq, subrel_local_state);
+
+ while ((rstate = hash_seq_search(&hash_seq)) != NULL)
{
- SubscriptionRelState *rstate = (SubscriptionRelState *)lfirst(lc);
+ /*
+ * Remove entries no longer required. If subrel_local_state is not
+ * updated above, this flag is ignored.
+ */
+ if (local_state_updated && !rstate->alive)
+ {
+ hash_search(subrel_local_state, &rstate->relid, HASH_REMOVE, NULL);
+ continue;
+ }
+
+ /* Reset preparing for the next update of this hash entry */
+ rstate->alive = false;
if (rstate->state == SUBREL_STATE_SYNCDONE)
{
@@ -395,17 +425,25 @@ process_syncing_tables_for_apply(XLogRecPtr current_lsn)
}
/*
- * If there is no sync worker registered for the table and
- * there is some free sync worker slot, start new sync worker
- * for the table.
+ * If there is no sync worker registered for the table and there
+ * is some free sync worker slot, start new sync worker for the
+ * table.
*/
else if (!syncworker && nsyncworkers < max_sync_workers_per_subscription)
{
- logicalrep_worker_launch(MyLogicalRepWorker->dbid,
- MySubscription->oid,
- MySubscription->name,
- MyLogicalRepWorker->userid,
- rstate->relid);
+ TimestampTz now = GetCurrentTimestamp();
+
+ /* Keep moderate intervals */
+ if (TimestampDifferenceExceeds(rstate->last_worker_start, now,
+ wal_retrieve_retry_interval))
+ {
+ logicalrep_worker_launch(MyLogicalRepWorker->dbid,
+ MySubscription->oid,
+ MySubscription->name,
+ MyLogicalRepWorker->userid,
+ rstate->relid);
+ rstate->last_worker_start = now;
+ }
}
}
}
diff --git a/src/include/catalog/pg_subscription_rel.h b/src/include/catalog/pg_subscription_rel.h
index 9f4f152..49c4723 100644
--- a/src/include/catalog/pg_subscription_rel.h
+++ b/src/include/catalog/pg_subscription_rel.h
@@ -66,8 +66,12 @@ typedef struct SubscriptionRelState
Oid relid;
XLogRecPtr lsn;
char state;
+ TimestampTz last_worker_start; /* time of the last launch of worker */
+ bool alive; /* this relid is alive in the catalog */
} SubscriptionRelState;
+struct HTAB;
+
extern Oid SetSubscriptionRelState(Oid subid, Oid relid, char state,
XLogRecPtr sublsn);
extern char GetSubscriptionRelState(Oid subid, Oid relid,
@@ -75,6 +79,6 @@ extern char GetSubscriptionRelState(Oid subid, Oid relid,
extern void RemoveSubscriptionRel(Oid subid, Oid relid);
extern List *GetSubscriptionRelations(Oid subid);
-extern List *GetSubscriptionNotReadyRelations(Oid subid);
+extern int GetSubscriptionNotReadyRelations(struct HTAB *localstate, Oid subid);
#endif /* PG_SUBSCRIPTION_REL_H */
--
2.9.2
On 21/04/17 04:38, Masahiko Sawada wrote:
On Thu, Apr 20, 2017 at 8:43 PM, Petr Jelinek
<petr.jelinek@2ndquadrant.com> wrote:On 20/04/17 06:21, Masahiko Sawada wrote:
On Thu, Apr 20, 2017 at 12:30 AM, Petr Jelinek
<petr.jelinek@2ndquadrant.com> wrote:On 19/04/17 15:57, Masahiko Sawada wrote:
On Wed, Apr 19, 2017 at 10:07 PM, Petr Jelinek
<petr.jelinek@2ndquadrant.com> wrote:On 19/04/17 14:42, Masahiko Sawada wrote:
On Wed, Apr 19, 2017 at 5:12 PM, Kyotaro HORIGUCHI
<horiguchi.kyotaro@lab.ntt.co.jp> wrote:At Tue, 18 Apr 2017 18:40:56 +0200, Petr Jelinek <petr.jelinek@2ndquadrant.com> wrote in <f64d87d1-bef3-5e3e-a999-ba302816a0ee@2ndquadrant.com>
On 18/04/17 18:14, Peter Eisentraut wrote:
On 4/18/17 11:59, Petr Jelinek wrote:
Hmm if we create hashtable for this, I'd say create hashtable for the
whole table_states then. The reason why it's list now was that it seemed
unnecessary to have hashtable when it will be empty almost always but
there is no need to have both hashtable + list IMHO.I understant that but I also don't like the frequent palloc/pfree
in long-lasting context and double loop like Peter.The difference is that we blow away the list of states when the catalog
changes, but we keep the hash table with the start times around. We
need two things with different life times.On the other hand, hash seems overdone. Addition to that, the
hash-version leaks stale entries while subscriptions are
modified. But vacuuming them costs high.Why can't we just update the hashtable based on the catalog? I mean once
the record is not needed in the list, the table has been synced so there
is no need for the timestamp either since we'll not try to start the
worker again.I guess the table sync worker for the same table could need to be
started again. For example, please image a case where the table
belonging to the publication is removed from it and the corresponding
subscription is refreshed, and then the table is added to it again. We
have the record of the table with timestamp in the hash table when the
table sync in the first time, but the table sync after refreshed could
have to wait for the interval.But why do we want to wait in such case where user has explicitly
requested refresh?Yeah, sorry, I meant that we don't want to wait but cannot launch the
tablesync worker in such case.But after more thought, the latest patch Peter proposed has the same
problem. Perhaps we need to scan always whole pg_subscription_rel and
remove the entry if the corresponding table get synced.Yes that's what I mean by "Why can't we just update the hashtable based
on the catalog". And if we do that then I don't understand why do we
need both hastable and linked list if we need to update both based on
catalog reads anyway.Thanks, I've now understood correctly. Yes, I think you're right. If
we update the hash table based on the catalog whenever table state is
invalidated, we don't need to have both hash table and list.BTW, in current HEAD the SUBREL_STATE_SYNCWAIT is not stored in the
pg_subscription_catalog. So the following condition seems not correct.
We should use "syncworker->relstate == SUBSCRIPTION_STATE_SYNCWAIT"
instead?/*
* There is a worker synchronizing the relation and waiting for
* apply to do something.
*/
if (syncworker && rstate->state == SUBREL_STATE_SYNCWAIT)
{
/*
* There are three possible synchronization situations here.
*
* a) Apply is in front of the table sync: We tell the table
* sync to CATCHUP.
*
* b) Apply is behind the table sync: We tell the table sync
* to mark the table as SYNCDONE and finish.* c) Apply and table sync are at the same position: We tell
* table sync to mark the table as READY and finish.
*
* In any case we'll need to wait for table sync to change
* the state in catalog and only then continue ourselves.
*/Good catch. Although it's not comment that's wrong, it's the if. We
should not compare rstate->state but syncworker->relstate.I've attached a patch to fix this bug.
Rereading the code again, it's actually not bug as we update the rstate
to what syncworker says, but it's obviously confusing so probably still
worth to commit that.
--
Petr Jelinek http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On 21/04/17 10:33, Kyotaro HORIGUCHI wrote:
Hello,
At Thu, 20 Apr 2017 13:21:14 +0900, Masahiko Sawada <sawada.mshk@gmail.com> wrote in <CAD21AoDrw0OaHE=oVRRhQX248kjJ7W+1ViM3K76aP46HnHJsnQ@mail.gmail.com>
On Thu, Apr 20, 2017 at 12:30 AM, Petr Jelinek
<petr.jelinek@2ndquadrant.com> wrote:On 19/04/17 15:57, Masahiko Sawada wrote:
On Wed, Apr 19, 2017 at 10:07 PM, Petr Jelinek
<petr.jelinek@2ndquadrant.com> wrote:On 19/04/17 14:42, Masahiko Sawada wrote:
On Wed, Apr 19, 2017 at 5:12 PM, Kyotaro HORIGUCHI
<horiguchi.kyotaro@lab.ntt.co.jp> wrote:At Tue, 18 Apr 2017 18:40:56 +0200, Petr Jelinek <petr.jelinek@2ndquadrant.com> wrote in <f64d87d1-bef3-5e3e-a999-ba302816a0ee@2ndquadrant.com>
On 18/04/17 18:14, Peter Eisentraut wrote:
On 4/18/17 11:59, Petr Jelinek wrote:
Hmm if we create hashtable for this, I'd say create hashtable for the
whole table_states then. The reason why it's list now was that it seemed
unnecessary to have hashtable when it will be empty almost always but
there is no need to have both hashtable + list IMHO.I understant that but I also don't like the frequent palloc/pfree
in long-lasting context and double loop like Peter.The difference is that we blow away the list of states when the catalog
changes, but we keep the hash table with the start times around. We
need two things with different life times.On the other hand, hash seems overdone. Addition to that, the
hash-version leaks stale entries while subscriptions are
modified. But vacuuming them costs high.Why can't we just update the hashtable based on the catalog? I mean once
the record is not needed in the list, the table has been synced so there
is no need for the timestamp either since we'll not try to start the
worker again.I guess the table sync worker for the same table could need to be
started again. For example, please image a case where the table
belonging to the publication is removed from it and the corresponding
subscription is refreshed, and then the table is added to it again. We
have the record of the table with timestamp in the hash table when the
table sync in the first time, but the table sync after refreshed could
have to wait for the interval.But why do we want to wait in such case where user has explicitly
requested refresh?Yeah, sorry, I meant that we don't want to wait but cannot launch the
tablesync worker in such case.But after more thought, the latest patch Peter proposed has the same
problem. Perhaps we need to scan always whole pg_subscription_rel and
remove the entry if the corresponding table get synced.Yes that's what I mean by "Why can't we just update the hashtable based
on the catalog". And if we do that then I don't understand why do we
need both hastable and linked list if we need to update both based on
catalog reads anyway.Thanks, I've now understood correctly. Yes, I think you're right. If
we update the hash table based on the catalog whenever table state is
invalidated, we don't need to have both hash table and list.Ah, ok. The patch from Peter still generating and replacing the
content of the list. The attached patch stores everything into
SubscriptionRelState. Oppositte to my anticiation, the hash can
be efectively kept small and removed.
Yeah this does not look bad in general, I have some comments about the
patch though:
+ if (!found) { + relstate->relid = subrel->srrelid; + relstate->state = subrel->srsubstate; + relstate->lsn = subrel->srsublsn; + relstate->last_worker_start = 0; + } +
I think we should document why this is done this way - I mean it's not
very obvious that it's okay to update just when not found and why and
even less obvious that it would be in fact wrong to update already
existing entry.
I also think that in it's current form the
GetSubscriptionNotReadyRelations should be moved to tablesync.c as it's
no longer generically usable function but it's behavior depends heavily
on how it's used in tablesync.c.
I also think we can't add the new fields to SubscriptionRelState
directly as that's used by catalog access functions as well. We need to
have some struct that's local to tablesync which contains
SubscriptionRelState as one field, and the two new fields probably.
--
Petr Jelinek http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Fri, Apr 21, 2017 at 5:33 PM, Kyotaro HORIGUCHI
<horiguchi.kyotaro@lab.ntt.co.jp> wrote:
Hello,
At Thu, 20 Apr 2017 13:21:14 +0900, Masahiko Sawada <sawada.mshk@gmail.com> wrote in <CAD21AoDrw0OaHE=oVRRhQX248kjJ7W+1ViM3K76aP46HnHJsnQ@mail.gmail.com>
On Thu, Apr 20, 2017 at 12:30 AM, Petr Jelinek
<petr.jelinek@2ndquadrant.com> wrote:On 19/04/17 15:57, Masahiko Sawada wrote:
On Wed, Apr 19, 2017 at 10:07 PM, Petr Jelinek
<petr.jelinek@2ndquadrant.com> wrote:On 19/04/17 14:42, Masahiko Sawada wrote:
On Wed, Apr 19, 2017 at 5:12 PM, Kyotaro HORIGUCHI
<horiguchi.kyotaro@lab.ntt.co.jp> wrote:At Tue, 18 Apr 2017 18:40:56 +0200, Petr Jelinek <petr.jelinek@2ndquadrant.com> wrote in <f64d87d1-bef3-5e3e-a999-ba302816a0ee@2ndquadrant.com>
On 18/04/17 18:14, Peter Eisentraut wrote:
On 4/18/17 11:59, Petr Jelinek wrote:
Hmm if we create hashtable for this, I'd say create hashtable for the
whole table_states then. The reason why it's list now was that it seemed
unnecessary to have hashtable when it will be empty almost always but
there is no need to have both hashtable + list IMHO.I understant that but I also don't like the frequent palloc/pfree
in long-lasting context and double loop like Peter.The difference is that we blow away the list of states when the catalog
changes, but we keep the hash table with the start times around. We
need two things with different life times.On the other hand, hash seems overdone. Addition to that, the
hash-version leaks stale entries while subscriptions are
modified. But vacuuming them costs high.Why can't we just update the hashtable based on the catalog? I mean once
the record is not needed in the list, the table has been synced so there
is no need for the timestamp either since we'll not try to start the
worker again.I guess the table sync worker for the same table could need to be
started again. For example, please image a case where the table
belonging to the publication is removed from it and the corresponding
subscription is refreshed, and then the table is added to it again. We
have the record of the table with timestamp in the hash table when the
table sync in the first time, but the table sync after refreshed could
have to wait for the interval.But why do we want to wait in such case where user has explicitly
requested refresh?Yeah, sorry, I meant that we don't want to wait but cannot launch the
tablesync worker in such case.But after more thought, the latest patch Peter proposed has the same
problem. Perhaps we need to scan always whole pg_subscription_rel and
remove the entry if the corresponding table get synced.Yes that's what I mean by "Why can't we just update the hashtable based
on the catalog". And if we do that then I don't understand why do we
need both hastable and linked list if we need to update both based on
catalog reads anyway.Thanks, I've now understood correctly. Yes, I think you're right. If
we update the hash table based on the catalog whenever table state is
invalidated, we don't need to have both hash table and list.Ah, ok. The patch from Peter still generating and replacing the
content of the list. The attached patch stores everything into
SubscriptionRelState. Oppositte to my anticiation, the hash can
be efectively kept small and removed.
Thank you for the patch!
Actually, I also bumped into the same the situation where we got the
following error during hash_seq_search. I guess we cannot commit a
transaction during hash_seq_scan but the sequential scan loop in
process_syncing_tables_for_apply could attempt to do that.
2017-04-21 21:35:22.587 JST [7508] WARNING: leaked hash_seq_search
scan for hash table 0x1f54980
2017-04-21 21:35:22.587 JST [7508] ERROR: no hash_seq_search scan for
hash table "Logical replication table sync worker start times"
Regards,
--
Masahiko Sawada
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Fri, Apr 21, 2017 at 11:19 PM, Masahiko Sawada <sawada.mshk@gmail.com> wrote:
On Fri, Apr 21, 2017 at 5:33 PM, Kyotaro HORIGUCHI
<horiguchi.kyotaro@lab.ntt.co.jp> wrote:Hello,
At Thu, 20 Apr 2017 13:21:14 +0900, Masahiko Sawada <sawada.mshk@gmail.com> wrote in <CAD21AoDrw0OaHE=oVRRhQX248kjJ7W+1ViM3K76aP46HnHJsnQ@mail.gmail.com>
On Thu, Apr 20, 2017 at 12:30 AM, Petr Jelinek
<petr.jelinek@2ndquadrant.com> wrote:On 19/04/17 15:57, Masahiko Sawada wrote:
On Wed, Apr 19, 2017 at 10:07 PM, Petr Jelinek
<petr.jelinek@2ndquadrant.com> wrote:On 19/04/17 14:42, Masahiko Sawada wrote:
On Wed, Apr 19, 2017 at 5:12 PM, Kyotaro HORIGUCHI
<horiguchi.kyotaro@lab.ntt.co.jp> wrote:At Tue, 18 Apr 2017 18:40:56 +0200, Petr Jelinek <petr.jelinek@2ndquadrant.com> wrote in <f64d87d1-bef3-5e3e-a999-ba302816a0ee@2ndquadrant.com>
On 18/04/17 18:14, Peter Eisentraut wrote:
On 4/18/17 11:59, Petr Jelinek wrote:
Hmm if we create hashtable for this, I'd say create hashtable for the
whole table_states then. The reason why it's list now was that it seemed
unnecessary to have hashtable when it will be empty almost always but
there is no need to have both hashtable + list IMHO.I understant that but I also don't like the frequent palloc/pfree
in long-lasting context and double loop like Peter.The difference is that we blow away the list of states when the catalog
changes, but we keep the hash table with the start times around. We
need two things with different life times.On the other hand, hash seems overdone. Addition to that, the
hash-version leaks stale entries while subscriptions are
modified. But vacuuming them costs high.Why can't we just update the hashtable based on the catalog? I mean once
the record is not needed in the list, the table has been synced so there
is no need for the timestamp either since we'll not try to start the
worker again.I guess the table sync worker for the same table could need to be
started again. For example, please image a case where the table
belonging to the publication is removed from it and the corresponding
subscription is refreshed, and then the table is added to it again. We
have the record of the table with timestamp in the hash table when the
table sync in the first time, but the table sync after refreshed could
have to wait for the interval.But why do we want to wait in such case where user has explicitly
requested refresh?Yeah, sorry, I meant that we don't want to wait but cannot launch the
tablesync worker in such case.But after more thought, the latest patch Peter proposed has the same
problem. Perhaps we need to scan always whole pg_subscription_rel and
remove the entry if the corresponding table get synced.Yes that's what I mean by "Why can't we just update the hashtable based
on the catalog". And if we do that then I don't understand why do we
need both hastable and linked list if we need to update both based on
catalog reads anyway.Thanks, I've now understood correctly. Yes, I think you're right. If
we update the hash table based on the catalog whenever table state is
invalidated, we don't need to have both hash table and list.Ah, ok. The patch from Peter still generating and replacing the
content of the list. The attached patch stores everything into
SubscriptionRelState. Oppositte to my anticiation, the hash can
be efectively kept small and removed.Thank you for the patch!
Actually, I also bumped into the same the situation where we got the
following error during hash_seq_search. I guess we cannot commit a
transaction during hash_seq_scan but the sequential scan loop in
process_syncing_tables_for_apply could attempt to do that.
So I guess we should commit the changing status to SUBREL_STATE_READY
after finished hash_seq_scan.
Regards,
--
Masahiko Sawada
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Hello,
At Sun, 23 Apr 2017 00:51:52 +0900, Masahiko Sawada <sawada.mshk@gmail.com> wrote in <CAD21AoApBU+nz7FYaWX6gjyB9r6WWrTujbL4rrZiocHLc=pEbA@mail.gmail.com>
On Fri, Apr 21, 2017 at 11:19 PM, Masahiko Sawada <sawada.mshk@gmail.com> wrote:
On Fri, Apr 21, 2017 at 5:33 PM, Kyotaro HORIGUCHI
<horiguchi.kyotaro@lab.ntt.co.jp> wrote:Hello,
At Thu, 20 Apr 2017 13:21:14 +0900, Masahiko Sawada <sawada.mshk@gmail.com> wrote in <CAD21AoDrw0OaHE=oVRRhQX248kjJ7W+1ViM3K76aP46HnHJsnQ@mail.gmail.com>
On Thu, Apr 20, 2017 at 12:30 AM, Petr Jelinek
<petr.jelinek@2ndquadrant.com> wrote:On 19/04/17 15:57, Masahiko Sawada wrote:
On Wed, Apr 19, 2017 at 10:07 PM, Petr Jelinek
<petr.jelinek@2ndquadrant.com> wrote:
Yeah, sorry, I meant that we don't want to wait but cannot launch the
tablesync worker in such case.But after more thought, the latest patch Peter proposed has the same
problem. Perhaps we need to scan always whole pg_subscription_rel and
remove the entry if the corresponding table get synced.Yes that's what I mean by "Why can't we just update the hashtable based
on the catalog". And if we do that then I don't understand why do we
need both hastable and linked list if we need to update both based on
catalog reads anyway.Thanks, I've now understood correctly. Yes, I think you're right. If
we update the hash table based on the catalog whenever table state is
invalidated, we don't need to have both hash table and list.Ah, ok. The patch from Peter still generating and replacing the
content of the list. The attached patch stores everything into
SubscriptionRelState. Oppositte to my anticiation, the hash can
be efectively kept small and removed.Thank you for the patch!
Actually, I also bumped into the same the situation where we got the
following error during hash_seq_search. I guess we cannot commit a
transaction during hash_seq_scan but the sequential scan loop in
process_syncing_tables_for_apply could attempt to do that.
Ah. Thanks. I forgot that I saw the same error once. The hash has
nothing to do with any transactions. The scan now runs after
freezing of the hash.
Petr:
I think we should document why this is done this way - I mean it's not
very obvious that it's okay to update just when not found and why and
even less obvious that it would be in fact wrong to update already
existing entry.
It is not prudently designed. I changed there.. seemingly more
reasonable, maybe.
Petr:
I also think that in it's current form the
GetSubscriptionNotReadyRelations should be moved to tablesync.c
Removed then moved to tablesync.c.
Petr:
I also think we can't add the new fields to
SubscriptionRelState directly as that's used by catalog access
functions as well.
Yeah, it was my lazyness. A new struct SubscriptionWorkerState is
added in tablesync.c and the interval stuff runs on it.
At Sun, 23 Apr 2017 00:51:52 +0900, Masahiko Sawada <sawada.mshk@gmail.com> wrote in <CAD21AoApBU+nz7FYaWX6gjyB9r6WWrTujbL4rrZiocHLc=pEbA@mail.gmail.com>
So I guess we should commit the changing status to SUBREL_STATE_READY
after finished hash_seq_scan.
We could so either, but I thought that a hash can explicitly
"unfreeze" without a side effect. The attached first one adds the
function in dynahash.c. I don't think that just allowing
unregsiterd scan on unfrozen hash is worse that this.
regards,
--
Kyotaro Horiguchi
NTT Open Source Software Center
Attachments:
0001-Add-unfrozen-feature-to-dynahash.patchtext/x-patch; charset=us-asciiDownload
From dcd8ad43f3dceac4018bfb0819d37190a1fd8ef8 Mon Sep 17 00:00:00 2001
From: Kyotaro Horiguchi <horiguchi.kyotaro@lab.ntt.co.jp>
Date: Mon, 24 Apr 2017 16:06:10 +0900
Subject: [PATCH 1/2] Add unfrozen feature to dynahash
Scans on unfrozen hash cannot live beyond a transaction lifetime. A
hash can be frozen to allow modify-free sequential scan which can live
ybeyond transactions but there's no means to release the status. This
patch adds a new function hash_unfreeze to allow us to reset frozen
hashes to insertable state.
---
src/backend/utils/hash/dynahash.c | 20 ++++++++++++++++++++
src/include/utils/hsearch.h | 1 +
2 files changed, 21 insertions(+)
diff --git a/src/backend/utils/hash/dynahash.c b/src/backend/utils/hash/dynahash.c
index 12b1658..073e2bc 100644
--- a/src/backend/utils/hash/dynahash.c
+++ b/src/backend/utils/hash/dynahash.c
@@ -1334,6 +1334,9 @@ hash_get_num_entries(HTAB *hashp)
* wherein it is inconvenient to track whether a scan is still open, and
* there's no possibility of further insertions after readout has begun.
*
+ * NOTE: it is possible to unfreeze a frozen hash. All running sequential
+ * scans must be abandoned by the scanners' responsibility.
+ *
* NOTE: to use this with a partitioned hashtable, caller had better hold
* at least shared lock on all partitions of the table throughout the scan!
* We can cope with insertions or deletions by our own backend, but *not*
@@ -1456,6 +1459,23 @@ hash_freeze(HTAB *hashp)
hashp->frozen = true;
}
+/*
+ * hash_unfreeze
+ * Reset the freeze state of a hashtable
+ *
+ * This re-enables insertion to a hash again. Active sequentual scans started
+ * during the freeze must not continue here after.
+ */
+void
+hash_unfreeze(HTAB *hashp)
+{
+ Assert (!hashp->isshared);
+ if (!hashp->frozen)
+ elog(ERROR, "this hash hashtable \"%s\" is not frozen",
+ hashp->tabname);
+ hashp->frozen = false;
+}
+
/********************************* UTILITIES ************************/
diff --git a/src/include/utils/hsearch.h b/src/include/utils/hsearch.h
index 7964087..60920d4 100644
--- a/src/include/utils/hsearch.h
+++ b/src/include/utils/hsearch.h
@@ -136,6 +136,7 @@ extern void hash_seq_init(HASH_SEQ_STATUS *status, HTAB *hashp);
extern void *hash_seq_search(HASH_SEQ_STATUS *status);
extern void hash_seq_term(HASH_SEQ_STATUS *status);
extern void hash_freeze(HTAB *hashp);
+extern void hash_unfreeze(HTAB *hashp);
extern Size hash_estimate_size(long num_entries, Size entrysize);
extern long hash_select_dirsize(long num_entries);
extern Size hash_get_shared_size(HASHCTL *info, int flags);
--
2.9.2
0002-Wait-between-tablesync-worker-restarts.patchtext/x-patch; charset=us-asciiDownload
From b1c86c37d442f9a90330df2f47a0680fa478ccde Mon Sep 17 00:00:00 2001
From: Kyotaro Horiguchi <horiguchi.kyotaro@lab.ntt.co.jp>
Date: Mon, 24 Apr 2017 13:39:17 +0900
Subject: [PATCH 2/2] Wait between tablesync worker restarts
Before restarting a tablesync worker for the same relation, wait
wal_retrieve_retry_interval. This avoids restarting failing workers in
a tight loop.
---
src/backend/catalog/pg_subscription.c | 52 -------
src/backend/replication/logical/tablesync.c | 205 +++++++++++++++++++++++-----
src/include/catalog/pg_subscription_rel.h | 1 -
3 files changed, 172 insertions(+), 86 deletions(-)
diff --git a/src/backend/catalog/pg_subscription.c b/src/backend/catalog/pg_subscription.c
index a183850..ff1ed62 100644
--- a/src/backend/catalog/pg_subscription.c
+++ b/src/backend/catalog/pg_subscription.c
@@ -446,55 +446,3 @@ GetSubscriptionRelations(Oid subid)
return res;
}
-
-/*
- * Get all relations for subscription that are not in a ready state.
- *
- * Returned list is palloced in current memory context.
- */
-List *
-GetSubscriptionNotReadyRelations(Oid subid)
-{
- List *res = NIL;
- Relation rel;
- HeapTuple tup;
- int nkeys = 0;
- ScanKeyData skey[2];
- SysScanDesc scan;
-
- rel = heap_open(SubscriptionRelRelationId, AccessShareLock);
-
- ScanKeyInit(&skey[nkeys++],
- Anum_pg_subscription_rel_srsubid,
- BTEqualStrategyNumber, F_OIDEQ,
- ObjectIdGetDatum(subid));
-
- ScanKeyInit(&skey[nkeys++],
- Anum_pg_subscription_rel_srsubstate,
- BTEqualStrategyNumber, F_CHARNE,
- CharGetDatum(SUBREL_STATE_READY));
-
- scan = systable_beginscan(rel, InvalidOid, false,
- NULL, nkeys, skey);
-
- while (HeapTupleIsValid(tup = systable_getnext(scan)))
- {
- Form_pg_subscription_rel subrel;
- SubscriptionRelState *relstate;
-
- subrel = (Form_pg_subscription_rel) GETSTRUCT(tup);
-
- relstate = (SubscriptionRelState *)palloc(sizeof(SubscriptionRelState));
- relstate->relid = subrel->srrelid;
- relstate->state = subrel->srsubstate;
- relstate->lsn = subrel->srsublsn;
-
- res = lappend(res, relstate);
- }
-
- /* Cleanup */
- systable_endscan(scan);
- heap_close(rel, AccessShareLock);
-
- return res;
-}
diff --git a/src/backend/replication/logical/tablesync.c b/src/backend/replication/logical/tablesync.c
index b4b48d9..e10b258 100644
--- a/src/backend/replication/logical/tablesync.c
+++ b/src/backend/replication/logical/tablesync.c
@@ -103,9 +103,18 @@
#include "storage/ipc.h"
#include "utils/builtins.h"
+#include "utils/fmgroids.h"
#include "utils/lsyscache.h"
#include "utils/memutils.h"
+/* Struct to track subscription worker startup */
+typedef struct SubscriptionWorkerState
+{
+ SubscriptionRelState rs; /* key is rs.relid */
+ TimestampTz last_worker_start; /* time of the last launch of worker */
+ bool alive; /* this relid is alive in the catalog */
+} SubscriptionWorkerState;
+
static bool table_states_valid = false;
StringInfo copybuf = NULL;
@@ -237,6 +246,80 @@ process_syncing_tables_for_sync(XLogRecPtr current_lsn)
}
/*
+ * Get all relations for subscription that are not in a ready state.
+ *
+ * Updates the given hash and returns the number of live entries.
+ *
+ * Live entries are the hash entries that a corresponding tuple is found in
+ * pg_subscription_rel.
+ *
+ */
+static int
+GetSubscriptionNotReadyRelations(struct HTAB *localstate, Oid subid)
+{
+ Relation rel;
+ HeapTuple tup;
+ int nkeys = 0;
+ ScanKeyData skey[2];
+ SysScanDesc scan;
+ int ret = 0;
+
+ rel = heap_open(SubscriptionRelRelationId, AccessShareLock);
+
+ ScanKeyInit(&skey[nkeys++],
+ Anum_pg_subscription_rel_srsubid,
+ BTEqualStrategyNumber, F_OIDEQ,
+ ObjectIdGetDatum(subid));
+
+ ScanKeyInit(&skey[nkeys++],
+ Anum_pg_subscription_rel_srsubstate,
+ BTEqualStrategyNumber, F_CHARNE,
+ CharGetDatum(SUBREL_STATE_READY));
+
+ scan = systable_beginscan(rel, InvalidOid, false,
+ NULL, nkeys, skey);
+
+ /* Update entries in the given hash */
+ while (HeapTupleIsValid(tup = systable_getnext(scan)))
+ {
+ Form_pg_subscription_rel subrel;
+ SubscriptionWorkerState *wstate;
+ bool found = false;
+
+ subrel = (Form_pg_subscription_rel) GETSTRUCT(tup);
+
+ wstate = hash_search(localstate, (void *) &subrel->srrelid,
+ HASH_ENTER, &found);
+
+ /*
+ * rs.relid is the key of this entry, which won't be changed. The rest
+ * fields should be initialized only at the first time.
+ */
+ if (!found)
+ {
+ wstate->rs.relid = subrel->srrelid;
+ wstate->last_worker_start = 0;
+ wstate->alive = false;
+ }
+
+ /* These two fields should track the status in catalog */
+ wstate->rs.lsn = subrel->srsublsn;
+ wstate->rs.state = subrel->srsubstate;
+
+ /* Set as alive since found in the catalog. */
+ Assert(!wstate->alive);
+ wstate->alive = true;
+ ret++;
+ }
+
+ /* Cleanup */
+ systable_endscan(scan);
+ heap_close(rel, AccessShareLock);
+
+ return ret;
+}
+
+/*
* Handle table synchronization cooperation from the apply worker.
*
* Walk over all subscription tables that are individually tracked by the
@@ -245,7 +328,10 @@ process_syncing_tables_for_sync(XLogRecPtr current_lsn)
*
* If there are tables that need synchronizing and are not being synchronized
* yet, start sync workers for them (if there are free slots for sync
- * workers).
+ * workers). To prevent starting the sync worker for the same relation at a
+ * high frequency after a failure, we store its last start time with each sync
+ * state info. We start the sync worker for the same relation after waiting
+ * at least wal_retrieve_retry_interval.
*
* For tables that are being synchronized already, check if sync workers
* either need action from the apply worker or have finished.
@@ -263,47 +349,87 @@ process_syncing_tables_for_sync(XLogRecPtr current_lsn)
static void
process_syncing_tables_for_apply(XLogRecPtr current_lsn)
{
- static List *table_states = NIL;
- ListCell *lc;
+ static HTAB *subrel_local_state = NULL;
+ HASH_SEQ_STATUS hash_seq;
+ SubscriptionWorkerState *wstate;
+ bool local_state_updated = false;
+ int nrels = 0;
Assert(!IsTransactionState());
/* We need up to date sync state info for subscription tables here. */
if (!table_states_valid)
{
- MemoryContext oldctx;
- List *rstates;
- ListCell *lc;
- SubscriptionRelState *rstate;
-
- /* Clean the old list. */
- list_free_deep(table_states);
- table_states = NIL;
+ /* Create the local state hash if not exists */
+ if (!subrel_local_state)
+ {
+ HASHCTL ctl;
+
+ memset(&ctl, 0, sizeof(ctl));
+ ctl.keysize = sizeof(Oid);
+ ctl.entrysize = sizeof(SubscriptionRelState);
+ subrel_local_state =
+ hash_create("Logical replication table sync worker state",
+ 256, &ctl, HASH_ELEM | HASH_BLOBS);
+ }
StartTransactionCommand();
- /* Fetch all non-ready tables. */
- rstates = GetSubscriptionNotReadyRelations(MySubscription->oid);
+ /* Update local state hash with non-ready tables. */
+ nrels = GetSubscriptionNotReadyRelations(subrel_local_state,
+ MySubscription->oid);
+ local_state_updated = true;
+ CommitTransactionCommand();
- /* Allocate the tracking info in a permanent memory context. */
- oldctx = MemoryContextSwitchTo(CacheMemoryContext);
- foreach(lc, rstates)
+ table_states_valid = true;
+ }
+ /* Just count the entries if the local_state is not update this time */
+ else if (subrel_local_state != NULL)
+ nrels = hash_get_num_entries(subrel_local_state);
+
+ /* No relation to be synched */
+ if (nrels == 0)
+ {
+ if (subrel_local_state != NULL)
{
- rstate = palloc(sizeof(SubscriptionRelState));
- memcpy(rstate, lfirst(lc), sizeof(SubscriptionRelState));
- table_states = lappend(table_states, rstate);
+ hash_destroy(subrel_local_state);
+ subrel_local_state = NULL;
}
- MemoryContextSwitchTo(oldctx);
- CommitTransactionCommand();
-
- table_states_valid = true;
+ return;
}
- /* Process all tables that are being synchronized. */
- foreach(lc, table_states)
+ /*
+ * Freeze hash table. This guarantees that the hash won't be broken for
+ * the scan below. Unfreezing on error leavs the hash freezed but any
+ * errors within this loop blows away the worker process involving the
+ * hash.
+ */
+ hash_freeze(subrel_local_state);
+
+ /* Process all tables that are to be synchronized. */
+ hash_seq_init(&hash_seq, subrel_local_state);
+
+ while ((wstate = hash_seq_search(&hash_seq)) != NULL)
{
- SubscriptionRelState *rstate = (SubscriptionRelState *)lfirst(lc);
+ SubscriptionRelState *rstate;
+
+ /*
+ * Remove entries no longer necessary. The flag signals nothing if
+ * subrel_local_state is not updated above. We can remove entries in
+ * frozen hash safely.
+ */
+ if (local_state_updated && !wstate->alive)
+ {
+ hash_search(subrel_local_state, &wstate->rs.relid,
+ HASH_REMOVE, NULL);
+ continue;
+ }
+
+ rstate = &wstate->rs;
+
+ /* This will be true in the next rurn only for live entries */
+ wstate->alive = false;
if (rstate->state == SUBREL_STATE_SYNCDONE)
{
@@ -344,7 +470,8 @@ process_syncing_tables_for_apply(XLogRecPtr current_lsn)
* workers for this subscription, while we have the lock, for
* later.
*/
- nsyncworkers = logicalrep_sync_worker_count(MyLogicalRepWorker->subid);
+ nsyncworkers =
+ logicalrep_sync_worker_count(MyLogicalRepWorker->subid);
LWLockRelease(LogicalRepWorkerLock);
/*
@@ -401,16 +528,28 @@ process_syncing_tables_for_apply(XLogRecPtr current_lsn)
* there is some free sync worker slot, start new sync worker
* for the table.
*/
- else if (!syncworker && nsyncworkers < max_sync_workers_per_subscription)
+ else if (!syncworker &&
+ nsyncworkers < max_sync_workers_per_subscription)
{
- logicalrep_worker_launch(MyLogicalRepWorker->dbid,
- MySubscription->oid,
- MySubscription->name,
- MyLogicalRepWorker->userid,
- rstate->relid);
+ TimestampTz now = GetCurrentTimestamp();
+
+ /* Keep moderate intervals from the previous launch */
+ if (TimestampDifferenceExceeds(wstate->last_worker_start, now,
+ wal_retrieve_retry_interval))
+ {
+ logicalrep_worker_launch(MyLogicalRepWorker->dbid,
+ MySubscription->oid,
+ MySubscription->name,
+ MyLogicalRepWorker->userid,
+ rstate->relid);
+ wstate->last_worker_start = now;
+ }
}
}
}
+
+ /* sequential scan ended. Allow insertions again. */
+ hash_unfreeze(subrel_local_state);
}
/*
diff --git a/src/include/catalog/pg_subscription_rel.h b/src/include/catalog/pg_subscription_rel.h
index 9f4f152..8f7337c 100644
--- a/src/include/catalog/pg_subscription_rel.h
+++ b/src/include/catalog/pg_subscription_rel.h
@@ -75,6 +75,5 @@ extern char GetSubscriptionRelState(Oid subid, Oid relid,
extern void RemoveSubscriptionRel(Oid subid, Oid relid);
extern List *GetSubscriptionRelations(Oid subid);
-extern List *GetSubscriptionNotReadyRelations(Oid subid);
#endif /* PG_SUBSCRIPTION_REL_H */
--
2.9.2
On Mon, Apr 24, 2017 at 4:41 PM, Kyotaro HORIGUCHI
<horiguchi.kyotaro@lab.ntt.co.jp> wrote:
Hello,
At Sun, 23 Apr 2017 00:51:52 +0900, Masahiko Sawada <sawada.mshk@gmail.com> wrote in <CAD21AoApBU+nz7FYaWX6gjyB9r6WWrTujbL4rrZiocHLc=pEbA@mail.gmail.com>
On Fri, Apr 21, 2017 at 11:19 PM, Masahiko Sawada <sawada.mshk@gmail.com> wrote:
On Fri, Apr 21, 2017 at 5:33 PM, Kyotaro HORIGUCHI
<horiguchi.kyotaro@lab.ntt.co.jp> wrote:Hello,
At Thu, 20 Apr 2017 13:21:14 +0900, Masahiko Sawada <sawada.mshk@gmail.com> wrote in <CAD21AoDrw0OaHE=oVRRhQX248kjJ7W+1ViM3K76aP46HnHJsnQ@mail.gmail.com>
On Thu, Apr 20, 2017 at 12:30 AM, Petr Jelinek
<petr.jelinek@2ndquadrant.com> wrote:On 19/04/17 15:57, Masahiko Sawada wrote:
On Wed, Apr 19, 2017 at 10:07 PM, Petr Jelinek
<petr.jelinek@2ndquadrant.com> wrote:
Yeah, sorry, I meant that we don't want to wait but cannot launch the
tablesync worker in such case.But after more thought, the latest patch Peter proposed has the same
problem. Perhaps we need to scan always whole pg_subscription_rel and
remove the entry if the corresponding table get synced.Yes that's what I mean by "Why can't we just update the hashtable based
on the catalog". And if we do that then I don't understand why do we
need both hastable and linked list if we need to update both based on
catalog reads anyway.Thanks, I've now understood correctly. Yes, I think you're right. If
we update the hash table based on the catalog whenever table state is
invalidated, we don't need to have both hash table and list.Ah, ok. The patch from Peter still generating and replacing the
content of the list. The attached patch stores everything into
SubscriptionRelState. Oppositte to my anticiation, the hash can
be efectively kept small and removed.Thank you for the patch!
Actually, I also bumped into the same the situation where we got the
following error during hash_seq_search. I guess we cannot commit a
transaction during hash_seq_scan but the sequential scan loop in
process_syncing_tables_for_apply could attempt to do that.Ah. Thanks. I forgot that I saw the same error once. The hash has
nothing to do with any transactions. The scan now runs after
freezing of the hash.Petr:
I think we should document why this is done this way - I mean it's not
very obvious that it's okay to update just when not found and why and
even less obvious that it would be in fact wrong to update already
existing entry.It is not prudently designed. I changed there.. seemingly more
reasonable, maybe.Petr:
I also think that in it's current form the
GetSubscriptionNotReadyRelations should be moved to tablesync.cRemoved then moved to tablesync.c.
Petr:
I also think we can't add the new fields to
SubscriptionRelState directly as that's used by catalog access
functions as well.Yeah, it was my lazyness. A new struct SubscriptionWorkerState is
added in tablesync.c and the interval stuff runs on it.At Sun, 23 Apr 2017 00:51:52 +0900, Masahiko Sawada <sawada.mshk@gmail.com> wrote in <CAD21AoApBU+nz7FYaWX6gjyB9r6WWrTujbL4rrZiocHLc=pEbA@mail.gmail.com>
So I guess we should commit the changing status to SUBREL_STATE_READY
after finished hash_seq_scan.We could so either, but I thought that a hash can explicitly
"unfreeze" without a side effect. The attached first one adds the
function in dynahash.c. I don't think that just allowing
unregsiterd scan on unfrozen hash is worse that this.
Thank you for updating the patch.
+ elog(ERROR, "this hash hashtable \"%s\" is not frozen",
+ hashp->tabname);
Maybe the error message should be "this hashtable \"%s\" is not frozen".
+ /*
+ * Freeze hash table. This guarantees that the hash won't be broken for
+ * the scan below. Unfreezing on error leavs the hash freezed but any
+ * errors within this loop blows away the worker process involving the
+ * hash.
+ */
s/leavs/leaves/
s/freezed/frozen/
+ /* This will be true in the next rurn only for live entries */
+ wstate->alive = false;
s/rurn/run/
+ /*
+ * Remove entries no longer necessary. The flag signals nothing if
+ * subrel_local_state is not updated above. We can remove entries in
+ * frozen hash safely.
+ */
+ if (local_state_updated && !wstate->alive)
+ {
+ hash_search(subrel_local_state, &wstate->rs.relid,
+ HASH_REMOVE, NULL);
+ continue;
+ }
IIUC since the apply worker can change the status from
SUBREL_STATE_SYNCWAIT to SUBREL_STATE_READY in a hash_seq_search loop
the table sync worker which is changed to SUBREL_STATE_READY by the
apply worker before updating the subrel_local_state could be remained
in the hash table. I think that we should scan pg_subscription_rel by
using only a condition "subid".
Regards,
--
Masahiko Sawada
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On 24/04/17 17:52, Masahiko Sawada wrote:
On Mon, Apr 24, 2017 at 4:41 PM, Kyotaro HORIGUCHI <horiguchi.kyotaro@lab.ntt.co.jp> wrote: + /* + * Remove entries no longer necessary. The flag signals nothing if + * subrel_local_state is not updated above. We can remove entries in + * frozen hash safely. + */ + if (local_state_updated && !wstate->alive) + { + hash_search(subrel_local_state, &wstate->rs.relid, + HASH_REMOVE, NULL); + continue; + }IIUC since the apply worker can change the status from
SUBREL_STATE_SYNCWAIT to SUBREL_STATE_READY in a hash_seq_search loop
the table sync worker which is changed to SUBREL_STATE_READY by the
apply worker before updating the subrel_local_state could be remained
in the hash table. I think that we should scan pg_subscription_rel by
using only a condition "subid".
I don't follow this.
--
Petr Jelinek http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Tue, Apr 25, 2017 at 1:42 AM, Petr Jelinek
<petr.jelinek@2ndquadrant.com> wrote:
On 24/04/17 17:52, Masahiko Sawada wrote:
On Mon, Apr 24, 2017 at 4:41 PM, Kyotaro HORIGUCHI <horiguchi.kyotaro@lab.ntt.co.jp> wrote: + /* + * Remove entries no longer necessary. The flag signals nothing if + * subrel_local_state is not updated above. We can remove entries in + * frozen hash safely. + */ + if (local_state_updated && !wstate->alive) + { + hash_search(subrel_local_state, &wstate->rs.relid, + HASH_REMOVE, NULL); + continue; + }IIUC since the apply worker can change the status from
SUBREL_STATE_SYNCWAIT to SUBREL_STATE_READY in a hash_seq_search loop
the table sync worker which is changed to SUBREL_STATE_READY by the
apply worker before updating the subrel_local_state could be remained
in the hash table. I think that we should scan pg_subscription_rel by
using only a condition "subid".I don't follow this.
Hmm, I'd misunderstood something. It should work fine. Sorry for the noise.
Regards,
--
Masahiko Sawada
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Hello,
At Tue, 25 Apr 2017 00:52:09 +0900, Masahiko Sawada <sawada.mshk@gmail.com> wrote in <CAD21AoCHPfBqKRiFSMmZSXM5qbB5rD_HZMrniL4E19T2r8WpkA@mail.gmail.com>
+ elog(ERROR, "this hash hashtable \"%s\" is not frozen", + hashp->tabname);Maybe the error message should be "this hashtable \"%s\" is not frozen".
Both of "hashtable" and "hash table" appear there but hash_freeze
uses the former. I followed that.
s/leavs/leaves/
s/freezed/frozen/
s/rurn/run/
Thanks! But the "rurn" was a typo of "turn".
On Tue, Apr 25, 2017 at 1:42 AM, Petr Jelinek
<petr.jelinek@2ndquadrant.com> wrote:On 24/04/17 17:52, Masahiko Sawada wrote:
On Mon, Apr 24, 2017 at 4:41 PM, Kyotaro HORIGUCHI <horiguchi.kyotaro@lab.ntt.co.jp> wrote: + /* + * Remove entries no longer necessary. The flag signals nothing if + * subrel_local_state is not updated above. We can remove entries in + * frozen hash safely. + */ + if (local_state_updated && !wstate->alive) + { + hash_search(subrel_local_state, &wstate->rs.relid, + HASH_REMOVE, NULL); + continue; + }IIUC since the apply worker can change the status from
SUBREL_STATE_SYNCWAIT to SUBREL_STATE_READY in a hash_seq_search loop
the table sync worker which is changed to SUBREL_STATE_READY by the
apply worker before updating the subrel_local_state could be remained
in the hash table.
Every time after pg_subscription_rel is updated, the hash entries
are marked alive only when the corresponding not-ready relations
found in pg_subscription_rel. If any live entries remains, nrels
becomes a positive number and dead entries are removed in the
loop just after. If no entry survives, the hash will be
immediately destroyed. Some dead entries can survive under
ceratin condition but the one of the aboves will occur shortly.
If it is hard to understand, I might should put some additional
comments.
I think that we should scan pg_subscription_rel by
using only a condition "subid".I don't follow this.
Hmm, I'd misunderstood something. It should work fine. Sorry for the noise.
Anyway, the typo-fixed version is attached.
regards,
--
Kyotaro Horiguchi
NTT Open Source Software Center
Attachments:
0001-Add-unfrozen-feature-to-dynahash.patchtext/x-patch; charset=us-asciiDownload
From c79c2c47a325e7b03e217cfed9cd4ac50ec22423 Mon Sep 17 00:00:00 2001
From: Kyotaro Horiguchi <horiguchi.kyotaro@lab.ntt.co.jp>
Date: Mon, 24 Apr 2017 16:06:10 +0900
Subject: [PATCH 1/2] Add unfrozen feature to dynahash
Scans on unfrozen hash cannot live beyond a transaction lifetime. A
hash can be frozen to allow modify-free sequential scan which can live
ybeyond transactions but there's no means to release the status. This
patch adds a new function hash_unfreeze to allow us to reset frozen
hashes to insertable state.
---
src/backend/utils/hash/dynahash.c | 20 ++++++++++++++++++++
src/include/utils/hsearch.h | 1 +
2 files changed, 21 insertions(+)
diff --git a/src/backend/utils/hash/dynahash.c b/src/backend/utils/hash/dynahash.c
index 12b1658..e253ab2 100644
--- a/src/backend/utils/hash/dynahash.c
+++ b/src/backend/utils/hash/dynahash.c
@@ -1334,6 +1334,9 @@ hash_get_num_entries(HTAB *hashp)
* wherein it is inconvenient to track whether a scan is still open, and
* there's no possibility of further insertions after readout has begun.
*
+ * NOTE: it is possible to unfreeze a frozen hash. All running sequential
+ * scans must be abandoned by the scanners' responsibility.
+ *
* NOTE: to use this with a partitioned hashtable, caller had better hold
* at least shared lock on all partitions of the table throughout the scan!
* We can cope with insertions or deletions by our own backend, but *not*
@@ -1456,6 +1459,23 @@ hash_freeze(HTAB *hashp)
hashp->frozen = true;
}
+/*
+ * hash_unfreeze
+ * Reset the freeze state of a hashtable
+ *
+ * This re-enables insertion to a hash again. Active sequentual scans started
+ * during the freeze must not continue here after.
+ */
+void
+hash_unfreeze(HTAB *hashp)
+{
+ Assert (!hashp->isshared);
+ if (!hashp->frozen)
+ elog(ERROR, "this hashtable \"%s\" is not frozen",
+ hashp->tabname);
+ hashp->frozen = false;
+}
+
/********************************* UTILITIES ************************/
diff --git a/src/include/utils/hsearch.h b/src/include/utils/hsearch.h
index 7964087..60920d4 100644
--- a/src/include/utils/hsearch.h
+++ b/src/include/utils/hsearch.h
@@ -136,6 +136,7 @@ extern void hash_seq_init(HASH_SEQ_STATUS *status, HTAB *hashp);
extern void *hash_seq_search(HASH_SEQ_STATUS *status);
extern void hash_seq_term(HASH_SEQ_STATUS *status);
extern void hash_freeze(HTAB *hashp);
+extern void hash_unfreeze(HTAB *hashp);
extern Size hash_estimate_size(long num_entries, Size entrysize);
extern long hash_select_dirsize(long num_entries);
extern Size hash_get_shared_size(HASHCTL *info, int flags);
--
2.9.2
0002-Wait-between-tablesync-worker-restarts.patchtext/x-patch; charset=us-asciiDownload
From a422d7b936eb50c885e5480bc21c7bae636a559a Mon Sep 17 00:00:00 2001
From: Kyotaro Horiguchi <horiguchi.kyotaro@lab.ntt.co.jp>
Date: Mon, 24 Apr 2017 13:39:17 +0900
Subject: [PATCH 2/2] Wait between tablesync worker restarts
Before restarting a tablesync worker for the same relation, wait
wal_retrieve_retry_interval. This avoids restarting failing workers in
a tight loop.
---
src/backend/catalog/pg_subscription.c | 52 -------
src/backend/replication/logical/tablesync.c | 205 +++++++++++++++++++++++-----
src/include/catalog/pg_subscription_rel.h | 1 -
3 files changed, 172 insertions(+), 86 deletions(-)
diff --git a/src/backend/catalog/pg_subscription.c b/src/backend/catalog/pg_subscription.c
index a183850..ff1ed62 100644
--- a/src/backend/catalog/pg_subscription.c
+++ b/src/backend/catalog/pg_subscription.c
@@ -446,55 +446,3 @@ GetSubscriptionRelations(Oid subid)
return res;
}
-
-/*
- * Get all relations for subscription that are not in a ready state.
- *
- * Returned list is palloced in current memory context.
- */
-List *
-GetSubscriptionNotReadyRelations(Oid subid)
-{
- List *res = NIL;
- Relation rel;
- HeapTuple tup;
- int nkeys = 0;
- ScanKeyData skey[2];
- SysScanDesc scan;
-
- rel = heap_open(SubscriptionRelRelationId, AccessShareLock);
-
- ScanKeyInit(&skey[nkeys++],
- Anum_pg_subscription_rel_srsubid,
- BTEqualStrategyNumber, F_OIDEQ,
- ObjectIdGetDatum(subid));
-
- ScanKeyInit(&skey[nkeys++],
- Anum_pg_subscription_rel_srsubstate,
- BTEqualStrategyNumber, F_CHARNE,
- CharGetDatum(SUBREL_STATE_READY));
-
- scan = systable_beginscan(rel, InvalidOid, false,
- NULL, nkeys, skey);
-
- while (HeapTupleIsValid(tup = systable_getnext(scan)))
- {
- Form_pg_subscription_rel subrel;
- SubscriptionRelState *relstate;
-
- subrel = (Form_pg_subscription_rel) GETSTRUCT(tup);
-
- relstate = (SubscriptionRelState *)palloc(sizeof(SubscriptionRelState));
- relstate->relid = subrel->srrelid;
- relstate->state = subrel->srsubstate;
- relstate->lsn = subrel->srsublsn;
-
- res = lappend(res, relstate);
- }
-
- /* Cleanup */
- systable_endscan(scan);
- heap_close(rel, AccessShareLock);
-
- return res;
-}
diff --git a/src/backend/replication/logical/tablesync.c b/src/backend/replication/logical/tablesync.c
index b4b48d9..f1c2d71 100644
--- a/src/backend/replication/logical/tablesync.c
+++ b/src/backend/replication/logical/tablesync.c
@@ -103,9 +103,18 @@
#include "storage/ipc.h"
#include "utils/builtins.h"
+#include "utils/fmgroids.h"
#include "utils/lsyscache.h"
#include "utils/memutils.h"
+/* Struct to track subscription worker startup */
+typedef struct SubscriptionWorkerState
+{
+ SubscriptionRelState rs; /* key is rs.relid */
+ TimestampTz last_worker_start; /* time of the last launch of worker */
+ bool alive; /* this relid is alive in the catalog */
+} SubscriptionWorkerState;
+
static bool table_states_valid = false;
StringInfo copybuf = NULL;
@@ -237,6 +246,80 @@ process_syncing_tables_for_sync(XLogRecPtr current_lsn)
}
/*
+ * Get all relations for subscription that are not in a ready state.
+ *
+ * Updates the given hash and returns the number of live entries.
+ *
+ * Live entries are the hash entries that a corresponding tuple is found in
+ * pg_subscription_rel.
+ *
+ */
+static int
+GetSubscriptionNotReadyRelations(struct HTAB *localstate, Oid subid)
+{
+ Relation rel;
+ HeapTuple tup;
+ int nkeys = 0;
+ ScanKeyData skey[2];
+ SysScanDesc scan;
+ int ret = 0;
+
+ rel = heap_open(SubscriptionRelRelationId, AccessShareLock);
+
+ ScanKeyInit(&skey[nkeys++],
+ Anum_pg_subscription_rel_srsubid,
+ BTEqualStrategyNumber, F_OIDEQ,
+ ObjectIdGetDatum(subid));
+
+ ScanKeyInit(&skey[nkeys++],
+ Anum_pg_subscription_rel_srsubstate,
+ BTEqualStrategyNumber, F_CHARNE,
+ CharGetDatum(SUBREL_STATE_READY));
+
+ scan = systable_beginscan(rel, InvalidOid, false,
+ NULL, nkeys, skey);
+
+ /* Update entries in the given hash */
+ while (HeapTupleIsValid(tup = systable_getnext(scan)))
+ {
+ Form_pg_subscription_rel subrel;
+ SubscriptionWorkerState *wstate;
+ bool found = false;
+
+ subrel = (Form_pg_subscription_rel) GETSTRUCT(tup);
+
+ wstate = hash_search(localstate, (void *) &subrel->srrelid,
+ HASH_ENTER, &found);
+
+ /*
+ * rs.relid is the key of this entry, which won't be changed. The rest
+ * fields should be initialized only at the first time.
+ */
+ if (!found)
+ {
+ wstate->rs.relid = subrel->srrelid;
+ wstate->last_worker_start = 0;
+ wstate->alive = false;
+ }
+
+ /* These two fields should track the status in catalog */
+ wstate->rs.lsn = subrel->srsublsn;
+ wstate->rs.state = subrel->srsubstate;
+
+ /* Set as alive since found in the catalog. */
+ Assert(!wstate->alive);
+ wstate->alive = true;
+ ret++;
+ }
+
+ /* Cleanup */
+ systable_endscan(scan);
+ heap_close(rel, AccessShareLock);
+
+ return ret;
+}
+
+/*
* Handle table synchronization cooperation from the apply worker.
*
* Walk over all subscription tables that are individually tracked by the
@@ -245,7 +328,10 @@ process_syncing_tables_for_sync(XLogRecPtr current_lsn)
*
* If there are tables that need synchronizing and are not being synchronized
* yet, start sync workers for them (if there are free slots for sync
- * workers).
+ * workers). To prevent starting the sync worker for the same relation at a
+ * high frequency after a failure, we store its last start time with each sync
+ * state info. We start the sync worker for the same relation after waiting
+ * at least wal_retrieve_retry_interval.
*
* For tables that are being synchronized already, check if sync workers
* either need action from the apply worker or have finished.
@@ -263,47 +349,87 @@ process_syncing_tables_for_sync(XLogRecPtr current_lsn)
static void
process_syncing_tables_for_apply(XLogRecPtr current_lsn)
{
- static List *table_states = NIL;
- ListCell *lc;
+ static HTAB *subrel_local_state = NULL;
+ HASH_SEQ_STATUS hash_seq;
+ SubscriptionWorkerState *wstate;
+ bool local_state_updated = false;
+ int nrels = 0;
Assert(!IsTransactionState());
/* We need up to date sync state info for subscription tables here. */
if (!table_states_valid)
{
- MemoryContext oldctx;
- List *rstates;
- ListCell *lc;
- SubscriptionRelState *rstate;
-
- /* Clean the old list. */
- list_free_deep(table_states);
- table_states = NIL;
+ /* Create the local state hash if not exists */
+ if (!subrel_local_state)
+ {
+ HASHCTL ctl;
+
+ memset(&ctl, 0, sizeof(ctl));
+ ctl.keysize = sizeof(Oid);
+ ctl.entrysize = sizeof(SubscriptionRelState);
+ subrel_local_state =
+ hash_create("Logical replication table sync worker state",
+ 256, &ctl, HASH_ELEM | HASH_BLOBS);
+ }
StartTransactionCommand();
- /* Fetch all non-ready tables. */
- rstates = GetSubscriptionNotReadyRelations(MySubscription->oid);
+ /* Update local state hash with non-ready tables. */
+ nrels = GetSubscriptionNotReadyRelations(subrel_local_state,
+ MySubscription->oid);
+ local_state_updated = true;
+ CommitTransactionCommand();
- /* Allocate the tracking info in a permanent memory context. */
- oldctx = MemoryContextSwitchTo(CacheMemoryContext);
- foreach(lc, rstates)
+ table_states_valid = true;
+ }
+ /* Just count the entries if the local_state is not update this time */
+ else if (subrel_local_state != NULL)
+ nrels = hash_get_num_entries(subrel_local_state);
+
+ /* No relation to be synched */
+ if (nrels == 0)
+ {
+ if (subrel_local_state != NULL)
{
- rstate = palloc(sizeof(SubscriptionRelState));
- memcpy(rstate, lfirst(lc), sizeof(SubscriptionRelState));
- table_states = lappend(table_states, rstate);
+ hash_destroy(subrel_local_state);
+ subrel_local_state = NULL;
}
- MemoryContextSwitchTo(oldctx);
- CommitTransactionCommand();
-
- table_states_valid = true;
+ return;
}
- /* Process all tables that are being synchronized. */
- foreach(lc, table_states)
+ /*
+ * Freeze hash table. This guarantees that the hash won't be broken for
+ * the scan below. Unfreezing on error leaves the hash frozen but any
+ * errors within this loop blows away the worker process involving the
+ * hash.
+ */
+ hash_freeze(subrel_local_state);
+
+ /* Process all tables that are to be synchronized. */
+ hash_seq_init(&hash_seq, subrel_local_state);
+
+ while ((wstate = hash_seq_search(&hash_seq)) != NULL)
{
- SubscriptionRelState *rstate = (SubscriptionRelState *)lfirst(lc);
+ SubscriptionRelState *rstate;
+
+ /*
+ * Remove entries no longer necessary. The flag signals nothing if
+ * subrel_local_state is not updated above. We can remove entries in
+ * frozen hash safely.
+ */
+ if (local_state_updated && !wstate->alive)
+ {
+ hash_search(subrel_local_state, &wstate->rs.relid,
+ HASH_REMOVE, NULL);
+ continue;
+ }
+
+ rstate = &wstate->rs;
+
+ /* This will be true in the next turn only for live entries */
+ wstate->alive = false;
if (rstate->state == SUBREL_STATE_SYNCDONE)
{
@@ -344,7 +470,8 @@ process_syncing_tables_for_apply(XLogRecPtr current_lsn)
* workers for this subscription, while we have the lock, for
* later.
*/
- nsyncworkers = logicalrep_sync_worker_count(MyLogicalRepWorker->subid);
+ nsyncworkers =
+ logicalrep_sync_worker_count(MyLogicalRepWorker->subid);
LWLockRelease(LogicalRepWorkerLock);
/*
@@ -401,16 +528,28 @@ process_syncing_tables_for_apply(XLogRecPtr current_lsn)
* there is some free sync worker slot, start new sync worker
* for the table.
*/
- else if (!syncworker && nsyncworkers < max_sync_workers_per_subscription)
+ else if (!syncworker &&
+ nsyncworkers < max_sync_workers_per_subscription)
{
- logicalrep_worker_launch(MyLogicalRepWorker->dbid,
- MySubscription->oid,
- MySubscription->name,
- MyLogicalRepWorker->userid,
- rstate->relid);
+ TimestampTz now = GetCurrentTimestamp();
+
+ /* Keep moderate intervals from the previous launch */
+ if (TimestampDifferenceExceeds(wstate->last_worker_start, now,
+ wal_retrieve_retry_interval))
+ {
+ logicalrep_worker_launch(MyLogicalRepWorker->dbid,
+ MySubscription->oid,
+ MySubscription->name,
+ MyLogicalRepWorker->userid,
+ rstate->relid);
+ wstate->last_worker_start = now;
+ }
}
}
}
+
+ /* sequential scan ended. Allow insertions again. */
+ hash_unfreeze(subrel_local_state);
}
/*
diff --git a/src/include/catalog/pg_subscription_rel.h b/src/include/catalog/pg_subscription_rel.h
index 9f4f152..8f7337c 100644
--- a/src/include/catalog/pg_subscription_rel.h
+++ b/src/include/catalog/pg_subscription_rel.h
@@ -75,6 +75,5 @@ extern char GetSubscriptionRelState(Oid subid, Oid relid,
extern void RemoveSubscriptionRel(Oid subid, Oid relid);
extern List *GetSubscriptionRelations(Oid subid);
-extern List *GetSubscriptionNotReadyRelations(Oid subid);
#endif /* PG_SUBSCRIPTION_REL_H */
--
2.9.2
Import Notes
Reply to msg id not found: CAD21AoDDKg9Q6yR94joYeaOhHf5ZvoRHd4SRCBu+-5QGtvPsoA@mail.gmail.comCAD21AoCHPfBqKRiFSMmZSXM5qbB5rD_HZMrniL4E19T2r8WpkA@mail.gmail.com
On 4/21/17 09:59, Petr Jelinek wrote:
Rereading the code again, it's actually not bug as we update the rstate
to what syncworker says, but it's obviously confusing so probably still
worth to commit that.
We don't have the syncworker->relmutex at that point, so it's probably
better to read the previously-copied value in rstate->state.
--
Peter Eisentraut http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
I feel it's getting a bit late for reworkings of this extent, also
considering the marginal nature of the problem we are trying to fix. My
patch from April 18 is very localized and gets the job done.
I think this is still a good direction to investigate, but if we have to
extend the hash table API to get it done, this might not be the best time.
--
Peter Eisentraut http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On 4/20/17 15:36, Peter Eisentraut wrote:
On 4/19/17 23:02, Noah Misch wrote:
This PostgreSQL 10 open item is past due for your status update. Kindly send
a status update within 24 hours, and include a date for your subsequent status
update. Refer to the policy on open item ownership:
/messages/by-id/20170404140717.GA2675809@tornado.leadboat.comThis is currently on the back burner relative to other issues. Next
check-in from me by Monday.
Update: I think there are some light disagreements about whether to go
for a simple localized patch or a more extensive rework. If the latter
camp doesn't convince me, I'll commit the former solution by Friday.
--
Peter Eisentraut http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On 25/04/17 19:54, Peter Eisentraut wrote:
I feel it's getting a bit late for reworkings of this extent, also
considering the marginal nature of the problem we are trying to fix. My
patch from April 18 is very localized and gets the job done.I think this is still a good direction to investigate, but if we have to
extend the hash table API to get it done, this might not be the best time.
Yeah the hash API change needed is a bummer at this stage.
One thing I am missing in your patch however is cleanup of entries for
relations that finished sync. I wonder if it would be enough to just
destroy the hash when we get to empty list.
--
Petr Jelinek http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On 4/27/17 06:47, Petr Jelinek wrote:
One thing I am missing in your patch however is cleanup of entries for
relations that finished sync. I wonder if it would be enough to just
destroy the hash when we get to empty list.
I had omitted that because the amount of memory "leaked" is not much,
but I guess it wouldn't hurt to clean it up.
How about the attached?
--
Peter Eisentraut http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Attachments:
v2-0001-Wait-between-tablesync-worker-restarts.patchinvalid/octet-stream; name=v2-0001-Wait-between-tablesync-worker-restarts.patchDownload
From 13f510d26387fdf91e98451461882b4c005fac13 Mon Sep 17 00:00:00 2001
From: Peter Eisentraut <peter_e@gmx.net>
Date: Thu, 27 Apr 2017 14:57:26 -0400
Subject: [PATCH v2] Wait between tablesync worker restarts
Before restarting a tablesync worker for the same relation, wait
wal_retrieve_retry_interval (currently 5s by default). This avoids
restarting failing workers in a tight loop.
We keep the last start times in a hash table last_start_times that is
separate from the table_states list, because that list is cleared out on
syscache invalidation, which happens whenever a table finishes syncing.
The hash table is kept until all tables have finished syncing.
A future project might be to unify these two and keep everything in one
data structure, but for now this is a less invasive change to accomplish
the original purpose.
For the test suite, set wal_retrieve_retry_interval to its minimum
value, to not increase the test suite run time.
Reported-by: Masahiko Sawada <sawada.mshk@gmail.com>
---
src/backend/replication/logical/tablesync.c | 58 ++++++++++++++++++++++++++---
src/test/subscription/t/004_sync.pl | 1 +
2 files changed, 53 insertions(+), 6 deletions(-)
diff --git a/src/backend/replication/logical/tablesync.c b/src/backend/replication/logical/tablesync.c
index e63d26b0bc..0823000f00 100644
--- a/src/backend/replication/logical/tablesync.c
+++ b/src/backend/replication/logical/tablesync.c
@@ -245,7 +245,10 @@ process_syncing_tables_for_sync(XLogRecPtr current_lsn)
*
* If there are tables that need synchronizing and are not being synchronized
* yet, start sync workers for them (if there are free slots for sync
- * workers).
+ * workers). To prevent starting the sync worker for the same relation at a
+ * high frequency after a failure, we store its last start time with each sync
+ * state info. We start the sync worker for the same relation after waiting
+ * at least wal_retrieve_retry_interval.
*
* For tables that are being synchronized already, check if sync workers
* either need action from the apply worker or have finished.
@@ -263,7 +266,13 @@ process_syncing_tables_for_sync(XLogRecPtr current_lsn)
static void
process_syncing_tables_for_apply(XLogRecPtr current_lsn)
{
+ struct tablesync_start_time_mapping
+ {
+ Oid relid;
+ TimestampTz last_start_time;
+ };
static List *table_states = NIL;
+ static HTAB *last_start_times = NULL;
ListCell *lc;
Assert(!IsTransactionState());
@@ -300,6 +309,31 @@ process_syncing_tables_for_apply(XLogRecPtr current_lsn)
table_states_valid = true;
}
+ /*
+ * Prepare hash table for tracking last start times of workers, to avoid
+ * immediate restarts. We don't need it if there are no tables that need
+ * syncing.
+ */
+ if (table_states && !last_start_times)
+ {
+ HASHCTL ctl;
+
+ memset(&ctl, 0, sizeof(ctl));
+ ctl.keysize = sizeof(Oid);
+ ctl.entrysize = sizeof(struct tablesync_start_time_mapping);
+ last_start_times = hash_create("Logical replication table sync worker start times",
+ 256, &ctl, HASH_ELEM | HASH_BLOBS);
+ }
+ /*
+ * Clean up the hash table when we're done with all tables (just to
+ * release the bit of memory).
+ */
+ else if (!table_states && last_start_times)
+ {
+ hash_destroy(last_start_times);
+ last_start_times = NULL;
+ }
+
/* Process all tables that are being synchronized. */
foreach(lc, table_states)
{
@@ -403,11 +437,23 @@ process_syncing_tables_for_apply(XLogRecPtr current_lsn)
*/
else if (!syncworker && nsyncworkers < max_sync_workers_per_subscription)
{
- logicalrep_worker_launch(MyLogicalRepWorker->dbid,
- MySubscription->oid,
- MySubscription->name,
- MyLogicalRepWorker->userid,
- rstate->relid);
+ TimestampTz now = GetCurrentTimestamp();
+ struct tablesync_start_time_mapping *hentry;
+ bool found;
+
+ hentry = hash_search(last_start_times, &rstate->relid, HASH_ENTER, &found);
+
+ if (!found ||
+ TimestampDifferenceExceeds(hentry->last_start_time, now,
+ wal_retrieve_retry_interval))
+ {
+ logicalrep_worker_launch(MyLogicalRepWorker->dbid,
+ MySubscription->oid,
+ MySubscription->name,
+ MyLogicalRepWorker->userid,
+ rstate->relid);
+ hentry->last_start_time = now;
+ }
}
}
}
diff --git a/src/test/subscription/t/004_sync.pl b/src/test/subscription/t/004_sync.pl
index 87541a0e6e..fa0bf7f49f 100644
--- a/src/test/subscription/t/004_sync.pl
+++ b/src/test/subscription/t/004_sync.pl
@@ -13,6 +13,7 @@
# Create subscriber node
my $node_subscriber = get_new_node('subscriber');
$node_subscriber->init(allows_streaming => 'logical');
+$node_subscriber->append_conf('postgresql.conf', "wal_retrieve_retry_interval = 1ms");
$node_subscriber->start;
# Create some preexisting content on publisher
--
2.12.2
On 27/04/17 21:00, Peter Eisentraut wrote:
On 4/27/17 06:47, Petr Jelinek wrote:
One thing I am missing in your patch however is cleanup of entries for
relations that finished sync. I wonder if it would be enough to just
destroy the hash when we get to empty list.I had omitted that because the amount of memory "leaked" is not much,
but I guess it wouldn't hurt to clean it up.How about the attached?
Yes that's more or less what I meant. All good.
--
Petr Jelinek http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Fri, Apr 28, 2017 at 4:00 AM, Peter Eisentraut
<peter.eisentraut@2ndquadrant.com> wrote:
On 4/27/17 06:47, Petr Jelinek wrote:
One thing I am missing in your patch however is cleanup of entries for
relations that finished sync. I wonder if it would be enough to just
destroy the hash when we get to empty list.I had omitted that because the amount of memory "leaked" is not much,
but I guess it wouldn't hurt to clean it up.How about the attached?
Thank you for updating patch!
+ /*
+ * Clean up the hash table when we're done with all tables (just to
+ * release the bit of memory).
+ */
+ else if (!table_states && last_start_times)
+ {
Isn't it better to use != NIL instead as follows?
else if (table_state != NIL && last_start_times)
Regards,
--
Masahiko Sawada
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
At Fri, 28 Apr 2017 10:20:48 +0900, Masahiko Sawada <sawada.mshk@gmail.com> wrote in <CAD21AoBY9UvS9QLrmaENGBGfQKOfGkGaLm=uYH24gmf-6CAoiw@mail.gmail.com>
On Fri, Apr 28, 2017 at 4:00 AM, Peter Eisentraut
<peter.eisentraut@2ndquadrant.com> wrote:On 4/27/17 06:47, Petr Jelinek wrote:
One thing I am missing in your patch however is cleanup of entries for
relations that finished sync. I wonder if it would be enough to just
destroy the hash when we get to empty list.I had omitted that because the amount of memory "leaked" is not much,
but I guess it wouldn't hurt to clean it up.How about the attached?
This seems rasonable enough.
Thank you for updating patch!
+ /* + * Clean up the hash table when we're done with all tables (just to + * release the bit of memory). + */ + else if (!table_states && last_start_times) + {Isn't it better to use != NIL instead as follows?
else if (table_state != NIL && last_start_times)
Definitely!, but maybe should be reverse condition.
- if (table_states && !last_start_times)
+ if (table_states != NIL && !last_start_times)
===
- else if (!table_states && last_start_times)
+ else if (table_states == NIL && last_start_times)
reagrds,
--
Kyotaro Horiguchi
NTT Open Source Software Center
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Fri, Apr 28, 2017 at 5:26 PM, Kyotaro HORIGUCHI
<horiguchi.kyotaro@lab.ntt.co.jp> wrote:
At Fri, 28 Apr 2017 10:20:48 +0900, Masahiko Sawada <sawada.mshk@gmail.com> wrote in <CAD21AoBY9UvS9QLrmaENGBGfQKOfGkGaLm=uYH24gmf-6CAoiw@mail.gmail.com>
On Fri, Apr 28, 2017 at 4:00 AM, Peter Eisentraut
<peter.eisentraut@2ndquadrant.com> wrote:On 4/27/17 06:47, Petr Jelinek wrote:
One thing I am missing in your patch however is cleanup of entries for
relations that finished sync. I wonder if it would be enough to just
destroy the hash when we get to empty list.I had omitted that because the amount of memory "leaked" is not much,
but I guess it wouldn't hurt to clean it up.How about the attached?
This seems rasonable enough.
Thank you for updating patch!
+ /* + * Clean up the hash table when we're done with all tables (just to + * release the bit of memory). + */ + else if (!table_states && last_start_times) + {Isn't it better to use != NIL instead as follows?
else if (table_state != NIL && last_start_times)
Definitely!, but maybe should be reverse condition.
Yeah, that's right.
Regards,
--
Masahiko Sawada
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On 4/27/17 15:33, Petr Jelinek wrote:
On 27/04/17 21:00, Peter Eisentraut wrote:
On 4/27/17 06:47, Petr Jelinek wrote:
One thing I am missing in your patch however is cleanup of entries for
relations that finished sync. I wonder if it would be enough to just
destroy the hash when we get to empty list.I had omitted that because the amount of memory "leaked" is not much,
but I guess it wouldn't hurt to clean it up.How about the attached?
Yes that's more or less what I meant. All good.
committed
--
Peter Eisentraut http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On 4/27/17 21:20, Masahiko Sawada wrote:
Isn't it better to use != NIL instead as follows?
else if (table_state != NIL && last_start_times)
I'm not a fan of that in general, and it doesn't really add any clarity
here.
--
Peter Eisentraut http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers