pg_basebackup throttling doesn't throttle as promised
The "-r" option to pg_basebackup is supposed to throttle the rate of the
backup. But it only works properly if the server is mostly idle.
Every non-trivial call to XLogFlush or XLogBackgroundFlush will wake up the
wal sender (the one which is not really sending wal, but base files), and
the throttling routine doesn't go back to sleep after being awoke early.
Rather, it releases another 32kb of data.
Should the basebackup.c throttle sleep in a loop until its time has
expired?
Or should walsender.c WalSndWakeup not wake a wal sender whose status
is WALSNDSTATE_BACKUP?
Or both?
Cheers,
Jeff
On Fri, Sep 1, 2017 at 1:32 PM, Jeff Janes <jeff.janes@gmail.com> wrote:
The "-r" option to pg_basebackup is supposed to throttle the rate of the
backup. But it only works properly if the server is mostly idle.Every non-trivial call to XLogFlush or XLogBackgroundFlush will wake up
the wal sender (the one which is not really sending wal, but base files),
and the throttling routine doesn't go back to sleep after being awoke
early. Rather, it releases another 32kb of data.Should the basebackup.c throttle sleep in a loop until its time has
expired?Or should walsender.c WalSndWakeup not wake a wal sender whose status
is WALSNDSTATE_BACKUP?Or both?
I'm attaching a patch for each option. Each one independently solves the
problem. But I think we should do both. There is no point in issuing
unnecessary kill system calls, and there may also be more spurious wake-ups
than just these ones.
Cheers,
Jeff
Attachments:
pg_basebackup_throttle_1_v1.patchapplication/octet-stream; name=pg_basebackup_throttle_1_v1.patchDownload
diff --git a/src/backend/replication/basebackup.c b/src/backend/replication/basebackup.c
new file mode 100644
index 9776858..3657a08
*** a/src/backend/replication/basebackup.c
--- b/src/backend/replication/basebackup.c
*************** throttle(size_t increment)
*** 1348,1361 ****
if (throttling_counter < throttling_sample)
return;
! /* Time elapsed since the last measurement (and possible wake up). */
! elapsed = GetCurrentTimestamp() - throttled_last;
! /* How much should have elapsed at minimum? */
! elapsed_min = elapsed_min_unit * (throttling_counter / throttling_sample);
! sleep = elapsed_min - elapsed;
! /* Only sleep if the transfer is faster than it should be. */
! if (sleep > 0)
{
ResetLatch(MyLatch);
/* We're eating a potentially set latch, so check for interrupts */
--- 1348,1364 ----
if (throttling_counter < throttling_sample)
return;
! for (;;)
{
+ /* Time elapsed since the last measurement (and possible wake up). */
+ elapsed = GetCurrentTimestamp() - throttled_last;
+ /* How much should have elapsed at minimum? */
+ elapsed_min = elapsed_min_unit * (throttling_counter / throttling_sample);
+ sleep = elapsed_min - elapsed;
+ /* Only sleep if the transfer is faster than it should be. */
+ if (sleep <= 0)
+ break;
+
ResetLatch(MyLatch);
/* We're eating a potentially set latch, so check for interrupts */
pg_basebackup_throttle_2_v1.patchapplication/octet-stream; name=pg_basebackup_throttle_2_v1.patchDownload
diff --git a/src/backend/replication/walsender.c b/src/backend/replication/walsender.c
new file mode 100644
index db346e6..db01d06
*** a/src/backend/replication/walsender.c
--- b/src/backend/replication/walsender.c
*************** WalSndWakeup(void)
*** 3008,3013 ****
--- 3008,3014 ----
for (i = 0; i < max_wal_senders; i++)
{
Latch *latch;
+ WalSndState state;
WalSnd *walsnd = &WalSndCtl->walsnds[i];
/*
*************** WalSndWakeup(void)
*** 3016,3024 ****
*/
SpinLockAcquire(&walsnd->mutex);
latch = walsnd->latch;
SpinLockRelease(&walsnd->mutex);
! if (latch != NULL)
SetLatch(latch);
}
}
--- 3017,3026 ----
*/
SpinLockAcquire(&walsnd->mutex);
latch = walsnd->latch;
+ state = walsnd->state;
SpinLockRelease(&walsnd->mutex);
! if (latch != NULL && state != WALSNDSTATE_BACKUP)
SetLatch(latch);
}
}
Jeff Janes <jeff.janes@gmail.com> wrote:
On Fri, Sep 1, 2017 at 1:32 PM, Jeff Janes <jeff.janes@gmail.com> wrote:
The "-r" option to pg_basebackup is supposed to throttle the rate of the
backup. But it only works properly if the server is mostly idle.Every non-trivial call to XLogFlush or XLogBackgroundFlush will wake up the
wal sender (the one which is not really sending wal, but base files), and
the throttling routine doesn't go back to sleep after being awoke
early. Rather, it releases another 32kb of data.
Sorry, I missed this fact when coding the feature.
Should the basebackup.c throttle sleep in a loop until its time has
expired?
I think this is the correct approach because latch can be set for unrelated
reasons.
The patch makes sense to me. I just recommend moving this part in front of the
loop because elapsed_min does not have to be re-computed each time:
/* How much should have elapsed at minimum? */
elapsed_min = elapsed_min_unit * (throttling_counter / throttling_sample);
And also a comment explaining the purpose of the loop would be
appreciated. Thanks.
--
Antonin Houska
Cybertec Schönig & Schönig GmbH
Gröhrmühlgasse 26
A-2700 Wiener Neustadt
Web: http://www.postgresql-support.de, http://www.cybertec.at
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Sat, Sep 2, 2017 at 6:42 AM, Jeff Janes <jeff.janes@gmail.com> wrote:
I'm attaching a patch for each option. Each one independently solves the
problem. But I think we should do both. There is no point in issuing
unnecessary kill system calls, and there may also be more spurious wake-ups
than just these ones.
I agree that 1) and 2) are sensible things to do now. At some point I
think that we will want do_pg_stop_backup() to use a wait event
instead of a pg_usleep call when waiting for a segment to be archived,
in which case we could use WalSndWakeup() to set the latch and
accelerate things. So it would be nice to keep track of this
possibility in the code. We could as well do that with a new API only
signalling WAL senders in backup state though.
SpinLockAcquire(&walsnd->mutex);
latch = walsnd->latch;
+ state = walsnd->state;
SpinLockRelease(&walsnd->mutex);
- if (latch != NULL)
+ if (latch != NULL && state != WALSNDSTATE_BACKUP)
SetLatch(latch);
This surely meritates a comment.
--
Michael
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Jeff Janes wrote:
I'm attaching a patch for each option. Each one independently solves the
problem. But I think we should do both. There is no point in issuing
unnecessary kill system calls, and there may also be more spurious wake-ups
than just these ones.
I modified patch 1 a bit -- result attached. I would like to back-patch
this all the way back to 9.4, but there are a few conflicts due to
c29aff959dc so I think I'm going to do pg10 only unless I get some votes
that it should go further back. I think it should be fairly trivial to
resolve.
The other patch should be applied to master only IMO.
--
�lvaro Herrera https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Attachments:
0001-fix-basebackup-throttle.patchtext/plain; charset=us-asciiDownload
From e482f52f792df3685f4f9d955d54e04570db5cc5 Mon Sep 17 00:00:00 2001
From: Alvaro Herrera <alvherre@alvh.no-ip.org>
Date: Mon, 4 Sep 2017 13:18:44 +0200
Subject: [PATCH] fix basebackup throttle
---
src/backend/replication/basebackup.c | 34 +++++++++++++++++++++++-----------
1 file changed, 23 insertions(+), 11 deletions(-)
diff --git a/src/backend/replication/basebackup.c b/src/backend/replication/basebackup.c
index 9776858f03..4db79d733c 100644
--- a/src/backend/replication/basebackup.c
+++ b/src/backend/replication/basebackup.c
@@ -1336,10 +1336,7 @@ _tarWriteDir(const char *pathbuf, int basepathlen, struct stat *statbuf,
static void
throttle(size_t increment)
{
- TimeOffset elapsed,
- elapsed_min,
- sleep;
- int wait_result;
+ TimeOffset elapsed_min;
if (throttling_counter < 0)
return;
@@ -1348,14 +1345,29 @@ throttle(size_t increment)
if (throttling_counter < throttling_sample)
return;
- /* Time elapsed since the last measurement (and possible wake up). */
- elapsed = GetCurrentTimestamp() - throttled_last;
- /* How much should have elapsed at minimum? */
- elapsed_min = elapsed_min_unit * (throttling_counter / throttling_sample);
- sleep = elapsed_min - elapsed;
- /* Only sleep if the transfer is faster than it should be. */
- if (sleep > 0)
+ /* How much time should have elapsed at minimum? */
+ elapsed_min = elapsed_min_unit *
+ (throttling_counter / throttling_sample);
+
+ /*
+ * Since the latch could be set repeatedly because of concurrently WAL
+ * activity, sleep enough times in a loop to ensure enough time has
+ * passed.
+ */
+ for (;;)
{
+ TimeOffset elapsed,
+ sleep;
+ int wait_result;
+
+ /* Time elapsed since the last measurement (and possible wake up). */
+ elapsed = GetCurrentTimestamp() - throttled_last;
+
+ /* sleep if the transfer is faster than it should be */
+ sleep = elapsed_min - elapsed;
+ if (sleep <= 0)
+ break;
+
ResetLatch(MyLatch);
/* We're eating a potentially set latch, so check for interrupts */
--
2.11.0
Alvaro Herrera wrote:
Jeff Janes wrote:
I'm attaching a patch for each option. Each one independently solves the
problem. But I think we should do both. There is no point in issuing
unnecessary kill system calls, and there may also be more spurious wake-ups
than just these ones.I modified patch 1 a bit -- result attached. I would like to back-patch
this all the way back to 9.4, but there are a few conflicts due to
c29aff959dc so I think I'm going to do pg10 only unless I get some votes
that it should go further back. I think it should be fairly trivial to
resolve.
Pushed patch 1 to master and pg10.
Please add the other one to commitfest.
--
�lvaro Herrera https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers