WalSndWakeup() and synchronous_commit=off
Hi all,
I noticed that when synchronous_commit=off were not waking up the wal sender
latch in xact.c:RecordTransactionCommit which leads to ugly delays of approx 7
seconds (1 + replication_timeout/10) with default settings.
Given that were flushing the wal to disk much sooner this appears to be a bad
idea - especially as this may happen even under load if we ever reach the
'coughtup' state.
I wonder why the WalSndWakeup isn't done like:
diff --git a/src/backend/access/transam/xlog.c
b/src/backend/access/transam/xlog.c
index ecb71b6..7a3224b 100644
--- a/src/backend/access/transam/xlog.c
+++ b/src/backend/access/transam/xlog.c
@@ -1906,6 +1906,10 @@ XLogWrite(XLogwrtRqst WriteRqst, bool flexible, bool
xlog_switch)
xlogctl->LogwrtRqst.Flush = LogwrtResult.Flush;
SpinLockRelease(&xlogctl->info_lck);
}
+
+ /* the walsender wasn't woken up in xact.c */
+ if(max_wal_senders > 1 && synchronous_commit == SYNCHRONOUS_COMMIT_OFF)
+ WalSndWakeup();
}
Doing that for the synchronous_commit=off case can imo be considered a bugfix,
but I wonder why we ever wake the senders somewhere else?
The only argument I can see for doing it at places like StartTransactionCommit
is that thats the place after which the data will be visible on the client. I
think thats a non-argument though because if wal is flushed to disk outside of
a commit there normally is enough data to make it worthwile.
Doing the above results in a very noticeable reduction in lagginess and even a
noticeable reduction in cpu-usage spikes on a busy replication test setup.
Greetings,
Andres
--
Andres Freund http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services
On 10 May 2012 20:51, Andres Freund <andres@2ndquadrant.com> wrote:
I noticed that when synchronous_commit=off were not waking up the wal sender
latch in xact.c:RecordTransactionCommit which leads to ugly delays of approx 7
seconds (1 + replication_timeout/10) with default settings.
Given that were flushing the wal to disk much sooner this appears to be a bad
idea - especially as this may happen even under load if we ever reach the
'coughtup' state.
Sounds like a problem. I'll have a look.
I wonder why the WalSndWakeup isn't done like:
diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c index ecb71b6..7a3224b 100644 --- a/src/backend/access/transam/xlog.c +++ b/src/backend/access/transam/xlog.c @@ -1906,6 +1906,10 @@ XLogWrite(XLogwrtRqst WriteRqst, bool flexible, bool xlog_switch) xlogctl->LogwrtRqst.Flush = LogwrtResult.Flush; SpinLockRelease(&xlogctl->info_lck); } + + /* the walsender wasn't woken up in xact.c */ + if(max_wal_senders > 1 && synchronous_commit == SYNCHRONOUS_COMMIT_OFF) + WalSndWakeup(); }Doing that for the synchronous_commit=off case can imo be considered a bugfix,
but I wonder why we ever wake the senders somewhere else?
The only argument I can see for doing it at places like StartTransactionCommit
is that thats the place after which the data will be visible on the client. I
think thats a non-argument though because if wal is flushed to disk outside of
a commit there normally is enough data to make it worthwile.Doing the above results in a very noticeable reduction in lagginess and even a
noticeable reduction in cpu-usage spikes on a busy replication test setup.Greetings,
Andres
--
Andres Freund http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
--
Simon Riggs http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services
On Fri, May 11, 2012 at 4:51 AM, Andres Freund <andres@2ndquadrant.com> wrote:
diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c index ecb71b6..7a3224b 100644 --- a/src/backend/access/transam/xlog.c +++ b/src/backend/access/transam/xlog.c @@ -1906,6 +1906,10 @@ XLogWrite(XLogwrtRqst WriteRqst, bool flexible, bool xlog_switch) xlogctl->LogwrtRqst.Flush = LogwrtResult.Flush; SpinLockRelease(&xlogctl->info_lck); } + + /* the walsender wasn't woken up in xact.c */ + if(max_wal_senders > 1 && synchronous_commit == SYNCHRONOUS_COMMIT_OFF) + WalSndWakeup(); }
Calling WalSndWakeup() while WALWriteLock is being held might cause another
performance degradation. No?
Regards,
--
Fujii Masao
On Fri, May 11, 2012 at 1:09 PM, Fujii Masao <masao.fujii@gmail.com> wrote:
Calling WalSndWakeup() while WALWriteLock is being held might cause another
performance degradation. No?
That definitely doesn't seem ideal - a lot of things can pile up
behind WALWriteLock. I'm not sure how big a problem it would be in
practice, but we generally make a practice of avoiding sending signals
while holding LWLocks whenever possible...
--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
On Friday, May 11, 2012 07:20:26 PM Robert Haas wrote:
On Fri, May 11, 2012 at 1:09 PM, Fujii Masao <masao.fujii@gmail.com> wrote:
Calling WalSndWakeup() while WALWriteLock is being held might cause
another performance degradation. No?That definitely doesn't seem ideal - a lot of things can pile up
behind WALWriteLock. I'm not sure how big a problem it would be in
practice, but we generally make a practice of avoiding sending signals
while holding LWLocks whenever possible...
In my measurements on moderately powerful hardware I couldn't see any
degradation on the primary - in fact the contrary, but the improvements were
around 0.4% and I only tested 10min so its not exactly hard evidence. But I
aggree its not ideal.
Its the only place though which knows whether its actually sensible to wakeup
the walsender. We could make it return whether it wrote anything and do the
wakeup at the callers. I count 4 different callsites which would be an
annoying duplication but I don't really see anything better right now.
Better Ideas?
Andres
--
Andres Freund http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services
Robert Haas <robertmhaas@gmail.com> writes:
That definitely doesn't seem ideal - a lot of things can pile up
behind WALWriteLock. I'm not sure how big a problem it would be in
practice, but we generally make a practice of avoiding sending signals
while holding LWLocks whenever possible...
There's a good reason for that, which is that the scheduler might well
decide to go run the wakened process instead of you. Admittedly this
tends to not be a problem on machines with $bignum CPUs, but on
single-CPU machines I've seen it happen a lot.
Refactoring so that the signal is sent only after lock release seems
like a good idea to me.
regards, tom lane
On Friday, May 11, 2012 08:36:24 PM Tom Lane wrote:
Robert Haas <robertmhaas@gmail.com> writes:
That definitely doesn't seem ideal - a lot of things can pile up
behind WALWriteLock. I'm not sure how big a problem it would be in
practice, but we generally make a practice of avoiding sending signals
while holding LWLocks whenever possible...There's a good reason for that, which is that the scheduler might well
decide to go run the wakened process instead of you. Admittedly this
tends to not be a problem on machines with $bignum CPUs, but on
single-CPU machines I've seen it happen a lot.Refactoring so that the signal is sent only after lock release seems
like a good idea to me.
Will send a patch lateron, duplication seems to be manageable.
Andres
Andres Freund <andres@2ndquadrant.com> writes:
Its the only place though which knows whether its actually sensible to wakeup
the walsender. We could make it return whether it wrote anything and do the
wakeup at the callers. I count 4 different callsites which would be an
annoying duplication but I don't really see anything better right now.
Another point here is that XLogWrite is not only normally called with
the lock held, but inside a critical section. I see no reason to take
the risk of doing signal sending inside critical sections.
BTW, a depressingly large fraction of the existing calls to WalSndWakeup
are also inside critical sections, generally for no good reason that I
can see. For example, in EndPrepare(), why was the call placed where
it is and not down beside SyncRepWaitForLSN?
regards, tom lane
On 11 May 2012 19:45, Tom Lane <tgl@sss.pgh.pa.us> wrote:
Andres Freund <andres@2ndquadrant.com> writes:
Its the only place though which knows whether its actually sensible to wakeup
the walsender. We could make it return whether it wrote anything and do the
wakeup at the callers. I count 4 different callsites which would be an
annoying duplication but I don't really see anything better right now.Another point here is that XLogWrite is not only normally called with
the lock held, but inside a critical section. I see no reason to take
the risk of doing signal sending inside critical sections.BTW, a depressingly large fraction of the existing calls to WalSndWakeup
are also inside critical sections, generally for no good reason that I
can see. For example, in EndPrepare(), why was the call placed where
it is and not down beside SyncRepWaitForLSN?
I think because nobody thought of that.
--
Simon Riggs http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services
On Friday, May 11, 2012 08:45:23 PM Tom Lane wrote:
Andres Freund <andres@2ndquadrant.com> writes:
Its the only place though which knows whether its actually sensible to
wakeup the walsender. We could make it return whether it wrote anything
and do the wakeup at the callers. I count 4 different callsites which
would be an annoying duplication but I don't really see anything better
right now.Another point here is that XLogWrite is not only normally called with
the lock held, but inside a critical section. I see no reason to take
the risk of doing signal sending inside critical sections.
BTW, a depressingly large fraction of the existing calls to WalSndWakeup
are also inside critical sections, generally for no good reason that I
can see. For example, in EndPrepare(), why was the call placed where
it is and not down beside SyncRepWaitForLSN?
Hm. While I see no real problem moving it out of the lock I don't really see a
way to cleanly outside critical sections everywhere. The impact of doing so
seems to be rather big to me. The only externally visible place where it
actually is known whether we write out data and thus do the wakeup is
XLogInsert, XLogFlush and XLogBackgroundFlush. The first two of those are
routinely called inside a critical section.
Andres
--
Andres Freund http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services
On Mon, May 14, 2012 at 6:32 PM, Andres Freund <andres@2ndquadrant.com> wrote:
On Friday, May 11, 2012 08:45:23 PM Tom Lane wrote:
Andres Freund <andres@2ndquadrant.com> writes:
Its the only place though which knows whether its actually sensible to
wakeup the walsender. We could make it return whether it wrote anything
and do the wakeup at the callers. I count 4 different callsites which
would be an annoying duplication but I don't really see anything better
right now.Another point here is that XLogWrite is not only normally called with
the lock held, but inside a critical section. I see no reason to take
the risk of doing signal sending inside critical sections.BTW, a depressingly large fraction of the existing calls to WalSndWakeup
are also inside critical sections, generally for no good reason that I
can see. For example, in EndPrepare(), why was the call placed where
it is and not down beside SyncRepWaitForLSN?Hm. While I see no real problem moving it out of the lock I don't really see a
way to cleanly outside critical sections everywhere. The impact of doing so
seems to be rather big to me. The only externally visible place where it
actually is known whether we write out data and thus do the wakeup is
XLogInsert, XLogFlush and XLogBackgroundFlush. The first two of those are
routinely called inside a critical section.
So what about moving the existing calls of WalSndWakeup() out of a critical
section and adding new call of WalSndWakeup() into XLogBackgroundFlush()?
Then all WalSndWakeup()s are called outside a critical section and after
releasing WALWriteLock. I attached the patch.
Regards,
--
Fujii Masao
Attachments:
move_walsndwakeup_v1.patchapplication/octet-stream; name=move_walsndwakeup_v1.patchDownload+44-29
On Monday, May 14, 2012 07:55:32 PM Fujii Masao wrote:
On Mon, May 14, 2012 at 6:32 PM, Andres Freund <andres@2ndquadrant.com>
wrote:
On Friday, May 11, 2012 08:45:23 PM Tom Lane wrote:
Andres Freund <andres@2ndquadrant.com> writes:
Its the only place though which knows whether its actually sensible to
wakeup the walsender. We could make it return whether it wrote
anything and do the wakeup at the callers. I count 4 different
callsites which would be an annoying duplication but I don't really
see anything better right now.Another point here is that XLogWrite is not only normally called with
the lock held, but inside a critical section. I see no reason to take
the risk of doing signal sending inside critical sections.BTW, a depressingly large fraction of the existing calls to WalSndWakeup
are also inside critical sections, generally for no good reason that I
can see. For example, in EndPrepare(), why was the call placed where
it is and not down beside SyncRepWaitForLSN?Hm. While I see no real problem moving it out of the lock I don't really
see a way to cleanly outside critical sections everywhere. The impact of
doing so seems to be rather big to me. The only externally visible place
where it actually is known whether we write out data and thus do the
wakeup is XLogInsert, XLogFlush and XLogBackgroundFlush. The first two
of those are routinely called inside a critical section.So what about moving the existing calls of WalSndWakeup() out of a critical
section and adding new call of WalSndWakeup() into XLogBackgroundFlush()?
Then all WalSndWakeup()s are called outside a critical section and after
releasing WALWriteLock. I attached the patch.
Imo its simply not sensible to call WalSndWakeup at *any* of the current
locations. They simply don't have the necessary information. They will wakeup
too often (because with concurrency commits often won't require additional wal
writes) and too seldom (because a flush caused by XLogInsert wont cause a
wakeup).
Andres
--
Andres Freund http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services
On Tuesday, May 15, 2012 05:30:27 PM Andres Freund wrote:
On Monday, May 14, 2012 07:55:32 PM Fujii Masao wrote:
On Mon, May 14, 2012 at 6:32 PM, Andres Freund <andres@2ndquadrant.com>
wrote:
On Friday, May 11, 2012 08:45:23 PM Tom Lane wrote:
Andres Freund <andres@2ndquadrant.com> writes:
Its the only place though which knows whether its actually sensible
to wakeup the walsender. We could make it return whether it wrote
anything and do the wakeup at the callers. I count 4 different
callsites which would be an annoying duplication but I don't really
see anything better right now.Another point here is that XLogWrite is not only normally called with
the lock held, but inside a critical section. I see no reason to take
the risk of doing signal sending inside critical sections.BTW, a depressingly large fraction of the existing calls to
WalSndWakeup are also inside critical sections, generally for no good
reason that I can see. For example, in EndPrepare(), why was the
call placed where it is and not down beside SyncRepWaitForLSN?Hm. While I see no real problem moving it out of the lock I don't
really see a way to cleanly outside critical sections everywhere. The
impact of doing so seems to be rather big to me. The only externally
visible place where it actually is known whether we write out data and
thus do the wakeup is XLogInsert, XLogFlush and XLogBackgroundFlush.
The first two of those are routinely called inside a critical section.So what about moving the existing calls of WalSndWakeup() out of a
critical section and adding new call of WalSndWakeup() into
XLogBackgroundFlush()? Then all WalSndWakeup()s are called outside a
critical section and after releasing WALWriteLock. I attached the patch.Imo its simply not sensible to call WalSndWakeup at *any* of the current
locations. They simply don't have the necessary information. They will
wakeup too often (because with concurrency commits often won't require
additional wal writes) and too seldom (because a flush caused by
XLogInsert wont cause a wakeup).
Does anybody have a better idea than to either call WalSndWakeup() at
essentially the wrong places or calling it inside a critical section?
Tom, what danger do you see from calling it in a critical section?
Greetings,
Andres
Andres Freund <andres@2ndquadrant.com> writes:
Does anybody have a better idea than to either call WalSndWakeup() at
essentially the wrong places or calling it inside a critical section?
Tom, what danger do you see from calling it in a critical section?
My concern was basically that it might throw an error. Looking at the
current implementation of SetLatch, it seems that's not possible, but
I wonder whether we want to lock ourselves into that assumption.
Still, if the alternatives are worse, maybe that's the best answer.
If we do that, though, let's add comments to WalSndWakeup and SetLatch
mentioning that they mustn't throw error.
regards, tom lane
Hi,
On Monday, May 28, 2012 07:11:53 PM Tom Lane wrote:
Andres Freund <andres@2ndquadrant.com> writes:
Does anybody have a better idea than to either call WalSndWakeup() at
essentially the wrong places or calling it inside a critical section?Tom, what danger do you see from calling it in a critical section?
My concern was basically that it might throw an error. Looking at the
current implementation of SetLatch, it seems that's not possible, but
I wonder whether we want to lock ourselves into that assumption.
The assumption is already made at several other places I think.
XLogSetAsyncXactLSN does a SetLatch and is called from critical sections;
several signal handlers call it without any attention to the context.
Requiring it to be called outside would make its usage considerably less
convenient and I don't really see what could change that would require to
throw non-panic errors.
Still, if the alternatives are worse, maybe that's the best answer.
If we do that, though, let's add comments to WalSndWakeup and SetLatch
mentioning that they mustn't throw error.
Patch attached.
Greetings,
Andres
PS: Sorry for dropping the CC list before...
--
Andres Freund http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services
Attachments:
0001-Fix-walsender-wakeup-handling.patchtext/x-patch; charset=UTF-8; name=0001-Fix-walsender-wakeup-handling.patchDownload+68-30
On Tuesday, May 29, 2012 08:42:43 PM Andres Freund wrote:
Patch attached.
Imo this patch should be backported to 9.1, 9.0 doesn't use latches and does
not do explicit wakeup of the sender so its not applicable there.
I can prepare a patch for 9.1 if people agree, there has been some amount of
change that won't make it apply cleanly.
Greetings,
Andres
--
Andres Freund http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services
On Wed, May 30, 2012 at 9:46 PM, Andres Freund <andres@2ndquadrant.com> wrote:
On Tuesday, May 29, 2012 08:42:43 PM Andres Freund wrote:
Patch attached.
Imo this patch should be backported to 9.1, 9.0 doesn't use latches and does
not do explicit wakeup of the sender so its not applicable there.I can prepare a patch for 9.1 if people agree, there has been some amount of
change that won't make it apply cleanly.
The patch wakes up walsender more frequently than now. Which leads
walsender to send smaller WAL data packet more frequently, and furthermore
which leads walreceiver to issue fsync more frequently. So I'm afraid that
the patch makes the disk more busy and slows down the read-only query
in the standby. I'm also afraid that frequent fsync calls degrade the
performance
of sync replication. So it's better to do benchmark to address the concerns.
Regards,
--
Fujii Masao
On Thursday, May 31, 2012 01:33:33 AM Fujii Masao wrote:
On Wed, May 30, 2012 at 9:46 PM, Andres Freund <andres@2ndquadrant.com>
wrote:
On Tuesday, May 29, 2012 08:42:43 PM Andres Freund wrote:
Patch attached.
Imo this patch should be backported to 9.1, 9.0 doesn't use latches and
does not do explicit wakeup of the sender so its not applicable there.I can prepare a patch for 9.1 if people agree, there has been some amount
of change that won't make it apply cleanly.The patch wakes up walsender more frequently than now. Which leads
walsender to send smaller WAL data packet more frequently, and furthermore
which leads walreceiver to issue fsync more frequently. So I'm afraid that
the patch makes the disk more busy and slows down the read-only query
in the standby. I'm also afraid that frequent fsync calls degrade the
performance
of sync replication. So it's better to do benchmark to address the
concerns.
I couldn't measure any significant difference in #fsyncs or replication speed
in a busy pgbench workload. If anything there were less, but the difference
was small. Both with synchronous_commit=on/off. I did *not* test sync repl.
Why would you expect a change? Walsender is only signalled if XLogWrite
actually flushed data to disk. Thats the point of the exercise. Thats normally
only done if the wal buffers are full or a commit record (+some other stuff)
requires the xlog to be flushed up to some point.
Greetings,
Andres
--
Andres Freund http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services
On Tuesday, May 29, 2012 08:42:43 PM Andres Freund wrote:
Hi,
On Monday, May 28, 2012 07:11:53 PM Tom Lane wrote:
Andres Freund <andres@2ndquadrant.com> writes:
Does anybody have a better idea than to either call WalSndWakeup() at
essentially the wrong places or calling it inside a critical section?Tom, what danger do you see from calling it in a critical section?
My concern was basically that it might throw an error. Looking at the
current implementation of SetLatch, it seems that's not possible, but
I wonder whether we want to lock ourselves into that assumption.The assumption is already made at several other places I think.
XLogSetAsyncXactLSN does a SetLatch and is called from critical sections;
several signal handlers call it without any attention to the context.Requiring it to be called outside would make its usage considerably less
convenient and I don't really see what could change that would require to
throw non-panic errors.Still, if the alternatives are worse, maybe that's the best answer.
If we do that, though, let's add comments to WalSndWakeup and SetLatch
mentioning that they mustn't throw error.Patch attached.
I would like to invite some more review (+commit...) here ;). Imo this is an
annoying bug which should be fixed before next point release or beta/rc comes
out...
Andres
--
Andres Freund http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services
On 6 June 2012 20:11, Andres Freund <andres@2ndquadrant.com> wrote:
On Tuesday, May 29, 2012 08:42:43 PM Andres Freund wrote:
Hi,
On Monday, May 28, 2012 07:11:53 PM Tom Lane wrote:
Andres Freund <andres@2ndquadrant.com> writes:
Does anybody have a better idea than to either call WalSndWakeup() at
essentially the wrong places or calling it inside a critical section?Tom, what danger do you see from calling it in a critical section?
My concern was basically that it might throw an error. Looking at the
current implementation of SetLatch, it seems that's not possible, but
I wonder whether we want to lock ourselves into that assumption.The assumption is already made at several other places I think.
XLogSetAsyncXactLSN does a SetLatch and is called from critical sections;
several signal handlers call it without any attention to the context.Requiring it to be called outside would make its usage considerably less
convenient and I don't really see what could change that would require to
throw non-panic errors.Still, if the alternatives are worse, maybe that's the best answer.
If we do that, though, let's add comments to WalSndWakeup and SetLatch
mentioning that they mustn't throw error.Patch attached.
I would like to invite some more review (+commit...) here ;). Imo this is an
annoying bug which should be fixed before next point release or beta/rc comes
out...
Moved the wakeup to a logical place outside a critical section.
--
Simon Riggs http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services