Latch for the WAL writer - further reducing idle wake-ups.

Started by Peter Geogheganalmost 14 years ago24 messageshackers
Jump to latest

Attached patch latches up the WAL Writer, reducing wake-ups and thus
saving electricity in a way that is more-or-less analogous to my work
on the BGWriter:

http://git.postgresql.org/gitweb/?p=postgresql.git;a=commit;h=6d90eaaa89a007e0d365f49d6436f35d2392cfeb

I am hoping this gets into 9.2 . I am concious of the fact that this
is quite late, but it the patch addresses an open item, the concluding
part of a much wider feature. In any case, it is a useful patch, that
ought to be committed at some point. I should point out:

1. This functionality was covered by the group commit patch that I
worked on back in January, which was submitted in advance of the
commitfest deadline. However, an alternative implementation was
ultimately committed that did not consider WAL Writer wake-ups.

2. The WAL writer is the most important auxiliary process to latch-up.
Though it is tied with the BGWriter at 5 wake-ups per second by
default, I consider the WAL Writer to be more important than the
BGWriter because I find it much more plausible that the WAL Writer
really won't need to be around for much of the time, as with a
read-mostly work load. "Cloud" type deployments often have read-mostly
workloads, so we can still save some power even if the DB is actually
servicing lots of read queries. That being the case, it would be a
shame if we didn't get this last one in, as it adds a lot more value
than any of the other patches.

3. This is a fairly simple patch; as I've said, it works in a way that
is quite analogous to the BGWriter patch, applying lessons learned
there.

With this patch, my instrumentation shows that wake-ups when Postgres
reaches a fully idle state are just 2.7 per second for the entire
postgres process group, quite an improvement on the 7.6 per second in
HEAD. This is exactly what you'd expect from a reduction of 5 wake-ups
per second to 0.1 per second on average for the WAL Writer.

I have determined this with PowerTOP 1.13 on my Fedora 16 laptop. Here
is an example session, began after the cluster reached a fully idle
state, with this patch applied (if, alternatively, I want to see
things at per-process granularity, I can get that from PowerTOP 1.98
beta 1, which is available from my system's package manager):

[peter@peterlaptop powertop-1.13]$ sudo ./powertop -d --time=300
[sudo] password for peter:
PowerTOP 1.13 (C) 2007 - 2010 Intel Corporation

Collecting data for 300 seconds

Cn Avg residency
C0 (cpu running) ( 2.8%)
polling 0.0ms ( 0.0%)
C1 mwait 0.5ms ( 1.0%)
C2 mwait 0.9ms ( 0.6%)
C3 mwait 1.4ms ( 0.1%)
C4 mwait 6.7ms (95.4%)
P-states (frequencies)
2.61 Ghz 5.7%
1.80 Ghz 0.1%
1200 Mhz 0.1%
1000 Mhz 0.2%
800 Mhz 93.5%
Wakeups-from-idle per second : 171.3 interval: 300.0s
no ACPI power usage estimate available
Top causes for wakeups:
23.0% (134.5) chrome
***SNIP***
0.5% ( 2.7) postgres
***SNIP***

This is a rather low number, that will make us really competitive with
other RDBMSs in this area. Recall that we started from 11.5 wake-ups
for an idle Postgres cluster with a default configuration.

To put the 2.7 number in context, I measured MySQL's wake-ups at 2.2
last year (mysql-server version 5.1.56, Fedora 14), though I
subsequently saw much higher numbers (over 20 per second) for version
5.5.19 on Fedora 16, so you should probably take that with a grain of
salt - I don't know anything about MySQL, and so cannot really be sure
that I'm making an objective comparison in comparing that number with
the number of wake-ups Postgres has with a stock postgresql.conf.

I've employed the same trick used when a buffer is dirtied for the
BGWriter - most of the time, the SetLatch() calls will check a single
flag, and find it already set. We are careful to only "arm" the latch
with a call to ResetLatch() when it is really needed. Rather than
waiting for the clocksweep to be lapped, we wait for a set number of
iterations of consistent inactivity.

I've made the WAL Writer use its process latch, rather than the latch
that was previously within XLogCtl. This seems much more idiomatic, as
in doing so we reserve the right to register generic signal handlers.
With a non-process latch, we'd have to worry about signal invalidation
issues on an ongoing basis, since the handler wouldn't be calling
SetLatch() against the latch we waited on. I have also added a comment
in latch.h generally advising against ad-hoc shared latches where .

I took initial steps to quantify the performance hit from this patch.
A simple "insert.sql" pgbench-tools benchmark on my laptop, with a
generic configuration showed no problems, though I do not assume that
this conclusively proves the case. Results:

http://walwriterlatch.staticloud.com/

My choice of XLogInsert() as an additional site at which to call
SetLatch() was one that wasn't taken easily, and frankly I'm not
entirely confident that I couldn't have been just as effective while
placing the SetLatch() call in a less hot, perhaps higher-level
codepath. That said, MarkBufferDirty() is also a very hot code path,
and it's where one of the SetLatch() calls goes in the earlier
BGWriter patch, besides which I haven't been able to quantify any
performance hit as yet.

Thoughts?

--
Peter Geoghegan       http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training and Services

Attachments:

walwriter_latch_v1_2012_05_02.patchapplication/octet-stream; name=walwriter_latch_v1_2012_05_02.patchDownload+200-117
#2Tom Lane
tgl@sss.pgh.pa.us
In reply to: Peter Geoghegan (#1)
Re: Latch for the WAL writer - further reducing idle wake-ups.

Peter Geoghegan <peter@2ndquadrant.com> writes:

Attached patch latches up the WAL Writer, reducing wake-ups and thus
saving electricity in a way that is more-or-less analogous to my work
on the BGWriter:
http://git.postgresql.org/gitweb/?p=postgresql.git;a=commit;h=6d90eaaa89a007e0d365f49d6436f35d2392cfeb
I am hoping this gets into 9.2 . I am concious of the fact that this
is quite late, but it the patch addresses an open item, the concluding
part of a much wider feature.

It is getting a bit late to be considering such changes for 9.2, but
I'm willing to review and commit this if there's not anybody who feels
strongly that it's too late. Personally I think it's in the nature of
cleanup and so fair game as long as we haven't formally started beta.
However I will confess to some bias about wanting to get the server's
idle wake-up rate down, because Fedora people have been bugging me
about that for a long time now. So I'm probably not the best person to
objectively evaluate whether we should hold this for 9.3. Comments?

Schedule questions aside, I'm disturbed by this bit:

My choice of XLogInsert() as an additional site at which to call
SetLatch() was one that wasn't taken easily, and frankly I'm not
entirely confident that I couldn't have been just as effective while
placing the SetLatch() call in a less hot, perhaps higher-level
codepath.

Adding any contention at all to XLogInsert doesn't seem like a smart
idea, even if you failed to measure any problem in the specific tests
you made. I wonder whether we could not improve matters by adding
an additional bool "wal_writer_needs_wakening" in the state that's
considered to be protected by WALInsertLock. XLogInsert would check
this while still holding the lock, and only consider that it needs to do
a SetLatch if the flag was set, whereupon it would clear it before
releasing the lock. In the normal case this would add one uncontended
fetch followed by boolean-test-and-jump to the work done while holding
the lock, which should be pretty negligible. Then, the WAL writer would
need to take WALInsertLock to set that flag, but presumably it should
only be doing that when there is no contention for the lock. (In fact,
we could have it do a ConditionalLockAcquire on WALInsertLock for the
purpose, and consider that failure means it shouldn't go to sleep after
all.)

Now this might sound pretty much equivalent to testing the latch's
is_set flag; perhaps it is and I'm worrying over nothing. But I'm
thinking that the wal_writer_needs_wakening flag would be in a cache
line that an acquirer of WALInsertLock would have to get ownership of
anyway, if it is adjacent to variables that XLogInsert has to manipulate
anyway. On the other hand, the WAL writer's process latch would be in
some other cache line that would also need to get passed around a lot,
if it's touched during every XLogInsert.

regards, tom lane

#3Robert Haas
robertmhaas@gmail.com
In reply to: Tom Lane (#2)
Re: Latch for the WAL writer - further reducing idle wake-ups.

On Wed, May 2, 2012 at 7:21 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

It is getting a bit late to be considering such changes for 9.2, but
I'm willing to review and commit this if there's not anybody who feels
strongly that it's too late.  Personally I think it's in the nature of
cleanup and so fair game as long as we haven't formally started beta.
However I will confess to some bias about wanting to get the server's
idle wake-up rate down, because Fedora people have been bugging me
about that for a long time now.  So I'm probably not the best person to
objectively evaluate whether we should hold this for 9.3.  Comments?

Well, I feel that one of the weaknesses of our CommitFest process is
that changes like this (which are really pretty small) end up having
the same deadline as patches that are large (command triggers,
checksums, etc.); in fact, they sometimes end up having an earlier
deadline, because the people doing the big stuff end up continuing to
hack on it for another couple months while the door is shut to smaller
improvements. So I'm not going to object if you feel like slipping
this one in. I looked it over myself and I think it's broadly
reasonable, although I'm not too sure about the particular criteria
chosen for sending the WAL writer to sleep and waking it up again.
And like you I'd like to see some more improvement in this area.

Adding any contention at all to XLogInsert doesn't seem like a smart
idea, even if you failed to measure any problem in the specific tests
you made.  I wonder whether we could not improve matters by adding
an additional bool "wal_writer_needs_wakening" in the state that's
considered to be protected by WALInsertLock.

I am skeptical about this, although it could be right. It could also
be better the way Peter did it; a fetch of an uncontended cache line
is pretty cheap. Another approach - which I think might be better
still - is to not bother kicking the WAL writer and let it wake up
when it wakes up. Maybe have it hibernate for 3 seconds instead of
10, or something like that. It seems unlikely to cause any real
problem if WAL writer takes a couple seconds to get with the program
after a long period of inactivity; note that an async commit will kick
it anyway, and a sync commit will probably half to flush WAL whether
the WAL writer wakes up or not.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

#4Tom Lane
tgl@sss.pgh.pa.us
In reply to: Robert Haas (#3)
Re: Latch for the WAL writer - further reducing idle wake-ups.

Robert Haas <robertmhaas@gmail.com> writes:

... It seems unlikely to cause any real
problem if WAL writer takes a couple seconds to get with the program
after a long period of inactivity; note that an async commit will kick
it anyway, and a sync commit will probably half to flush WAL whether
the WAL writer wakes up or not.

That's a good point. What about only kicking the WAL writer in code
paths where a backend found itself having to write/flush WAL for itself?
The added overhead is very surely negligible in such a situation.

regards, tom lane

#5Robert Haas
robertmhaas@gmail.com
In reply to: Tom Lane (#4)
Re: Latch for the WAL writer - further reducing idle wake-ups.

On Wed, May 2, 2012 at 11:42 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

Robert Haas <robertmhaas@gmail.com> writes:

... It seems unlikely to cause any real
problem if WAL writer takes a couple seconds to get with the program
after a long period of inactivity; note that an async commit will kick
it anyway, and a sync commit will probably half to flush WAL whether
the WAL writer wakes up or not.

That's a good point.  What about only kicking the WAL writer in code
paths where a backend found itself having to write/flush WAL for itself?
The added overhead is very surely negligible in such a situation.

Yeah, I think that would make sense, though I'd probably still argue
for a hibernation period not quite so long as ten seconds. Actually,
what I'd really like is for this to be adaptive: if we find that
there's no WAL to write, increase the time until the next wakeup by 10
ms until we hit the maximum of, say, 3 seconds. If we find that there
is WAL to write, cut the time until the next wakeup in half until we
hit a minimum of, say, 20ms. And, if we're forced to write/flush WAL
ourselves, or we async commit, kick the WAL writer in the pants and
wake him up right away. That way we're willing to get
super-aggressive when needed, but we don't stay there very long once
the pounding ends. Also, we avoid having a hard "cut" between regular
sleeps and deep hibernation; instead, we kind of gradually drift off.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

#6Heikki Linnakangas
heikki.linnakangas@enterprisedb.com
In reply to: Robert Haas (#3)
Re: Latch for the WAL writer - further reducing idle wake-ups.

On 03.05.2012 03:41, Robert Haas wrote:

On Wed, May 2, 2012 at 7:21 PM, Tom Lane<tgl@sss.pgh.pa.us> wrote:

Adding any contention at all to XLogInsert doesn't seem like a smart
idea, even if you failed to measure any problem in the specific tests
you made. I wonder whether we could not improve matters by adding
an additional bool "wal_writer_needs_wakening" in the state that's
considered to be protected by WALInsertLock.

I am skeptical about this, although it could be right. It could also
be better the way Peter did it; a fetch of an uncontended cache line
is pretty cheap.

I'm very wary of adding any extra shared memory accesses to XLogInsert.
I spent a lot of time trying to eliminate them in my XLogInsert scaling
patch. It might be ok if the flag is usually not modified, and we don't
add any extra barrier instructions in there, but it would be better to
avoid it.

One simple idea would be to only try to set the latch every 100
XLogInsert calls in the backend. That would cut whatever contention it
might cause by a factor of 100, making it negligible.

Another approach - which I think might be better
still - is to not bother kicking the WAL writer and let it wake up
when it wakes up. Maybe have it hibernate for 3 seconds instead of
10, or something like that. It seems unlikely to cause any real
problem if WAL writer takes a couple seconds to get with the program
after a long period of inactivity; note that an async commit will kick
it anyway, and a sync commit will probably half to flush WAL whether
the WAL writer wakes up or not.

Yeah, that'd be even simpler.

--
Heikki Linnakangas
EnterpriseDB http://www.enterprisedb.com

#7Magnus Hagander
magnus@hagander.net
In reply to: Robert Haas (#3)
Re: Latch for the WAL writer - further reducing idle wake-ups.

On Thu, May 3, 2012 at 2:41 AM, Robert Haas <robertmhaas@gmail.com> wrote:

On Wed, May 2, 2012 at 7:21 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

It is getting a bit late to be considering such changes for 9.2, but
I'm willing to review and commit this if there's not anybody who feels
strongly that it's too late.  Personally I think it's in the nature of
cleanup and so fair game as long as we haven't formally started beta.
However I will confess to some bias about wanting to get the server's
idle wake-up rate down, because Fedora people have been bugging me
about that for a long time now.  So I'm probably not the best person to
objectively evaluate whether we should hold this for 9.3.  Comments?

Well, I feel that one of the weaknesses of our CommitFest process is
that changes like this (which are really pretty small) end up having
the same deadline as patches that are large (command triggers,
checksums, etc.); in fact, they sometimes end up having an earlier
deadline, because the people doing the big stuff end up continuing to
hack on it for another couple months while the door is shut to smaller
improvements.  So I'm not going to object if you feel like slipping
this one in.  I looked it over myself and I think it's broadly
reasonable, although I'm not too sure about the particular criteria
chosen for sending the WAL writer to sleep and waking it up again.
And like you I'd like to see some more improvement in this area.

I agree that it's ok to slip it in given that it's "finishing off a
patch from earlier". I think it's reasonable to hold it to a little
bit higher review stadards since it's that late in the cycle though,
such as two people reviewing it before it goes in (or 1 reviewer + 1
committer - and of course, unless it's a truly trivial patch). Which
it seems you both are doing now, so that makes it ok ;)

--
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/

In reply to: Magnus Hagander (#7)
Re: Latch for the WAL writer - further reducing idle wake-ups.

On 3 May 2012 10:56, Magnus Hagander <magnus@hagander.net> wrote:

I agree that it's ok to slip it in given that it's "finishing off a
patch from earlier". I think it's reasonable to hold it to a little
bit higher review stadards since it's that late in the cycle though,
such as two people reviewing it before it goes in (or 1 reviewer + 1
committer - and of course, unless it's a truly trivial patch). Which
it seems you both are doing now, so that makes it ok ;)

Right. It's a simple, largely mechanical patch, that doesn't have any
behavioural changes, and is of strategic importance, so I thought it
was worthy of special consideration, without actually expecting it.

Attached patch removes the questionable SetLatch() call, under the
assumption that it's okay if the WALWriter, having entered hibernation
due to sustained inactivity (10 seconds) subsequently takes up to 5
seconds (2.5 on average) to notice that it has work to do. These
values may need to be tweaked. I have not bothered with making the
sleep time adaptive, because it's probably too late to do that.

This latest revision also covers the checkpointer. The code for that
is far simpler than that for the WAL Writer, so it doesn't
particularly feel like I'm pushing my luck by slipping that into
something to be slipped in. I should not have excluded it before,
since it accounts for another 2 wake-ups per second. All told, this
new revision sees Postgres wake-ups stabilise at 0.9 per second. With
the checkpointer code included, we roundly beat MySQL in this area,
which will be a nice advocacy message for 9.2, though I probably
shouldn't be quoted on that until I get the opportunity to go back and
make absolutely sure that I've been fair.

--
Peter Geoghegan       http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training and Services

Attachments:

walwriter_latch_v2_2012_05_04.patchapplication/octet-stream; name=walwriter_latch_v2_2012_05_04.patchDownload+244-127
#9Tom Lane
tgl@sss.pgh.pa.us
In reply to: Peter Geoghegan (#8)
Re: Latch for the WAL writer - further reducing idle wake-ups.

Peter Geoghegan <peter@2ndquadrant.com> writes:

This latest revision also covers the checkpointer. The code for that
is far simpler than that for the WAL Writer, so it doesn't
particularly feel like I'm pushing my luck by slipping that into
something to be slipped in.

Well ... maybe, or maybe not, or maybe you are just poking at a sore
spot that was already created by the patch to make a separate
checkpointer process. What bothers me in looking at this is that the
main loop of the checkpointer includes an AbsorbFsyncRequests() call,
which is now the only wakeup condition that isn't covered by latch
logic or a predictable time delay. A long sleep period could easily
result in overflow of the fsync request queue, which is not good for
performance. I'm inclined to think that we'd better add logic to
ForwardFsyncRequest() to set the latch once the queue is, say, more
than half full.

I also notice that the separate-checkpointer patch failed to rename
assorted things like BgWriterCommLock, BgWriterRequest,
BgWriterShmemStruct, which are all 100% inappropriately named now.
And it still contains various obsolete comments referring to itself
as the background writer. Will see about cleaning that up.

regards, tom lane

#10Simon Riggs
simon@2ndQuadrant.com
In reply to: Tom Lane (#9)
Re: Latch for the WAL writer - further reducing idle wake-ups.

On 7 May 2012 18:09, Tom Lane <tgl@sss.pgh.pa.us> wrote:

Peter Geoghegan <peter@2ndquadrant.com> writes:

This latest revision also covers the checkpointer. The code for that
is far simpler than that for the WAL Writer, so it doesn't
particularly feel like I'm pushing my luck by slipping that into
something to be slipped in.

Well ... maybe, or maybe not, or maybe you are just poking at a sore
spot that was already created by the patch to make a separate
checkpointer process.  What bothers me in looking at this is that the
main loop of the checkpointer includes an AbsorbFsyncRequests() call,
which is now the only wakeup condition that isn't covered by latch
logic or a predictable time delay.  A long sleep period could easily
result in overflow of the fsync request queue, which is not good for
performance.  I'm inclined to think that we'd better add logic to
ForwardFsyncRequest() to set the latch once the queue is, say, more
than half full.

OK

I also notice that the separate-checkpointer patch failed to rename
assorted things like BgWriterCommLock, BgWriterRequest,
BgWriterShmemStruct, which are all 100% inappropriately named now.
And it still contains various obsolete comments referring to itself
as the background writer.  Will see about cleaning that up.

For want of a better name, keeping them the same seemed best.

If you have a suggested name change, I'd be happy to oblige.

--
 Simon Riggs                   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

#11Tom Lane
tgl@sss.pgh.pa.us
In reply to: Simon Riggs (#10)
Re: Latch for the WAL writer - further reducing idle wake-ups.

Simon Riggs <simon@2ndQuadrant.com> writes:

On 7 May 2012 18:09, Tom Lane <tgl@sss.pgh.pa.us> wrote:

I also notice that the separate-checkpointer patch failed to rename
assorted things like BgWriterCommLock, BgWriterRequest,
BgWriterShmemStruct, which are all 100% inappropriately named now.
And it still contains various obsolete comments referring to itself
as the background writer. �Will see about cleaning that up.

For want of a better name, keeping them the same seemed best.

I was just thinking s/BgWriter/Checkpointer/, do you think that's too
long?

regards, tom lane

#12Simon Riggs
simon@2ndQuadrant.com
In reply to: Tom Lane (#11)
Re: Latch for the WAL writer - further reducing idle wake-ups.

On 7 May 2012 19:44, Tom Lane <tgl@sss.pgh.pa.us> wrote:

Simon Riggs <simon@2ndQuadrant.com> writes:

On 7 May 2012 18:09, Tom Lane <tgl@sss.pgh.pa.us> wrote:

I also notice that the separate-checkpointer patch failed to rename
assorted things like BgWriterCommLock, BgWriterRequest,
BgWriterShmemStruct, which are all 100% inappropriately named now.
And it still contains various obsolete comments referring to itself
as the background writer.  Will see about cleaning that up.

For want of a better name, keeping them the same seemed best.

I was just thinking s/BgWriter/Checkpointer/, do you think that's too
long?

CheckpointerCommLock
CheckpointerShmemStruct
work OK

CheckpointerRequest
sounds a little vague, but can be tweaked

It also leaves the situation that we have a catalog view called
pg_stat_bgwriter that would be accessing "checkpointer" things. That's
really the thorny one that I wasn't sure how to handle. Good example
of why we shouldn't expose internals too much.

--
 Simon Riggs                   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

#13Tom Lane
tgl@sss.pgh.pa.us
In reply to: Simon Riggs (#12)
Re: Latch for the WAL writer - further reducing idle wake-ups.

Simon Riggs <simon@2ndQuadrant.com> writes:

It also leaves the situation that we have a catalog view called
pg_stat_bgwriter that would be accessing "checkpointer" things. That's
really the thorny one that I wasn't sure how to handle. Good example
of why we shouldn't expose internals too much.

Yeah, that's a bit unfortunate but changing it doesn't seem like a good
idea. The names I intended to change are all internal.

regards, tom lane

#14Simon Riggs
simon@2ndQuadrant.com
In reply to: Tom Lane (#13)
Re: Latch for the WAL writer - further reducing idle wake-ups.

On 7 May 2012 20:06, Tom Lane <tgl@sss.pgh.pa.us> wrote:

Simon Riggs <simon@2ndQuadrant.com> writes:

It also leaves the situation that we have a catalog view called
pg_stat_bgwriter that would be accessing "checkpointer" things. That's
really the thorny one that I wasn't sure how to handle. Good example
of why we shouldn't expose internals too much.

Yeah, that's a bit unfortunate but changing it doesn't seem like a good
idea.  The names I intended to change are all internal.

OK, I'll change just the internal names.

--
 Simon Riggs                   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

#15Tom Lane
tgl@sss.pgh.pa.us
In reply to: Peter Geoghegan (#8)
Re: Latch for the WAL writer - further reducing idle wake-ups.

Peter Geoghegan <peter@2ndquadrant.com> writes:

Attached patch removes the questionable SetLatch() call, under the
assumption that it's okay if the WALWriter, having entered hibernation
due to sustained inactivity (10 seconds) subsequently takes up to 5
seconds (2.5 on average) to notice that it has work to do. These
values may need to be tweaked. I have not bothered with making the
sleep time adaptive, because it's probably too late to do that.

Now that I've actually read the patch, rather than just responding to
your description of it, I find myself entirely unhappy with the proposed
changes in the walwriter's sleep logic. You have introduced race
conditions (it is NOT okay to reset the latch somewhere below the top of
the loop) and degraded the walwriter's signal response time in normal
non-hibernation state, to solve a problem not in evidence; to wit that
backends spend too much time signaling the walwriter. Given the
location of the only existing SetLatch call for the purpose, I find that
proposition more than a bit doubtful. I see little or no value in
trying to keep the walwriter's procLatch set when it's not hibernating,
and I definitely don't think it is worth the risk of introducing bugs
when we're about to start beta. I'm intending to rip all that out and
go back to the plain "ResetLatch at the top of the loop, WaitLatch at
the bottom" design, with the hibernation logic just controlling the
timeout on the WaitLatch call.

regards, tom lane

In reply to: Tom Lane (#15)
Re: Latch for the WAL writer - further reducing idle wake-ups.

On 8 May 2012 22:35, Tom Lane <tgl@sss.pgh.pa.us> wrote:

Now that I've actually read the patch, rather than just responding to
your description of it, I find myself entirely unhappy with the proposed
changes in the walwriter's sleep logic.  You have introduced race
conditions (it is NOT okay to reset the latch somewhere below the top of
the loop)

Yes, there is some checking of flags before the potential ResetLatch()
call, which may be acted on. The code there is almost identical to
that of the extant bgwriter code. I was under the impression that this
did not amount to a race, though it's rather late now, and I'm feeling
under the weather, so I have not taken steps to verify that I have it
right. Arguably, you'd want somebody's SetLatch call to be ignored if

and degraded the walwriter's signal response time in normal
non-hibernation state, to solve a problem not in evidence; to wit that
backends spend too much time signaling the walwriter.  Given the
location of the only existing SetLatch call for the purpose, I find that
proposition more than a bit doubtful.

I do too. The elaborate logic to reduce that overhead was nothing more
than a vestige of the first version. I apologise for making such a
basic error.

--
Peter Geoghegan       http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training and Services

#17Tom Lane
tgl@sss.pgh.pa.us
In reply to: Peter Geoghegan (#16)
Re: Latch for the WAL writer - further reducing idle wake-ups.

Peter Geoghegan <peter@2ndquadrant.com> writes:

On 8 May 2012 22:35, Tom Lane <tgl@sss.pgh.pa.us> wrote:

Now that I've actually read the patch, rather than just responding to
your description of it, I find myself entirely unhappy with the proposed
changes in the walwriter's sleep logic. �You have introduced race
conditions (it is NOT okay to reset the latch somewhere below the top of
the loop)

Yes, there is some checking of flags before the potential ResetLatch()
call, which may be acted on. The code there is almost identical to
that of the extant bgwriter code.

Um, yes, I noticed that shortly after sending my previous message.
I'm pretty unhappy about the current state of the bgwriter loop, too.
I rather wonder whether that coding explains the "postmaster failed to
shut down" errors that we've been seeing lately in the buildfarm.
Not noticing a shutdown signal promptly would go a long way towards
explaining that.

regards, tom lane

In reply to: Peter Geoghegan (#16)
Re: Latch for the WAL writer - further reducing idle wake-ups.

On 9 May 2012 00:21, Peter Geoghegan <peter@2ndquadrant.com> wrote:

Yes, there is some checking of flags before the potential ResetLatch()
call, which may be acted on. The code there is almost identical to
that of the extant bgwriter code. I was under the impression that this
did not amount to a race, though it's rather late now, and I'm feeling
under the weather, so I have not taken steps to verify that I have it
right. Arguably, you'd want somebody's SetLatch call to be ignored if

Sent too early. That should be: Arguably, you'd want somebody's
SetLatch call to be ignored under the circumstances that that could
happen in both the bgwriter, and the WALWriter within my recent patch.

--
Peter Geoghegan       http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training and Services

#19Tom Lane
tgl@sss.pgh.pa.us
In reply to: Peter Geoghegan (#18)
Re: Latch for the WAL writer - further reducing idle wake-ups.

I've applied the walwriter/checkpointer patch, with the mentioned
re-simplification of the logic. While measuring that, I noticed that
the stats collector was now the biggest repeated-wakeup culprit, so
I took the time to latch-ify it as well. AFAICS we no longer have
anything that wakes up oftener than once every five seconds when the
system is idle, so life is looking pretty good in powertop land.

regards, tom lane

#20Tom Lane
tgl@sss.pgh.pa.us
In reply to: Tom Lane (#19)
Re: Latch for the WAL writer - further reducing idle wake-ups.

On further reflection I've realized that there's a really unpleasant
consequence of the walwriter change as-committed: it breaks the former
guarantee that async commits would reach disk within at most 3 times
the WalWriterDelay setting. They will still get written within at most
3 walwriter cycles, but a lone async commit occurring when the writer
is hibernating could see a delay much longer than before. This seems to
me to be unacceptable. Probably nobody cares that much about the exact
multiplier of 3, but if the delay could be an order of magnitude or two
more than that, that's going to make users of async commits unhappy.

So what we need here is for XLogSetAsyncXactLSN() to be able to boot the
walwriter out of hibernate mode. I still don't care in the least for
the original hack of using the state of the procLatch to indicate
whether the walwriter is hibernating, but we can add a separate flag
instead so as to avoid having every trip through XLogSetAsyncXactLSN
do a SetLatch call (which would be bad anyway since it would prevent
the walwriter from sleeping normally). Since XLogSetAsyncXactLSN
has to take the info_lck anyway, we might as well make this new flag
be protected by info_lck. The walwriter won't need to change the
flag's state very often, by definition, so that end of it isn't going
to cost anything noticeable.

regards, tom lane

#21Tom Lane
tgl@sss.pgh.pa.us
In reply to: Peter Geoghegan (#18)
#22Tom Lane
tgl@sss.pgh.pa.us
In reply to: Tom Lane (#21)
#23Heikki Linnakangas
heikki.linnakangas@enterprisedb.com
In reply to: Tom Lane (#22)
#24Tom Lane
tgl@sss.pgh.pa.us
In reply to: Heikki Linnakangas (#23)