Doc patch making firm recommendation for setting the value of commit_delay

Started by Peter Geogheganabout 13 years ago24 messages

peter@2ndquadrant.com

about 13 years ago

1 attachment(s)

Some of you will be aware that I've tried to formulate a good general
recommendation for setting the value of commit_delay, in light of the
improvements made in 9.3. I previously asked for help finding a
methodology for optimising commit_delay [1]http://archives.postgresql.org/pgsql-performance/2012-08/msg00003.php. The documentation related
to commit_delay still says that we don't know what might work well,
though I don't think that's true any more.

I found the time to do some benchmarks of my own - Greg Smith made
available a server that he frequently uses for benchmarks. It was
previously my observation that "half of raw single-page sync time as
reported by pg_test_fsync for you wal_sync_method" worked best for
commit_delay. I went so far as to modify pg_test_fsync to output
average time per op in microseconds for each operation with
commit_delay in mind, which is a patch that has already been committed
[2]: http://git.postgresql.org/gitweb/?p=postgresql.git;a=commit;h=82e429794b348cd80c1d1b011e21ffac98bc6e88
a slowish fsync of about 8 milliseconds (8,000 microseconds), for
which a commit_delay setting of 4,000 (microseconds) seemed to clearly
work best, on both insert and tpc-b benchmarks [3]http://pgeoghegan.blogspot.com/2012/06/towards-14000-write-transactions-on-my.html.

This server has an Intel 320 SSD series. Greg has already written
quite a bit about this drive [4]http://blog.2ndquadrant.com/intel_ssds_lifetime_and_the_32/ for unrelated reasons. For the SSD,
results of an insert-based pgbench-tools run are shown, with
commit_delay at 0, 20 and 59 (determined by following my general
advise above):

http://commit-delay-results-ssd-insert.staticloud.com

I also looked at a 3-disk RAID0 of 7200RPM drives connected through a
512MB battery-backed write cache (Areca controller), again using
insert.sql:

http://commit-delay-stripe-insert.staticloud.com

In addition to a tpc-b benchmark with the same data directory on the
same 3-disk stripe:

http://commit-delay-results-stripe-tpcb.staticloud.com

I used the same postgresql.conf in all cases, which you can see for
each report, and did the usual median-of-three thing in all cases
(though each run lasted 120 seconds in all cases, not the default 60
seconds). Settings used for one particular pgbench run can be viewed
here (though they're all exactly the same anyway):

http://commit-delay-results-ssd-insert.staticloud.com/19/pg_settings.txt

Attached is a doc-patch that makes recommendations that are consistent
with my observations about what works best here. I'd like to see us
making *some* recommendation - for sympathetic cases, setting
commit_delay appropriately can make a very large difference to
transaction throughput. Such sympathetic cases - many small write
transactions - are something that tends to be seen relatively
frequently with web applications, that disproportionately use cloud
hosting. It isn't at all uncommon for these cases to be highly bound
by their commit rate, and so it is compelling to try to amortize the
cost of a flush as effectively as possible there. It would be
unfortunate if no one was even aware that commit_delay is now useful
for these cases, since the setting allows cloud hosting providers to
help these cases quite a bit, without having to do something like
compromise durability, which in general isn't acceptable.

The point of all these benchmarks isn't to show how effective
commit_delay now is, or can be - we already had that discussion months
ago, before the alteration to its behaviour was committed. The point
is to put my proposed doc changes in the appropriate context, so that
I can build confidence that they're balanced and helpful, by showing
cases that are not so sympathetic.

Thoughts?

[1]: http://archives.postgresql.org/pgsql-performance/2012-08/msg00003.php

[2]: http://git.postgresql.org/gitweb/?p=postgresql.git;a=commit;h=82e429794b348cd80c1d1b011e21ffac98bc6e88

[3]: http://pgeoghegan.blogspot.com/2012/06/towards-14000-write-transactions-on-my.html

[4]: http://blog.2ndquadrant.com/intel_ssds_lifetime_and_the_32/

--
Peter Geoghegan http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training and Services

Attachments:

commit_delay_doc.2012_11_14.patchapplication/octet-stream; name=commit_delay_doc.2012_11_14.patchDownload

diff doc/src/sgml/wal.sgml
index fc5c3b2..92619dd
*** a/doc/src/sgml/wal.sgml
--- b/doc/src/sgml/wal.sgml
***************
*** 375,382 ****
     just before a synchronous commit attempts to flush
     <acronym>WAL</acronym> to disk, in the hope that a single flush
     executed by one such transaction can also serve other transactions
!    committing at about the same time.  Setting <varname>commit_delay</varname>
!    can only help when there are many concurrently committing transactions.
    </para>
  
   </sect1>
--- 375,397 ----
     just before a synchronous commit attempts to flush
     <acronym>WAL</acronym> to disk, in the hope that a single flush
     executed by one such transaction can also serve other transactions
!    committing at about the same time.  The setting can be thought of
!    as a way of increasing the time window in which other transactions
!    can join a backlog participating in a single flush, to amortize the
!    cost of the flush among multiple transactions.  When
!    <varname>commit_delay</varname> is set to zero (microseconds), the
!    default, the number of sessions that join this group commit backlog
!    is dictated entirely by the number of sessions that happen to reach
!    the stage where they need to flush, during the window in which the
!    previous, currently executing flush operation (if any) takes to
!    complete, which tends to only be effective at higher client counts
!    where the window is naturally wider.  Setting
!    <varname>commit_delay</varname> can only help when there are some
!    concurrently committing transactions, and throughput is limited to
!    some degree by commit rate, but in the event of a high rotational
!    latency the setting can be effective in increasing transaction
!    throughput with as few as two clients (that is, a single committing
!    client with one sibling transaction).
    </para>
  
   </sect1>
***************
*** 560,570 ****
     is not enabled, or if fewer than <xref linkend="guc-commit-siblings">
     other sessions are currently in active transactions; this avoids
     sleeping when it's unlikely that any other session will commit soon.
!    Note that on most platforms, the resolution of a sleep request is
     ten milliseconds, so that any nonzero <varname>commit_delay</varname>
     setting between 1 and 10000 microseconds would have the same effect.
!    Good values for these parameters are not yet clear; experimentation
!    is encouraged.
    </para>
  
    <para>
--- 575,607 ----
     is not enabled, or if fewer than <xref linkend="guc-commit-siblings">
     other sessions are currently in active transactions; this avoids
     sleeping when it's unlikely that any other session will commit soon.
!    Note that on some platforms, the resolution of a sleep request is
     ten milliseconds, so that any nonzero <varname>commit_delay</varname>
     setting between 1 and 10000 microseconds would have the same effect.
!   </para>
! 
!   <para>
!    Since the purpose of <varname>commit_delay</varname> is to allow
!    the cost of each flush operation to be more effectively amortized
!    across concurrently committing transactions (potentially at the
!    expense of transaction latency), it is necessary to quantify that
!    cost when altering the setting.  The higher that cost is, the more
!    effective <varname>commit_delay</varname> is expected to be in
!    increasing transaction throughput.  The
!    <xref linkend="pgtestfsync"> module can be used to measure the
!    average time in microseconds that a single WAL flush operation
!    takes.  A value of half of the average time the module reports it
!    takes to flush after a single 8kB write operation is often the most
!    effective setting for <varname>commit_delay</varname>.  The benefit
!    of tuning <varname>commit_delay</varname> can even be pronounced on
!    storage media with very fast sync times, such as solid-state drives
!    or RAID arrays with a battery-backed write cache.  However, higher
!    values of <varname>commit_siblings</varname> should be used is such
!    cases, whereas smaller <varname>commit_siblings</varname> values
!    can be helpful on higher latency media.  Note that it is quite
!    possible that a setting of <varname>commit_delay</varname> that is
!    too high can increase transaction latency by so much that total
!    transaction throughput suffers.
    </para>
  
    <para>

David Rowley

dgrowleyml@gmail.com

about 13 years ago

In reply to: Peter Geoghegan (#1)

Re: Doc patch making firm recommendation for setting the value of commit_delay

-----Original Message-----
From: pgsql-hackers-owner@postgresql.org [mailto:pgsql-hackers-
owner@postgresql.org] On Behalf Of Peter Geoghegan
Sent: 15 November 2012 09:44
To: PG Hackers
Subject: [HACKERS] Doc patch making firm recommendation for setting the
value of commit_delay

Some of you will be aware that I've tried to formulate a good general
recommendation for setting the value of commit_delay, in light of the
improvements made in 9.3. I previously asked for help finding a

methodology

for optimising commit_delay [1]. The documentation related to
commit_delay still says that we don't know what might work well, though I
don't think that's true any more.

I found the time to do some benchmarks of my own - Greg Smith made
available a server that he frequently uses for benchmarks. It was

previously

my observation that "half of raw single-page sync time as reported by
pg_test_fsync for you wal_sync_method" worked best for commit_delay. I
went so far as to modify pg_test_fsync to output average time per op in
microseconds for each operation with commit_delay in mind, which is a

patch

that has already been committed [2]. This general approach worked really
well on my laptop, which has a slowish fsync of about 8 milliseconds

(8,000

microseconds), for which a commit_delay setting of 4,000 (microseconds)
seemed to clearly work best, on both insert and tpc-b benchmarks [3].

<snip>

Thoughts?

I think for sure, since the GUC maintained its original name, that the docs
need to be updated to let people know the background behaviour has changed
and may now be far more useful.
I've read through the patch. Only thing I see out of place is a small typo:

! values of <varname>commit_siblings</varname> should be used is such
cases,

Should probably read

! values of <varname>commit_siblings</varname> should be used *in* such
cases,

Thanks for doing all the benchmarks too. Good to see such a range of
different hardware tested.

Regards

David Rowley

Albe Laurenz

laurenz.albe@wien.gv.at

about 13 years ago

In reply to: Peter Geoghegan (#1)

Re: Doc patch making firm recommendation for setting the value of commit_delay

Peter Geoghegan wrote:

Some of you will be aware that I've tried to formulate a good general
recommendation for setting the value of commit_delay, in light of the
improvements made in 9.3. I previously asked for help finding a
methodology for optimising commit_delay [1]. The documentation related
to commit_delay still says that we don't know what might work well,
though I don't think that's true any more.

I found the time to do some benchmarks of my own - Greg Smith made
available a server that he frequently uses for benchmarks. It was
previously my observation that "half of raw single-page sync time as
reported by pg_test_fsync for you wal_sync_method" worked best for
commit_delay. I went so far as to modify pg_test_fsync to output
average time per op in microseconds for each operation with
commit_delay in mind, which is a patch that has already been committed
[2].

[...]

Attached is a doc-patch that makes recommendations that are consistent
with my observations about what works best here. I'd like to see us
making *some* recommendation - for sympathetic cases, setting
commit_delay appropriately can make a very large difference to
transaction throughput. Such sympathetic cases - many small write
transactions - are something that tends to be seen relatively
frequently with web applications, that disproportionately use cloud
hosting. It isn't at all uncommon for these cases to be highly bound
by their commit rate, and so it is compelling to try to amortize the
cost of a flush as effectively as possible there. It would be
unfortunate if no one was even aware that commit_delay is now useful
for these cases, since the setting allows cloud hosting providers to
help these cases quite a bit, without having to do something like
compromise durability, which in general isn't acceptable.

The point of all these benchmarks isn't to show how effective
commit_delay now is, or can be - we already had that discussion months
ago, before the alteration to its behaviour was committed. The point
is to put my proposed doc changes in the appropriate context, so that
I can build confidence that they're balanced and helpful, by showing
cases that are not so sympathetic.

Thoughts?

If there is an agreement that half the sync time as reported by
pg_test_fsync is a good value, would it make sense to have initdb test
sync time and preset commit_delay?

Yours,
Laurenz Albe

Greg Smith

greg@2ndQuadrant.com

about 13 years ago

In reply to: Albe Laurenz (#3)

Re: Doc patch making firm recommendation for setting the value of commit_delay

On 11/15/12 12:19 AM, Albe Laurenz wrote:

If there is an agreement that half the sync time as reported by
pg_test_fsync is a good value, would it make sense to have initdb test
sync time and preset commit_delay?

Peter has validated this against a good range of systems, but it would
be optimistic to say it's a universal idea.

My main concern with this would be the relatively common practice of
moving the pg_xlog directory after initdb time. Sometimes people don't
know about the option to have initdb move it. Sometimes the drive to
hold pg_xlog isn't available when the database is originally created
yet. And the camp I fall into (which admittedly is the group who can
take care of this on their own) will move pg_xlog manually and symlink
it on their own, because that's what we're used to.

I would rather see this just turn into one of the things a more general
tuning tool knew how to do, executing against a fully setup system.
Having a useful implementation of commit_delay and useful docs on it
seems like enough of a jump forward for one release. Moving fully into
auto-tuning before getting more field feedback on how that works out is
pretty aggressive.

--
Greg Smith 2ndQuadrant US greg@2ndQuadrant.com Baltimore, MD
PostgreSQL Training, Services, and 24x7 Support www.2ndQuadrant.com

Craig Ringer

craig@2ndQuadrant.com

about 13 years ago

In reply to: Greg Smith (#4)

Re: Doc patch making firm recommendation for setting the value of commit_delay

On 11/15/2012 04:56 PM, Greg Smith wrote:

I would rather see this just turn into one of the things a more
general tuning tool knew how to do, executing against a fully setup
system. Having a useful implementation of commit_delay and useful docs
on it seems like enough of a jump forward for one release. Moving
fully into auto-tuning before getting more field feedback on how that
works out is pretty aggressive.

It'll also potentially make it harder to get reproducible results in
benchmarking and testing across repeated runs, cause confusion when
someone relocates a DB or changes hardware, and slow down initdb (and
thus testing).

I'd be all for making it part of a "test my hardware and tune my DB"
tool, but not such a fan of doing it at initdb time. Making initdb less
predictable and more complicated sounds like asking for trouble.

--
Craig Ringer http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

Magnus Hagander

magnus@hagander.net

about 13 years ago

In reply to: Greg Smith (#4)

Re: Doc patch making firm recommendation for setting the value of commit_delay

On Thu, Nov 15, 2012 at 9:56 AM, Greg Smith <greg@2ndquadrant.com> wrote:

On 11/15/12 12:19 AM, Albe Laurenz wrote:

If there is an agreement that half the sync time as reported by
pg_test_fsync is a good value, would it make sense to have initdb test
sync time and preset commit_delay?

Peter has validated this against a good range of systems, but it would be
optimistic to say it's a universal idea.

My main concern with this would be the relatively common practice of moving
the pg_xlog directory after initdb time. Sometimes people don't know about
the option to have initdb move it. Sometimes the drive to hold pg_xlog
isn't available when the database is originally created yet. And the camp I
fall into (which admittedly is the group who can take care of this on their
own) will move pg_xlog manually and symlink it on their own, because that's
what we're used to.

An even more common usecase for this, I think, is "I installed from a
package that ran initdb for me".. I still think manual moving of
pg_xlog is a lot more common than using the initdb option in general.

I would rather see this just turn into one of the things a more general
tuning tool knew how to do, executing against a fully setup system. Having a
useful implementation of commit_delay and useful docs on it seems like
enough of a jump forward for one release. Moving fully into auto-tuning
before getting more field feedback on how that works out is pretty
aggressive.

+1.

--
Magnus Hagander
Me: http://www.hagander.net/
Work: http://www.redpill-linpro.com/

Peter Geoghegan

peter@2ndquadrant.com

about 13 years ago

In reply to: Magnus Hagander (#6)

Re: Doc patch making firm recommendation for setting the value of commit_delay

On 15 November 2012 10:04, Magnus Hagander <magnus@hagander.net> wrote:

I would rather see this just turn into one of the things a more general
tuning tool knew how to do, executing against a fully setup system. Having a
useful implementation of commit_delay and useful docs on it seems like
enough of a jump forward for one release. Moving fully into auto-tuning
before getting more field feedback on how that works out is pretty
aggressive.

+1.

I am inclined to agree. I did attempt to use the instrumentation
macros to have commit_delay set adaptively at runtime, which would
have at least addressed this concern, but that just didn't work as
well as this.

--
Peter Geoghegan http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training and Services

Peter Geoghegan

peter@2ndquadrant.com

about 13 years ago

In reply to: Greg Smith (#4)

Re: Doc patch making firm recommendation for setting the value of commit_delay

On 15 November 2012 08:56, Greg Smith <greg@2ndquadrant.com> wrote:

My main concern with this would be the relatively common practice of moving
the pg_xlog directory after initdb time.

I probably should have increased the default number of seconds that
pg_test_fsync runs for, in light of the fact that that can make an
appreciable difference when following this method. I'd suggest that a
value of 5 be used there. I think we should just change that, since
everyone will just use the default anyway (or, the more cautious ones
will use a higher value than the default, if anything).

--
Peter Geoghegan http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training and Services

Stephen Frost

sfrost@snowman.net

about 13 years ago

In reply to: Peter Geoghegan (#7)

Re: Doc patch making firm recommendation for setting the value of commit_delay

* Peter Geoghegan (peter@2ndquadrant.com) wrote:

I am inclined to agree. I did attempt to use the instrumentation
macros to have commit_delay set adaptively at runtime, which would
have at least addressed this concern, but that just didn't work as
well as this.

Setting it at run-time was actually my first thought on this. I'm
disappointed to hear that it didn't work out well.

Thanks,

Stephen

#10

Peter Geoghegan

peter@2ndquadrant.com

about 13 years ago

In reply to: Stephen Frost (#9)

Re: Doc patch making firm recommendation for setting the value of commit_delay

On 16 November 2012 01:20, Stephen Frost <sfrost@snowman.net> wrote:

Setting it at run-time was actually my first thought on this. I'm
disappointed to hear that it didn't work out well.

I actually did quite a lot of research on it, that was interesting,
but ultimately didn't lead to any breakthroughs. There was a couple of
research papers written on this subject in the late 80s and early 90s
that are cited in the transaction processing literature relatively
frequently, but the models involved aren't terribly applicable to
modern hardware.

This is the main paper:

http://www.hpl.hp.com/techreports/tandem/TR-88.1.pdf

This later 1991 paper reads kind of like a critique of the 1988 paper:

http://www.linux-mips.org/pub/linux/mips/people/macro/DEC/DTJ/DTJ107/DTJ107PF.PDF

There doesn't appear to be much of anything in the public domain after
this. I suppose that I could have spent more time pursuing this, but I
very much had the sense of pursuing diminishing returns. If the whole
idea of making the thing adaptive is to make it more accessible to
users, well, they'll still have to turn it on themselves (presumably
it will never be on by default), which is only slightly less effort
than following my advice here.

I suspect my tools - in particular, pgbench - just isn't sophisticated
enough to evaluate the effectiveness of any particular model. I might
have used something like Tsung, which modes workloads stochastically,
instead. However, I just didn't have the time.

--
Peter Geoghegan http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training and Services

#11

Peter Geoghegan

peter@2ndquadrant.com

about 13 years ago

In reply to: Peter Geoghegan (#7)

Re: Doc patch making firm recommendation for setting the value of commit_delay

I've added this to the open commitfest. I think that a formal review
is probably required here.

--
Peter Geoghegan http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training and Services

#12

Noah Misch

noah@leadboat.com

almost 13 years ago

In reply to: Peter Geoghegan (#1)

Re: Doc patch making firm recommendation for setting the value of commit_delay

On Wed, Nov 14, 2012 at 08:44:26PM +0000, Peter Geoghegan wrote:

http://commit-delay-results-ssd-insert.staticloud.com
http://commit-delay-stripe-insert.staticloud.com
http://commit-delay-results-stripe-tpcb.staticloud.com
http://commit-delay-results-ssd-insert.staticloud.com/19/pg_settings.txt

staticloud.com seems to be gone. Would you repost these?

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#13

Peter Geoghegan

peter@2ndquadrant.com

almost 13 years ago

In reply to: Noah Misch (#12)

Re: Doc patch making firm recommendation for setting the value of commit_delay

On 19 January 2013 20:38, Noah Misch <noah@leadboat.com> wrote:

staticloud.com seems to be gone. Would you repost these?

I've pushed these to a git repo, hosted on github.

https://github.com/petergeoghegan/commit_delay_benchmarks

I'm sorry that I didn't take the time to make the html benchmarks
easily viewable within a browser on this occasion.

--
Peter Geoghegan http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training and Services

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#14

Noah Misch

noah@leadboat.com

almost 13 years ago

In reply to: Peter Geoghegan (#13)

Re: Doc patch making firm recommendation for setting the value of commit_delay

On Mon, Jan 21, 2013 at 12:23:21AM +0000, Peter Geoghegan wrote:

On 19 January 2013 20:38, Noah Misch <noah@leadboat.com> wrote:

staticloud.com seems to be gone. Would you repost these?

I've pushed these to a git repo, hosted on github.

https://github.com/petergeoghegan/commit_delay_benchmarks

I'm sorry that I didn't take the time to make the html benchmarks
easily viewable within a browser on this occasion.

That's plenty convenient; thanks.

What filesystem did you use for testing? Would you also provide /proc/cpuinfo
or a rough description of the system's CPUs?

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#15

Peter Geoghegan

peter@2ndquadrant.com

almost 13 years ago

In reply to: Noah Misch (#14)

Re: Doc patch making firm recommendation for setting the value of commit_delay

On 21 January 2013 13:10, Noah Misch <noah@leadboat.com> wrote:

What filesystem did you use for testing? Would you also provide /proc/cpuinfo
or a rough description of the system's CPUs?

Unfortunately, I don't have access to that server at the moment. It's
under Greg Smith's control. I believe you yourself had an account on
this server at one point. I do know that the CPU is an Intel Core
i7-870:

http://ark.intel.com/products/41315/Intel-Core-i7-870-Processor-8M-Cache-2_93-GHz

I am pretty sure that the filesystem that the various block devices
were mounted with was ext4 (without LVM), but it might have been XFS.
I don't recall. The operating system was Debian Lenny.

--
Peter Geoghegan http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training and Services

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#16

Noah Misch

noah@leadboat.com

almost 13 years ago

In reply to: Peter Geoghegan (#1)

Re: Doc patch making firm recommendation for setting the value of commit_delay

Hi Peter,

I took a look at this patch and the benchmarks you've furnished:

https://github.com/petergeoghegan/commit_delay_benchmarks
http://pgeoghegan.blogspot.com/2012/06/towards-14000-write-transactions-on-my.html

On Wed, Nov 14, 2012 at 08:44:26PM +0000, Peter Geoghegan wrote:

Attached is a doc-patch that makes recommendations that are consistent
with my observations about what works best here. I'd like to see us
making *some* recommendation - for sympathetic cases, setting
commit_delay appropriately can make a very large difference to
transaction throughput. Such sympathetic cases - many small write
transactions - are something that tends to be seen relatively
frequently with web applications, that disproportionately use cloud
hosting. It isn't at all uncommon for these cases to be highly bound
by their commit rate, and so it is compelling to try to amortize the
cost of a flush as effectively as possible there. It would be
unfortunate if no one was even aware that commit_delay is now useful
for these cases, since the setting allows cloud hosting providers to
help these cases quite a bit, without having to do something like
compromise durability, which in general isn't acceptable.

Your fast-fsync (SSD, BBWC) benchmarks show a small loss up to 8 clients and a
10-20% improvement at 32 clients. That's on a 4-core/8-thread CPU, assuming
HT was left enabled. Your slow-fsync (laptop) benchmarks show a 40-100%
improvement in the 16-64 client range.

I did a few more benchmarks along the spectrum. First, I used a Mac, also
4-core/8-thread, with fsync_writethrough; half of fsync time gave commit_delay
= 35000. I used pgbench, scale factor 100, 4-minute runs, three trials each:

-clients- -tps@commit_delay=0- -tps@commit_delay=35000-
8 51,55,63 82,84,86
16 98,100,107 130,134,143
32 137,148,157 192,200,201
64 199,201,214 249,256,258

So that's a nice 27-53% improvement, fairly similar to the pattern for your
laptop pgbench numbers. Next, based on your comment about the possible value
for cloud-hosted applications, I tried a cc2.8xlarge (16 core, 32 thread),
GNU/Linux EC2 instance with a data directory on a standard EBS volume, ext4
filesystem. Several 15s pg_test_fsync runs could not agree on an fsync time;
I saw results from 694us to 1904us. Ultimately I settled on trying
commit_delay=500, scale factor 300:

-clients- -tps@commit_delay=0- -tps@commit_delay=500-
32 1224,1391,1584 1175,1229,1394
64 1553,1647,1673 1544,1546,1632
128 1717,1833,1900 1621,1720,1951
256 1664,1717,1918 1734,1832,1918

The numbers are all over the place, but there's more loss than gain. Amit
Kapila also measured small losses in tps@-c16:

/messages/by-id/000701cd6ff0$013a6210$03af2630$@kapila@huawei.com

I was curious about the cost of the MinimumActiveBackends() call when relying
on commit_siblings to skip the delay. I ran a similar test with an extra 500
idle backends, clients=8, commit_siblings=20 (so the delay would never be
used), and either a zero or nonzero commit_delay. There was no appreciable
performance advantage from setting commit_delay=0 as opposed to relying on
commit_siblings to suppress the delay. That's good news.

On the GNU/Linux VM, pg_sleep() achieves precision on the order of 10us.
However, the sleep was consistently around 70us longer than requested. A
300us request yielded a 370us sleep, and a 3000us request gave a 3080us sleep.
Mac OS X was similarly precise for short sleeps, but it could oversleep a full
1000us on a 35000us sleep.

diff doc/src/sgml/wal.sgml
index fc5c3b2..92619dd
*** a/doc/src/sgml/wal.sgml
--- b/doc/src/sgml/wal.sgml
***************
*** 375,382 ****
just before a synchronous commit attempts to flush
<acronym>WAL</acronym> to disk, in the hope that a single flush
executed by one such transaction can also serve other transactions
!    committing at about the same time.  Setting <varname>commit_delay</varname>
!    can only help when there are many concurrently committing transactions.
</para>

</sect1>
--- 375,397 ----

The beginning of this paragraph stills says "commit_delay causes a delay just
before a synchronous commit attempts to flush WAL to disk". Since it now
applies to every WAL flush, that should be updated.

***************
*** 560,570 ****
is not enabled, or if fewer than <xref linkend="guc-commit-siblings">
other sessions are currently in active transactions; this avoids
sleeping when it's unlikely that any other session will commit soon.
! Note that on most platforms, the resolution of a sleep request is
ten milliseconds, so that any nonzero <varname>commit_delay</varname>
setting between 1 and 10000 microseconds would have the same effect.
! Good values for these parameters are not yet clear; experimentation
! is encouraged.
</para>
<para>
--- 575,607 ----

There's a similar problem at the beginning of this paragraph; it says
specifically, "The commit_delay parameter defines for how many microseconds
the server process will sleep after writing a commit record to the log with
LogInsert but before performing a LogFlush."

is not enabled, or if fewer than <xref linkend="guc-commit-siblings">
other sessions are currently in active transactions; this avoids
sleeping when it's unlikely that any other session will commit soon.
! Note that on some platforms, the resolution of a sleep request is
ten milliseconds, so that any nonzero <varname>commit_delay</varname>
setting between 1 and 10000 microseconds would have the same effect.
! </para>

As a side note, if we're ever going to recommend a fire-and-forget method for
setting commit_delay, it may be worth detecting whether the host sleep
granularity is limited like this. Setting commit_delay = 20 for your SSD and
silently getting commit_delay = 10000 would make for an unpleasant surprise.

!
! <para>
! Since the purpose of <varname>commit_delay</varname> is to allow
! the cost of each flush operation to be more effectively amortized
! across concurrently committing transactions (potentially at the
! expense of transaction latency), it is necessary to quantify that
! cost when altering the setting. The higher that cost is, the more
! effective <varname>commit_delay</varname> is expected to be in
! increasing transaction throughput. The

That's true for spinning disks, but I suspect it does not hold for storage
with internal parallelism, notably virtualized storage. Consider an iSCSI
configuration with high bandwidth and high latency. When network latency is
the limiting factor, will sending larger requests less often still help?

! <xref linkend="pgtestfsync"> module can be used to measure the
! average time in microseconds that a single WAL flush operation
! takes. A value of half of the average time the module reports it
! takes to flush after a single 8kB write operation is often the most
! effective setting for <varname>commit_delay</varname>. The benefit
! of tuning <varname>commit_delay</varname> can even be pronounced on
! storage media with very fast sync times, such as solid-state drives
! or RAID arrays with a battery-backed write cache. However, higher
! values of <varname>commit_siblings</varname> should be used is such

Typo: "is such" -> "in such"

! cases, whereas smaller <varname>commit_siblings</varname> values
! can be helpful on higher latency media. Note that it is quite
! possible that a setting of <varname>commit_delay</varname> that is
! too high can increase transaction latency by so much that total
! transaction throughput suffers.
</para>

One would be foolish to run a performance-sensitive workload like those in
question, including the choice to have synchronous_commit=on, on spinning
disks with no battery-backed write cache. A cloud environment is more
credible, but my benchmark showed no gain there. Overall, I still won't
personally recommend changing commit_delay without measuring the performance
change for one's particular workload and storage environment. commit_delay
can now bring some impressive gains in the right situations, but I doubt those
are common enough for a fire-and-forget setting to do more good than harm.

I suggest having the documentation recommend half of the fsync time as a
starting point for benchmarking different commit_delay settings against your
own workload. Indicate that it's more likely to help for direct use of
spinning disks than for BBWC/solid state/virtualized storage. Not sure what
else can be credibly given as general advice for PostgreSQL DBAs.

Thanks,
nm

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#17

Peter Geoghegan

peter.geoghegan86@gmail.com

almost 13 years ago

In reply to: Noah Misch (#16)

1 attachment(s)

Re: Re: Doc patch making firm recommendation for setting the value of commit_delay

Hi Noah,

On 27 January 2013 02:31, Noah Misch <noah@leadboat.com> wrote:

I did a few more benchmarks along the spectrum.

So that's a nice 27-53% improvement, fairly similar to the pattern for your
laptop pgbench numbers.

I presume that this applies to a tpc-b benchmark (the pgbench
default). Note that the really compelling numbers that I reported in
that blog post (where there is an increase of over 80% in transaction
throughput at lower client counts) occur with an insert-based
benchmark (i.e. a maximally commit-bound workload).

Next, based on your comment about the possible value
for cloud-hosted applications

-clients- -tps@commit_delay=0- -tps@commit_delay=500-
32 1224,1391,1584 1175,1229,1394
64 1553,1647,1673 1544,1546,1632
128 1717,1833,1900 1621,1720,1951
256 1664,1717,1918 1734,1832,1918

The numbers are all over the place, but there's more loss than gain.

I suspected that the latency of cloud storage might be relatively
poor. Since that is evidently not actually the case with Amazon EBS,
it makes sense that commit_delay isn't compelling there. I am not
disputing whether or not Amazon EBS should be considered
representative of such systems in general - I'm sure that it should
be.

There was no appreciable
performance advantage from setting commit_delay=0 as opposed to relying on
commit_siblings to suppress the delay. That's good news.

Thank you for doing that research; I investigated that the fastpath in
MinimumActiveBackends() works well myself, but it's useful to have my
findings verified.

On the GNU/Linux VM, pg_sleep() achieves precision on the order of 10us.
However, the sleep was consistently around 70us longer than requested. A
300us request yielded a 370us sleep, and a 3000us request gave a 3080us sleep.
Mac OS X was similarly precise for short sleeps, but it could oversleep a full
1000us on a 35000us sleep.

Ugh.

The beginning of this paragraph stills says "commit_delay causes a delay just
before a synchronous commit attempts to flush WAL to disk". Since it now
applies to every WAL flush, that should be updated.

Agreed.

There's a similar problem at the beginning of this paragraph; it says
specifically, "The commit_delay parameter defines for how many microseconds
the server process will sleep after writing a commit record to the log with
LogInsert but before performing a LogFlush."

Right.

As a side note, if we're ever going to recommend a fire-and-forget method for
setting commit_delay, it may be worth detecting whether the host sleep
granularity is limited like this. Setting commit_delay = 20 for your SSD and
silently getting commit_delay = 10000 would make for an unpleasant surprise.

Yes, it would. Note on possible oversleeping added.

! <para>
! Since the purpose of <varname>commit_delay</varname> is to allow
! the cost of each flush operation to be more effectively amortized
! across concurrently committing transactions (potentially at the
! expense of transaction latency), it is necessary to quantify that
! cost when altering the setting. The higher that cost is, the more
! effective <varname>commit_delay</varname> is expected to be in
! increasing transaction throughput. The

That's true for spinning disks, but I suspect it does not hold for storage
with internal parallelism, notably virtualized storage. Consider an iSCSI
configuration with high bandwidth and high latency. When network latency is
the limiting factor, will sending larger requests less often still help?

Well, I don't like to speculate about things like that, because it's
just too easy to be wrong. That said, it doesn't immediately occur to
me why the statement that you've highlighted wouldn't be true of
virtualised storage that has the characteristics you describe. Any
kind of latency at flush time means that clients idle, which means
that the CPU is potentially not kept fully busy for a greater amount
of wall time, where it might otherwise be kept more busy.

One would be foolish to run a performance-sensitive workload like those in
question, including the choice to have synchronous_commit=on, on spinning
disks with no battery-backed write cache. A cloud environment is more
credible, but my benchmark showed no gain there.

In an everyday sense you are correct. It would typically be fairly
senseless to run an application that was severely limited by
transaction throughput like this, when a battery-backed cache could be
used at the cost of a couple of hundred dollars. However, it's quite
possible to imagine a scenario in which the economics favoured using
commit_delay instead. For example, I am aware that at Facebook, a
similar Facebook-flavoured-MySQL setting (sync_binlog_timeout_usecs)
is used. Furthermore, it might not be obvious that fsync speed is an
issue in practice. Setting commit_delay to 4,000 has seemingly no
downside on my laptop - it *positively* affects both average and
worse-case transaction latency - so with spinning disks, it probably
would actually be sensible to set it and forget it, regardless of
workload.

When Robert committed this feature, he added an additional check when
WALWriteLock is acquired, that could see the lock acquired in a way
that turned out to be needless, but also prevented a flush that was
technically needless from the group commit leader/lock holder
backend's own selfish perspective. I never got around to satisfying
myself that that changed helped more than it hurt, if in fact it had
any measurable impact either way. Perhaps I should. The benchmark that
appears on my blog was actually produced with the slightly different,
original version.

Overall, I still won't
personally recommend changing commit_delay without measuring the performance
change for one's particular workload and storage environment. commit_delay
can now bring some impressive gains in the right situations, but I doubt those
are common enough for a fire-and-forget setting to do more good than harm.

I agree.

I suggest having the documentation recommend half of the fsync time as a
starting point for benchmarking different commit_delay settings against your
own workload. Indicate that it's more likely to help for direct use of
spinning disks than for BBWC/solid state/virtualized storage. Not sure what
else can be credibly given as general advice for PostgreSQL DBAs.

That all seems reasonable. The really important thing is that we don't
state that we don't have a clue what helps - that inspires no
confidence, could turn someone off what would be a really useful
feature for them and just isn't accurate. I also think it's important
that we don't say "Setting commit_delay can only help when there are
many concurrently committing transactions", because roughly the
opposite is true. With many connections, there are already enough
committing transactions to effectively amortize the cost of a flush,
and commit_delay is then only very slightly helpful. Lower client
counts are where commit_delay actually helps (at least with slow fsync
times, which are the compelling case).

I attach a revision that I think addresses your concerns. I've
polished it a bit further too - in particular, my elaborations about
commit_delay have been concentrated at the end of wal.sgml, where they
belong. I've also removed the reference to XLogInsert, because, since
all XLogFlush call sites are now covered by commit_delay, XLogInsert
isn't particularly relevant.

I have also increased the default time that pg_test_fsync runs - I
think that the kind of variability commonly seen in its output, that
you yourself have reported, justifies doing so in passing.

--
Regards,
Peter Geoghegan

Attachments:

commit_delay_doc.2013_01_28.patchapplication/octet-stream; name=commit_delay_doc.2013_01_28.patchDownload

diff contrib/pg_test_fsync/pg_test_fsync.c
index ec4b90c..5ee0398
*** a/contrib/pg_test_fsync/pg_test_fsync.c
--- b/contrib/pg_test_fsync/pg_test_fsync.c
*************** do { \
*** 60,66 ****
  
  static const char *progname;
  
! static int	secs_per_test = 2;
  static int	needs_unlink = 0;
  static char full_buf[XLOG_SEG_SIZE],
  		   *buf,
--- 60,66 ----
  
  static const char *progname;
  
! static int	secs_per_test = 5;
  static int	needs_unlink = 0;
  static char full_buf[XLOG_SEG_SIZE],
  		   *buf,
diff doc/src/sgml/pgtestfsync.sgml
index 00ef209..ef4cd3d
*** a/doc/src/sgml/pgtestfsync.sgml
--- b/doc/src/sgml/pgtestfsync.sgml
***************
*** 72,79 ****
         <para>
          Specifies the number of seconds for each test.  The more time
          per test, the greater the test's accuracy, but the longer it takes
!         to run.  The default is 2 seconds, which allows the program to
!         complete in about 30 seconds.
         </para>
        </listitem>
       </varlistentry>
--- 72,79 ----
         <para>
          Specifies the number of seconds for each test.  The more time
          per test, the greater the test's accuracy, but the longer it takes
!         to run.  The default is 5 seconds, which allows the program to
!         complete in under 1 minute 30 seconds.
         </para>
        </listitem>
       </varlistentry>
diff doc/src/sgml/wal.sgml
index fc5c3b2..a3e5a4c
*** a/doc/src/sgml/wal.sgml
--- b/doc/src/sgml/wal.sgml
***************
*** 372,382 ****
     asynchronous commit, but it is actually a synchronous commit method
     (in fact, <varname>commit_delay</varname> is ignored during an
     asynchronous commit).  <varname>commit_delay</varname> causes a delay
!    just before a synchronous commit attempts to flush
!    <acronym>WAL</acronym> to disk, in the hope that a single flush
!    executed by one such transaction can also serve other transactions
!    committing at about the same time.  Setting <varname>commit_delay</varname>
!    can only help when there are many concurrently committing transactions.
    </para>
  
   </sect1>
--- 372,384 ----
     asynchronous commit, but it is actually a synchronous commit method
     (in fact, <varname>commit_delay</varname> is ignored during an
     asynchronous commit).  <varname>commit_delay</varname> causes a delay
!    just before a <acronym>WAL</acronym> flush to disk, in the hope
!    that a single flush executed by one such transaction can also serve
!    other transactions committing at about the same time.  The setting
!    can be thought of as a way of increasing the time window in which
!    other transactions can join a backlog about to participate in a
!    single flush, to amortize the cost of the flush among multiple
!    transactions.
    </para>
  
   </sect1>
***************
*** 550,570 ****
    </para>
  
    <para>
!    The <xref linkend="guc-commit-delay"> parameter defines for how many
!    microseconds the server process will sleep after writing a commit
!    record to the log with <function>LogInsert</function> but before
!    performing a <function>LogFlush</function>. This delay allows other
!    server processes to add their commit records to the log so as to have all
!    of them flushed with a single log sync. No sleep will occur if
!    <xref linkend="guc-fsync">
!    is not enabled, or if fewer than <xref linkend="guc-commit-siblings">
!    other sessions are currently in active transactions; this avoids
!    sleeping when it's unlikely that any other session will commit soon.
!    Note that on most platforms, the resolution of a sleep request is
!    ten milliseconds, so that any nonzero <varname>commit_delay</varname>
!    setting between 1 and 10000 microseconds would have the same effect.
!    Good values for these parameters are not yet clear; experimentation
!    is encouraged.
    </para>
  
    <para>
--- 552,618 ----
    </para>
  
    <para>
!    The <xref linkend="guc-commit-delay"> parameter defines for how
!    many microseconds the group commit leader server process will sleep
!    after acquiring a lock within <function>LogFlush</function>, as
!    group commit followers queue up behind the leader.  This delay
!    allows other server processes to join the backlog and add their
!    commit records to the log so as to have all of them flushed with a
!    single log sync. No sleep will occur if <xref linkend="guc-fsync">
!    is not enabled, or if fewer than
!    <xref linkend="guc-commit-siblings"> other sessions are currently
!    in active transactions; this avoids sleeping when it's unlikely
!    that any other session will commit soon.  Note that on some
!    platforms, the resolution of a sleep request is ten milliseconds,
!    so that any nonzero <varname>commit_delay</varname> setting between
!    1 and 10000 microseconds would have the same effect.  Note also
!    that on some platforms, sleep operations may take slightly longer
!    than requested by the implementation.
!   </para>
! 
!   <para>
!    Since the purpose of <varname>commit_delay</varname> is to allow
!    the cost of each flush operation to be more effectively amortized
!    across concurrently committing transactions (potentially at the
!    expense of transaction latency), it is necessary to quantify that
!    cost when altering the setting.  The higher that cost is, the more
!    effective <varname>commit_delay</varname> is expected to be in
!    increasing transaction throughput, up to a point.  The
!    <xref linkend="pgtestfsync"> module can be used to measure the
!    average time in microseconds that a single WAL flush operation
!    takes.  A value of half of the average time the module reports it
!    takes to flush after a single 8kB write operation is often the most
!    effective setting for <varname>commit_delay</varname>, particularly
!    for high latency rotating disks, and this should be used as a
!    starting point when optimizing for a particular workload.  The
!    benefit of tuning <varname>commit_delay</varname> can even be
!    pronounced on storage media with very fast sync times, such as
!    solid-state drives or RAID arrays with a battery-backed write
!    cache, but this should definitely be tested against a
!    representative workload.  Higher values of
!    <varname>commit_siblings</varname> should be used in such cases,
!    whereas smaller <varname>commit_siblings</varname> values are often
!    helpful on higher latency media.  Note that it is quite possible
!    that a setting of <varname>commit_delay</varname> that is too high
!    can increase transaction latency by so much that total transaction
!    throughput suffers.
!   </para>
! 
!   <para>
!    When <varname>commit_delay</varname> is set to zero (microseconds),
!    the default, the number of sessions that join this group commit
!    backlog is dictated entirely by the number of sessions that happen
!    to reach the stage where they need to flush, during the window in
!    which the previous, currently executing flush operation (if any)
!    takes to complete.  At higher client counts where the window is
!    naturally wider, a "gangway effect" tends to occur, and
!    <varname>commit_delay</varname> tends to help less.  Setting
!    <varname>commit_delay</varname> can only help when there are some
!    concurrently committing transactions, and throughput is limited to
!    some degree by commit rate, but with high rotational latency the
!    setting can be effective in increasing transaction throughput with
!    as few as two clients (that is, a single committing client with one
!    sibling transaction).
    </para>
  
    <para>

#18

Noah Misch

noah@leadboat.com

almost 13 years ago

In reply to: Peter Geoghegan (#17)

Re: Re: Doc patch making firm recommendation for setting the value of commit_delay

On Mon, Jan 28, 2013 at 12:16:24AM +0000, Peter Geoghegan wrote:

On 27 January 2013 02:31, Noah Misch <noah@leadboat.com> wrote:

I did a few more benchmarks along the spectrum.

So that's a nice 27-53% improvement, fairly similar to the pattern for your
laptop pgbench numbers.

I presume that this applies to a tpc-b benchmark (the pgbench
default). Note that the really compelling numbers that I reported in
that blog post (where there is an increase of over 80% in transaction
throughput at lower client counts) occur with an insert-based
benchmark (i.e. a maximally commit-bound workload).

Correct. The pgbench default workload is already rather friendly toward
commit_delay, so I wanted to stay away from even-friendlier tests.

Would you commit to the same git repository the pgbench-tools data for the
graphs appearing in that blog post? I couldn't readily tell what was
happening below 16 clients due to the graphed data points blending together.

! <para>
! Since the purpose of <varname>commit_delay</varname> is to allow
! the cost of each flush operation to be more effectively amortized
! across concurrently committing transactions (potentially at the
! expense of transaction latency), it is necessary to quantify that
! cost when altering the setting. The higher that cost is, the more
! effective <varname>commit_delay</varname> is expected to be in
! increasing transaction throughput. The

That's true for spinning disks, but I suspect it does not hold for storage
with internal parallelism, notably virtualized storage. Consider an iSCSI
configuration with high bandwidth and high latency. When network latency is
the limiting factor, will sending larger requests less often still help?

Well, I don't like to speculate about things like that, because it's
just too easy to be wrong. That said, it doesn't immediately occur to
me why the statement that you've highlighted wouldn't be true of
virtualised storage that has the characteristics you describe. Any
kind of latency at flush time means that clients idle, which means
that the CPU is potentially not kept fully busy for a greater amount
of wall time, where it might otherwise be kept more busy.

On further reflection, I retract the comment. Regardless of internal
parallelism of the storage, PostgreSQL issues WAL fsyncs serially.

One would be foolish to run a performance-sensitive workload like those in
question, including the choice to have synchronous_commit=on, on spinning
disks with no battery-backed write cache. A cloud environment is more
credible, but my benchmark showed no gain there.

In an everyday sense you are correct. It would typically be fairly
senseless to run an application that was severely limited by
transaction throughput like this, when a battery-backed cache could be
used at the cost of a couple of hundred dollars. However, it's quite
possible to imagine a scenario in which the economics favoured using
commit_delay instead. For example, I am aware that at Facebook, a
similar Facebook-flavoured-MySQL setting (sync_binlog_timeout_usecs)
is used. Furthermore, it might not be obvious that fsync speed is an
issue in practice. Setting commit_delay to 4,000 has seemingly no
downside on my laptop - it *positively* affects both average and
worse-case transaction latency - so with spinning disks, it probably
would actually be sensible to set it and forget it, regardless of
workload.

I agree that commit_delay is looking like a safe bet for spinning disks.

I attach a revision that I think addresses your concerns. I've
polished it a bit further too - in particular, my elaborations about
commit_delay have been concentrated at the end of wal.sgml, where they
belong. I've also removed the reference to XLogInsert, because, since
all XLogFlush call sites are now covered by commit_delay, XLogInsert
isn't particularly relevant.

I'm happy with this formulation.

I have also increased the default time that pg_test_fsync runs - I
think that the kind of variability commonly seen in its output, that
you yourself have reported, justifies doing so in passing.

On the EBS configuration with volatile fsync timings, the variability didn't
go away with 15s runs. On systems with stable fsync times, 15s was no better
than 2s. Absent some particular reason to believe 5s is better than 2s, I
would leave it alone.

I'm marking this patch Ready for Committer, qualified with a recommendation to
adopt only the wal.sgml changes.

Thanks,
nm

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#19

Peter Geoghegan

peter.geoghegan86@gmail.com

almost 13 years ago

In reply to: Noah Misch (#18)

Re: Re: Doc patch making firm recommendation for setting the value of commit_delay

On 28 January 2013 03:34, Noah Misch <noah@leadboat.com> wrote:

On the EBS configuration with volatile fsync timings, the variability didn't
go away with 15s runs. On systems with stable fsync times, 15s was no better
than 2s. Absent some particular reason to believe 5s is better than 2s, I
would leave it alone.

I'm not recommending doing so because I thought you'd be likely to get
better numbers on EBS; obviously the variability you saw there likely
had a lot to do with the fact that the underlying physical machines
have multiple tenants. It has just been my observation that more
consistent figures can be obtained (on my laptop) by using a
pg_test_fsync --secs-per-test of about 5. That being the case, why
take the chance with 2 seconds? It isn't as if people run
pg_test_fsync everyday, or that they cannot set --secs-per-test to
whatever they like themselves. On the other hand, the cost of setting
it too low could be quite high now, because the absolute values (and
not just how different wal_sync_methods compare) is now important.

--
Regards,
Peter Geoghegan

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#20

Peter Geoghegan

peter.geoghegan86@gmail.com

almost 13 years ago

In reply to: Noah Misch (#18)

Re: Re: Doc patch making firm recommendation for setting the value of commit_delay

On 28 January 2013 03:34, Noah Misch <noah@leadboat.com> wrote:

Would you commit to the same git repository the pgbench-tools data for the
graphs appearing in that blog post? I couldn't readily tell what was
happening below 16 clients due to the graphed data points blending together.

I'm afraid that I no longer have that data. Of course, I could fairly
easily recreate it, but I don't think I'll have time tomorrow. Is it
important? Are you interested in both the insert and tpc-b cases?

--
Regards,
Peter Geoghegan

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#21

Noah Misch

noah@leadboat.com

almost 13 years ago

In reply to: Peter Geoghegan (#20)

Re: Re: Doc patch making firm recommendation for setting the value of commit_delay

On Mon, Jan 28, 2013 at 04:48:56AM +0000, Peter Geoghegan wrote:

On 28 January 2013 03:34, Noah Misch <noah@leadboat.com> wrote:

Would you commit to the same git repository the pgbench-tools data for the
graphs appearing in that blog post? I couldn't readily tell what was
happening below 16 clients due to the graphed data points blending together.

I'm afraid that I no longer have that data. Of course, I could fairly
easily recreate it, but I don't think I'll have time tomorrow. Is it
important? Are you interested in both the insert and tpc-b cases?

No need, then.

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#22

Noah Misch

noah@leadboat.com

almost 13 years ago

In reply to: Peter Geoghegan (#19)

Re: Re: Doc patch making firm recommendation for setting the value of commit_delay

On Mon, Jan 28, 2013 at 04:29:12AM +0000, Peter Geoghegan wrote:

On 28 January 2013 03:34, Noah Misch <noah@leadboat.com> wrote:

On the EBS configuration with volatile fsync timings, the variability didn't
go away with 15s runs. On systems with stable fsync times, 15s was no better
than 2s. Absent some particular reason to believe 5s is better than 2s, I
would leave it alone.

I'm not recommending doing so because I thought you'd be likely to get
better numbers on EBS; obviously the variability you saw there likely
had a lot to do with the fact that the underlying physical machines
have multiple tenants. It has just been my observation that more
consistent figures can be obtained (on my laptop) by using a
pg_test_fsync --secs-per-test of about 5. That being the case, why
take the chance with 2 seconds?

I can't get too excited about it either way.

It isn't as if people run
pg_test_fsync everyday, or that they cannot set --secs-per-test to
whatever they like themselves. On the other hand, the cost of setting
it too low could be quite high now, because the absolute values (and
not just how different wal_sync_methods compare) is now important.

True. You'd actually want to run the tool with a short interval to select a
wal_sync_method, then test the chosen method for a longer period to get an
accurate reading for commit_delay.

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#23

Tom Lane

tgl@sss.pgh.pa.us

almost 13 years ago

In reply to: Noah Misch (#18)

Re: Re: Doc patch making firm recommendation for setting the value of commit_delay

Noah Misch <noah@leadboat.com> writes:

I'm marking this patch Ready for Committer, qualified with a recommendation to
adopt only the wal.sgml changes.

I've committed this along with some further wordsmithing. I kept
Peter's change to pg_test_fsync's default -s value; I've always felt
that 2 seconds was laughably small. It might be all right for very
quick-and-dirty tests, but as a default value, it seems like a poor
choice, because it's at the very bottom of the credible range of
choices.

regards, tom lane

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#24

Bruce Momjian

bruce@momjian.us

almost 13 years ago

In reply to: Tom Lane (#23)

Re: Re: Doc patch making firm recommendation for setting the value of commit_delay

On Fri, Mar 15, 2013 at 05:47:30PM -0400, Tom Lane wrote:

Noah Misch <noah@leadboat.com> writes:

I'm marking this patch Ready for Committer, qualified with a recommendation to
adopt only the wal.sgml changes.

I've committed this along with some further wordsmithing. I kept
Peter's change to pg_test_fsync's default -s value; I've always felt
that 2 seconds was laughably small. It might be all right for very
quick-and-dirty tests, but as a default value, it seems like a poor
choice, because it's at the very bottom of the credible range of
choices.

Agreed, 2 seconds was at the bottom. The old behavior was very slow so
I went low. Now that we are using it, 5 secs makes sense.

--
Bruce Momjian <bruce@momjian.us> http://momjian.us
EnterpriseDB http://enterprisedb.com

+ It's impossible for everything to be true. +

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers