pgsql: Optimize commit_siblings in two ways to improve group commit.

Started by Simon Riggsabout 15 years ago8 messages
#1Simon Riggs
simon@2ndQuadrant.com

Optimize commit_siblings in two ways to improve group commit.
First, avoid scanning the whole ProcArray once we know there
are at least commit_siblings active; second, skip the check
altogether if commit_siblings = 0.

Greg Smith

Branch
------
master

Details
-------
http://git.postgresql.org/gitweb?p=postgresql.git;a=commitdiff;h=e620ee35b249b0af255ef788003d1c9edb815a35

Modified Files
--------------
doc/src/sgml/config.sgml | 17 ++++++++++++-----
src/backend/access/transam/xact.c | 2 +-
src/backend/storage/ipc/procarray.c | 17 ++++++++++++-----
src/backend/utils/misc/guc.c | 2 +-
src/include/storage/procarray.h | 2 +-
5 files changed, 27 insertions(+), 13 deletions(-)

#2Tom Lane
tgl@sss.pgh.pa.us
In reply to: Simon Riggs (#1)
Re: [COMMITTERS] pgsql: Optimize commit_siblings in two ways to improve group commit.

Simon Riggs <simon@2ndQuadrant.com> writes:

Optimize commit_siblings in two ways to improve group commit.
First, avoid scanning the whole ProcArray once we know there
are at least commit_siblings active; second, skip the check
altogether if commit_siblings = 0.

Greg Smith

I wonder whether we shouldn't change commit_siblings' default value to
zero while we're at it.

regards, tom lane

#3Robert Haas
robertmhaas@gmail.com
In reply to: Tom Lane (#2)
Re: [COMMITTERS] pgsql: Optimize commit_siblings in two ways to improve group commit.

On Wed, Dec 8, 2010 at 1:56 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

Simon Riggs <simon@2ndQuadrant.com> writes:

Optimize commit_siblings in two ways to improve group commit.
First, avoid scanning the whole ProcArray once we know there
are at least commit_siblings active; second, skip the check
altogether if commit_siblings = 0.

Greg Smith

I wonder whether we shouldn't change commit_siblings' default value to
zero while we're at it.

Not that I see anything to disagree with in this patch, but what
happened to posting patches in advance of committing them? Or did I
just miss that part?

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

#4Tom Lane
tgl@sss.pgh.pa.us
In reply to: Robert Haas (#3)
Re: [COMMITTERS] pgsql: Optimize commit_siblings in two ways to improve group commit.

Robert Haas <robertmhaas@gmail.com> writes:

Not that I see anything to disagree with in this patch, but what
happened to posting patches in advance of committing them? Or did I
just miss that part?

http://archives.postgresql.org/pgsql-performance/2010-12/msg00073.php

Possibly it should have been posted to -hackers instead, but surely you
read -performance?

regards, tom lane

#5Robert Haas
robertmhaas@gmail.com
In reply to: Tom Lane (#4)
Re: [COMMITTERS] pgsql: Optimize commit_siblings in two ways to improve group commit.

On Wed, Dec 8, 2010 at 2:31 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

Robert Haas <robertmhaas@gmail.com> writes:

Not that I see anything to disagree with in this patch, but what
happened to posting patches in advance of committing them?  Or did I
just miss that part?

http://archives.postgresql.org/pgsql-performance/2010-12/msg00073.php

Possibly it should have been posted to -hackers instead, but surely you
read -performance?

Oh, yeah I see it now. I do read -performance, but with two orders of
magnitude more latency than -hackers.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

#6Greg Smith
greg@2ndquadrant.com
In reply to: Tom Lane (#4)
Re: [COMMITTERS] pgsql: Optimize commit_siblings in two ways to improve group commit.

Tom Lane wrote:

http://archives.postgresql.org/pgsql-performance/2010-12/msg00073.php

Possibly it should have been posted to -hackers instead, but surely you
read -performance?

Trying to figure out what exactly commit_delay and commit_siblings did
under the hood was actually the motivation behind my first foray into
reading the PostgreSQL source code. Ever since, I've been annoyed that
the behavior didn't really help the way it's intended, but was not sure
what would be better. The additional input from Jignesh this week on
the performance list suddenly made it crystal clear what would preserve
the good behavior he had seen, even improving things for his case, while
also helping the majority who won't benefit from the commit_delay
behavior at all a little. I immediately wrote the patch and breathed a
sign of relief that it was finally going to get better.

I then posted the patch and added it to the January CF. Unbeknownst to
me until today, Simon had the same multi-year "this itches and I can't
make it stop" feel toward these parameters, and that's how it jumped the
standard process.

--
Greg Smith 2ndQuadrant US greg@2ndQuadrant.com Baltimore, MD
PostgreSQL Training, Services and Support www.2ndQuadrant.us
"PostgreSQL 9.0 High Performance": http://www.2ndQuadrant.com/books

#7Tom Lane
tgl@sss.pgh.pa.us
In reply to: Greg Smith (#6)
Re: [COMMITTERS] pgsql: Optimize commit_siblings in two ways to improve group commit.

Greg Smith <greg@2ndquadrant.com> writes:

I then posted the patch and added it to the January CF. Unbeknownst to
me until today, Simon had the same multi-year "this itches and I can't
make it stop" feel toward these parameters, and that's how it jumped the
standard process.

I think pretty much everybody who's looked at that code had the same
feeling. If Simon hadn't taken it, I might have.

Jignesh's explanation of what the actual usefulness of the code is
finally made sense to me: the sleep calls effectively synchronize
multiple nearby commits to happen at the next scheduler clock tick,
and then whichever one grabs the WALWriteLock first does the work.
If you've got a high enough commit volume that this is likely to
be a win, then it's unclear that taking ProcArrayLock (even shared)
to check for guys who might commit shortly is a net win. Moreover,
it's likely that that heuristic will exclude the last-to-arrive
process who otherwise could have participated in a group flush.

I'm not entirely convinced that zero commit_siblings is a better
default than small positive values, but it's certainly plausible.

regards, tom lane

#8Greg Smith
greg@2ndquadrant.com
In reply to: Tom Lane (#7)
Re: [COMMITTERS] pgsql: Optimize commit_siblings in two ways to improve group commit.

Tom Lane wrote:

I'm not entirely convinced that zero commit_siblings is a better
default than small positive values, but it's certainly plausible.

Not being allowed to set it to zero was certainly a limitation worth
abolishing though; that has been the case before now, for those who
didn't see the thread on the performance list. I think that on the sort
of high throughput system likely to benefit from this behavior, whether
commit_siblings is zero or five doesn't matter very much--those people
should cross the siblings threshold very quickly regardless. The main
arguments in favor of making the default lower aren't as exciting now
that it jumps out of the loop early once finding the requisite number.

I like keeping the default at 5 though. It keeps the person who
experiments with increasing commit_delay from suffering when there are
in reality not a lot of active connections. There are essentially two
foot-guns you have to aim before you run into the worst case here, which
is making every single commit wait for the delay when there's really
only one active process committing.

--
Greg Smith 2ndQuadrant US greg@2ndQuadrant.com Baltimore, MD
PostgreSQL Training, Services and Support www.2ndQuadrant.us
"PostgreSQL 9.0 High Performance": http://www.2ndQuadrant.com/books