Overhead cost of Serializable Snapshot Isolation

Started by Greg Sabino Mullaneover 14 years ago48 messageshackers
Jump to latest
#1Greg Sabino Mullane
greg@turnstep.com

I'm looking into upgrading a fairly busy system to 9.1. They use
serializable mode for a few certain things, and suffer through some
serialization errors as a result. While looking over the new
serializable/SSI documentation, one thing that stood out is:

http://www.postgresql.org/docs/current/interactive/transaction-iso.html

"The monitoring of read/write dependencies has a cost, as does the restart of
transactions which are terminated with a serialization failure, but balanced
against the cost and blocking involved in use of explicit locks and SELECT
FOR UPDATE or SELECT FOR SHARE, Serializable transactions are the best
performance choice for some environments."

I agree it is better versus SELECT FOR, but what about repeatable read versus
the new serializable? How much overhead is there in the 'monitoring of
read/write dependencies'? This is my only concern at the moment. Are we
talking insignificant overhead? Minor? Is it measurable? Hard to say without
knowing the number of txns, number of locks, etc.?

--
Greg Sabino Mullane greg@endpoint.com
End Point Corporation
PGP Key: 0x14964AC8

#2Heikki Linnakangas
heikki.linnakangas@enterprisedb.com
In reply to: Greg Sabino Mullane (#1)
Re: Overhead cost of Serializable Snapshot Isolation

On 10.10.2011 21:25, Greg Sabino Mullane wrote:

I agree it is better versus SELECT FOR, but what about repeatable read versus
the new serializable? How much overhead is there in the 'monitoring of
read/write dependencies'? This is my only concern at the moment. Are we
talking insignificant overhead? Minor? Is it measurable? Hard to say without
knowing the number of txns, number of locks, etc.?

I'm sure it does depend heavily on all of those things, but IIRC Kevin
ran some tests earlier in the spring and saw a 5% slowdown. That feels
like reasonable initial guess to me. If you can run some tests and
measure the overhead in your application, it would be nice to hear about it.

--
Heikki Linnakangas
EnterpriseDB http://www.enterprisedb.com

#3Dan Ports
drkp@csail.mit.edu
In reply to: Greg Sabino Mullane (#1)
Re: Overhead cost of Serializable Snapshot Isolation

On Mon, Oct 10, 2011 at 02:25:59PM -0400, Greg Sabino Mullane wrote:

I agree it is better versus SELECT FOR, but what about repeatable read versus
the new serializable? How much overhead is there in the 'monitoring of
read/write dependencies'? This is my only concern at the moment. Are we
talking insignificant overhead? Minor? Is it measurable? Hard to say without
knowing the number of txns, number of locks, etc.?

I'd expect that in most cases the main cost is not going to be overhead
from the lock manager but rather the cost of having transactions
aborted due to conflicts. (But the rollback rate is extremely
workload-dependent.)

We've seen CPU overhead from the lock manager to be a few percent on a
CPU-bound workload (in-memory pgbench). Also, if you're using a system
with many cores and a similar workload, SerializableXactHashLock might
become a scalability bottleneck.

Dan

--
Dan R. K. Ports MIT CSAIL http://drkp.net/

#4Kevin Grittner
Kevin.Grittner@wicourts.gov
In reply to: Heikki Linnakangas (#2)
Re: Overhead cost of Serializable Snapshot Isolation

Heikki Linnakangas <heikki.linnakangas@enterprisedb.com> wrote:

On 10.10.2011 21:25, Greg Sabino Mullane wrote:

I agree it is better versus SELECT FOR, but what about repeatable
read versus the new serializable? How much overhead is there in
the 'monitoring of read/write dependencies'? This is my only
concern at the moment. Are we talking insignificant overhead?
Minor? Is it measurable? Hard to say without knowing the number
of txns, number of locks, etc.?

I'm sure it does depend heavily on all of those things, but IIRC
Kevin ran some tests earlier in the spring and saw a 5% slowdown.
That feels like reasonable initial guess to me. If you can run
some tests and measure the overhead in your application, it would
be nice to hear about it.

Right: the only real answer is "it depends". At various times I ran
different benchmarks where the overhead ranged from "lost in the
noise" to about 5% for one variety of "worst case". Dan ran DBT-2,
following the instructions on how to measure performance quite
rigorously, and came up with a 2% hit versus repeatable read for
that workload. I rarely found a benchmark where the hit exceeded
2%, but I have a report of a workload where they hit was 20% -- for
constantly overlapping long-running transactions contending for the
same table.

I do have some concern about whether the performance improvements
from reduced LW locking contention elsewhere in the code may (in
whack-a-mole fashion) cause the percentages to go higher in SSI.
The biggest performance issues in some of the SSI benchmarks were on
LW lock contention, so those may become more noticeable as other
contention is reduced. I've been trying to follow along on the
threads regarding Robert's work in that area, with hopes of applying
some of the same techniques to SSI, but it's not clear whether I'll
have time to work on that for the 9.2 release. (It's actually
looking improbably at this point.)

If you give it a try, please optimize using the performance
considerations for SSI in the manual. They can make a big
difference.

-Kevin

#5Dan Ports
drkp@csail.mit.edu
In reply to: Kevin Grittner (#4)
Re: Overhead cost of Serializable Snapshot Isolation

On Mon, Oct 10, 2011 at 02:59:04PM -0500, Kevin Grittner wrote:

I do have some concern about whether the performance improvements
from reduced LW locking contention elsewhere in the code may (in
whack-a-mole fashion) cause the percentages to go higher in SSI.
The biggest performance issues in some of the SSI benchmarks were on
LW lock contention, so those may become more noticeable as other
contention is reduced. I've been trying to follow along on the
threads regarding Robert's work in that area, with hopes of applying
some of the same techniques to SSI, but it's not clear whether I'll
have time to work on that for the 9.2 release. (It's actually
looking improbably at this point.)

I spent some time thinking about this a while back, but didn't have
time to get very far. The problem isn't contention in the predicate
lock manager (which is partitioned) but the single lock protecting the
active SerializableXact state.

It would probably help things a great deal if we could make that lock
more fine-grained. However, it's tricky to do this without deadlocking
because the serialization failure checks need to examine a node's
neighbors in the dependency graph.

Dan

--
Dan R. K. Ports MIT CSAIL http://drkp.net/

#6Kevin Grittner
Kevin.Grittner@wicourts.gov
In reply to: Dan Ports (#5)
Re: Overhead cost of Serializable Snapshot Isolation

Dan Ports <drkp@csail.mit.edu> wrote:

I spent some time thinking about this a while back, but didn't
have time to get very far. The problem isn't contention in the
predicate lock manager (which is partitioned) but the single lock
protecting the active SerializableXact state.

It would probably help things a great deal if we could make that
lock more fine-grained. However, it's tricky to do this without
deadlocking because the serialization failure checks need to
examine a node's neighbors in the dependency graph.

Did you ever see much contention on
SerializablePredicateLockListLock, or was it just
SerializableXactHashLock? I think the former might be able to use
the non-blocking techniques, but I fear the main issue is with the
latter, which seems like a harder problem.

-Kevin

#7Dan Ports
drkp@csail.mit.edu
In reply to: Kevin Grittner (#6)
Re: Overhead cost of Serializable Snapshot Isolation

On Mon, Oct 10, 2011 at 04:10:18PM -0500, Kevin Grittner wrote:

Did you ever see much contention on
SerializablePredicateLockListLock, or was it just
SerializableXactHashLock? I think the former might be able to use
the non-blocking techniques, but I fear the main issue is with the
latter, which seems like a harder problem.

No, not that I recall -- if SerializablePredicateLockListLock was on
the list of contended locks, it was pretty far down.

SerializableXactHashLock was the main bottleneck, and
SerializableXactFinishedListLock was a lesser but still significant
one.

Dan

--
Dan R. K. Ports MIT CSAIL http://drkp.net/

#8Simon Riggs
simon@2ndQuadrant.com
In reply to: Heikki Linnakangas (#2)
Re: Overhead cost of Serializable Snapshot Isolation

On Mon, Oct 10, 2011 at 8:30 PM, Heikki Linnakangas
<heikki.linnakangas@enterprisedb.com> wrote:

On 10.10.2011 21:25, Greg Sabino Mullane wrote:

I agree it is better versus SELECT FOR, but what about repeatable read
versus
the new serializable? How much overhead is there in the 'monitoring of
read/write dependencies'? This is my only concern at the moment. Are we
talking insignificant overhead? Minor? Is it measurable? Hard to say
without
knowing the number of txns, number of locks, etc.?

I'm sure it does depend heavily on all of those things, but IIRC Kevin ran
some tests earlier in the spring and saw a 5% slowdown. That feels like
reasonable initial guess to me. If you can run some tests and measure the
overhead in your application, it would be nice to hear about it.

How do we turn it on/off to allow the overhead to be measured?

--
 Simon Riggs                   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

#9Kevin Grittner
Kevin.Grittner@wicourts.gov
In reply to: Simon Riggs (#8)
Re: Overhead cost of Serializable Snapshot Isolation

Simon Riggs <simon@2ndQuadrant.com> wrote:

How do we turn it on/off to allow the overhead to be measured?

User REPEATABLE READ transactions or SERIALIZABLE transactions. The
easiest way, if you're doing it for all transactions (which I
recommend) is to set default_transaction_isolation.

-Kevin

#10Robert Haas
robertmhaas@gmail.com
In reply to: Kevin Grittner (#4)
Re: Overhead cost of Serializable Snapshot Isolation

On Mon, Oct 10, 2011 at 3:59 PM, Kevin Grittner
<Kevin.Grittner@wicourts.gov> wrote:

I do have some concern about whether the performance improvements
from reduced LW locking contention elsewhere in the code may (in
whack-a-mole fashion) cause the percentages to go higher in SSI.
The biggest performance issues in some of the SSI benchmarks were on
LW lock contention, so those may become more noticeable as other
contention is reduced.  I've been trying to follow along on the
threads regarding Robert's work in that area, with hopes of applying
some of the same techniques to SSI, but it's not clear whether I'll
have time to work on that for the 9.2 release.  (It's actually
looking improbably at this point.)

I ran my good old pgbench -S, scale factor 100, shared_buffers = 8GB
test on Nate Boley's box. I ran it on both 9.1 and 9.2dev, and at all
three isolation levels. As usual, I took the median of three 5-minute
runs, which I've generally found adequate to eliminate the noise. On
both 9.1 and 9.2dev, read committed and repeatable read have basically
identical performance; if anything, repeatable read may be slightly
better - which would make sense, if it cuts down the number of
snapshots taken.

Serializable mode is much slower on this test, though. On
REL9_1_STABLE, it's about 8% slower with a single client. At 8
clients, the difference rises to 43%, and at 32 clients, it's 51%
slower. On 9.2devel, raw performance is somewhat higher (e.g. +51% at
8 clients) but the performance when not using SSI has improved so much
that the performance gap between serializable and the other two
isolation levels is now huge: with 32 clients, in serializable mode,
the median result was 21114.577645 tps; in read committed,
218748.929692 tps - that is, read committed is running more than ten
times faster than serializable. Data are attached, in text form and
as a plot. I excluded the repeatable read results from the plot as
they just clutter it up - they're basically on top of the read
committed results.

I haven't run this with LWLOCK_STATS, but my seat-of-the-pants guess
is that there's a single lightweight lock that everything is
bottlenecking on. One possible difference between this test case and
the ones you may have used is that this case involves lots and lots of
really short transactions that don't do much. The effect of anything
that only happens once or a few times per transaction is really
magnified in this type of workload (which is why the locking changes
make so much of a difference here - in a longer or heavier-weight
transaction that stuff would be lost in the noise).

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Attachments:

isolation-scaling.txttext/plain; charset=US-ASCII; name=isolation-scaling.txtDownload
isolation-scaling.pngimage/png; name=isolation-scaling.pngDownload
#11Kevin Grittner
Kevin.Grittner@wicourts.gov
In reply to: Robert Haas (#10)
Re: Overhead cost of Serializable Snapshot Isolation

Robert Haas <robertmhaas@gmail.com> wrote:

I ran my good old pgbench -S, scale factor 100, shared_buffers =
8GB test on Nate Boley's box. I ran it on both 9.1 and 9.2dev,
and at all three isolation levels. As usual, I took the median of
three 5-minute runs, which I've generally found adequate to
eliminate the noise. On both 9.1 and 9.2dev, read committed and
repeatable read have basically identical performance; if anything,
repeatable read may be slightly better - which would make sense,
if it cuts down the number of snapshots taken.

Right. Thanks for running this. Could you give enough details to
allow reproducing on this end (or point to a previous post with the
details)?

Serializable mode is much slower on this test, though. On
REL9_1_STABLE, it's about 8% slower with a single client. At 8
clients, the difference rises to 43%, and at 32 clients, it's 51%
slower. On 9.2devel, raw performance is somewhat higher (e.g.
+51% at 8 clients) but the performance when not using SSI has
improved so much that the performance gap between serializable and
the other two isolation levels is now huge: with 32 clients, in
serializable mode, the median result was 21114.577645 tps; in read
committed, 218748.929692 tps - that is, read committed is running
more than ten times faster than serializable.

Yeah. I was very excited to see your numbers as you worked on that,
but I've been concerned that with the "Performance Whack A Mole"
nature of things (to borrow a term from Josh Berkus), SSI
lightweight locks might be popping their heads up.

Data are attached, in text form and as a plot. I excluded the
repeatable read results from the plot as they just clutter it up -
they're basically on top of the read committed results.

That was kind, but really the REPEATABLE READ results are probably
the more meaningful comparison, even if they are more embarrassing.
:-(

I haven't run this with LWLOCK_STATS, but my seat-of-the-pants
guess is that there's a single lightweight lock that everything is
bottlenecking on.

The lock in question is SerializableXactHashLock. A secondary
problem is SerializableFinishedListLock, which is used for
protecting cleanup of old transactions. This is per Dan's reports,
who had a better look at in on a 16 core machine, but is consistent
with what I saw on fewer cores.

Early in development we had a bigger problem with
SerializablePredicateLockListLock, but Dan added a local map to
eliminate contention during lock promotion decision, and I reworked
that lock from the SHARED read and EXCLUSIVE write approach to the
SHARED for accessing your own data and EXCLUSIVE for accessing data
for another process technique. Combined, that made the problems
with that negligible.

One possible difference between this test case and the ones you
may have used is that this case involves lots and lots of really
short transactions that don't do much.

I did some tests like that, but not on a box with that many
processors, and I probably didn't try using a thread count more than
double the core count, so I probably never ran into the level of
contention you're seeing. The differences at the low connection
counts are surprising to me. Maybe it will make more sense when I
see the test case. There's also some chance that late elimination
of some race conditions found in testing affected this, and I didn't
re-run those tests late enough to see that. Not sure.

The effect of anything that only happens once or a few times per
transaction is really magnified in this type of workload (which is
why the locking changes make so much of a difference here - in a
longer or heavier-weight transaction that stuff would be lost in
the noise).

Did these transactions write anything? If not, were they declared
to be READ ONLY? If they were, in fact, only reading, it would be
interesting to see what the performance looks like if the
recommendation to use the READ ONLY attribute is followed. That's
at the top of the list of performance tips for SSI at:

http://www.postgresql.org/docs/9.1/interactive/transaction-iso.html#XACT-SERIALIZABLE

Anyway, this isolates a real issue, even if the tests exaggerate it
beyond what anyone is likely to see in production. Once this CF is
over, I'll put a review of this at the top of my PG list.

-Kevin

#12Simon Riggs
simon@2ndQuadrant.com
In reply to: Kevin Grittner (#9)
Re: Overhead cost of Serializable Snapshot Isolation

On Mon, Oct 10, 2011 at 11:31 PM, Kevin Grittner
<Kevin.Grittner@wicourts.gov> wrote:

Simon Riggs <simon@2ndQuadrant.com> wrote:

How do we turn it on/off to allow the overhead to be measured?

User REPEATABLE READ transactions or SERIALIZABLE transactions.  The
easiest way, if you're doing it for all transactions (which I
recommend) is to set default_transaction_isolation.

Most apps use mixed mode serializable/repeatable read and therefore
can't be changed by simple parameter. Rewriting the application isn't
a sensible solution.

I think it's clear that SSI should have had and still needs an "off
switch" for cases that cause performance problems.

--
 Simon Riggs                   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

#13Robert Haas
robertmhaas@gmail.com
In reply to: Kevin Grittner (#11)
Re: Overhead cost of Serializable Snapshot Isolation

On Tue, Oct 11, 2011 at 12:46 PM, Kevin Grittner
<Kevin.Grittner@wicourts.gov> wrote:

Robert Haas <robertmhaas@gmail.com> wrote:

I ran my good old pgbench -S, scale factor 100, shared_buffers =
8GB test on Nate Boley's box.  I ran it on both 9.1 and 9.2dev,
and at all three isolation levels.  As usual, I took the median of
three 5-minute runs, which I've generally found adequate to
eliminate the noise. On both 9.1 and 9.2dev, read committed and
repeatable read have basically identical performance; if anything,
repeatable read may be slightly better - which would make sense,
if it cuts down the number of snapshots taken.

Right.  Thanks for running this.  Could you give enough details to
allow reproducing on this end (or point to a previous post with the
details)?

Sure, it's pretty much just a vanilla pgbench -S run, but the scripts
I used are attached here. I build the head of each branch using the
"test-build" script and then used the "runtestiso" script to drive the
test runs. These scripts are throwaway so they're not really
documented, but hopefully it's clear enough what it's doing. The
server itself is a 32-core AMD 6128.

Data are attached, in text form and as a plot.  I excluded the
repeatable read results from the plot as they just clutter it up -
they're basically on top of the read committed results.

That was kind, but really the REPEATABLE READ results are probably
the more meaningful comparison, even if they are more embarrassing.
:-(

They're neither more nor less embarrassing - they're pretty much not
different at all. I just didn't see any point in making a graph with
6 lines on it when you could only actually see 4 of them.

Did these transactions write anything?  If not, were they declared
to be READ ONLY?  If they were, in fact, only reading, it would be
interesting to see what the performance looks like if the
recommendation to use the READ ONLY attribute is followed.

pgbench -S doesn't do any writes, or issue any transaction control
statements. It just fires off SELECT statements against a single
table as fast as it can, retrieving values from rows chosen at random.
Each SELECT implicitly begins and ends a transaction. Possibly the
system could gaze upon the SELECT statement and infer that the
one-statement transaction induced thereby can't possibly write any
tuples, and mark it read-only automatically, but I'm actually not that
excited about that approach - trying to fix the lwlock contention
that's causing the headache in the first place seems like a better use
of time, assuming it's possible to make some headway there.

My general observation is that, on this machine, a lightweight lock
that is taken in exclusive mode by a series of lockers in quick
succession seems to max out around 16-20 clients, and the curve starts
to bend well before that.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Attachments:

test-buildapplication/octet-stream; name=test-buildDownload
runtestisoapplication/octet-stream; name=runtestisoDownload
#14Robert Haas
robertmhaas@gmail.com
In reply to: Simon Riggs (#12)
Re: Overhead cost of Serializable Snapshot Isolation

On Tue, Oct 11, 2011 at 1:11 PM, Simon Riggs <simon@2ndquadrant.com> wrote:

On Mon, Oct 10, 2011 at 11:31 PM, Kevin Grittner
<Kevin.Grittner@wicourts.gov> wrote:

Simon Riggs <simon@2ndQuadrant.com> wrote:

How do we turn it on/off to allow the overhead to be measured?

User REPEATABLE READ transactions or SERIALIZABLE transactions.  The
easiest way, if you're doing it for all transactions (which I
recommend) is to set default_transaction_isolation.

Most apps use mixed mode serializable/repeatable read and therefore
can't be changed by simple parameter. Rewriting the application isn't
a sensible solution.

I think it's clear that SSI should have had and still needs an "off
switch" for cases that cause performance problems.

Is it possible that you are confusing the default level, which is READ
COMMITTED, with REPEATABLE READ? I can't see why anyone would code up
their application to use REPEATABLE READ for some things and
SERIALIZABLE for other things unless they were explicitly trying to
turn SSI off for a subset of their transactions. In all releases
prior to 9.0, REPEATABLE READ and SERIALIZABLE behaved identically, so
there wouldn't be any reason for a legacy app to mix-and-match between
the two.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

#15Simon Riggs
simon@2ndQuadrant.com
In reply to: Robert Haas (#14)
Re: Overhead cost of Serializable Snapshot Isolation

On Tue, Oct 11, 2011 at 6:14 PM, Robert Haas <robertmhaas@gmail.com> wrote:

On Tue, Oct 11, 2011 at 1:11 PM, Simon Riggs <simon@2ndquadrant.com> wrote:

On Mon, Oct 10, 2011 at 11:31 PM, Kevin Grittner
<Kevin.Grittner@wicourts.gov> wrote:

Simon Riggs <simon@2ndQuadrant.com> wrote:

How do we turn it on/off to allow the overhead to be measured?

User REPEATABLE READ transactions or SERIALIZABLE transactions.  The
easiest way, if you're doing it for all transactions (which I
recommend) is to set default_transaction_isolation.

Most apps use mixed mode serializable/repeatable read and therefore
can't be changed by simple parameter. Rewriting the application isn't
a sensible solution.

I think it's clear that SSI should have had and still needs an "off
switch" for cases that cause performance problems.

Is it possible that you are confusing the default level, which is READ
COMMITTED, with REPEATABLE READ?  I can't see why anyone would code up
their application to use REPEATABLE READ for some things and
SERIALIZABLE for other things unless they were explicitly trying to
turn SSI off for a subset of their transactions.  In all releases
prior to 9.0, REPEATABLE READ and SERIALIZABLE behaved identically, so
there wouldn't be any reason for a legacy app to mix-and-match between
the two.

Yes, I mistyped "read" when I meant "committed". You are right to
point out there is no problem if people were using repeatable read and
serializable.

Let me retype, so there is no confusion:

It's common to find applications that have some transactions
explicitly coded to use SERIALIZABLE mode, while the rest are in the
default mode READ COMMITTED. So common that TPC-E benchmark has been
written as a representation of such workloads. The reason this is
common is that some transactions require SERIALIZABLE as a "fix" for
transaction problems.

If you alter the default_transaction_isolation then you will break
applications like this, so it is not a valid way to turn off SSI.

--
 Simon Riggs                   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

#16Kevin Grittner
Kevin.Grittner@wicourts.gov
In reply to: Robert Haas (#13)
Re: Overhead cost of Serializable Snapshot Isolation

Robert Haas <robertmhaas@gmail.com> wrote:

Kevin Grittner <Kevin.Grittner@wicourts.gov> wrote:

Did these transactions write anything? If not, were they
declared to be READ ONLY? If they were, in fact, only reading,
it would be interesting to see what the performance looks like if
the recommendation to use the READ ONLY attribute is followed.

pgbench -S doesn't do any writes, or issue any transaction control
statements. It just fires off SELECT statements against a single
table as fast as it can, retrieving values from rows chosen at
random. Each SELECT implicitly begins and ends a transaction.

So that test could be accomplished by setting
default_transaction_read_only to on. That's actually what we're
doing, because we have a lot more of them than of read-write
transactions. But, with the scripts I can confirm the performance
of that on this end. It should be indistinguishable from the
repeatable read line; if not, there's something to look at there.

Possibly the system could gaze upon the SELECT statement and infer
that the one-statement transaction induced thereby can't possibly
write any tuples, and mark it read-only automatically, but I'm
actually not that excited about that approach

I wasn't intending to suggest that. In fact I hadn't really thought
of it. It might be a fun optimization, although it would be well
down my list, and it wouldn't be trivial because you couldn't use if
for any statements with volatile functions -- so the statement would
need to be planned far enough to know whether that was the case
before making this decision. In fact, I'm not sure the community
would want to generate an error if a user marked a function other
than volatile and ran it in this way. Definitely not something to
even look at any time soon.

trying to fix the lwlock contention that's causing the headache in
the first place seems like a better use of time, assuming it's
possible to make some headway there.

Absolutely. I just thought the timings with READ ONLY would make
for an interesting data point. For one thing, it might reassure
people that even this artificial use cases doesn't perform that
badly if the advice in the documentation is heeded. For another, a
result slower than repeatable read would be a surprise that might
point more directly to the problem.

My general observation is that, on this machine, a lightweight
lock that is taken in exclusive mode by a series of lockers in
quick succession seems to max out around 16-20 clients, and the
curve starts to bend well before that.

OK, I will keep that in mind.

Thanks,

-Kevin

#17Kevin Grittner
Kevin.Grittner@wicourts.gov
In reply to: Simon Riggs (#15)
Re: Overhead cost of Serializable Snapshot Isolation

Simon Riggs <simon@2ndQuadrant.com> wrote:

It's common to find applications that have some transactions
explicitly coded to use SERIALIZABLE mode, while the rest are in
the default mode READ COMMITTED. So common that TPC-E benchmark
has been written as a representation of such workloads.

I would be willing to be that any such implementations assume S2PL,
and would not prevent anomalies as expected unless all transactions
are serializable.

The reason this is common is that some transactions require
SERIALIZABLE as a "fix" for transaction problems.

That is a mode of thinking which doesn't work if you only assume
serializable provides the guarantees required by the standard. Many
people assume otherwise. It does *not* guarantee blocking on
conflicts, and it does not require that transactions appear to have
executed in the order of successful commit. It requires only that
the result of concurrently running any mix of serializable
transactions produce a result consistent with some one-at-a-time
execution of those transactions. Rollback of transactions to
prevent violations of that guarantee are allowed. I don't see any
guarantees about how serializable transactions interact with
non-serializable transactions beyond each transaction not seeing any
of the phenomena prohibited for its isolation level.

If you alter the default_transaction_isolation then you will break
applications like this, so it is not a valid way to turn off SSI.

I don't follow you here. What would break? In what fashion? Since
the standard allows any isolation level to provide more strict
transaction isolation than required, it would be conforming to
*only* support serializable transactions, regardless of the level
requested. Not a good idea for some workloads from a performance
perspective, but it would be conforming, and any application which
doesn't work correctly with that is not written to the standard.

-Kevin

#18Simon Riggs
simon@2ndQuadrant.com
In reply to: Kevin Grittner (#17)
Re: Overhead cost of Serializable Snapshot Isolation

On Tue, Oct 11, 2011 at 6:44 PM, Kevin Grittner
<Kevin.Grittner@wicourts.gov> wrote:

If you alter the default_transaction_isolation then you will break
applications like this, so it is not a valid way to turn off SSI.

I don't follow you here.  What would break?  In what fashion?  Since
the standard allows any isolation level to provide more strict
transaction isolation than required, it would be conforming to
*only* support serializable transactions, regardless of the level
requested.  Not a good idea for some workloads from a performance
perspective, but it would be conforming, and any application which
doesn't work correctly with that is not written to the standard.

If the normal default_transaction_isolation = read committed and all
transactions that require serializable are explicitly marked in the
application then there is no way to turn off SSI without altering the
application. That is not acceptable, since it causes changes in
application behaviour and possibly also performance issues.

We should provide a mechanism to allow people to upgrade to 9.1+
without needing to change the meaning and/or performance of their
apps.

I strongly support the development of SSI, but I don't support
application breakage. We can have SSI without breaking anything for
people that can't or don't want to use it.

--
 Simon Riggs                   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

#19Greg Sabino Mullane
greg@turnstep.com
In reply to: Robert Haas (#13)
Re: Overhead cost of Serializable Snapshot Isolation

Robert Haas:

Serializable mode is much slower on this test, though. On
REL9_1_STABLE, it's about 8% slower with a single client. At 8
clients, the difference rises to 43%, and at 32 clients, it's 51%
slower.

Bummer. Thanks for putting some numbers out there; glad I was able
to jump start a deeper look at this. Based on this thread so far,
I am probably going to avoid serializable in this particular case,
and stick to repeatable read. Once things are in place, perhaps I'll
be able to try switching to serializable and get some measurements,
but I wanted to see if the impact was minor enough to safely start
with serializable. Seems not. :) Keep in mind this is not even a
formal proposal yet for our client, so any benchmarks from me may
be quite a while.

Kevin Grittner:

Did these transactions write anything? If not, were they declared
to be READ ONLY? If they were, in fact, only reading, it would be
interesting to see what the performance looks like if the
recommendation to use the READ ONLY attribute is followed.

Yes, I'll definitely look into that, but the great majority of the
things done in this case are read/write.

Simon Riggs:

Most apps use mixed mode serializable/repeatable read and therefore
can't be changed by simple parameter. Rewriting the application isn't
a sensible solution.

I think it's clear that SSI should have had and still needs an "off
switch" for cases that cause performance problems.

Eh? It has an off switch: repeatable read.

Thanks for all replying to this thread, it's been very helpful.

--
Greg Sabino Mullane greg@endpoint.com
End Point Corporation
PGP Key: 0x14964AC8

#20Greg Sabino Mullane
greg@turnstep.com
In reply to: Simon Riggs (#18)
Re: Overhead cost of Serializable Snapshot Isolation

If the normal default_transaction_isolation = read committed and all
transactions that require serializable are explicitly marked in the
application then there is no way to turn off SSI without altering the
application. That is not acceptable, since it causes changes in
application behaviour and possibly also performance issues.

Performance, perhaps. What application behavior changes? Less
serialization conflicts?

We should provide a mechanism to allow people to upgrade to 9.1+
without needing to change the meaning and/or performance of their
apps.

That ship has sailed.

--
Greg Sabino Mullane greg@endpoint.com
End Point Corporation
PGP Key: 0x14964AC8

#21Simon Riggs
simon@2ndQuadrant.com
In reply to: Greg Sabino Mullane (#19)
#22Bruce Momjian
bruce@momjian.us
In reply to: Greg Sabino Mullane (#20)
#23Bruce Momjian
bruce@momjian.us
In reply to: Simon Riggs (#18)
#24Kevin Grittner
Kevin.Grittner@wicourts.gov
In reply to: Simon Riggs (#21)
#25Greg Sabino Mullane
greg@turnstep.com
In reply to: Bruce Momjian (#22)
#26Simon Riggs
simon@2ndQuadrant.com
In reply to: Bruce Momjian (#22)
#27Tom Lane
tgl@sss.pgh.pa.us
In reply to: Simon Riggs (#21)
#28Kevin Grittner
Kevin.Grittner@wicourts.gov
In reply to: Greg Sabino Mullane (#19)
#29Bruce Momjian
bruce@momjian.us
In reply to: Simon Riggs (#26)
#30Simon Riggs
simon@2ndQuadrant.com
In reply to: Kevin Grittner (#24)
#31Kevin Grittner
Kevin.Grittner@wicourts.gov
In reply to: Simon Riggs (#30)
#32Simon Riggs
simon@2ndQuadrant.com
In reply to: Bruce Momjian (#29)
#33Tom Lane
tgl@sss.pgh.pa.us
In reply to: Simon Riggs (#32)
#34Simon Riggs
simon@2ndQuadrant.com
In reply to: Kevin Grittner (#31)
#35Florian Pflug
fgp@phlo.org
In reply to: Simon Riggs (#30)
#36Simon Riggs
simon@2ndQuadrant.com
In reply to: Tom Lane (#33)
#37Simon Riggs
simon@2ndQuadrant.com
In reply to: Florian Pflug (#35)
#38Kevin Grittner
Kevin.Grittner@wicourts.gov
In reply to: Simon Riggs (#34)
#39Bruce Momjian
bruce@momjian.us
In reply to: Simon Riggs (#36)
#40Bruce Momjian
bruce@momjian.us
In reply to: Simon Riggs (#37)
#41Heikki Linnakangas
heikki.linnakangas@enterprisedb.com
In reply to: Simon Riggs (#18)
#42Peter Eisentraut
peter_e@gmx.net
In reply to: Simon Riggs (#26)
#43Simon Riggs
simon@2ndQuadrant.com
In reply to: Heikki Linnakangas (#41)
#44Simon Riggs
simon@2ndQuadrant.com
In reply to: Peter Eisentraut (#42)
#45Florian Pflug
fgp@phlo.org
In reply to: Simon Riggs (#37)
#46Robert Haas
robertmhaas@gmail.com
In reply to: Florian Pflug (#45)
#47Simon Riggs
simon@2ndQuadrant.com
In reply to: Florian Pflug (#45)
#48Greg Sabino Mullane
greg@turnstep.com
In reply to: Peter Eisentraut (#42)