LWLock Queue Jumping

Started by Simon Riggsover 16 years ago9 messages

simon@2ndQuadrant.com

over 16 years ago

WALInsertLock is heavily contended and likely always will be even if we
apply some of the planned fixes.

Some callers of WALInsertLock are more important than others

* Writing new Clog or Multixact pages (serialized by ClogControlLock)
* For Hot Standby, writing SnapshotData (serialized by ProcArrayLock)

In these cases it seems like we can skip straight to the front of the
WALInsertLock queue without problem.

Most other items cannot be safely reordered, possibly no other items.

We already re-order the lock queues when we hold shared locks, so we
know in principle it is OK to do so. This is an extension of that
thought.

Implementing this would do much to remove my objection to performance
issues associated with simplifying the Hot Standby patch, as recently
suggested by Heikki.

Possible? If so, we can discuss implementation. No worries if not, but
just a side thought that may be fruitful.

--
Simon Riggs www.2ndQuadrant.com

Greg Stark

gsstark@mit.edu

over 16 years ago

In reply to: Simon Riggs (#1)

Re: LWLock Queue Jumping

On Fri, Aug 28, 2009 at 8:07 PM, Simon Riggs<simon@2ndquadrant.com> wrote:

WALInsertLock is heavily contended and likely always will be even if we
apply some of the planned fixes.

I've lost any earlier messages, could you resend the raw data on which
this is based?

Some callers of WALInsertLock are more important than others

* Writing new Clog or Multixact pages (serialized by ClogControlLock)
* For Hot Standby, writing SnapshotData (serialized by ProcArrayLock)

In these cases it seems like we can skip straight to the front of the
WALInsertLock queue without problem.

How does re-ordering reduce the contention? We reorder shared lockers
ahead of exclusive lockers because they can all hold the lock at the
same time so we can reduce the amount of time the lock is held.

Reordering some exclusive lockers ahead of other exclusive lockers
won't reduce the amount of time the lock is held at all. Are you
saying the reason to do it is to reduce time spent waiting on this
lock while holding other critical locks? Do we have tools to measure
how long is being spent waiting on one lock while holding another lock
so we can see if there's a problem and whether this helps?

--
greg
http://mit.edu/~gsstark/resume.pdf

Heikki Linnakangas

heikki.linnakangas@enterprisedb.com

over 16 years ago

In reply to: Greg Stark (#2)

Re: LWLock Queue Jumping

Greg Stark wrote:

On Fri, Aug 28, 2009 at 8:07 PM, Simon Riggs<simon@2ndquadrant.com> wrote:

WALInsertLock is heavily contended and likely always will be even if we
apply some of the planned fixes.

I've lost any earlier messages, could you resend the raw data on which
this is based?

I don't have any pointers right now, but WALInsertLock does often show
up as a bottleneck in write-intensive benchmarks.

Some callers of WALInsertLock are more important than others

* Writing new Clog or Multixact pages (serialized by ClogControlLock)
* For Hot Standby, writing SnapshotData (serialized by ProcArrayLock)

In these cases it seems like we can skip straight to the front of the
WALInsertLock queue without problem.

Reordering some exclusive lockers ahead of other exclusive lockers
won't reduce the amount of time the lock is held at all. Are you
saying the reason to do it is to reduce time spent waiting on this
lock while holding other critical locks?

That's what I thought. I don't know about the clog/multixact issue, it
doesn't seem like it would be too bad, given how seldom new clog or
multixact pages are written.

The Hot Standby thing has been discussed, but no-one has actually posted
a patch which does the locking correctly, where the ProcArrayLock is
held while the SnapshotData WAL record is inserted. So there is no
evidence that it's actually a problem, we might be making a mountain out
of a molehill. It will have practically no effect on throughput, given
how seldom SnapshotData records are written (once per checkpoint), but
if it causes a significant bump to response times, that might be a problem.

This is a good idea to keep in mind, but right now it feels like a
solution in search of a problem.

--
Heikki Linnakangas
EnterpriseDB http://www.enterprisedb.com

Stefan Kaltenbrunner

stefan@kaltenbrunner.cc

over 16 years ago

In reply to: Heikki Linnakangas (#3)

Re: LWLock Queue Jumping

Heikki Linnakangas wrote:

Greg Stark wrote:

On Fri, Aug 28, 2009 at 8:07 PM, Simon Riggs<simon@2ndquadrant.com> wrote:

WALInsertLock is heavily contended and likely always will be even if we
apply some of the planned fixes.

I've lost any earlier messages, could you resend the raw data on which
this is based?

I don't have any pointers right now, but WALInsertLock does often show
up as a bottleneck in write-intensive benchmarks.

yeah I recently ran accross that issue with testing concurrent COPY
performance:

http://www.kaltenbrunner.cc/blog/index.php?/archives/27-Benchmarking-8.4-Chapter-2bulk-loading.html
discussed here:

http://archives.postgresql.org/pgsql-hackers/2009-06/msg01019.php

and (iirc) also here:

http://archives.postgresql.org/pgsql-hackers/2009-06/msg01133.php

however the general issue is easily visible in almost any write
intensive concurrent workload on a fast IO subsystem(ie
pgbench,sysbench,...).

Stefan

Simon Riggs

simon@2ndQuadrant.com

over 16 years ago

In reply to: Heikki Linnakangas (#3)

Re: LWLock Queue Jumping

On Sun, 2009-08-30 at 09:03 +0300, Heikki Linnakangas wrote:

The Hot Standby thing has been discussed, but no-one has actually posted
a patch which does the locking correctly, where the ProcArrayLock is
held while the SnapshotData WAL record is inserted. So there is no
evidence that it's actually a problem, we might be making a mountain out
of a molehill. It will have practically no effect on throughput, given
how seldom SnapshotData records are written (once per checkpoint), but
if it causes a significant bump to response times, that might be a problem.

This is a good idea to keep in mind, but right now it feels like a
solution in search of a problem.

The most important thing is to get HS committed and to do that I think
it is important that I show you I am willing to respond to review
comments. So I will implement it the way you propose and defer any
further discussion about lock contention. The idea here is a simple fix
and very easy enough to return to later, if we need it.

--
Simon Riggs www.2ndQuadrant.com

Jeff Janes

jeff.janes@gmail.com

over 16 years ago

In reply to: Simon Riggs (#5)

Re: LWLock Queue Jumping

---------- Forwarded message ----------
From: Stefan Kaltenbrunner <stefan@kaltenbrunner.cc>
To: Heikki Linnakangas <heikki.linnakangas@enterprisedb.com>
Date: Sun, 30 Aug 2009 11:48:47 +0200
Subject: Re: LWLock Queue Jumping
Heikki Linnakangas wrote:

Greg Stark wrote:

On Fri, Aug 28, 2009 at 8:07 PM, Simon Riggs<simon@2ndquadrant.com>
wrote:

WALInsertLock is heavily contended and likely always will be even if we
apply some of the planned fixes.

I've lost any earlier messages, could you resend the raw data on which
this is based?

I don't have any pointers right now, but WALInsertLock does often show
up as a bottleneck in write-intensive benchmarks.

yeah I recently ran accross that issue with testing concurrent COPY
performance:

http://www.kaltenbrunner.cc/blog/index.php?/archives/27-Benchmarking-8.4-Chapter-2bulk-loading.html
discussed here:

http://archives.postgresql.org/pgsql-hackers/2009-06/msg01019.php

It looks like this is the bulk loading of data into unindexed tables. How
good is that as a target for optimization? I can see several (quite
difficult to code and maintain) ways to make bulk loading into unindexed
tables faster, but they would not speed up the more general cases.

and (iirc) also here:

http://archives.postgresql.org/pgsql-hackers/2009-06/msg01133.php

I played around a little with this, parallel bulk loads into a unindexed,
very skinny table. If I hacked XLogInsert so that it did nothing but take
the WALInsertLock, release it, then return a fake RecPtr, it scaled better
but still not very well. So giant leaps in throughput would need to involve
calling XLogInsert less often (or at least taking the WALInsertLock less
often). You could nibble around the edges by tweaking what happens under
the WALInsertLock, but I don't think that that will get you big wins by
doing that for this case. But again, how important is this case? Are bulk
loads into skinny unindexed tables the best test-bed for improving
XLogInsert?

(Sorry, I think I forgot to change the subject on previous message. Digests
are great if you only read, but for contributing I guess I have to change to
receiving each message)

Jeff

Import Notes

Resolved by subject fallback

Stefan Kaltenbrunner

stefan@kaltenbrunner.cc

over 16 years ago

In reply to: Jeff Janes (#6)

Re: LWLock Queue Jumping

Jeff Janes wrote:

---------- Forwarded message ----------
From: Stefan Kaltenbrunner <stefan@kaltenbrunner.cc>
To: Heikki Linnakangas <heikki.linnakangas@enterprisedb.com
<mailto:heikki.linnakangas@enterprisedb.com>>
Date: Sun, 30 Aug 2009 11:48:47 +0200
Subject: Re: LWLock Queue Jumping
Heikki Linnakangas wrote:

Greg Stark wrote:

On Fri, Aug 28, 2009 at 8:07 PM, Simon
Riggs<simon@2ndquadrant.com <mailto:simon@2ndquadrant.com>>
wrote:

WALInsertLock is heavily contended and likely always
will be even if we
apply some of the planned fixes.

I've lost any earlier messages, could you resend the raw
data on which
this is based?

I don't have any pointers right now, but WALInsertLock does
often show
up as a bottleneck in write-intensive benchmarks.

yeah I recently ran accross that issue with testing concurrent COPY
performance:

http://www.kaltenbrunner.cc/blog/index.php?/archives/27-Benchmarking-8.4-Chapter-2bulk-loading.html
discussed here:

http://archives.postgresql.org/pgsql-hackers/2009-06/msg01019.php

It looks like this is the bulk loading of data into unindexed tables.
How good is that as a target for optimization? I can see several (quite
difficult to code and maintain) ways to make bulk loading into unindexed
tables faster, but they would not speed up the more general cases.

well bulk loading into unindexed tables is quite a common workload -
apart from dump/restore cycles (which we can now do in parallel) a lot
of analytic workloads are that way.
Import tons of data from various sources every night/weeek/month, index,
analyze & aggregate, drop again.

and (iirc) also here:

http://archives.postgresql.org/pgsql-hackers/2009-06/msg01133.php

I played around a little with this, parallel bulk loads into a
unindexed, very skinny table. If I hacked XLogInsert so that it did
nothing but take the WALInsertLock, release it, then return a fake
RecPtr, it scaled better but still not very well. So giant leaps in
throughput would need to involve calling XLogInsert less often (or at
least taking the WALInsertLock less often). You could nibble around the
edges by tweaking what happens under the WALInsertLock, but I don't
think that that will get you big wins by doing that for this case. But
again, how important is this case? Are bulk loads into skinny unindexed
tables the best test-bed for improving XLogInsert?

well you can get similiar looking profiles from other workloads (say
pgbench) as well. Pretty sure the archives have examples for those as well..

Stefan

Jeff Janes

jeff.janes@gmail.com

over 16 years ago

In reply to: Stefan Kaltenbrunner (#7)

Re: LWLock Queue Jumping

On Sun, Aug 30, 2009 at 11:01 AM, Stefan Kaltenbrunner
<stefan@kaltenbrunner.cc> wrote:

Jeff Janes wrote:

---------- Forwarded message ----------
From: Stefan Kaltenbrunner <stefan@kaltenbrunner.cc>
To: Heikki Linnakangas <heikki.linnakangas@enterprisedb.com
<mailto:heikki.linnakangas@enterprisedb.com>>
Date: Sun, 30 Aug 2009 11:48:47 +0200
Subject: Re: LWLock Queue Jumping
Heikki Linnakangas wrote:

I don't have any pointers right now, but WALInsertLock does
often show
up as a bottleneck in write-intensive benchmarks.

yeah I recently ran accross that issue with testing concurrent COPY
performance:

http://www.kaltenbrunner.cc/blog/index.php?/archives/27-Benchmarking-8.4-Chapter-2bulk-loading.html
discussed here:

http://archives.postgresql.org/pgsql-hackers/2009-06/msg01019.php

It looks like this is the bulk loading of data into unindexed tables. How
good is that as a target for optimization? I can see several (quite
difficult to code and maintain) ways to make bulk loading into unindexed
tables faster, but they would not speed up the more general cases.

well bulk loading into unindexed tables is quite a common workload - apart
from dump/restore cycles (which we can now do in parallel) a lot of analytic
workloads are that way.
Import tons of data from various sources every night/weeek/month, index,
analyze & aggregate, drop again.

In those cases where you end by dropping the tables, we should be willing to
bypass WAL altogether, right? Is the problem we can bypass WAL (by doing
the COPY in the same transaction that created or truncated the table), or we
can COPY in parallel, but we can't do both simultaneously?

Jeff

Stefan Kaltenbrunner

stefan@kaltenbrunner.cc

over 16 years ago

In reply to: Jeff Janes (#8)

Re: LWLock Queue Jumping

Jeff Janes wrote:

On Sun, Aug 30, 2009 at 11:01 AM, Stefan Kaltenbrunner
<stefan@kaltenbrunner.cc> wrote:

Jeff Janes wrote:

---------- Forwarded message ----------
From: Stefan Kaltenbrunner <stefan@kaltenbrunner.cc>
To: Heikki Linnakangas <heikki.linnakangas@enterprisedb.com
<mailto:heikki.linnakangas@enterprisedb.com>
<mailto:heikki.linnakangas@enterprisedb.com
<mailto:heikki.linnakangas@enterprisedb.com>>>
Date: Sun, 30 Aug 2009 11:48:47 +0200
Subject: Re: LWLock Queue Jumping
Heikki Linnakangas wrote:

I don't have any pointers right now, but WALInsertLock does
often show
up as a bottleneck in write-intensive benchmarks.

yeah I recently ran accross that issue with testing
concurrent COPY
performance:

http://www.kaltenbrunner.cc/blog/index.php?/archives/27-Benchmarking-8.4-Chapter-2bulk-loading.html
discussed here:

http://archives.postgresql.org/pgsql-hackers/2009-06/msg01019.php

It looks like this is the bulk loading of data into unindexed
tables. How good is that as a target for optimization? I can
see several (quite difficult to code and maintain) ways to make
bulk loading into unindexed tables faster, but they would not
speed up the more general cases.

well bulk loading into unindexed tables is quite a common workload -
apart from dump/restore cycles (which we can now do in parallel) a
lot of analytic workloads are that way.
Import tons of data from various sources every night/weeek/month,
index, analyze & aggregate, drop again.

In those cases where you end by dropping the tables, we should be
willing to bypass WAL altogether, right? Is the problem we can bypass
WAL (by doing the COPY in the same transaction that created or truncated
the table), or we can COPY in parallel, but we can't do both simultaneously?

well yes that is part of the problem - if you bulk load into one or few
tables concurrently you can only sometimes make use of the WAL bypass
optimization. This is especially interesting if you consider that COPY
alone is more or less CPU bottlenecked these days so using multiple
cores makes sense to get higher load rates.

Stefan