WAL Rate Limiting

robertmhaas@gmail.com

almost 12 years ago

In reply to: Magnus Hagander (#5)

Re: WAL Rate Limiting

On Wed, Jan 15, 2014 at 7:54 AM, Magnus Hagander <magnus@hagander.net> wrote:

On Wed, Jan 15, 2014 at 3:20 AM, Simon Riggs <simon@2ndquadrant.com> wrote:

We've discussed previously the negative impact of large bulk
operations, especially wrt WAL writes. Patch here allows maintenance
operations to have their WAL generation slowed down as a replication
lag prevention feature.

I believe there was originally intended to be some work on I/O rate
limiting, but that hasn't happened and is in some ways orthogonal to
this patch and we will likely eventually want both.

Single new parameter works very similarly to vacuum_cost_delay

wal_rate_limit_delay = Xms

Seems like a really bad name if we are only slowing down some commands -
that seems to indicate we're slowing down all of them. I think it should be
something that indicates that it only affects the maintenance commands.

And why should it only affect the maintenance commands anyway, and who
decides what's a maintenance command?

I thought Heroku suggested something like this previously, and their
use case was something along the lines of "we need to slow the system
down enough to do a backup so we can delete some stuff before the disk
fills". For that, it seems likely to me that you would just want to
slow everything down.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

andres@2ndquadrant.com

almost 12 years ago

In reply to: Robert Haas (#6)

Re: WAL Rate Limiting

On 2014-01-16 09:06:30 -0500, Robert Haas wrote:

Seems like a really bad name if we are only slowing down some commands -
that seems to indicate we're slowing down all of them. I think it should be
something that indicates that it only affects the maintenance commands.

And why should it only affect the maintenance commands anyway, and who
decides what's a maintenance command?

I think implementing it for everything might have some use, but it's a
much, much more complex task. You can't simply do rate limiting in
XLogInsert() or somesuch - we're holding page locks, buffer pins, other
locks... I don't see why that needs to be done in the same feature.

I don't really see much difficulty in determining what's a utility
command and what not for the purpose of this? All utility commands which
create WAL in O(table_size) or worse.

I thought Heroku suggested something like this previously, and their
use case was something along the lines of "we need to slow the system
down enough to do a backup so we can delete some stuff before the disk
fills". For that, it seems likely to me that you would just want to
slow everything down.

I think the usecase for this more along the lines of not slowing normal
operations or cause replication delays to standbys unduly, while
performing maintenance operations on relations.

Greetings,

Andres Freund

--
Andres Freund http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

tgl@sss.pgh.pa.us

almost 12 years ago

In reply to: Andres Freund (#7)

Re: WAL Rate Limiting

Andres Freund <andres@2ndquadrant.com> writes:

I don't really see much difficulty in determining what's a utility
command and what not for the purpose of this? All utility commands which
create WAL in O(table_size) or worse.

By that definition, INSERT, UPDATE, and DELETE can all be "utility
commands". So would a full-table-scan SELECT, if it's unfortunate
enough to run into a lot of hint-setting or HOT-pruning work.

I think possibly a more productive approach to this would be to treat
it as a session-level GUC setting, rather than hard-wiring it to affect
certain commands and not others.

regards, tom lane

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

andres@2ndquadrant.com

almost 12 years ago

In reply to: Tom Lane (#8)

Re: WAL Rate Limiting

On 2014-01-16 10:35:20 -0500, Tom Lane wrote:

Andres Freund <andres@2ndquadrant.com> writes:

I don't really see much difficulty in determining what's a utility
command and what not for the purpose of this? All utility commands which
create WAL in O(table_size) or worse.

By that definition, INSERT, UPDATE, and DELETE can all be "utility
commands". So would a full-table-scan SELECT, if it's unfortunate
enough to run into a lot of hint-setting or HOT-pruning work.

Well, I said *utility* command. So everything processed by
ProcessUtility() generating WAL like that.

I think possibly a more productive approach to this would be to treat
it as a session-level GUC setting, rather than hard-wiring it to affect
certain commands and not others.

Do you see a reasonable way to implement this generically for all
commands? I don't.
We shouldn't let the the need for generic resource control stop us from
providing some for of resource control for a consistent subset of
commands.

Greetings,

Andres Freund

--
Andres Freund http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#10

Heikki Linnakangas

hlinnakangas@vmware.com

almost 12 years ago

In reply to: Andres Freund (#9)

Re: WAL Rate Limiting

On 01/16/2014 05:39 PM, Andres Freund wrote:

On 2014-01-16 10:35:20 -0500, Tom Lane wrote:

I think possibly a more productive approach to this would be to treat
it as a session-level GUC setting, rather than hard-wiring it to affect
certain commands and not others.

Do you see a reasonable way to implement this generically for all
commands? I don't.

In suitable safe places, check if you've written too much WAL, and sleep
if so. Call it CHECK_FOR_WAL_BUDGET(), along the lines of
CHECK_FOR_INTERRUPTS(), but called less frequently. Or maybe
vacuum_delay_point() is a better analogy. Hopefully you don't need to
sprinkle them in too many places to be useful.

- Heikki

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#11

tgl@sss.pgh.pa.us

almost 12 years ago

In reply to: Heikki Linnakangas (#10)

Re: WAL Rate Limiting

Heikki Linnakangas <hlinnakangas@vmware.com> writes:

On 01/16/2014 05:39 PM, Andres Freund wrote:

Do you see a reasonable way to implement this generically for all
commands? I don't.

In suitable safe places, check if you've written too much WAL, and sleep
if so. Call it CHECK_FOR_WAL_BUDGET(), along the lines of
CHECK_FOR_INTERRUPTS(), but called less frequently. Or maybe
vacuum_delay_point() is a better analogy. Hopefully you don't need to
sprinkle them in too many places to be useful.

We probably don't really need to implement it for "all" commands; I think
everyone can agree that, say, ALTER TABLE RENAME COLUMN isn't going to
emit enough WAL to really matter. My point was that we should try to hit
everything that potentially *could* generate lots of WAL, rather than
arbitrarily deciding that some are utility commands and some are not.

For INSERT/UPDATE/DELETE, likely you could do this with a single
CHECK_FOR_WAL_BUDGET() added at a strategic spot in nodeModifyTable.c.

regards, tom lane

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#12

andres@2ndquadrant.com

almost 12 years ago

In reply to: Heikki Linnakangas (#10)

Re: WAL Rate Limiting

On 2014-01-16 17:47:51 +0200, Heikki Linnakangas wrote:

On 01/16/2014 05:39 PM, Andres Freund wrote:

On 2014-01-16 10:35:20 -0500, Tom Lane wrote:

I think possibly a more productive approach to this would be to treat
it as a session-level GUC setting, rather than hard-wiring it to affect
certain commands and not others.

Do you see a reasonable way to implement this generically for all
commands? I don't.

In suitable safe places, check if you've written too much WAL, and sleep if
so. Call it CHECK_FOR_WAL_BUDGET(), along the lines of
CHECK_FOR_INTERRUPTS(), but called less frequently. Or maybe
vacuum_delay_point() is a better analogy. Hopefully you don't need to
sprinkle them in too many places to be useful.

That'd be most places doing XLogInsert() if you want to be
consistent. Each needing to be analyzed whether we're blocking important
resources, determining where in the callstack we can do the
CHECK_FOR_WAL_BUDGET().

I don't think the the add power by allowing bulk DML to be metered is
worth the increased implementation cost.

I think the usecases that would want this for DML probably also wan this
to work for unlogged, temp tables. That'd require a much more generic
resource control framework.

Greetings,

Andres Freund

--
Andres Freund http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#13

tgl@sss.pgh.pa.us

almost 12 years ago

In reply to: Andres Freund (#12)

Re: WAL Rate Limiting

Andres Freund <andres@2ndquadrant.com> writes:

On 2014-01-16 17:47:51 +0200, Heikki Linnakangas wrote:

In suitable safe places, check if you've written too much WAL, and sleep if
so.

That'd be most places doing XLogInsert() if you want to be
consistent.

See my other response. There's no need for "consistency", only to be sure
that code paths that might generate a *lot* of WAL include such checks.
We've gotten along fine without any formal methodology for where to put
CHECK_FOR_INTERRUPTS() or vacuum_delay_point() calls, and this seems like
just more of the same.

I think the usecases that would want this for DML probably also wan this
to work for unlogged, temp tables.

Huh? Unlogged tables generate *zero* WAL, by definition.

If your point is that WAL creation isn't a particularly good resource
consumption metric, that's an argument worth considering, but it seems
quite orthogonal to whether we can implement such throttling reasonably.
And in any case, you didn't provide such an argument.

regards, tom lane

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#14

simon@2ndQuadrant.com

almost 12 years ago

In reply to: Andres Freund (#12)

Re: WAL Rate Limiting

On 16 January 2014 16:56, Andres Freund <andres@2ndquadrant.com> wrote:

On 2014-01-16 17:47:51 +0200, Heikki Linnakangas wrote:

On 01/16/2014 05:39 PM, Andres Freund wrote:

On 2014-01-16 10:35:20 -0500, Tom Lane wrote:

I think possibly a more productive approach to this would be to treat
it as a session-level GUC setting, rather than hard-wiring it to affect
certain commands and not others.

Do you see a reasonable way to implement this generically for all
commands? I don't.

In suitable safe places, check if you've written too much WAL, and sleep if
so. Call it CHECK_FOR_WAL_BUDGET(), along the lines of
CHECK_FOR_INTERRUPTS(), but called less frequently. Or maybe
vacuum_delay_point() is a better analogy. Hopefully you don't need to
sprinkle them in too many places to be useful.

That'd be most places doing XLogInsert() if you want to be
consistent. Each needing to be analyzed whether we're blocking important
resources, determining where in the callstack we can do the
CHECK_FOR_WAL_BUDGET().

I don't think the the add power by allowing bulk DML to be metered is
worth the increased implementation cost.

I think the usecases that would want this for DML probably also wan this
to work for unlogged, temp tables. That'd require a much more generic
resource control framework.

Thank you everyone for responding so positively to this proposal. It
is something many users have complained about.

In time, I think we might want both WAL Rate Limiting and I/O Rate
Limiting. IMHO I/O rate limiting is harder and so this proposal is
restricted solely to WAL rate limiting.

I'm comfortable with a session level parameter. I was mulling over a
wal_rate_limit_scope parameter but I think that is too much.
I will follow Craig's proposal to define this in terms of MB/s, adding
that I would calc from beginning of statement to now, so it is
averaged rate. Setting of zero means unlimited. The rate would be
per-session (in this release) rather than a globally calculated
setting.

I would like to call it CHECK_FOR_WAL_RATE_LIMIT()

That seems like a cheap enough call to allow it to be added in more
places, so my expanded list of commands touched are
CLUSTER/VACUUM FULL
ALTER TABLE (only in phase 3 table rewrite)
ALTER TABLE SET TABLESPACE
CREATE INDEX
which are already there, plus new ones suggested/implied
COPY
CREATE TABLE AS SELECT
INSERT/UPDATE/DELETE

Please let me know if I missed something (rather than debating what is
or is not a "maintenance" command).

Any other thoughts before I cut v2 ?

--
Simon Riggs http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#15

andres@2ndquadrant.com

almost 12 years ago

In reply to: Tom Lane (#13)

Re: WAL Rate Limiting

On 2014-01-16 11:19:28 -0500, Tom Lane wrote:

I think the usecases that would want this for DML probably also wan this
to work for unlogged, temp tables.

Huh? Unlogged tables generate *zero* WAL, by definition.

Yes. That's my point. If we provide it as a generic resource control -
which what's being discussed here sounds to me - it should be generic.

If we provide as a measure to prevent standbys from getting out of date
due to maintenance commands, then it only needs to cover those.

Greetings,

Andres Freund

--
Andres Freund http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#16

simon@2ndQuadrant.com

almost 12 years ago

In reply to: Andres Freund (#15)

Re: WAL Rate Limiting

On 16 January 2014 17:22, Andres Freund <andres@2ndquadrant.com> wrote:

On 2014-01-16 11:19:28 -0500, Tom Lane wrote:

I think the usecases that would want this for DML probably also wan this
to work for unlogged, temp tables.

Huh? Unlogged tables generate *zero* WAL, by definition.

Yes. That's my point. If we provide it as a generic resource control -
which what's being discussed here sounds to me - it should be generic.

If we provide as a measure to prevent standbys from getting out of date
due to maintenance commands, then it only needs to cover those.

Agreed, but it won't happen in this release. I/O resource control to
follow in later releases.

--
Simon Riggs http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#17

andres@2ndquadrant.com

almost 12 years ago

In reply to: Simon Riggs (#14)

Re: WAL Rate Limiting

On 2014-01-16 17:20:19 +0100, Simon Riggs wrote:

I'm comfortable with a session level parameter. I was mulling over a
wal_rate_limit_scope parameter but I think that is too much.
I will follow Craig's proposal to define this in terms of MB/s, adding
that I would calc from beginning of statement to now, so it is
averaged rate. Setting of zero means unlimited. The rate would be
per-session (in this release) rather than a globally calculated
setting.

I would like to call it CHECK_FOR_WAL_RATE_LIMIT()

That seems like a cheap enough call to allow it to be added in more
places, so my expanded list of commands touched are
CLUSTER/VACUUM FULL
ALTER TABLE (only in phase 3 table rewrite)
ALTER TABLE SET TABLESPACE
CREATE INDEX
which are already there, plus new ones suggested/implied
COPY
CREATE TABLE AS SELECT
INSERT/UPDATE/DELETE

Please let me know if I missed something (rather than debating what is
or is not a "maintenance" command).

If we're going to do this for DML - which I am far from convinced of -
we also should do it for SELECT, since that can generate significant
amounts of WAL with checksums turned on.
Otherwise stuff like INSERT ... SELECT, UPDATE FROM et al. will behave
very confusingly since suddenly thez will not only block the WAL
generated by the INSERT but also the SELECT.

Greetings,

Andres Freund

--
Andres Freund http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#18

simon@2ndQuadrant.com

almost 12 years ago

In reply to: Andres Freund (#17)

Re: WAL Rate Limiting

On 16 January 2014 17:29, Andres Freund <andres@2ndquadrant.com> wrote:

Please let me know if I missed something (rather than debating what is
or is not a "maintenance" command).

If we're going to do this for DML - which I am far from convinced of -
we also should do it for SELECT, since that can generate significant
amounts of WAL with checksums turned on.
Otherwise stuff like INSERT ... SELECT, UPDATE FROM et al. will behave
very confusingly since suddenly thez will not only block the WAL
generated by the INSERT but also the SELECT.

Good point, but rather happily I can say "thought of that" and refer
you to the other patch which limits SELECT's ability to dirty pages,
and thus, with checksums enabled will limit the generation of WAL.
(And no, they aren't the same thing).

--
Simon Riggs http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#19

tgl@sss.pgh.pa.us

almost 12 years ago

In reply to: Simon Riggs (#16)

Re: WAL Rate Limiting

Simon Riggs <simon@2ndQuadrant.com> writes:

Agreed, but it won't happen in this release. I/O resource control to
follow in later releases.

"This release"? TBH, I think this patch has arrived too late to be
considered for 9.4. It's a fairly major feature and clearly not
without controversy, so I don't think that posting it for the first
time the day before CF4 starts meets the guidelines we all agreed
to back in Ottawa.

regards, tom lane

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#20

simon@2ndQuadrant.com

almost 12 years ago

In reply to: Tom Lane (#19)

Re: WAL Rate Limiting

On 16 January 2014 17:49, Tom Lane <tgl@sss.pgh.pa.us> wrote:

Simon Riggs <simon@2ndQuadrant.com> writes:

Agreed, but it won't happen in this release. I/O resource control to
follow in later releases.

"This release"? TBH, I think this patch has arrived too late to be
considered for 9.4. It's a fairly major feature and clearly not
without controversy, so I don't think that posting it for the first
time the day before CF4 starts meets the guidelines we all agreed
to back in Ottawa.

I think its majorly useful, but there really isn't much to the
implementation and I expect its beneath the level of being defined as
a major feature. (I think it may have taken less time to write than
this current conversation has lasted).

--
Simon Riggs http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#21

Jeff Janes

jeff.janes@gmail.com

almost 12 years ago

In reply to: Tom Lane (#13)

Re: WAL Rate Limiting

On Thu, Jan 16, 2014 at 8:19 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

I think the usecases that would want this for DML probably also wan this
to work for unlogged, temp tables.

Huh? Unlogged tables generate *zero* WAL, by definition.

Transactions that only change unlogged tables still generate commit records
to WAL.

I don't think that amount of WAL is particularly relevant to this
discussion, but I was recently surprised by it, so wanted to publicize it.
(It was causing a lot of churn in my WAL due to interaction with
archive_timeout)

Cheers,

Jeff

#22

Peter Eisentraut

peter_e@gmx.net

almost 12 years ago

In reply to: Simon Riggs (#1)

Re: WAL Rate Limiting

The documentation doesn't build:

openjade:config.sgml:1416:11:E: end tag for "VARIABLELIST" omitted, but OMITTAG NO was specified
openjade:config.sgml:1390:5: start tag was here

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#23

Craig Ringer

craig@2ndquadrant.com

almost 12 years ago

In reply to: Tom Lane (#11)

Re: WAL Rate Limiting

On 01/16/2014 11:52 PM, Tom Lane wrote:

Heikki Linnakangas <hlinnakangas@vmware.com> writes:

On 01/16/2014 05:39 PM, Andres Freund wrote:

Do you see a reasonable way to implement this generically for all
commands? I don't.

In suitable safe places, check if you've written too much WAL, and sleep
if so. Call it CHECK_FOR_WAL_BUDGET(), along the lines of
CHECK_FOR_INTERRUPTS(), but called less frequently. Or maybe
vacuum_delay_point() is a better analogy. Hopefully you don't need to
sprinkle them in too many places to be useful.

We probably don't really need to implement it for "all" commands; I think
everyone can agree that, say, ALTER TABLE RENAME COLUMN isn't going to
emit enough WAL to really matter. My point was that we should try to hit
everything that potentially *could* generate lots of WAL, rather than
arbitrarily deciding that some are utility commands and some are not.

Then you land up needing a configuration mechanism to control *which*
commands get affected, too, to handle the original use case of "I want
to rate limit vaccuum and index rebuilds, while everything else runs
full tilt".

Or do you propose to just do this per-session? If so, what about autovacuum?

--
Craig Ringer http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#24

robertmhaas@gmail.com

almost 12 years ago

In reply to: Andres Freund (#12)

Re: WAL Rate Limiting

On Thu, Jan 16, 2014 at 10:56 AM, Andres Freund <andres@2ndquadrant.com> wrote:

That'd be most places doing XLogInsert() if you want to be
consistent. Each needing to be analyzed whether we're blocking important
resources, determining where in the callstack we can do the
CHECK_FOR_WAL_BUDGET().

And doing that probably wouldn't be a problem except for the fact that
this patch is showing up at the absolute very last minute with
multiple unresolved design considerations. I'm with Tom: this should
be postponed to 9.5.

That having been said, I bet it could be done at the tail of
XLogInsert(). if there are cases where that's not desirable, then add
macros HOLDOFF_WAL_THROTTLING() and RESUME_WAL_THROTTLING() that bump
a counter up and down. When the counter is >0, XLogInsert() doesn't
sleep; when RESUME_WAL_THROTTLING() drops the counter to 0, it also
considers sleeping. I suspect only a few places would need to do
this, like where we're holding one of the SLRU locks. Some testing
would be needed to figure out exactly which places cause problems, but
that's why it's a good idea to start discussion of your proposed
feature before January 14th.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#25

andres@2ndquadrant.com

almost 12 years ago

In reply to: Robert Haas (#24)

Re: WAL Rate Limiting

On 2014-01-17 09:04:54 -0500, Robert Haas wrote:

That having been said, I bet it could be done at the tail of
XLogInsert(). if there are cases where that's not desirable, then add
macros HOLDOFF_WAL_THROTTLING() and RESUME_WAL_THROTTLING() that bump
a counter up and down. When the counter is >0, XLogInsert() doesn't
sleep; when RESUME_WAL_THROTTLING() drops the counter to 0, it also
considers sleeping. I suspect only a few places would need to do
this, like where we're holding one of the SLRU locks.

I don't think there are many locations where this would be ok. Sleeping
while holding exclusive buffer locks? Quite possibly inside a criticial
section?
Surely not.

Greetings,

Andres Freund

--
Andres Freund http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#26

tgl@sss.pgh.pa.us

almost 12 years ago

In reply to: Andres Freund (#25)

Re: WAL Rate Limiting

Andres Freund <andres@2ndquadrant.com> writes:

On 2014-01-17 09:04:54 -0500, Robert Haas wrote:

That having been said, I bet it could be done at the tail of
XLogInsert().

I don't think there are many locations where this would be ok. Sleeping
while holding exclusive buffer locks? Quite possibly inside a criticial
section?

More or less by definition, you're always doing both when you call
XLogInsert.

Surely not.

I agree. It's got to be somewhere further up the call stack.

I'm inclined to think that what we ought to do is reconceptualize
vacuum_delay_point() as something a bit more generic, and sprinkle
calls to it in a few more places than now.

It's also interesting to wonder about the relationship to
CHECK_FOR_INTERRUPTS --- although I think that currently, we assume
that that's *cheap* (1 test and branch) as long as nothing is pending.
I don't want to see a bunch of arithmetic added to it.

regards, tom lane

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#27

/messages/by-id/51B095FE.6050106@vmware.com

simon@2ndQuadrant.com

almost 12 years ago

In reply to: Simon Riggs (#14)

1 attachment(s)

Re: WAL Rate Limiting

On 16 January 2014 17:20, Simon Riggs <simon@2ndquadrant.com> wrote:

Thank you everyone for responding so positively to this proposal. It
is something many users have complained about.

In time, I think we might want both WAL Rate Limiting and I/O Rate
Limiting. IMHO I/O rate limiting is harder and so this proposal is
restricted solely to WAL rate limiting.

I'm comfortable with a session level parameter. I was mulling over a
wal_rate_limit_scope parameter but I think that is too much.
I will follow Craig's proposal to define this in terms of MB/s, adding
that I would calc from beginning of statement to now, so it is
averaged rate. Setting of zero means unlimited. The rate would be
per-session (in this release) rather than a globally calculated
setting.

Parameter renamed to wal_rate_limit.

Not yet modified to produce this in MB/s

I would like to call it CHECK_FOR_WAL_RATE_LIMIT()

Used CHECK_WAL_BUDGET() as proposed

That seems like a cheap enough call to allow it to be added in more
places, so my expanded list of commands touched are
CLUSTER/VACUUM FULL
ALTER TABLE (only in phase 3 table rewrite)
ALTER TABLE SET TABLESPACE
CREATE INDEX
which are already there, plus new ones suggested/implied
COPY
CREATE TABLE AS SELECT
INSERT/UPDATE/DELETE

Added, no problems or technical difficulties encountered.

Please let me know if I missed something (rather than debating what is
or is not a "maintenance" command).

Any other thoughts before I cut v2 ?

v2 attached

--
Simon Riggs http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

Attachments:

wal_rate_limiting.v2.patchapplication/octet-stream; name=wal_rate_limiting.v2.patchDownload

diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml
index 0f2f2bf..aa32cfa 100644
--- a/doc/src/sgml/config.sgml
+++ b/doc/src/sgml/config.sgml
@@ -1354,6 +1354,68 @@ include 'filename'
      </variablelist>
     </sect2>
 
+    <sect2 id="runtime-config-resource-wal-rate-limit">
+     <title>WAL Rate Limiting</title>
+
+     <para>
+      During the execution of commands that may produce large volumes of WAL,
+      such as <command>INSERT SELECT</command>, <command>COPY</command>,
+      <command>CLUSTER</command>, <command>ALTER TABLE</command>,
+      <command>CREATE INDEX</command>, the system maintains an
+      internal counter that keeps track of the volume of WAL written.
+      When the accumulated WAL volume reaches a multiple of 64kB
+      the process performing
+      the operation will sleep for a short period of time, as specified by
+      <varname>wal_rate_limit</varname>. Then it will reset the
+      counter and continue execution.
+     </para>
+
+     <para>
+      The intent of this feature is to allow administrators to reduce
+      the impact of these commands on concurrent database activity
+      and replication.  There are many situations where it is not
+      important that bulk write commands finish quickly;
+      however, it is usually very important that these
+      commands do not significantly interfere with the ability of the
+      system to perform other database operations. WAL Rate Limiting
+      provides a way for administrators to achieve this.
+     </para>
+
+     <para>
+      This feature is disabled by default.  To enable it, set the
+      <varname>wal_rate_limit</varname> variable to a nonzero
+      value, when <varname>wal_level</varname> is set to
+      <literal>archive</literal> or higher.
+     </para>
+
+     <variablelist>
+      <varlistentry id="guc-wal-rate-limit" xreflabel="wal_rate_limit">
+       <term><varname>wal_rate_limit</varname> (<type>integer</type>)</term>
+       <indexterm>
+        <primary><varname>wal_rate_limit</> configuration parameter</primary>
+       </indexterm>
+       <listitem>
+        <para>
+         The length of time, in milliseconds, that the process will sleep
+         when the cost limit has been exceeded.
+         The default value is zero, which disables the WAL Rate Limiting
+         feature.  Positive values enable WAL Rate Limiting.
+         Note that on many systems, the effective resolution
+         of sleep delays is 10 milliseconds; setting
+         <varname>wal_rate_limit</varname> to a value that is
+         not a multiple of 10 might have the same results as setting it
+         to the next higher multiple of 10.
+        </para>
+
+        <para>
+         When using WAL Rate Limiting, appropriate values for
+         <varname>wal_rate_limit</> are usually quite small, perhaps
+         10 or 20 milliseconds.
+        </para>
+       </listitem>
+      </varlistentry>
+    </sect2>
+
     <sect2 id="runtime-config-resource-vacuum-cost">
      <title>Cost-based Vacuum Delay</title>
 
diff --git a/src/backend/access/gin/gininsert.c b/src/backend/access/gin/gininsert.c
index 5e6f627..29dab6d 100644
--- a/src/backend/access/gin/gininsert.c
+++ b/src/backend/access/gin/gininsert.c
@@ -279,6 +279,8 @@ ginBuildCallback(Relation index, HeapTuple htup, Datum *values,
 		{
 			/* there could be many entries, so be willing to abort here */
 			CHECK_FOR_INTERRUPTS();
+			if (RelationNeedsWAL(index))
+				CHECK_FOR_WAL_BUDGET();
 			ginEntryInsert(&buildstate->ginstate, attnum, key, category,
 						   list, nlist, &buildstate->buildStats);
 		}
diff --git a/src/backend/access/gist/gistbuild.c b/src/backend/access/gist/gistbuild.c
index 1832687..0825c59 100644
--- a/src/backend/access/gist/gistbuild.c
+++ b/src/backend/access/gist/gistbuild.c
@@ -563,6 +563,9 @@ gistProcessItup(GISTBuildState *buildstate, IndexTuple itup,
 
 	CHECK_FOR_INTERRUPTS();
 
+	if (RelationNeedsWAL(indexrel))
+		CHECK_FOR_WAL_BUDGET();
+
 	/*
 	 * Loop until we reach a leaf page (level == 0) or a level with buffers
 	 * (not including the level we start at, because we would otherwise make
diff --git a/src/backend/access/nbtree/nbtsort.c b/src/backend/access/nbtree/nbtsort.c
index 504eb71..b5c265e 100644
--- a/src/backend/access/nbtree/nbtsort.c
+++ b/src/backend/access/nbtree/nbtsort.c
@@ -460,6 +460,9 @@ _bt_buildadd(BTWriteState *wstate, BTPageState *state, IndexTuple itup)
 	 */
 	CHECK_FOR_INTERRUPTS();
 
+	if (wstate->btws_use_wal)
+		CHECK_FOR_WAL_BUDGET();
+
 	npage = state->btps_page;
 	nblkno = state->btps_blkno;
 	last_off = state->btps_lastoff;
diff --git a/src/backend/access/spgist/spgdoinsert.c b/src/backend/access/spgist/spgdoinsert.c
index 2f6a878..a746329 100644
--- a/src/backend/access/spgist/spgdoinsert.c
+++ b/src/backend/access/spgist/spgdoinsert.c
@@ -1916,6 +1916,9 @@ spgdoinsert(Relation index, SpGistState *state,
 		 */
 		CHECK_FOR_INTERRUPTS();
 
+		if (RelationNeedsWAL(index))
+			CHECK_FOR_WAL_BUDGET();
+
 		if (current.blkno == InvalidBlockNumber)
 		{
 			/*
diff --git a/src/backend/access/transam/xact.c b/src/backend/access/transam/xact.c
index 0487be1..22adedd 100644
--- a/src/backend/access/transam/xact.c
+++ b/src/backend/access/transam/xact.c
@@ -183,6 +183,12 @@ static TransactionStateData TopTransactionStateData = {
 };
 
 /*
+ * Track total volume of WAL written by current top-level transaction
+ * to allow tracking, reporting and control of writing WAL.
+ */
+static uint64 currentTransactionWALVolume;
+
+/*
  * unreportedXids holds XIDs of all subtransactions that have not yet been
  * reported in a XLOG_XACT_ASSIGNMENT record.
  */
@@ -397,17 +403,26 @@ GetCurrentTransactionIdIfAny(void)
 }
 
 /*
- *	MarkCurrentTransactionIdLoggedIfAny
+ * ReportTransactionInsertedWAL
  *
- * Remember that the current xid - if it is assigned - now has been wal logged.
+ * Remember that the current xid - if it is assigned - has now inserted WAL
  */
 void
-MarkCurrentTransactionIdLoggedIfAny(void)
+ReportTransactionInsertedWAL(uint32 insertedWALVolume)
 {
+	currentTransactionWALVolume += insertedWALVolume;
 	if (TransactionIdIsValid(CurrentTransactionState->transactionId))
 		CurrentTransactionState->didLogXid = true;
 }
 
+/*
+ * GetCurrentTransactionWALVolume
+ */
+uint64
+GetCurrentTransactionWALVolume(void)
+{
+	return currentTransactionWALVolume;
+}
 
 /*
  *	GetStableLatestTransactionId
@@ -1772,6 +1787,7 @@ StartTransaction(void)
 	/*
 	 * initialize reported xid accounting
 	 */
+	currentTransactionWALVolume = 0;
 	nUnreportedXids = 0;
 	s->didLogXid = false;
 
diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c
index 8679d0a..21eecfc 100644
--- a/src/backend/access/transam/xlog.c
+++ b/src/backend/access/transam/xlog.c
@@ -1192,6 +1192,7 @@ begin:;
 		 * already exactly at the beginning of a segment, so there was no need
 		 * to do anything.
 		 */
+		write_len = 0;
 	}
 
 	/*
@@ -1199,7 +1200,7 @@ begin:;
 	 */
 	WALInsertSlotRelease();
 
-	MarkCurrentTransactionIdLoggedIfAny();
+	ReportTransactionInsertedWAL(write_len);
 
 	END_CRIT_SECTION();
 
diff --git a/src/backend/catalog/index.c b/src/backend/catalog/index.c
index 8eae43d..be1f143 100644
--- a/src/backend/catalog/index.c
+++ b/src/backend/catalog/index.c
@@ -2163,6 +2163,8 @@ IndexBuildHeapScan(Relation heapRelation,
 		OldestXmin = GetOldestXmin(heapRelation->rd_rel->relisshared, true);
 	}
 
+	init_wal_rate_limit();
+
 	scan = heap_beginscan_strat(heapRelation,	/* relation */
 								snapshot,		/* snapshot */
 								0,		/* number of keys */
diff --git a/src/backend/commands/cluster.c b/src/backend/commands/cluster.c
index e15f042..9860633 100644
--- a/src/backend/commands/cluster.c
+++ b/src/backend/commands/cluster.c
@@ -926,6 +926,8 @@ copy_heap_data(Oid OIDNewHeap, Oid OIDOldHeap, Oid OIDOldIndex, bool verbose,
 						get_namespace_name(RelationGetNamespace(OldHeap)),
 						RelationGetRelationName(OldHeap))));
 
+	init_wal_rate_limit();
+
 	/*
 	 * Scan through the OldHeap, either in OldIndex order or sequentially;
 	 * copy each tuple into the NewHeap, or transiently to the tuplesort
@@ -940,6 +942,9 @@ copy_heap_data(Oid OIDNewHeap, Oid OIDOldHeap, Oid OIDOldIndex, bool verbose,
 
 		CHECK_FOR_INTERRUPTS();
 
+		if (use_wal)
+			CHECK_FOR_WAL_BUDGET();
+
 		if (indexScan != NULL)
 		{
 			tuple = index_getnext(indexScan, ForwardScanDirection);
diff --git a/src/backend/commands/copy.c b/src/backend/commands/copy.c
index f91f164..7ad2a71 100644
--- a/src/backend/commands/copy.c
+++ b/src/backend/commands/copy.c
@@ -2195,6 +2195,9 @@ CopyFrom(CopyState cstate)
 
 		CHECK_FOR_INTERRUPTS();
 
+		if (hi_options & HEAP_INSERT_SKIP_WAL)
+			CHECK_FOR_WAL_BUDGET();
+
 		if (nBufferedTuples == 0)
 		{
 			/*
diff --git a/src/backend/commands/createas.c b/src/backend/commands/createas.c
index 73e6e20..9256649 100644
--- a/src/backend/commands/createas.c
+++ b/src/backend/commands/createas.c
@@ -460,6 +460,9 @@ intorel_receive(TupleTableSlot *slot, DestReceiver *self)
 				myState->hi_options,
 				myState->bistate);
 
+	if (myState->hi_options & HEAP_INSERT_SKIP_WAL)
+		CHECK_FOR_WAL_BUDGET();
+
 	/* We know this is a newly created relation, so there are no indexes */
 }
 
diff --git a/src/backend/commands/tablecmds.c b/src/backend/commands/tablecmds.c
index 466d757..6917c69 100644
--- a/src/backend/commands/tablecmds.c
+++ b/src/backend/commands/tablecmds.c
@@ -3821,6 +3821,8 @@ ATRewriteTable(AlteredTableInfo *tab, Oid OIDNewHeap, LOCKMODE lockmode)
 				dropped_attrs = lappend_int(dropped_attrs, i);
 		}
 
+		init_wal_rate_limit();
+
 		/*
 		 * Scan through the rows, generating a new row if needed and then
 		 * checking all the constraints.
@@ -3929,6 +3931,9 @@ ATRewriteTable(AlteredTableInfo *tab, Oid OIDNewHeap, LOCKMODE lockmode)
 			ResetExprContext(econtext);
 
 			CHECK_FOR_INTERRUPTS();
+
+			if (hi_options & HEAP_INSERT_SKIP_WAL)
+				CHECK_FOR_WAL_BUDGET();
 		}
 
 		MemoryContextSwitchTo(oldCxt);
@@ -9120,11 +9125,16 @@ copy_relation_data(SMgrRelation src, SMgrRelation dst,
 
 	nblocks = smgrnblocks(src, forkNum);
 
+	init_wal_rate_limit();
+
 	for (blkno = 0; blkno < nblocks; blkno++)
 	{
 		/* If we got a cancel signal during the copy of the data, quit */
 		CHECK_FOR_INTERRUPTS();
 
+		if (use_wal)
+			CHECK_FOR_WAL_BUDGET();
+
 		smgrread(src, forkNum, blkno, buf);
 
 		if (!PageIsVerified(page, blkno))
diff --git a/src/backend/commands/vacuum.c b/src/backend/commands/vacuum.c
index 3455a0b..90b4552 100644
--- a/src/backend/commands/vacuum.c
+++ b/src/backend/commands/vacuum.c
@@ -60,7 +60,7 @@ int			vacuum_freeze_table_age;
 /* A few variables that don't seem worth passing around as parameters */
 static MemoryContext vac_context = NULL;
 static BufferAccessStrategy vac_strategy;
-
+static uint64 prevWALchunks = 0;
 
 /* non-export function prototypes */
 static List *get_rel_oids(Oid relid, const RangeVar *vacrel);
@@ -1293,3 +1293,51 @@ vacuum_delay_point(void)
 		CHECK_FOR_INTERRUPTS();
 	}
 }
+
+/*
+ * WAL Rate Limiting will wait after each "chunk" of WAL.
+ */
+#define WALRateLimitBytesPerChunk	65536
+#define GetCurrentTransactionChunks() \
+	(GetCurrentTransactionWALVolume() / WALRateLimitBytesPerChunk )
+/*
+ * init_wal_rate_limit - initialize for WAL rate limiting
+ */
+void
+init_wal_rate_limit(void)
+{
+	prevWALchunks = GetCurrentTransactionChunks();
+}
+
+/*
+ * CHECK_FOR_WAL_BUDGET --- check for interrupts and cost-based delay.
+ *
+ * This should be called in each major loop of maintenance command processing,
+ * typically once per page processed when generating large amounts of WAL.
+ *
+ * Listed in miscadmin.h for general use.
+ */
+void
+CHECK_FOR_WAL_BUDGET(void)
+{
+	uint64 currWALchunks;
+
+	if (WALRateLimit == 0)
+		return;
+
+	/*
+	 * Fine out how many chunks of WAL have been written
+	 */
+	currWALchunks = GetCurrentTransactionChunks();
+
+	/* Nap if appropriate */
+	if (currWALchunks > prevWALchunks)
+	{
+		prevWALchunks = currWALchunks;
+
+		pg_usleep(WALRateLimit * 1000L);
+
+		/* Might have gotten an interrupt while sleeping */
+		CHECK_FOR_INTERRUPTS();
+	}
+}
diff --git a/src/backend/executor/nodeModifyTable.c b/src/backend/executor/nodeModifyTable.c
index 6f0f47e..697841e 100644
--- a/src/backend/executor/nodeModifyTable.c
+++ b/src/backend/executor/nodeModifyTable.c
@@ -1026,6 +1026,8 @@ ExecModifyTable(ModifyTableState *node)
 				break;
 		}
 
+		CHECK_FOR_WAL_BUDGET();
+
 		/*
 		 * If we got a RETURNING result, return it to caller.  We'll continue
 		 * the work on next call.
diff --git a/src/backend/utils/init/globals.c b/src/backend/utils/init/globals.c
index 63c951e..1740e6e 100644
--- a/src/backend/utils/init/globals.c
+++ b/src/backend/utils/init/globals.c
@@ -115,6 +115,7 @@ int			VacuumCostPageMiss = 10;
 int			VacuumCostPageDirty = 20;
 int			VacuumCostLimit = 200;
 int			VacuumCostDelay = 0;
+int			WALRateLimit = 0;
 
 int			VacuumPageHit = 0;
 int			VacuumPageMiss = 0;
diff --git a/src/backend/utils/misc/guc.c b/src/backend/utils/misc/guc.c
index 1217098..03c9d60 100644
--- a/src/backend/utils/misc/guc.c
+++ b/src/backend/utils/misc/guc.c
@@ -544,6 +544,8 @@ const char *const config_group_names[] =
 	gettext_noop("Resource Usage / Disk"),
 	/* RESOURCES_KERNEL */
 	gettext_noop("Resource Usage / Kernel Resources"),
+	/* RESOURCES_WAL_RATE_LIMIT */
+	gettext_noop("Resource Usage / WAL Rate Limiting"),
 	/* RESOURCES_VACUUM_DELAY */
 	gettext_noop("Resource Usage / Cost-Based Vacuum Delay"),
 	/* RESOURCES_BGWRITER */
@@ -1842,6 +1844,16 @@ static struct config_int ConfigureNamesInt[] =
 	},
 
 	{
+		{"wal_rate_limit", PGC_USERSET, RESOURCES_WAL_RATE_LIMIT,
+			gettext_noop("WAL rate limit delay in milliseconds."),
+			NULL,
+			GUC_UNIT_MS
+		},
+		&WALRateLimit,
+		0, 0, 100,
+		NULL, NULL, NULL
+	},
+	{
 		{"autovacuum_vacuum_cost_delay", PGC_SIGHUP, AUTOVACUUM,
 			gettext_noop("Vacuum cost delay in milliseconds, for autovacuum."),
 			NULL,
diff --git a/src/backend/utils/misc/postgresql.conf.sample b/src/backend/utils/misc/postgresql.conf.sample
index 27791cc..0ee285e 100644
--- a/src/backend/utils/misc/postgresql.conf.sample
+++ b/src/backend/utils/misc/postgresql.conf.sample
@@ -145,6 +145,10 @@
 					# (change requires restart)
 #shared_preload_libraries = ''		# (change requires restart)
 
+# - WAL Rate Limiting -
+
+#wal_rate_limit = 0			# 0-100 milliseconds
+
 # - Cost-Based Vacuum Delay -
 
 #vacuum_cost_delay = 0			# 0-100 milliseconds
diff --git a/src/include/access/xact.h b/src/include/access/xact.h
index 634f5b2..31ca1ba 100644
--- a/src/include/access/xact.h
+++ b/src/include/access/xact.h
@@ -215,7 +215,8 @@ extern TransactionId GetCurrentTransactionId(void);
 extern TransactionId GetCurrentTransactionIdIfAny(void);
 extern TransactionId GetStableLatestTransactionId(void);
 extern SubTransactionId GetCurrentSubTransactionId(void);
-extern void MarkCurrentTransactionIdLoggedIfAny(void);
+extern void ReportTransactionInsertedWAL(uint32 insertedWALVolume);
+extern uint64 GetCurrentTransactionWALVolume(void);
 extern bool SubTransactionIsActive(SubTransactionId subxid);
 extern CommandId GetCurrentCommandId(bool used);
 extern TimestampTz GetCurrentTransactionStartTimestamp(void);
diff --git a/src/include/miscadmin.h b/src/include/miscadmin.h
index b145a19..27b5d08 100644
--- a/src/include/miscadmin.h
+++ b/src/include/miscadmin.h
@@ -226,6 +226,10 @@ extern bool allowSystemTableMods;
 extern PGDLLIMPORT int work_mem;
 extern PGDLLIMPORT int maintenance_work_mem;
 
+extern void init_wal_rate_limit(void);
+extern void CHECK_FOR_WAL_BUDGET(void);
+extern int  WALRateLimit;
+
 extern int	VacuumCostPageHit;
 extern int	VacuumCostPageMiss;
 extern int	VacuumCostPageDirty;
diff --git a/src/include/utils/guc_tables.h b/src/include/utils/guc_tables.h
index 47ff880..ded38c9 100644
--- a/src/include/utils/guc_tables.h
+++ b/src/include/utils/guc_tables.h
@@ -61,6 +61,7 @@ enum config_group
 	RESOURCES_MEM,
 	RESOURCES_DISK,
 	RESOURCES_KERNEL,
+	RESOURCES_WAL_RATE_LIMIT,
 	RESOURCES_VACUUM_DELAY,
 	RESOURCES_BGWRITER,
 	RESOURCES_ASYNCHRONOUS,

#28

Heikki Linnakangas

hlinnakangas@vmware.com

almost 12 years ago

In reply to: Tom Lane (#26)

Re: WAL Rate Limiting

It occurs to me that this is very similar to the method I proposed in
June to enforce a hard limit on WAL usage, to avoid PANIC caused by
running out of disk space when writing WAL:

Enforcing a global limit needs more book-keeping than rate limiting
individual sesssions. But I'm hoping that the CHECK_WAL_BUDGET() calls
could be used for both purposes.

- Heikki

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#29

Heikki Linnakangas

hlinnakangas@vmware.com

almost 12 years ago

In reply to: Simon Riggs (#27)

Re: WAL Rate Limiting

On 01/17/2014 05:20 PM, Simon Riggs wrote:

+ if (RelationNeedsWAL(indexrel))
+ CHECK_FOR_WAL_BUDGET();

I don't think we need the RelationNeedsWAL tests. If the relation is not
WAL-logged, you won't write much WAL, so you should easily stay under
the limit. And CHECK_FOR_WAL_BUDGET() better be cheap when you're below
the limit.

- Heikki

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#30

simon@2ndQuadrant.com

almost 12 years ago

In reply to: Heikki Linnakangas (#29)

Re: WAL Rate Limiting

On 17 January 2014 16:34, Heikki Linnakangas <hlinnakangas@vmware.com> wrote:

On 01/17/2014 05:20 PM, Simon Riggs wrote:

+ if (RelationNeedsWAL(indexrel))
+ CHECK_FOR_WAL_BUDGET();

I don't think we need the RelationNeedsWAL tests. If the relation is not
WAL-logged, you won't write much WAL, so you should easily stay under the
limit. And CHECK_FOR_WAL_BUDGET() better be cheap when you're below the
limit.

OK, I'll remove them, thanks.

--
Simon Riggs http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#31

simon@2ndQuadrant.com

almost 12 years ago

In reply to: Tom Lane (#26)

Re: WAL Rate Limiting

On 17 January 2014 16:10, Tom Lane <tgl@sss.pgh.pa.us> wrote:

Andres Freund <andres@2ndquadrant.com> writes:

On 2014-01-17 09:04:54 -0500, Robert Haas wrote:

That having been said, I bet it could be done at the tail of
XLogInsert().

I don't think there are many locations where this would be ok. Sleeping
while holding exclusive buffer locks? Quite possibly inside a criticial
section?

More or less by definition, you're always doing both when you call
XLogInsert.

Surely not.

I agree. It's got to be somewhere further up the call stack.

Definitely.

I'm inclined to think that what we ought to do is reconceptualize
vacuum_delay_point() as something a bit more generic, and sprinkle
calls to it in a few more places than now.

Agreed; that was the original plan, but implementation delays
prevented the whole vision/discussion/implementation. Requirements
from various areas include WAL rate limiting for replication, I/O rate
limiting, hard CPU and I/O limits for security and mixed workload
coexistence.

I'd still like to get something on this in 9.4 that alleviates the
replication issues, leaving wider changes for later releases.

The vacuum_* parameters don't allow any control over WAL production,
which is often the limiting factor. I could, for example, introduce a
new parameter for vacuum_cost_delay that provides a weighting for each
new BLCKSZ chunk of WAL, then rename all parameters to a more general
form. Or I could forget that and just press ahead with the patch as
is, providing a cleaner interface in next release.

It's also interesting to wonder about the relationship to
CHECK_FOR_INTERRUPTS --- although I think that currently, we assume
that that's *cheap* (1 test and branch) as long as nothing is pending.
I don't want to see a bunch of arithmetic added to it.

Good point.

I'll call these new calls CHECK_FOR_RESOURCES() to allow us to
implement various kinds of resource checking in future.

--
Simon Riggs http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#32

simon@2ndQuadrant.com

almost 12 years ago

In reply to: Heikki Linnakangas (#28)

Re: WAL Rate Limiting

On 17 January 2014 16:30, Heikki Linnakangas <hlinnakangas@vmware.com> wrote:

It occurs to me that this is very similar to the method I proposed in June
to enforce a hard limit on WAL usage, to avoid PANIC caused by running out
of disk space when writing WAL:

/messages/by-id/51B095FE.6050106@vmware.com

Enforcing a global limit needs more book-keeping than rate limiting
individual sesssions. But I'm hoping that the CHECK_WAL_BUDGET() calls could
be used for both purposes.

We can't set a useful hard limit on WAL by default without that being
a bigger foot gun than what we have now.

If we did have a limit, I would look to make it less exact and take it
out of the main path for example by checkpointer - or perhaps only
make the check when we switch xlog files.

So I don't want to include that requirement into this current work.

--
Simon Riggs http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#33

Greg Stark

stark@mit.edu

almost 12 years ago

In reply to: Simon Riggs (#31)

Re: WAL Rate Limiting

On Mon, Jan 20, 2014 at 5:37 PM, Simon Riggs <simon@2ndquadrant.com> wrote:

Agreed; that was the original plan, but implementation delays
prevented the whole vision/discussion/implementation. Requirements
from various areas include WAL rate limiting for replication, I/O rate
limiting, hard CPU and I/O limits for security and mixed workload
coexistence.

I'd still like to get something on this in 9.4 that alleviates the
replication issues, leaving wider changes for later releases.

My first reaction was that we should just have a generic I/O resource
throttling. I was only convinced this was a reasonable idea by the
replication use case. It would help me to understand the specific
situations where replication breaks down due to WAL bandwidth
starvation. Heroku has had some problems with slaves falling behind
though the immediate problems that causes is the slave filling up disk
which we could solve more directly by switching to archive mode rather
than slowing down the master.

But I would suggest you focus on a specific use case that's
problematic so we can judge better if the implementation is really
fixing it.

The vacuum_* parameters don't allow any control over WAL production,
which is often the limiting factor. I could, for example, introduce a
new parameter for vacuum_cost_delay that provides a weighting for each
new BLCKSZ chunk of WAL, then rename all parameters to a more general
form. Or I could forget that and just press ahead with the patch as
is, providing a cleaner interface in next release.

It's also interesting to wonder about the relationship to
CHECK_FOR_INTERRUPTS --- although I think that currently, we assume
that that's *cheap* (1 test and branch) as long as nothing is pending.
I don't want to see a bunch of arithmetic added to it.

Good point.

I think it should be possible to actually merge it into
CHECK_FOR_INTERRUPTS. Have a single global flag
io_done_since_check_for_interrupts which is set to 0 after each
CHECK_FOR_INTERRUPTS and set to 1 whenever any wal is written. Then
CHECK_FOR_INTERRUPTS turns into two tests and branches instead of one
in the normal case.

In fact you could do all the arithmetic when you do the wal write.
Only set the flag if the bandwidth consumed is above the budget. Then
the flag should only ever be set when you're about to sleep.

I would dearly love to see a generic I/O bandwidth limits so it would
be nice to see a nicely general pattern here that could be extended
even if we only target wal this release.

I'm going to read the existing patch now, do you think it's ready to
go or did you want to do more work based on the feedback?

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#34

robertmhaas@gmail.com

almost 12 years ago

In reply to: Greg Stark (#33)

Re: WAL Rate Limiting

On Wed, Feb 19, 2014 at 8:28 AM, Greg Stark <stark@mit.edu> wrote:

On Mon, Jan 20, 2014 at 5:37 PM, Simon Riggs <simon@2ndquadrant.com> wrote:

Agreed; that was the original plan, but implementation delays
prevented the whole vision/discussion/implementation. Requirements
from various areas include WAL rate limiting for replication, I/O rate
limiting, hard CPU and I/O limits for security and mixed workload
coexistence.

I'd still like to get something on this in 9.4 that alleviates the
replication issues, leaving wider changes for later releases.

My first reaction was that we should just have a generic I/O resource
throttling. I was only convinced this was a reasonable idea by the
replication use case. It would help me to understand the specific
situations where replication breaks down due to WAL bandwidth
starvation. Heroku has had some problems with slaves falling behind
though the immediate problems that causes is the slave filling up disk
which we could solve more directly by switching to archive mode rather
than slowing down the master.

But I would suggest you focus on a specific use case that's
problematic so we can judge better if the implementation is really
fixing it.

The vacuum_* parameters don't allow any control over WAL production,
which is often the limiting factor. I could, for example, introduce a
new parameter for vacuum_cost_delay that provides a weighting for each
new BLCKSZ chunk of WAL, then rename all parameters to a more general
form. Or I could forget that and just press ahead with the patch as
is, providing a cleaner interface in next release.

It's also interesting to wonder about the relationship to
CHECK_FOR_INTERRUPTS --- although I think that currently, we assume
that that's *cheap* (1 test and branch) as long as nothing is pending.
I don't want to see a bunch of arithmetic added to it.

Good point.

I think it should be possible to actually merge it into
CHECK_FOR_INTERRUPTS. Have a single global flag
io_done_since_check_for_interrupts which is set to 0 after each
CHECK_FOR_INTERRUPTS and set to 1 whenever any wal is written. Then
CHECK_FOR_INTERRUPTS turns into two tests and branches instead of one
in the normal case.

In fact you could do all the arithmetic when you do the wal write.
Only set the flag if the bandwidth consumed is above the budget. Then
the flag should only ever be set when you're about to sleep.

I would dearly love to see a generic I/O bandwidth limits so it would
be nice to see a nicely general pattern here that could be extended
even if we only target wal this release.

I'm going to read the existing patch now, do you think it's ready to
go or did you want to do more work based on the feedback?

Well, *I* don't think this is ready to go. A WAL rate limit that only
limits WAL sometimes still doesn't impress me.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#35

simon@2ndQuadrant.com

almost 12 years ago

In reply to: Robert Haas (#34)

Re: WAL Rate Limiting

On 19 February 2014 16:04, Robert Haas <robertmhaas@gmail.com> wrote:

Well, *I* don't think this is ready to go. A WAL rate limit that only
limits WAL sometimes still doesn't impress me.

Could you be specific in your criticism? "Sometimes" wouldn't impress
anybody, but we need to understand whether you are referring to a bug
or just making a personal assessment of the patch features. Thanks.

--
Simon Riggs http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#36

Greg Stark

stark@mit.edu

almost 12 years ago

In reply to: Simon Riggs (#35)

Re: WAL Rate Limiting

On Thu, Feb 20, 2014 at 1:45 PM, Simon Riggs <simon@2ndquadrant.com> wrote:

On 19 February 2014 16:04, Robert Haas <robertmhaas@gmail.com> wrote:

Well, *I* don't think this is ready to go. A WAL rate limit that only
limits WAL sometimes still doesn't impress me.

Could you be specific in your criticism? "Sometimes" wouldn't impress
anybody, but we need to understand whether you are referring to a bug
or just making a personal assessment of the patch features. Thanks.

I think it's clear what the question here is. The patch represents a
quick easy win for certain cases but isn't the nice broad solution we
would prefer to have. We face this decision a lot and it's a tough
one. On the one hand we want to set a high standard for ourselves and
not load up the Postgres source with
lots of half-measures, on the
other hand we don't want to let "perfect be the enemy of the good" and
have development stall while we refuse easy wins.

I don't think it would be a terrible thing if this patch went in. And
I also don't think it would be a huge loss if it didn't since I think
we will need IO resource management down the road anyways.

I was looking for a few non-controversial patches to work on at least
at the moment to establish some history before I take on more
controversial patches so I don't think committing this is really a
great idea for me right now. If you wanted to make this a more general
system I have some ideas and I think we could do it fairly easily. We
could make CHECK_FOR_INTERRUPTS have a single flag but if it's set
check the "interrupt type" and know that some types can be handled by
simply sleeping and not throwing an error. There are already some
cases that are handled specially so it wouldn't really even change the
code much. I would be happy to help get something like that committed.

--
greg

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#37

simon@2ndQuadrant.com

almost 12 years ago

In reply to: Greg Stark (#33)

Re: WAL Rate Limiting

On 19 February 2014 13:28, Greg Stark <stark@mit.edu> wrote:

On Mon, Jan 20, 2014 at 5:37 PM, Simon Riggs <simon@2ndquadrant.com> wrote:

Agreed; that was the original plan, but implementation delays
prevented the whole vision/discussion/implementation. Requirements
from various areas include WAL rate limiting for replication, I/O rate
limiting, hard CPU and I/O limits for security and mixed workload
coexistence.

I'd still like to get something on this in 9.4 that alleviates the
replication issues, leaving wider changes for later releases.

My first reaction was that we should just have a generic I/O resource
throttling. I was only convinced this was a reasonable idea by the
replication use case. It would help me to understand the specific
situations where replication breaks down due to WAL bandwidth
starvation.

The various bulk operations mentioned in the OP can dump GBs of data
into WAL. That causes replication to immediately fall behind, which
slows everything down if you are using synchronous replication. This
has been discussed before on list over last few years (and Heroku
staff have been involved in that, fyi).

General I/O limiting is a nice-to-have but what is suggested here is
an essential aspect of running bulk commands when using replication,
especially synchronous replication.

The use cases for I/O limiting and replication limiting are different
in both their effects and their criticality.

The design choice of making the limit only apply to bulk ops is
because that is where the main problem lies. Rate limiting will cause
a loss of performance in the main line for non-bulk operations, so
adding tests there will not be valuable.

--
Simon Riggs http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#38

Greg Stark

stark@mit.edu

almost 12 years ago

In reply to: Simon Riggs (#37)

Re: WAL Rate Limiting

On Thu, Feb 20, 2014 at 2:43 PM, Simon Riggs <simon@2ndquadrant.com> wrote:

The design choice of making the limit only apply to bulk ops is
because that is where the main problem lies. Rate limiting will cause
a loss of performance in the main line for non-bulk operations, so
adding tests there will not be valuable.

This isn't a bad point. If you do a massive UPDATE or INSERT you
probably do want the ability to limit the damage to your replication
lag but if you're an OLTP worker and you just happen to handle a
disproportionate amount of the incoming requests you wouldn't. But
then presumably you would just set that GUC that way where your OLTP
workers would run without any limits and your batch jobs would set a
limit.

As I said it wouldn't be terrible if the patch went in and could only
handle cluster/vacuum/create index. They're by far the most common
causes of problems and most people who do massive UPDATES or INSERTS
quickly discover they want to do it in batches of a few thousand rows
anyways so they can limit the speed themselves.

There's a decent argument to made for incremental changes too. If we
add a feature like this for these operations then we'll find out how
well it works in practice and whether users need finer control or what
other sources become the next problem before we've invested a whole
lot in a general solution.

--
greg

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#39

robertmhaas@gmail.com

almost 12 years ago

In reply to: Simon Riggs (#37)

Re: WAL Rate Limiting

On Thu, Feb 20, 2014 at 9:43 AM, Simon Riggs <simon@2ndquadrant.com> wrote:

The design choice of making the limit only apply to bulk ops is
because that is where the main problem lies. Rate limiting will cause
a loss of performance in the main line for non-bulk operations, so
adding tests there will not be valuable.

That's pure sophistry. Of course rate limiting will cause "a loss of
performance in the main line for non-bulk operations". Rate limiting
will also cause a loss of performance for bulk operations. The whole
point of rate limiting is to cause a loss of performance. That's not
a bug; that's what the feature is explicitly designed to do.

So-called "non-bulk operations", which seems to be defined as more or
less "anything where actually making this feature work seemed hard",
can include huge inserts, updates, or deletes that greatly depress
performance for other parts of the system, either individually or in
aggregate. And it's just as legitimate to want to tamp down that
activity as it is to want to slow down CLUSTER. Remember, this is a
GUC, so it need not be set identically for every user session. It is
not hard at all to imagine a situation where the user wishes to limit
the rate at which some sessions can write WAL so as to avoid
interfering with other, higher-priority tasks happening in other
sessions. That is hardly a niche use case; I think I've seen it
reported, if anything, even more frequently than problems with what
you're calling bulk operations.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#40

robertmhaas@gmail.com

almost 12 years ago

In reply to: Greg Stark (#38)

Re: WAL Rate Limiting

On Thu, Feb 20, 2014 at 10:10 AM, Greg Stark <stark@mit.edu> wrote:

On Thu, Feb 20, 2014 at 2:43 PM, Simon Riggs <simon@2ndquadrant.com> wrote:

The design choice of making the limit only apply to bulk ops is
because that is where the main problem lies. Rate limiting will cause
a loss of performance in the main line for non-bulk operations, so
adding tests there will not be valuable.

This isn't a bad point. If you do a massive UPDATE or INSERT you
probably do want the ability to limit the damage to your replication
lag but if you're an OLTP worker and you just happen to handle a
disproportionate amount of the incoming requests you wouldn't. But
then presumably you would just set that GUC that way where your OLTP
workers would run without any limits and your batch jobs would set a
limit.

As I said it wouldn't be terrible if the patch went in and could only
handle cluster/vacuum/create index. They're by far the most common
causes of problems and most people who do massive UPDATES or INSERTS
quickly discover they want to do it in batches of a few thousand rows
anyways so they can limit the speed themselves.

There's a decent argument to made for incremental changes too. If we
add a feature like this for these operations then we'll find out how
well it works in practice and whether users need finer control or what
other sources become the next problem before we've invested a whole
lot in a general solution.

What will more likely happen is that we won't be able to change the
definition in future releases because it breaks backward
compatibility. If we implement this now for only a subset of
commands, and then later someone proposes a patch to expand it to more
commands, then there's going to be someone who shows up and complains
about that definitional change. And for good reason.

I'm heartily in favor of incremental development of features, but each
incremental addition needs to stand on its own two feet, not be
something that we can already foresee needing to break or throw out
later. There are way, way too many PostgreSQL users now for us to be
able to get away with that.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#41

Hannu Krosing

hannu@2ndQuadrant.com

almost 12 years ago

In reply to: Robert Haas (#39)

Re: WAL Rate Limiting

On 02/20/2014 04:16 PM, Robert Haas wrote:

On Thu, Feb 20, 2014 at 9:43 AM, Simon Riggs <simon@2ndquadrant.com> wrote:

The design choice of making the limit only apply to bulk ops is
because that is where the main problem lies. Rate limiting will cause
a loss of performance in the main line for non-bulk operations, so
adding tests there will not be valuable.

That's pure sophistry. Of course rate limiting will cause "a loss of
performance in the main line for non-bulk operations".

I think he meant that *adding code for* rate limiting there will cause
loss of performance even if not used.

Rate limiting
will also cause a loss of performance for bulk operations. The whole
point of rate limiting is to cause a loss of performance.

Not when it is not enabled it should not.

That's not
a bug; that's what the feature is explicitly designed to do.

So-called "non-bulk operations", which seems to be defined as more or
less "anything where actually making this feature work seemed hard",

NO, it is anything that a decent DBA could not script to run in
chunks, which he would do anyway for other reasons, like not
to lock out OLTP.

can include huge inserts, updates, or deletes that greatly depress
performance for other parts of the system, either individually or in
aggregate. And it's just as legitimate to want to tamp down that
activity as it is to want to slow down CLUSTER. Remember, this is a
GUC, so it need not be set identically for every user session. It is
not hard at all to imagine a situation where the user wishes to limit
the rate at which some sessions can write WAL so as to avoid
interfering with other, higher-priority tasks happening in other
sessions.

It is hard to imagine, but it is still a much more infrequent than needing
to do concurrent ops like VACUUM and CREATE INDEX CONCURRENTLY .

That is hardly a niche use case; I think I've seen it
reported, if anything, even more frequently than problems with what
you're calling bulk operations.

Could be, but they are two separate cases. One helps DBA get stuff
done without stepping on users toes, the other is about protecting
one user from the other.

It is arguable that the first is a sub-case of the other, but I don't think
these should be controlled by the same GUC though you may want to
have some checking for sane values between the two

Maybe call this one `maintenance_wal_rate_limit_delay` same way as we
have `maintenance_work_mem`

Cheers

--
Hannu Krosing
PostgreSQL Consultant
Performance, Scalability and High Availability
2ndQuadrant Nordic Oï¿½

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#42

Alvaro Herrera

alvherre@2ndquadrant.com

almost 12 years ago

In reply to: Robert Haas (#39)

Re: WAL Rate Limiting

Robert Haas escribiï¿½:

On Thu, Feb 20, 2014 at 9:43 AM, Simon Riggs <simon@2ndquadrant.com> wrote:

The design choice of making the limit only apply to bulk ops is
because that is where the main problem lies. Rate limiting will cause
a loss of performance in the main line for non-bulk operations, so
adding tests there will not be valuable.

That's pure sophistry. Of course rate limiting will cause "a loss of
performance in the main line for non-bulk operations". Rate limiting
will also cause a loss of performance for bulk operations. The whole
point of rate limiting is to cause a loss of performance.

I think he's saying that it will degrade performance even in the case
where the feature is disabled, which would be undesirable. I don't
necessarily agree with that assertion, but at least it's something to
consider rather than to laugh at.

And it's just as legitimate to want to tamp down that
activity as it is to want to slow down CLUSTER. Remember, this is a
GUC, so it need not be set identically for every user session. It is
not hard at all to imagine a situation where the user wishes to limit
the rate at which some sessions can write WAL so as to avoid
interfering with other, higher-priority tasks happening in other
sessions.

I don't disagree with this either, but I don't think it necessarily has
to be the same feature. Most likely we don't want to have users setting
the GUC each time they want to run vacuum; rather they will want to be
able to set one line in postgresql.conf and have it affect all vacuuming
activity. However, if that same config tweak affects all their UPDATE
statements regardless of them being single-row updates or bulk updates,
I'm pretty sure it's not going to be welcome.

I think "bulk" (maintenance) operations should legitimately configured
separately from regular UPDATE etc operations. For the latter I think
it's pretty reasonable to assume that users are going to want to tweak
the GUC for each individual callsite in the application, rather than
wholesale in postgresql.conf.

--
ï¿½lvaro Herrera http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#43