autovacuum_work_mem

Started by Peter Geogheganover 12 years ago29 messageshackers
Jump to latest

There has recently been considerable discussion around auto-tuning.
Throughout the course of this discussion, I raised the idea of
creating a new GUC to separately control autovacuum's usage of
maintenance_work_mem [1]/messages/by-id/CAM3SWZTr1uu+7KR1ZOuGwcJriw9NVBQdjqyDMRWypEvDFi4a6Q@mail.gmail.com, explaining the rationale in some detail [2]/messages/by-id/CAM3SWZTYoD0YCLA-4nRb4S8-UGJyr514aEy+8O6VJQwvbzszGQ@mail.gmail.com.
At the time Magnus seemed to think this a good idea [3]/messages/by-id/CABUevEzVrd36yeFzYBzad0=r09eqRqNoMwX8r=URikG9DrfUkw@mail.gmail.com.

Attached simple patch adds a new GUC, autovacuum_work_mem, which when
set to a value other than -1 (the default) is preferred by autovacuum
over maintenance_work_mem. All other usage of VACUUM is unaffected.

I won't repeat the rationale for the patch here. Appropriate
documentation changes are included. I don't think it's a problem that
autovacuum_work_mem is kind of similar to vacuum_mem in name.
maintenance_work_mem was last spelt vacuum_mem about 10 years ago.
Enough time has passed that I think it very unlikely that someone
might conflate the two.

I have decided to have the default value of -1 carry, and not have it
automatically take the same value as maintenance_work_mem, because
doing so could create the impression that it needs to be set
explicitly, which of course it does not - this is not the same
situation as exists for wal_buffers. We just check if its set when
going to VACUUM, if VACUUM is requested from a worker process.

It seemed neater to me to create a new flag, so that in principle any
vacuum() code path can request autovacuum_work_mem, rather than having
lazyvacuum.c code call IsAutoVacuumWorkerProcess() for the same
purpose. To date, that's only been done within vacuumlazy.c for things
like logging.

[1]: /messages/by-id/CAM3SWZTr1uu+7KR1ZOuGwcJriw9NVBQdjqyDMRWypEvDFi4a6Q@mail.gmail.com

[2]: /messages/by-id/CAM3SWZTYoD0YCLA-4nRb4S8-UGJyr514aEy+8O6VJQwvbzszGQ@mail.gmail.com

[3]: /messages/by-id/CABUevEzVrd36yeFzYBzad0=r09eqRqNoMwX8r=URikG9DrfUkw@mail.gmail.com

--
Peter Geoghegan

Attachments:

autovacuum_work_mem.v1.2013_10_19.patchapplication/octet-stream; name=autovacuum_work_mem.v1.2013_10_19.patchDownload+93-47
#2Magnus Hagander
magnus@hagander.net
In reply to: Peter Geoghegan (#1)
Re: autovacuum_work_mem

On Sun, Oct 20, 2013 at 2:22 AM, Peter Geoghegan <pg@heroku.com> wrote:

There has recently been considerable discussion around auto-tuning.
Throughout the course of this discussion, I raised the idea of
creating a new GUC to separately control autovacuum's usage of
maintenance_work_mem [1], explaining the rationale in some detail [2].
At the time Magnus seemed to think this a good idea [3].

Attached simple patch adds a new GUC, autovacuum_work_mem, which when
set to a value other than -1 (the default) is preferred by autovacuum
over maintenance_work_mem. All other usage of VACUUM is unaffected.

I won't repeat the rationale for the patch here. Appropriate
documentation changes are included. I don't think it's a problem that
autovacuum_work_mem is kind of similar to vacuum_mem in name.
maintenance_work_mem was last spelt vacuum_mem about 10 years ago.
Enough time has passed that I think it very unlikely that someone
might conflate the two.

My rationale for liking the idea is also in those threads :)

And I definitely don't think there is any problem whatsoever in having
a name that's close to something we had 10 years ago. I think it would
be hard enough to make a case for not reusing the *same* name that
long after, but here it's actually different.

I have decided to have the default value of -1 carry, and not have it
automatically take the same value as maintenance_work_mem, because
doing so could create the impression that it needs to be set
explicitly, which of course it does not - this is not the same
situation as exists for wal_buffers. We just check if its set when
going to VACUUM, if VACUUM is requested from a worker process.

It is the same that exists for autovacuum_vacuum_cost_limit and
autovacuum_vacuum_cost_delay, so that makes perfect sense.

It seemed neater to me to create a new flag, so that in principle any
vacuum() code path can request autovacuum_work_mem, rather than having
lazyvacuum.c code call IsAutoVacuumWorkerProcess() for the same
purpose. To date, that's only been done within vacuumlazy.c for things
like logging.

Hmm. I'm not entirely sure I agree that that makes it neater :)

We could also look at autovacuum_vacuum_cost_limit etc above, but
those just override what the non-autovac parameters do. But since the
parameter is called maintenance_work_mem in that case, I think that
would make it harder to read.

But I'd suggest just a:
int vac_work_mem = (IsAutoVacuumWorkerProcess() && autovacuum_work_mem
!= -1) ? autovacuum_work_mem : maintenance_work_mem;

and not sending around a boolean flag through a bunch of places when
it really means just the same thing,

Another option would be to just have vac_work_mem actually being the
global,and have it set by the maintenance_work_mem parameter guc, and
then overwritten by the vacuum work mem parameter guc in autovacuum
worker startup. Then you wouldn't put either of those two in the
codepath.

Also, isn't this quite confusing:
+ # Note:  autovacuum only prefers autovacuum_work_mem over maintenance_work_mem
+ #autovacuum_work_mem = -1 # min 1MB, or -1 to disable

Where does the "only" come from? And we don't really use the term
"prefers over" anywhere else there. And -1 doesn't actually disable
it. I suggest following the pattern of the other parameters that work
the same way, which would then just be:

#autovacuum_work_mem = -1 # amount of memory for each autovacuum
worker. -1 means use maintenance_work_mem

--
Magnus Hagander
Me: http://www.hagander.net/
Work: http://www.redpill-linpro.com/

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#3Peter Eisentraut
peter_e@gmx.net
In reply to: Peter Geoghegan (#1)
Re: autovacuum_work_mem

On 10/19/13 8:22 PM, Peter Geoghegan wrote:

I don't think it's a problem that
autovacuum_work_mem is kind of similar to vacuum_mem in name.
maintenance_work_mem was last spelt vacuum_mem about 10 years ago.
Enough time has passed that I think it very unlikely that someone
might conflate the two.

What is more confusing, however, is that autovacuum_work_mem looks like
it's work_mem as used by autovacuum, where it's really
maintenance_work_mem as used by autovacuum. So maybe it should be
called autovacuum_maintenance_work_mem.

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#4Robert Haas
robertmhaas@gmail.com
In reply to: Peter Eisentraut (#3)
Re: autovacuum_work_mem

On Mon, Oct 21, 2013 at 9:36 AM, Peter Eisentraut <peter_e@gmx.net> wrote:

On 10/19/13 8:22 PM, Peter Geoghegan wrote:

I don't think it's a problem that
autovacuum_work_mem is kind of similar to vacuum_mem in name.
maintenance_work_mem was last spelt vacuum_mem about 10 years ago.
Enough time has passed that I think it very unlikely that someone
might conflate the two.

What is more confusing, however, is that autovacuum_work_mem looks like
it's work_mem as used by autovacuum, where it's really
maintenance_work_mem as used by autovacuum. So maybe it should be
called autovacuum_maintenance_work_mem.

I think I prefer autovacuum_work_mem. I don't think sticking the word
maintenance in there is really adding much in the way of clarity.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#5Magnus Hagander
magnus@hagander.net
In reply to: Robert Haas (#4)
Re: autovacuum_work_mem

On Mon, Oct 21, 2013 at 3:42 PM, Robert Haas <robertmhaas@gmail.com> wrote:

On Mon, Oct 21, 2013 at 9:36 AM, Peter Eisentraut <peter_e@gmx.net> wrote:

On 10/19/13 8:22 PM, Peter Geoghegan wrote:

I don't think it's a problem that
autovacuum_work_mem is kind of similar to vacuum_mem in name.
maintenance_work_mem was last spelt vacuum_mem about 10 years ago.
Enough time has passed that I think it very unlikely that someone
might conflate the two.

What is more confusing, however, is that autovacuum_work_mem looks like
it's work_mem as used by autovacuum, where it's really
maintenance_work_mem as used by autovacuum. So maybe it should be
called autovacuum_maintenance_work_mem.

I think I prefer autovacuum_work_mem. I don't think sticking the word
maintenance in there is really adding much in the way of clarity.

+1. If changing at all, then maybe just "autovacuum_mem"? It's not
like there's a different parameter to control a different kind of
memory in autovac, is there?

--
Magnus Hagander
Me: http://www.hagander.net/
Work: http://www.redpill-linpro.com/

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#6Stephen Frost
sfrost@snowman.net
In reply to: Magnus Hagander (#5)
Re: autovacuum_work_mem

* Magnus Hagander (magnus@hagander.net) wrote:

+1. If changing at all, then maybe just "autovacuum_mem"? It's not
like there's a different parameter to control a different kind of
memory in autovac, is there?

+1 on that, for my 2c.

Thanks,

Stephen

In reply to: Magnus Hagander (#5)
Re: autovacuum_work_mem

On Mon, Oct 21, 2013 at 6:44 AM, Magnus Hagander <magnus@hagander.net> wrote:

+1. If changing at all, then maybe just "autovacuum_mem"?

I would like to stick with autovacuum_work_mem - it reflects the fact
that it's a setting shadowed by maintenance_work_mem, without being
too verbose.

--
Peter Geoghegan

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#8Simon Riggs
simon@2ndQuadrant.com
In reply to: Peter Geoghegan (#1)
Re: autovacuum_work_mem

On 19 October 2013 19:22, Peter Geoghegan <pg@heroku.com> wrote:

I won't repeat the rationale for the patch here.

I can't see the problem that this patch is trying to solve. I'm having
trouble understanding when I would use this.

VACUUM uses 6 bytes per dead tuple. And autovacuum regularly removes
dead tuples, limiting their numbers.

In what circumstances will the memory usage from multiple concurrent
VACUUMs become a problem? In those circumstances, reducing
autovacuum_work_mem will cause more passes through indexes, dirtying
more pages and elongating the problem workload.

I agree that multiple concurrent VACUUMs could be a problem but this
doesn't solve that, it just makes things worse.

Freezing doesn't require any memory at all, so wraparound vacuums
won't be controlled by this parameter.

Can we re-state what problem actually is here and discuss how to solve
it. (The reference [2] didn't provide a detailed explanation of the
problem, only the reason why we want a separate parameter).

--
Simon Riggs http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

In reply to: Simon Riggs (#8)
Re: autovacuum_work_mem

On Sun, Nov 24, 2013 at 9:06 AM, Simon Riggs <simon@2ndquadrant.com> wrote:

VACUUM uses 6 bytes per dead tuple. And autovacuum regularly removes
dead tuples, limiting their numbers.

In what circumstances will the memory usage from multiple concurrent
VACUUMs become a problem? In those circumstances, reducing
autovacuum_work_mem will cause more passes through indexes, dirtying
more pages and elongating the problem workload.

Yes, of course, but if we presume that the memory for autovacuum
workers to do everything in one pass simply isn't there, it's still
better to do multiple passes. Similarly, it's also sometimes (roughly
speaking) locally suboptimal but globally optimal to do tapesorts in
preference to in-memory quicksorts, even though, as you know, very
frequently tapesort is very considerably slower than quicksort.

Look at the folk wisdom for sizing maintenance_work_mem that is
floating around (for example, take a look at Greg Smith's
recommendations in his book). Setting it within postgresql.conf is
assumed. You can end up with a conservative value because you're
worrying about the worst case. The average case suffers. Especially
since, as you say, autovacuum only uses 6 bytes per tuple, and so
probably isn't all that likely to run out of working memory, making
that worst case (that is, maintenance_work_mem over-allocation by
autovacuum workers) very unlikely.

So on larger Heroku Postgres plans, the generic maintenance_work_mem
is on the low side, and I sometimes have to manually increase it when
I'm doing something like creating a new index. I would like to not
have to do that, and I would like to not require users to be aware of
this issue, especially since external sorting is so much slower.

I am inclined to think that we need an altogether more sophisticated
model, but this is an incremental improvement.

Can we re-state what problem actually is here and discuss how to solve
it. (The reference [2] didn't provide a detailed explanation of the
problem, only the reason why we want a separate parameter).

It's principally a DBA feature, in that it allows the DBA to
separately control the memory used by very routine vacuuming, while
also having a less conservative default value for maintenance
operations that typically are under direct human control. Yes, this is
no better than just having maintenance_work_mem be equal to your
would-be autovacuum_work_mem setting in the first place, and having
everyone remember to set maintenance_work_mem dynamically. However,
sometimes people are ill-informed (more ill-informed than the person
that puts the setting in postgresql.conf), and other times they're
forgetful, and other times still they're using a tool like pg_restore
with no convenient way to dynamically set maintenance_work_mem. So, to
answer your question, yes: it is entirely possible that you or someone
like you may have no use for this.

It's often reasonable to assume that autovacuum workers are the only
processes that can allocate memory bound in size by
maintenance_work_mem that are not under the direct control of a human
performing maintenance. Autovacuum workers are in a sense just
servicing regular application queries (consider how Oracle handles
ROLLBACK garbage collection), and things that service regular
application queries are already bound separately by work_mem.

--
Peter Geoghegan

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#10Robert Haas
robertmhaas@gmail.com
In reply to: Magnus Hagander (#2)
Re: autovacuum_work_mem

On Sun, Oct 20, 2013 at 7:21 AM, Magnus Hagander <magnus@hagander.net> wrote:

It seemed neater to me to create a new flag, so that in principle any
vacuum() code path can request autovacuum_work_mem, rather than having
lazyvacuum.c code call IsAutoVacuumWorkerProcess() for the same
purpose. To date, that's only been done within vacuumlazy.c for things
like logging.

Hmm. I'm not entirely sure I agree that that makes it neater :)

We could also look at autovacuum_vacuum_cost_limit etc above, but
those just override what the non-autovac parameters do. But since the
parameter is called maintenance_work_mem in that case, I think that
would make it harder to read.

But I'd suggest just a:
int vac_work_mem = (IsAutoVacuumWorkerProcess() && autovacuum_work_mem
!= -1) ? autovacuum_work_mem : maintenance_work_mem;

and not sending around a boolean flag through a bunch of places when
it really means just the same thing,

+1 for that change.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#11Simon Riggs
simon@2ndQuadrant.com
In reply to: Peter Geoghegan (#9)
Re: autovacuum_work_mem

On 25 November 2013 21:51, Peter Geoghegan <pg@heroku.com> wrote:

On Sun, Nov 24, 2013 at 9:06 AM, Simon Riggs <simon@2ndquadrant.com> wrote:

VACUUM uses 6 bytes per dead tuple. And autovacuum regularly removes
dead tuples, limiting their numbers.

In what circumstances will the memory usage from multiple concurrent
VACUUMs become a problem? In those circumstances, reducing
autovacuum_work_mem will cause more passes through indexes, dirtying
more pages and elongating the problem workload.

Yes, of course, but if we presume that the memory for autovacuum
workers to do everything in one pass simply isn't there, it's still
better to do multiple passes.

That isn't clear to me. It seems better to wait until we have the memory.

My feeling is this parameter is a fairly blunt approach to the
problems of memory pressure on autovacuum and other maint tasks. I am
worried that it will not effectively solve the problem. I don't wish
to block the patch; I wish to get to an effective solution to the
problem.

A better aproach to handling memory pressure would be to globally
coordinate workers so that we don't oversubscribe memory, allocating
memory from a global pool.

--
Simon Riggs http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#12Robert Haas
robertmhaas@gmail.com
In reply to: Simon Riggs (#11)
Re: autovacuum_work_mem

On Wed, Dec 11, 2013 at 9:43 AM, Simon Riggs <simon@2ndquadrant.com> wrote:

On 25 November 2013 21:51, Peter Geoghegan <pg@heroku.com> wrote:

On Sun, Nov 24, 2013 at 9:06 AM, Simon Riggs <simon@2ndquadrant.com> wrote:

VACUUM uses 6 bytes per dead tuple. And autovacuum regularly removes
dead tuples, limiting their numbers.

In what circumstances will the memory usage from multiple concurrent
VACUUMs become a problem? In those circumstances, reducing
autovacuum_work_mem will cause more passes through indexes, dirtying
more pages and elongating the problem workload.

Yes, of course, but if we presume that the memory for autovacuum
workers to do everything in one pass simply isn't there, it's still
better to do multiple passes.

That isn't clear to me. It seems better to wait until we have the memory.

My feeling is this parameter is a fairly blunt approach to the
problems of memory pressure on autovacuum and other maint tasks. I am
worried that it will not effectively solve the problem. I don't wish
to block the patch; I wish to get to an effective solution to the
problem.

A better aproach to handling memory pressure would be to globally
coordinate workers so that we don't oversubscribe memory, allocating
memory from a global pool.

This is doubtless true, but that project is at least two if not three
orders of magnitude more complex than what's being proposed here, and
I don't think we should make the perfect the enemy of the good.

Right now, maintenance_work_mem controls the amount of memory that
we're willing to use for either a vacuum operation or an index build.
Those things don't have much to do with each other, so it's not hard
for me to imagine that someone might want to configure different
memory usage for one than the other. This patch would allow that, and
I think that's good.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#13Simon Riggs
simon@2ndQuadrant.com
In reply to: Robert Haas (#12)
Re: autovacuum_work_mem

On 11 December 2013 14:50, Robert Haas <robertmhaas@gmail.com> wrote:

On Wed, Dec 11, 2013 at 9:43 AM, Simon Riggs <simon@2ndquadrant.com> wrote:

On 25 November 2013 21:51, Peter Geoghegan <pg@heroku.com> wrote:

On Sun, Nov 24, 2013 at 9:06 AM, Simon Riggs <simon@2ndquadrant.com> wrote:

VACUUM uses 6 bytes per dead tuple. And autovacuum regularly removes
dead tuples, limiting their numbers.

In what circumstances will the memory usage from multiple concurrent
VACUUMs become a problem? In those circumstances, reducing
autovacuum_work_mem will cause more passes through indexes, dirtying
more pages and elongating the problem workload.

Yes, of course, but if we presume that the memory for autovacuum
workers to do everything in one pass simply isn't there, it's still
better to do multiple passes.

That isn't clear to me. It seems better to wait until we have the memory.

My feeling is this parameter is a fairly blunt approach to the
problems of memory pressure on autovacuum and other maint tasks. I am
worried that it will not effectively solve the problem. I don't wish
to block the patch; I wish to get to an effective solution to the
problem.

A better aproach to handling memory pressure would be to globally
coordinate workers so that we don't oversubscribe memory, allocating
memory from a global pool.

This is doubtless true, but that project is at least two if not three
orders of magnitude more complex than what's being proposed here, and
I don't think we should make the perfect the enemy of the good.

It looks fairly easy to estimate the memory needed for an auto vacuum,
since we know the scale factor and the tuple estimate. We can then use
the memory estimate to alter the scheduling of work. And/or we can use
actual memory usage and block auto vac workers if they need more
memory than is currently available because of other workers.

We would still benefit from a new parameter in the above sketch, but
it would achieve something useful in practice.

That's about 2-3 days work and I know Peter can hack it. So the
situation is not perfection-sought-blocking-good, this is more like
fairly poor solution being driven through when a better solution is
available within the time and skills available.

--
Simon Riggs http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#14Robert Haas
robertmhaas@gmail.com
In reply to: Simon Riggs (#13)
Re: autovacuum_work_mem

On Wed, Dec 11, 2013 at 10:41 AM, Simon Riggs <simon@2ndquadrant.com> wrote:

It looks fairly easy to estimate the memory needed for an auto vacuum,
since we know the scale factor and the tuple estimate. We can then use
the memory estimate to alter the scheduling of work. And/or we can use
actual memory usage and block auto vac workers if they need more
memory than is currently available because of other workers.

We would still benefit from a new parameter in the above sketch, but
it would achieve something useful in practice.

That's about 2-3 days work and I know Peter can hack it. So the
situation is not perfection-sought-blocking-good, this is more like
fairly poor solution being driven through when a better solution is
available within the time and skills available.

I don't agree with that assessment. Anything that involves changing
the scheduling of autovacuum is a major project that will legitimately
provoke much controversy. Extensive testing will be needed to prove
that the new algorithm doesn't perform worse than the current
algorithm in any important cases. I have my doubts about whether that
can be accomplished in an entire release cycle, let alone 2-3 days.
In contrast, the patch proposed does something that is easy to
understand, clearly safe, and an improvement over what we have now.

Quite apart from the amount of development time required, I think that
the idea that we would use the availability of memory to schedule work
is highly suspect. You haven't given any details on what you think
that algorithm might look like, and I doubt that any simple solution
will do. If running more autovacuum workers drives the machine into
swap, then we shouldn't, but we have no way of calculating what size
memory allocation will cause that to happen. But NOT running
autovacuum workers isn't safe either, because it could cause table
bloat that them drives the machine into swap. We have no way of
knowing whether that will happen either. So I think your contention
that we have the necessary information available to make an
intelligent decision is incorrect.

Regardless, whether or not a more complex change is within Peter's
technical capabilities is utterly irrelevant to whether we should
adopt the proposed patch. Your phrasing seems to imply that you would
not ask such a thing of a less-talented individual, and I think that
is utterly wrong. Peter's technical acumen does not give us the right
to ask him to write a more complex patch as a condition of getting a
simpler one accepted.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#15Josh Berkus
josh@agliodbs.com
In reply to: Peter Geoghegan (#1)
Re: autovacuum_work_mem

On 12/11/2013 09:57 AM, Robert Haas wrote:

I don't agree with that assessment. Anything that involves changing
the scheduling of autovacuum is a major project that will legitimately
provoke much controversy. Extensive testing will be needed to prove
that the new algorithm doesn't perform worse than the current
algorithm in any important cases. I have my doubts about whether that
can be accomplished in an entire release cycle, let alone 2-3 days.
In contrast, the patch proposed does something that is easy to
understand, clearly safe, and an improvement over what we have now.

+1

There is an inherent tuning and troubleshooting challenge in anything
involving a feedback loop.

--
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

In reply to: Simon Riggs (#13)
Re: autovacuum_work_mem

On Wed, Dec 11, 2013 at 7:41 AM, Simon Riggs <simon@2ndquadrant.com> wrote:

That's about 2-3 days work and I know Peter can hack it. So the
situation is not perfection-sought-blocking-good, this is more like
fairly poor solution being driven through when a better solution is
available within the time and skills available.

I think that that's a very optimistic assessment of the amount of work
required. Even by the rose-tinted standards of software project time
estimation. A ton of data is required to justify fundamental
infrastructural changes like that.

--
Peter Geoghegan

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#17Simon Riggs
simon@2ndQuadrant.com
In reply to: Robert Haas (#14)
Re: autovacuum_work_mem

On 11 December 2013 17:57, Robert Haas <robertmhaas@gmail.com> wrote:

Extensive testing will be needed to prove
that the new algorithm doesn't perform worse than the current
algorithm in any important cases.

Agreed, but the amount of testing seems equivalent in both cases,
assuming we weren't going to skip it for this patch.

Let me repeat the question, so we are clear...
In what circumstances will the memory usage from multiple concurrent
VACUUMs become a problem? In those circumstances, reducing
autovacuum_work_mem will cause more passes through indexes, dirtying
more pages and elongating the problem workload. I agree that multiple
concurrent VACUUMs could be a problem but this
doesn't solve that, it just makes things worse.

The *only* time this parameter would have any effect looks like when
it will make matters worse.

With considerable regret, I don't see how this solves the problem at
hand. We can and should do better.

--
Simon Riggs http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#18Robert Haas
robertmhaas@gmail.com
In reply to: Simon Riggs (#17)
Re: autovacuum_work_mem

On Wed, Dec 11, 2013 at 2:37 PM, Simon Riggs <simon@2ndquadrant.com> wrote:

On 11 December 2013 17:57, Robert Haas <robertmhaas@gmail.com> wrote:

Extensive testing will be needed to prove
that the new algorithm doesn't perform worse than the current
algorithm in any important cases.

Agreed, but the amount of testing seems equivalent in both cases,
assuming we weren't going to skip it for this patch.

Let me repeat the question, so we are clear...
In what circumstances will the memory usage from multiple concurrent
VACUUMs become a problem? In those circumstances, reducing
autovacuum_work_mem will cause more passes through indexes, dirtying
more pages and elongating the problem workload. I agree that multiple
concurrent VACUUMs could be a problem but this
doesn't solve that, it just makes things worse.

That's not the problem the patch is designed to solve. It's intended
for the case where you want to allow more or less memory to autovacuum
than you do for index builds. There's no principled reason that
anyone should want those things to be the same. It is not difficult
to imagine situations in which you would want one set to a very
different value than the other. In particular it seems quite likely
to me that the amount of memory appropriate for index builds might be
vastly more than is needed by autovacuum. For example, in a
data-warehousing environment where updates are rare but large index
builds by the system's sole user are frequent, someone might want to
default index builds to 64GB of RAM (especially after Noah's patch to
allow huge allocations for the tuple array while sorting) but only
need 256MB for autovacuum.

In general, I'm reluctant to believe that Peter proposed this patch
just for fun. I assume this is a real-world problem that Heroku
encounters in their environment. If not, well then that's different.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#19Josh Berkus
josh@agliodbs.com
In reply to: Peter Geoghegan (#1)
Re: autovacuum_work_mem

On 12/11/2013 11:37 AM, Simon Riggs wrote:> On 11 December 2013 17:57,
Robert Haas <robertmhaas@gmail.com> wrote:

Extensive testing will be needed to prove
that the new algorithm doesn't perform worse than the current
algorithm in any important cases.

Agreed, but the amount of testing seems equivalent in both cases,
assuming we weren't going to skip it for this patch.

No performance testing is required for this patch. The effect of memory
limits on vacuum are already well-known and well-understood.

With considerable regret, I don't see how this solves the problem at
hand. We can and should do better.

I strongly disagree. The problem we are dealing with currently is that
two resource limits which should have *always* been independent of each
other are currently conflated into a single GUC variable. This forces
users to remember to set maintenance_work_mem interactively every time
they want to run a manual VACUUM, because the setting in postgresql.conf
is needed to tune autovacuum.

In other words, we are having an issue with *non-atomic data*, and this
patch partially fixes that.

Would it be better to have an admissions-control policy engine for
launching autovacuum which takes into account available RAM, estimated
costs of concurrent vacuums, current CPU activity, and which tables are
in cache? Yes. And if you started on that now, you might have it ready
for 9.5.

And, for that matter, accepting this patch by no means blocks doing
something more sophisticated in the future.

--
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#20Tom Lane
tgl@sss.pgh.pa.us
In reply to: Josh Berkus (#19)
Re: autovacuum_work_mem

Josh Berkus <josh@agliodbs.com> writes:

And, for that matter, accepting this patch by no means blocks doing
something more sophisticated in the future.

Yeah. I think the only real argument against it is "do we really need
yet another knob?". Since Josh, who's usually the voicer of that
argument, is for this one, I don't have a problem with it.

regards, tom lane

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#21Josh Berkus
josh@agliodbs.com
In reply to: Peter Geoghegan (#1)
#22Simon Riggs
simon@2ndQuadrant.com
In reply to: Josh Berkus (#19)
In reply to: Simon Riggs (#22)
#24Simon Riggs
simon@2ndQuadrant.com
In reply to: Peter Geoghegan (#23)
#25Bruce Momjian
bruce@momjian.us
In reply to: Josh Berkus (#15)
#26Heikki Linnakangas
heikki.linnakangas@enterprisedb.com
In reply to: Bruce Momjian (#25)
#27Alvaro Herrera
alvherre@2ndquadrant.com
In reply to: Heikki Linnakangas (#26)
#28Heikki Linnakangas
heikki.linnakangas@enterprisedb.com
In reply to: Alvaro Herrera (#27)
#29Simon Riggs
simon@2ndQuadrant.com
In reply to: Heikki Linnakangas (#28)