Re: review: autovacuum_work_mem
On Sun, Oct 20, 2013 at 7:21 AM, Magnus Hagander <magnus@hagander.net> wrote:
On Sun, Oct 20, 2013 at 2:22 AM, Peter Geoghegan <pg@heroku.com> wrote:
It seemed neater to me to create a new flag, so that in principle any
vacuum() code path can request autovacuum_work_mem, rather than having
lazyvacuum.c code call IsAutoVacuumWorkerProcess() for the same
purpose. To date, that's only been done within vacuumlazy.c for things
like logging.
But I'd suggest just a:
int vac_work_mem = (IsAutoVacuumWorkerProcess() && autovacuum_work_mem
!= -1) ? autovacuum_work_mem : maintenance_work_mem;and not sending around a boolean flag through a bunch of places when
it really means just the same thing,
I agree with Magnus here, calling IsAutoVacuumWorkerProcess() seems
cleaner than the new flag and the function parameter changes.
Also, isn't this quite confusing: + # Note: autovacuum only prefers autovacuum_work_mem over maintenance_work_mem + #autovacuum_work_mem = -1 # min 1MB, or -1 to disableWhere does the "only" come from? And we don't really use the term
"prefers over" anywhere else there. And -1 doesn't actually disable
it. I suggest following the pattern of the other parameters that work
the same way, which would then just be:#autovacuum_work_mem = -1 # amount of memory for each autovacuum
worker. -1 means use maintenance_work_mem
+1
here's my review of the v1 patch,
patch features tested:
- regular VACUUM * commands ignore autovacuum_work_mem.
- if autovacuum_work_mem = -1 then maintenance_work_mem is used by autovacuum.
- if autovacuum_work_mem is set then it is used instead of
maintenance_work_mem by autovacuum.
- the autovacuum_work_mem guc has a "sighup" context and the global
variable used in the code is correctly refreshed during a reload.
- a 1MB lower limit for autovacuum_work_mem is enforced.
build (platform is fedora 18):
- patch (context format) applies to current HEAD with offsets (please rebase).
- documentation patches included.
- "make" doesn't produce any extra warnings.
- "make check" passes all tests (no extra regression tests).
questions/comments:
- should the category of the guc be "RESOURCES_MEM" (as in the patch)
or "AUTOVACUUM"? seems to fit in both, but it's really autovacuum
specific.
- could you also add documentation to the autovacuum section of
maintenance.sgml? I think people tuning their autovacuum are likely to
look there for guidance.
- could you update the comments at the top of vacuumlazy.c to reflect
this new feature?
-nigel.
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Please reply to the original thread in future (even if the Reply-to
Message-ID is the same, I see this as a separate thread).
On Fri, Nov 15, 2013 at 5:37 PM, Nigel Heron <nheron@querymetrics.com> wrote:
On Sun, Oct 20, 2013 at 7:21 AM, Magnus Hagander <magnus@hagander.net> wrote:
On Sun, Oct 20, 2013 at 2:22 AM, Peter Geoghegan <pg@heroku.com> wrote:
It seemed neater to me to create a new flag, so that in principle any
vacuum() code path can request autovacuum_work_mem, rather than having
lazyvacuum.c code call IsAutoVacuumWorkerProcess() for the same
purpose. To date, that's only been done within vacuumlazy.c for things
like logging.
I agree with Magnus here, calling IsAutoVacuumWorkerProcess() seems
cleaner than the new flag and the function parameter changes.
Well, what I did was analogous to the existing coding for
VACOPT_NOWAIT. But it's hardly worth discussing much further. Patch
updated along these lines.
Also, isn't this quite confusing: + # Note: autovacuum only prefers autovacuum_work_mem over maintenance_work_mem + #autovacuum_work_mem = -1 # min 1MB, or -1 to disableWhere does the "only" come from? And we don't really use the term
"prefers over" anywhere else there.
"Only" could be replaced by "merely" here. In any case, a more
succinct wording is now used.
here's my review of the v1 patch,
patch features tested:
- regular VACUUM * commands ignore autovacuum_work_mem.
- if autovacuum_work_mem = -1 then maintenance_work_mem is used by autovacuum.
- if autovacuum_work_mem is set then it is used instead of
maintenance_work_mem by autovacuum.
- the autovacuum_work_mem guc has a "sighup" context and the global
variable used in the code is correctly refreshed during a reload.
Right.
- a 1MB lower limit for autovacuum_work_mem is enforced.
Right, just like maintenance_work_mem. The difference being that we
cannot enforce it with the same standard mechanism, because we still
need to make -1 values possible. This happens in our callback, in the
style of wal_buffers.
build (platform is fedora 18):
- patch (context format) applies to current HEAD with offsets (please rebase).
It's been rebased.
questions/comments:
- should the category of the guc be "RESOURCES_MEM" (as in the patch)
or "AUTOVACUUM"? seems to fit in both, but it's really autovacuum
specific.
Well, log_autovacuum_min_duration is also very autovacuum specific,
but is LOGGING_WHAT, so I've left autovacuum_work_mem RESOURCES_MEM.
It's not a fixed allocation of shared memory.
- could you also add documentation to the autovacuum section of
maintenance.sgml? I think people tuning their autovacuum are likely to
look there for guidance.
I don't want to add a reference there as long as there is no
maintenance_work_mem reference. I think that perform.sgml is a more
likely candidate, though I haven't added anything there either, since
it's just talking about increasing maintenance_work_mem in a
controlled context.
- could you update the comments at the top of vacuumlazy.c to reflect
this new feature?
Seems reasonable.
Revision attached.
--
Peter Geoghegan
Attachments:
autovacuum_work_mem.v2.2013_11_18.patchtext/x-patch; charset=US-ASCII; name=autovacuum_work_mem.v2.2013_11_18.patchDownload
diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml
new file mode 100644
index 77a9303..1200ad6
*** a/doc/src/sgml/config.sgml
--- b/doc/src/sgml/config.sgml
*************** include 'filename'
*** 1156,1163 ****
</para>
<para>
Note that when autovacuum runs, up to
! <xref linkend="guc-autovacuum-max-workers"> times this memory may be
! allocated, so be careful not to set the default value too high.
</para>
</listitem>
</varlistentry>
--- 1156,1181 ----
</para>
<para>
Note that when autovacuum runs, up to
! <xref linkend="guc-autovacuum-max-workers"> times this memory
! may be allocated, so be careful not to set the default value
! too high. It may be useful to control for this by separately
! setting <xref linkend="guc-autovacuum-work-mem">.
! </para>
! </listitem>
! </varlistentry>
!
! <varlistentry id="guc-autovacuum-work-mem" xreflabel="autovacuum_work_mem">
! <term><varname>autovacuum_work_mem</varname> (<type>integer</type>)</term>
! <indexterm>
! <primary><varname>autovacuum_work_mem</> configuration parameter</primary>
! </indexterm>
! <listitem>
! <para>
! Specifies the maximum amount of memory to be used by each
! autovacuum worker process. It defaults to -1, indicating that
! the value of <xref linkend="guc-maintenance-work-mem"> should
! be used instead. The setting has no effect on the behavior of
! <command>VACUUM</command> when run in other contexts.
</para>
</listitem>
</varlistentry>
diff --git a/src/backend/commands/vacuumlazy.c b/src/backend/commands/vacuumlazy.c
new file mode 100644
index e1d6d1c..62bd2f0
*** a/src/backend/commands/vacuumlazy.c
--- b/src/backend/commands/vacuumlazy.c
***************
*** 10,22 ****
* relations with finite memory space usage. To do that, we set upper bounds
* on the number of tuples and pages we will keep track of at once.
*
! * We are willing to use at most maintenance_work_mem memory space to keep
! * track of dead tuples. We initially allocate an array of TIDs of that size,
! * with an upper limit that depends on table size (this limit ensures we don't
! * allocate a huge area uselessly for vacuuming small tables). If the array
! * threatens to overflow, we suspend the heap scan phase and perform a pass of
! * index cleanup and page compaction, then resume the heap scan with an empty
! * TID array.
*
* If we're processing a table with no indexes, we can just vacuum each page
* as we go; there's no need to save up multiple tuples to minimize the number
--- 10,22 ----
* relations with finite memory space usage. To do that, we set upper bounds
* on the number of tuples and pages we will keep track of at once.
*
! * We are willing to use at most maintenance_work_mem (or perhaps
! * autovacuum_work_mem) memory space to keep track of dead tuples. We
! * initially allocate an array of TIDs of that size, with an upper limit that
! * depends on table size (this limit ensures we don't allocate a huge area
! * uselessly for vacuuming small tables). If the array threatens to overflow,
! * we suspend the heap scan phase and perform a pass of index cleanup and page
! * compaction, then resume the heap scan with an empty TID array.
*
* If we're processing a table with no indexes, we can just vacuum each page
* as we go; there's no need to save up multiple tuples to minimize the number
*************** static void
*** 1581,1590 ****
lazy_space_alloc(LVRelStats *vacrelstats, BlockNumber relblocks)
{
long maxtuples;
if (vacrelstats->hasindex)
{
! maxtuples = (maintenance_work_mem * 1024L) / sizeof(ItemPointerData);
maxtuples = Min(maxtuples, INT_MAX);
maxtuples = Min(maxtuples, MaxAllocSize / sizeof(ItemPointerData));
--- 1581,1593 ----
lazy_space_alloc(LVRelStats *vacrelstats, BlockNumber relblocks)
{
long maxtuples;
+ int vac_work_mem = IsAutoVacuumWorkerProcess() &&
+ autovacuum_work_mem != -1 ?
+ autovacuum_work_mem : maintenance_work_mem;
if (vacrelstats->hasindex)
{
! maxtuples = (vac_work_mem * 1024L) / sizeof(ItemPointerData);
maxtuples = Min(maxtuples, INT_MAX);
maxtuples = Min(maxtuples, MaxAllocSize / sizeof(ItemPointerData));
diff --git a/src/backend/postmaster/autovacuum.c b/src/backend/postmaster/autovacuum.c
new file mode 100644
index 8c14d0f..0b881c0
*** a/src/backend/postmaster/autovacuum.c
--- b/src/backend/postmaster/autovacuum.c
***************
*** 110,115 ****
--- 110,116 ----
*/
bool autovacuum_start_daemon = false;
int autovacuum_max_workers;
+ int autovacuum_work_mem = -1;
int autovacuum_naptime;
int autovacuum_vac_thresh;
double autovacuum_vac_scale;
diff --git a/src/backend/utils/misc/guc.c b/src/backend/utils/misc/guc.c
new file mode 100644
index 54d8078..d33048f
*** a/src/backend/utils/misc/guc.c
--- b/src/backend/utils/misc/guc.c
*************** static const char *show_tcp_keepalives_c
*** 192,197 ****
--- 192,198 ----
static bool check_maxconnections(int *newval, void **extra, GucSource source);
static bool check_max_worker_processes(int *newval, void **extra, GucSource source);
static bool check_autovacuum_max_workers(int *newval, void **extra, GucSource source);
+ static bool check_autovacuum_work_mem(int *newval, void **extra, GucSource source);
static bool check_effective_io_concurrency(int *newval, void **extra, GucSource source);
static void assign_effective_io_concurrency(int newval, void *extra);
static void assign_pgstat_temp_directory(const char *newval, void *extra);
*************** static struct config_int ConfigureNamesI
*** 2347,2352 ****
--- 2348,2364 ----
},
{
+ {"autovacuum_work_mem", PGC_SIGHUP, RESOURCES_MEM,
+ gettext_noop("Sets the maximum memory to be used by each autovacuum worker process."),
+ NULL,
+ GUC_UNIT_KB
+ },
+ &autovacuum_work_mem,
+ -1, -1, MAX_KILOBYTES,
+ check_autovacuum_work_mem, NULL, NULL
+ },
+
+ {
{"tcp_keepalives_idle", PGC_USERSET, CLIENT_CONN_OTHER,
gettext_noop("Time between issuing TCP keepalives."),
gettext_noop("A value of 0 uses the system default."),
*************** check_autovacuum_max_workers(int *newval
*** 8751,8756 ****
--- 8763,8791 ----
return true;
}
+ static bool
+ check_autovacuum_work_mem(int *newval, void **extra, GucSource source)
+ {
+ /*
+ * -1 indicates fallback.
+ *
+ * If we haven't yet changed the boot_val default of -1, just let it be.
+ * Autovacuum will look to maintenance_work_mem instead.
+ */
+ if (*newval == -1)
+ return true;
+
+ /*
+ * We clamp manually-set values to at least 1MB. Since
+ * maintenance_work_mem is always set to at least this value, do the same
+ * here.
+ */
+ if (*newval < 1024)
+ *newval = 1024;
+
+ return true;
+ }
+
static bool
check_max_worker_processes(int *newval, void **extra, GucSource source)
{
diff --git a/src/backend/utils/misc/postgresql.conf.sample b/src/backend/utils/misc/postgresql.conf.sample
new file mode 100644
index 34a2d05..4737ede
*** a/src/backend/utils/misc/postgresql.conf.sample
--- b/src/backend/utils/misc/postgresql.conf.sample
***************
*** 122,127 ****
--- 122,128 ----
# actively intend to use prepared transactions.
#work_mem = 1MB # min 64kB
#maintenance_work_mem = 16MB # min 1MB
+ #autovacuum_work_mem = -1 # min 1MB, or -1 to use maintenance_work_mem
#max_stack_depth = 2MB # min 100kB
#dynamic_shared_memory_type = posix # the default is the first option
# supported by the operating system:
diff --git a/src/include/postmaster/autovacuum.h b/src/include/postmaster/autovacuum.h
new file mode 100644
index e96f07a..92560fe
*** a/src/include/postmaster/autovacuum.h
--- b/src/include/postmaster/autovacuum.h
***************
*** 18,23 ****
--- 18,24 ----
/* GUC variables */
extern bool autovacuum_start_daemon;
extern int autovacuum_max_workers;
+ extern int autovacuum_work_mem;
extern int autovacuum_naptime;
extern int autovacuum_vac_thresh;
extern double autovacuum_vac_scale;
On Mon, Nov 18, 2013 at 11:36 PM, Peter Geoghegan <pg@heroku.com> wrote:
Please reply to the original thread in future (even if the Reply-to
Message-ID is the same, I see this as a separate thread).
sorry about that, when i added "review" to the subject gmail removed
the thread info.
for reference the original thread started here:
</messages/by-id/CAM3SWZTwLA8Ef2DTvbwM1b1zEVU_eN3N4rReGNU5_zFyjNGi6w@mail.gmail.com>
Revision attached.
--
Peter Geoghegan
Review for Peter Geoghegan's v2 patch in CF 2013-11:
https://commitfest.postgresql.org/action/patch_view?id=1262
Submission review
-----------------
* Is the patch in a patch format which has context?
Yes
* Does it apply cleanly to the current git master
(04eee1fa9ee80dabf7cf4b8b9106897272e9b291)?
patching file src/backend/commands/vacuumlazy.c
Hunk #2 succeeded at 1582 (offset 1 line).
* Does it include reasonable tests, necessary doc patches, etc?
Documentation patches included.
No additional tests.
Usability review
-----------------
* Does the patch actually implement that?
Yes.
* Do we want that?
Yes. The original thread has references, see
</messages/by-id/CAM3SWZTwLA8Ef2DTvbwM1b1zEVU_eN3N4rReGNU5_zFyjNGi6w@mail.gmail.com>
* Do we already have it?
No.
* Does it follow SQL spec, or the community-agreed behavior?
SQL spec: n/a
community: Yes. The original thread has references, see
</messages/by-id/CAM3SWZTwLA8Ef2DTvbwM1b1zEVU_eN3N4rReGNU5_zFyjNGi6w@mail.gmail.com>
* Does it include pg_dump support (if applicable)?
n/a
Feature test
-----------------
* Does the feature work as advertised?
Yes.
* Are there corner cases the author has failed to consider?
None that i can see.
* Are there any assertion failures or crashes?
No.
Performance review
-----------------
* Does the patch slow down simple tests?
No.
* If it claims to improve performance, does it?
n/a
* Does it slow down other things?
No.
Coding review
-----------------
* Does it follow the project coding guidelines?
Yes.
* Are there portability issues?
None that i can see.
* Will it work on Windows/BSD etc?
None that i can see. (I only tested it on linux though)
* Does it do what it says, correctly?
Yes.
* Does it produce compiler warnings?
No.
* Can you make it crash?
No.
Architecture review
-----------------
* Is everything done in a way that fits together coherently with other
features/modules?
Yes.
* Are there interdependencies that can cause problems?
No.
-nigel.
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers