Should vacuum process config file reload more often
Hi,
Users may wish to speed up long-running vacuum of a large table by
decreasing autovacuum_vacuum_cost_delay/vacuum_cost_delay, however the
config file is only reloaded between tables (for autovacuum) or after
the statement (for explicit vacuum). This has been brought up for
autovacuum in [1]/messages/by-id/22CA91B4-D341-4075-BD3C-4BAB52AF1E80@amazon.com.
Andres suggested that it might be possible to check ConfigReloadPending
in vacuum_delay_point(), so I thought I would draft a rough patch and
start a discussion.
Since vacuum_delay_point() is also called by analyze and we do not want
to reload the configuration file if we are in a user transaction, I
widened the scope of the in_outer_xact variable in vacuum() and allowed
analyze in a user transaction to default to the current configuration
file reload cadence in PostgresMain().
I don't think I can set and leave vac_in_outer_xact the way I am doing
it in this patch, since I use vac_in_outer_xact in vacuum_delay_point(),
which I believe is reachable from codepaths that would not have called
vacuum(). It seems that if a backend sets it, the outer transaction
commits, and then the backend ends up calling vacuum_delay_point() in a
different way later, it wouldn't be quite right.
Apart from this, one higher level question I have is if there are other
gucs whose modification would make reloading the configuration file
during vacuum/analyze unsafe.
- Melanie
[1]: /messages/by-id/22CA91B4-D341-4075-BD3C-4BAB52AF1E80@amazon.com
Attachments:
v1-0001-reload-config-file-vac.patchtext/x-patch; charset=US-ASCII; name=v1-0001-reload-config-file-vac.patchDownload+14-8
Hi, Melanie!
On Fri, 24 Feb 2023 at 02:08, Melanie Plageman
<melanieplageman@gmail.com> wrote:
Hi,
Users may wish to speed up long-running vacuum of a large table by
decreasing autovacuum_vacuum_cost_delay/vacuum_cost_delay, however the
config file is only reloaded between tables (for autovacuum) or after
the statement (for explicit vacuum). This has been brought up for
autovacuum in [1].Andres suggested that it might be possible to check ConfigReloadPending
in vacuum_delay_point(), so I thought I would draft a rough patch and
start a discussion.Since vacuum_delay_point() is also called by analyze and we do not want
to reload the configuration file if we are in a user transaction, I
widened the scope of the in_outer_xact variable in vacuum() and allowed
analyze in a user transaction to default to the current configuration
file reload cadence in PostgresMain().I don't think I can set and leave vac_in_outer_xact the way I am doing
it in this patch, since I use vac_in_outer_xact in vacuum_delay_point(),
which I believe is reachable from codepaths that would not have called
vacuum(). It seems that if a backend sets it, the outer transaction
commits, and then the backend ends up calling vacuum_delay_point() in a
different way later, it wouldn't be quite right.Apart from this, one higher level question I have is if there are other
gucs whose modification would make reloading the configuration file
during vacuum/analyze unsafe.
I have a couple of small questions:
Can this patch also read the current GUC value if it's modified by the
SET command, without editing config file?
What will be if we modify config file with mistakes? (When we try to
start the cluster with an erroneous config file it will fail to start,
not sure about re-read config)
Overall the proposal seems legit and useful.
Kind regards,
Pavel Borisov
Hi,
On Fri, Feb 24, 2023 at 7:08 AM Melanie Plageman
<melanieplageman@gmail.com> wrote:
Hi,
Users may wish to speed up long-running vacuum of a large table by
decreasing autovacuum_vacuum_cost_delay/vacuum_cost_delay, however the
config file is only reloaded between tables (for autovacuum) or after
the statement (for explicit vacuum). This has been brought up for
autovacuum in [1].Andres suggested that it might be possible to check ConfigReloadPending
in vacuum_delay_point(), so I thought I would draft a rough patch and
start a discussion.
In vacuum_delay_point(), we need to update VacuumCostActive too if necessary.
Apart from this, one higher level question I have is if there are other
gucs whose modification would make reloading the configuration file
during vacuum/analyze unsafe.
As far as I know there are not such GUC parameters in the core but
there might be in third-party table AM and index AM extensions. Also,
I'm concerned that allowing to change any GUC parameters during
vacuum/analyze could be a foot-gun in the future. When modifying
vacuum/analyze-related codes, we have to consider the case where any
GUC parameters could be changed during vacuum/analyze. I guess it
would be better to apply the parameter changes for only vacuum delay
related parameters. For example, autovacuum launcher advertises the
values of the vacuum delay parameters on the shared memory not only
for autovacuum processes but also for manual vacuum/analyze processes.
Both processes can update them accordingly in vacuum_delay_point().
Regards,
--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com
Hi,
On 2023-02-27 23:11:53 +0900, Masahiko Sawada wrote:
As far as I know there are not such GUC parameters in the core but
there might be in third-party table AM and index AM extensions.
We already reload in a pretty broad range of situations, so I'm not sure
there's a lot that could be unsafe that isn't already.
Also, I'm concerned that allowing to change any GUC parameters during
vacuum/analyze could be a foot-gun in the future. When modifying
vacuum/analyze-related codes, we have to consider the case where any GUC
parameters could be changed during vacuum/analyze.
What kind of scenario are you thinking of?
I guess it would be better to apply the parameter changes for only vacuum
delay related parameters. For example, autovacuum launcher advertises the
values of the vacuum delay parameters on the shared memory not only for
autovacuum processes but also for manual vacuum/analyze processes. Both
processes can update them accordingly in vacuum_delay_point().
I don't think this is a good idea. It'd introduce a fair amount of complexity
without, as far as I can tell, a benefit.
Greetings,
Andres Freund
On Tue, Feb 28, 2023 at 10:21 AM Andres Freund <andres@anarazel.de> wrote:
Hi,
On 2023-02-27 23:11:53 +0900, Masahiko Sawada wrote:
As far as I know there are not such GUC parameters in the core but
there might be in third-party table AM and index AM extensions.We already reload in a pretty broad range of situations, so I'm not sure
there's a lot that could be unsafe that isn't already.Also, I'm concerned that allowing to change any GUC parameters during
vacuum/analyze could be a foot-gun in the future. When modifying
vacuum/analyze-related codes, we have to consider the case where any GUC
parameters could be changed during vacuum/analyze.What kind of scenario are you thinking of?
For example, I guess we will need to take care of changes of
maintenance_work_mem. Currently we initialize the dead tuple space at
the beginning of lazy vacuum, but perhaps we would need to
enlarge/shrink it based on the new value?
Regards,
--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com
Thanks for the feedback and questions, Pavel!
On Fri, Feb 24, 2023 at 3:43 AM Pavel Borisov <pashkin.elfe@gmail.com> wrote:
I have a couple of small questions:
Can this patch also read the current GUC value if it's modified by the
SET command, without editing config file?
If a user sets a guc like vacuum_cost_limit with SET, this only modifies
the value for that session. That wouldn't affect the in-progress vacuum
you initiated from that session because you would have to wait for the
vacuum to complete before issuing the SET command.
What will be if we modify config file with mistakes? (When we try to
start the cluster with an erroneous config file it will fail to start,
not sure about re-read config)
If you manually add an invalid valid to your postgresql.conf, when it is
reloaded, the existing value will remain unchanged and an error will be
logged. If you attempt to change the guc value to an invalid value with
ALTER SYSTEM, the ALTER SYSTEM command will fail and the existing value
will remain unchanged.
- Melanie
Hi,
On 2023-02-28 11:16:45 +0900, Masahiko Sawada wrote:
On Tue, Feb 28, 2023 at 10:21 AM Andres Freund <andres@anarazel.de> wrote:
On 2023-02-27 23:11:53 +0900, Masahiko Sawada wrote:
As far as I know there are not such GUC parameters in the core but
there might be in third-party table AM and index AM extensions.We already reload in a pretty broad range of situations, so I'm not sure
there's a lot that could be unsafe that isn't already.Also, I'm concerned that allowing to change any GUC parameters during
vacuum/analyze could be a foot-gun in the future. When modifying
vacuum/analyze-related codes, we have to consider the case where any GUC
parameters could be changed during vacuum/analyze.What kind of scenario are you thinking of?
For example, I guess we will need to take care of changes of
maintenance_work_mem. Currently we initialize the dead tuple space at
the beginning of lazy vacuum, but perhaps we would need to
enlarge/shrink it based on the new value?
I don't think we need to do anything about that initially, just because the
config can be changed in a more granular way, doesn't mean we have to react to
every change for the current operation.
Greetings,
Andres Freund
On Mon, Feb 27, 2023 at 9:12 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
On Fri, Feb 24, 2023 at 7:08 AM Melanie Plageman
<melanieplageman@gmail.com> wrote:Users may wish to speed up long-running vacuum of a large table by
decreasing autovacuum_vacuum_cost_delay/vacuum_cost_delay, however the
config file is only reloaded between tables (for autovacuum) or after
the statement (for explicit vacuum). This has been brought up for
autovacuum in [1].Andres suggested that it might be possible to check ConfigReloadPending
in vacuum_delay_point(), so I thought I would draft a rough patch and
start a discussion.In vacuum_delay_point(), we need to update VacuumCostActive too if necessary.
Yes, good point. Thank you!
On Thu, Feb 23, 2023 at 5:08 PM Melanie Plageman
<melanieplageman@gmail.com> wrote:
I don't think I can set and leave vac_in_outer_xact the way I am doing
it in this patch, since I use vac_in_outer_xact in vacuum_delay_point(),
which I believe is reachable from codepaths that would not have called
vacuum(). It seems that if a backend sets it, the outer transaction
commits, and then the backend ends up calling vacuum_delay_point() in a
different way later, it wouldn't be quite right.
Perhaps I could just set in_outer_xact to false in the PG_FINALLY()
section in vacuum() to avoid this problem.
On Wed, Mar 1, 2023 at 7:15 PM Andres Freund <andres@anarazel.de> wrote:
On 2023-02-28 11:16:45 +0900, Masahiko Sawada wrote:
On Tue, Feb 28, 2023 at 10:21 AM Andres Freund <andres@anarazel.de> wrote:
On 2023-02-27 23:11:53 +0900, Masahiko Sawada wrote:
Also, I'm concerned that allowing to change any GUC parameters during
vacuum/analyze could be a foot-gun in the future. When modifying
vacuum/analyze-related codes, we have to consider the case where any GUC
parameters could be changed during vacuum/analyze.What kind of scenario are you thinking of?
For example, I guess we will need to take care of changes of
maintenance_work_mem. Currently we initialize the dead tuple space at
the beginning of lazy vacuum, but perhaps we would need to
enlarge/shrink it based on the new value?I don't think we need to do anything about that initially, just because the
config can be changed in a more granular way, doesn't mean we have to react to
every change for the current operation.
Perhaps we can mention in the docs that a change to maintenance_work_mem
will not take effect in the middle of vacuuming a table. But, Ithink it probably
isn't needed.
On another topic, I've just realized that when autovacuuming we only
update tab->at_vacuum_cost_delay/limit from
autovacuum_vacuum_cost_delay/limit for each table (in
table_recheck_autovac()) and then use that to update
MyWorkerInfo->wi_cost_delay/limit. MyWorkerInfo->wi_cost_delay/limit is
what is used to update VacuumCostDelay/Limit in AutoVacuumUpdateDelay().
So, even if we reload the config file in vacuum_delay_point(), if we
don't use the new value of autovacuum_vacuum_cost_delay/limit it will
have no effect for autovacuum.
I started writing a little helper that could be used to update these
workerinfo->wi_cost_delay/limit in vacuum_delay_point(), but I notice
when they are first set, we consider the autovacuum table options. So,
I suppose I would need to consider these when updating
wi_cost_delay/limit later as well? (during vacuum_delay_point() or
in AutoVacuumUpdateDelay())
I wasn't quite sure because I found these chained ternaries rather
difficult to interpret, but I think table_recheck_autovac() is saying
that the autovacuum table options override all other values for
vac_cost_delay?
vac_cost_delay = (avopts && avopts->vacuum_cost_delay >= 0)
? avopts->vacuum_cost_delay
: (autovacuum_vac_cost_delay >= 0)
? autovacuum_vac_cost_delay
: VacuumCostDelay;
i.e. this?
if (avopts && avopts->vacuum_cost_delay >= 0)
vac_cost_delay = avopts->vacuum_cost_delay;
else if (autovacuum_vac_cost_delay >= 0)
vac_cost_delay = autovacuum_vacuum_cost_delay;
else
vac_cost_delay = VacuumCostDelay
- Melanie
On Thu, Mar 2, 2023 at 5:45 AM Andres Freund <andres@anarazel.de> wrote:
On 2023-02-28 11:16:45 +0900, Masahiko Sawada wrote:
On Tue, Feb 28, 2023 at 10:21 AM Andres Freund <andres@anarazel.de> wrote:
On 2023-02-27 23:11:53 +0900, Masahiko Sawada wrote:
As far as I know there are not such GUC parameters in the core but
there might be in third-party table AM and index AM extensions.We already reload in a pretty broad range of situations, so I'm not sure
there's a lot that could be unsafe that isn't already.Also, I'm concerned that allowing to change any GUC parameters during
vacuum/analyze could be a foot-gun in the future. When modifying
vacuum/analyze-related codes, we have to consider the case where any GUC
parameters could be changed during vacuum/analyze.What kind of scenario are you thinking of?
For example, I guess we will need to take care of changes of
maintenance_work_mem. Currently we initialize the dead tuple space at
the beginning of lazy vacuum, but perhaps we would need to
enlarge/shrink it based on the new value?I don't think we need to do anything about that initially, just because the
config can be changed in a more granular way, doesn't mean we have to react to
every change for the current operation.
+1. I also don't see the need to do anything for this case.
--
With Regards,
Amit Kapila.
On Thu, Mar 2, 2023 at 10:41 AM Melanie Plageman
<melanieplageman@gmail.com> wrote:
On Mon, Feb 27, 2023 at 9:12 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
On Fri, Feb 24, 2023 at 7:08 AM Melanie Plageman
<melanieplageman@gmail.com> wrote:Users may wish to speed up long-running vacuum of a large table by
decreasing autovacuum_vacuum_cost_delay/vacuum_cost_delay, however the
config file is only reloaded between tables (for autovacuum) or after
the statement (for explicit vacuum). This has been brought up for
autovacuum in [1].Andres suggested that it might be possible to check ConfigReloadPending
in vacuum_delay_point(), so I thought I would draft a rough patch and
start a discussion.In vacuum_delay_point(), we need to update VacuumCostActive too if necessary.
Yes, good point. Thank you!
On Thu, Feb 23, 2023 at 5:08 PM Melanie Plageman
<melanieplageman@gmail.com> wrote:I don't think I can set and leave vac_in_outer_xact the way I am doing
it in this patch, since I use vac_in_outer_xact in vacuum_delay_point(),
which I believe is reachable from codepaths that would not have called
vacuum(). It seems that if a backend sets it, the outer transaction
commits, and then the backend ends up calling vacuum_delay_point() in a
different way later, it wouldn't be quite right.Perhaps I could just set in_outer_xact to false in the PG_FINALLY()
section in vacuum() to avoid this problem.On Wed, Mar 1, 2023 at 7:15 PM Andres Freund <andres@anarazel.de> wrote:
On 2023-02-28 11:16:45 +0900, Masahiko Sawada wrote:
On Tue, Feb 28, 2023 at 10:21 AM Andres Freund <andres@anarazel.de> wrote:
On 2023-02-27 23:11:53 +0900, Masahiko Sawada wrote:
Also, I'm concerned that allowing to change any GUC parameters during
vacuum/analyze could be a foot-gun in the future. When modifying
vacuum/analyze-related codes, we have to consider the case where any GUC
parameters could be changed during vacuum/analyze.What kind of scenario are you thinking of?
For example, I guess we will need to take care of changes of
maintenance_work_mem. Currently we initialize the dead tuple space at
the beginning of lazy vacuum, but perhaps we would need to
enlarge/shrink it based on the new value?I don't think we need to do anything about that initially, just because the
config can be changed in a more granular way, doesn't mean we have to react to
every change for the current operation.Perhaps we can mention in the docs that a change to maintenance_work_mem
will not take effect in the middle of vacuuming a table. But, Ithink it probably
isn't needed.
Agreed.
On another topic, I've just realized that when autovacuuming we only
update tab->at_vacuum_cost_delay/limit from
autovacuum_vacuum_cost_delay/limit for each table (in
table_recheck_autovac()) and then use that to update
MyWorkerInfo->wi_cost_delay/limit. MyWorkerInfo->wi_cost_delay/limit is
what is used to update VacuumCostDelay/Limit in AutoVacuumUpdateDelay().
So, even if we reload the config file in vacuum_delay_point(), if we
don't use the new value of autovacuum_vacuum_cost_delay/limit it will
have no effect for autovacuum.
Right, but IIUC wi_cost_limit (and VacuumCostDelayLimit) might be
updated. After the autovacuum launcher reloads the config file, it
calls autovac_balance_cost() that updates that value of active
workers. I'm not sure why we don't update workers' wi_cost_delay,
though.
I started writing a little helper that could be used to update these
workerinfo->wi_cost_delay/limit in vacuum_delay_point(),
Since we set vacuum delay parameters for autovacuum workers so that we
ration out I/O equally, I think we should keep the current mechanism
that the autovacuum launcher sets workers' delay parameters and they
update accordingly.
but I notice
when they are first set, we consider the autovacuum table options. So,
I suppose I would need to consider these when updating
wi_cost_delay/limit later as well? (during vacuum_delay_point() or
in AutoVacuumUpdateDelay())I wasn't quite sure because I found these chained ternaries rather
difficult to interpret, but I think table_recheck_autovac() is saying
that the autovacuum table options override all other values for
vac_cost_delay?vac_cost_delay = (avopts && avopts->vacuum_cost_delay >= 0)
? avopts->vacuum_cost_delay
: (autovacuum_vac_cost_delay >= 0)
? autovacuum_vac_cost_delay
: VacuumCostDelay;i.e. this?
if (avopts && avopts->vacuum_cost_delay >= 0)
vac_cost_delay = avopts->vacuum_cost_delay;
else if (autovacuum_vac_cost_delay >= 0)
vac_cost_delay = autovacuum_vacuum_cost_delay;
else
vac_cost_delay = VacuumCostDelay
Yes, if the table has autovacuum table options, we use these values
and the table is excluded from the balancing algorithm I mentioned
above. See the code from table_recheck_autovac(),
/*
* If any of the cost delay parameters has been set individually for
* this table, disable the balancing algorithm.
*/
tab->at_dobalance =
!(avopts && (avopts->vacuum_cost_limit > 0 ||
avopts->vacuum_cost_delay > 0));
So if the table has autovacuum table options, the vacuum delay
parameters probably should be updated by ALTER TABLE, not by reloading
the config file.
Regards,
--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com
On Thu, Mar 2, 2023 at 2:36 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
On Thu, Mar 2, 2023 at 10:41 AM Melanie Plageman
<melanieplageman@gmail.com> wrote:On another topic, I've just realized that when autovacuuming we only
update tab->at_vacuum_cost_delay/limit from
autovacuum_vacuum_cost_delay/limit for each table (in
table_recheck_autovac()) and then use that to update
MyWorkerInfo->wi_cost_delay/limit. MyWorkerInfo->wi_cost_delay/limit is
what is used to update VacuumCostDelay/Limit in AutoVacuumUpdateDelay().
So, even if we reload the config file in vacuum_delay_point(), if we
don't use the new value of autovacuum_vacuum_cost_delay/limit it will
have no effect for autovacuum.Right, but IIUC wi_cost_limit (and VacuumCostDelayLimit) might be
updated. After the autovacuum launcher reloads the config file, it
calls autovac_balance_cost() that updates that value of active
workers. I'm not sure why we don't update workers' wi_cost_delay,
though.
Ah yes, I didn't realize this. Thanks. I went back and did more code
reading/analysis, and I see no reason why we shouldn't update
worker->wi_cost_delay to the new value of autovacuum_vac_cost_delay in
autovac_balance_cost(). Then, as you said, the autovac launcher will
call autovac_balance_cost() when it reloads the configuration file.
Then, the next time the autovac worker calls AutoVacuumUpdateDelay(), it
will update VacuumCostDelay.
I started writing a little helper that could be used to update these
workerinfo->wi_cost_delay/limit in vacuum_delay_point(),Since we set vacuum delay parameters for autovacuum workers so that we
ration out I/O equally, I think we should keep the current mechanism
that the autovacuum launcher sets workers' delay parameters and they
update accordingly.
Yes, agreed, it should go in the same place as where we update
wi_cost_limit (autovac_balance_cost()). I think we should potentially
rename autovac_balance_cost() because its name and all its comments
point to its only purpose being to balance the total of the workers
wi_cost_limits to no more than autovacuum_vacuum_cost_limit. And the
autovacuum_vacuum_cost_delay doesn't need to be balanced in this way.
Though, since this change on its own would make autovacuum pick up new
values of autovacuum_vacuum_cost_limit (without having the worker reload
the config file), I wonder if it makes sense to try and have
vacuum_delay_point() only reload the config file if it is an explicit
vacuum or an analyze not being run in an outer transaction (to avoid
overhead of reloading config file)?
The lifecycle of this different vacuum delay-related gucs and how it
differs between autovacuum workers and explicit vacuum is quite tangled
already, though.
but I notice
when they are first set, we consider the autovacuum table options. So,
I suppose I would need to consider these when updating
wi_cost_delay/limit later as well? (during vacuum_delay_point() or
in AutoVacuumUpdateDelay())I wasn't quite sure because I found these chained ternaries rather
difficult to interpret, but I think table_recheck_autovac() is saying
that the autovacuum table options override all other values for
vac_cost_delay?vac_cost_delay = (avopts && avopts->vacuum_cost_delay >= 0)
? avopts->vacuum_cost_delay
: (autovacuum_vac_cost_delay >= 0)
? autovacuum_vac_cost_delay
: VacuumCostDelay;i.e. this?
if (avopts && avopts->vacuum_cost_delay >= 0)
vac_cost_delay = avopts->vacuum_cost_delay;
else if (autovacuum_vac_cost_delay >= 0)
vac_cost_delay = autovacuum_vacuum_cost_delay;
else
vac_cost_delay = VacuumCostDelayYes, if the table has autovacuum table options, we use these values
and the table is excluded from the balancing algorithm I mentioned
above. See the code from table_recheck_autovac(),/*
* If any of the cost delay parameters has been set individually for
* this table, disable the balancing algorithm.
*/
tab->at_dobalance =
!(avopts && (avopts->vacuum_cost_limit > 0 ||
avopts->vacuum_cost_delay > 0));So if the table has autovacuum table options, the vacuum delay
parameters probably should be updated by ALTER TABLE, not by reloading
the config file.
Yes, if the table has autovacuum table options, I think the user is
out-of-luck until the relation is done being vacuumed because the ALTER
TABLE will need to get a lock.
- Melanie
On Thu, Mar 2, 2023 at 6:37 PM Melanie Plageman
<melanieplageman@gmail.com> wrote:
On Thu, Mar 2, 2023 at 2:36 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
On Thu, Mar 2, 2023 at 10:41 AM Melanie Plageman
<melanieplageman@gmail.com> wrote:On another topic, I've just realized that when autovacuuming we only
update tab->at_vacuum_cost_delay/limit from
autovacuum_vacuum_cost_delay/limit for each table (in
table_recheck_autovac()) and then use that to update
MyWorkerInfo->wi_cost_delay/limit. MyWorkerInfo->wi_cost_delay/limit is
what is used to update VacuumCostDelay/Limit in AutoVacuumUpdateDelay().
So, even if we reload the config file in vacuum_delay_point(), if we
don't use the new value of autovacuum_vacuum_cost_delay/limit it will
have no effect for autovacuum.Right, but IIUC wi_cost_limit (and VacuumCostDelayLimit) might be
updated. After the autovacuum launcher reloads the config file, it
calls autovac_balance_cost() that updates that value of active
workers. I'm not sure why we don't update workers' wi_cost_delay,
though.Ah yes, I didn't realize this. Thanks. I went back and did more code
reading/analysis, and I see no reason why we shouldn't update
worker->wi_cost_delay to the new value of autovacuum_vac_cost_delay in
autovac_balance_cost(). Then, as you said, the autovac launcher will
call autovac_balance_cost() when it reloads the configuration file.
Then, the next time the autovac worker calls AutoVacuumUpdateDelay(), it
will update VacuumCostDelay.I started writing a little helper that could be used to update these
workerinfo->wi_cost_delay/limit in vacuum_delay_point(),Since we set vacuum delay parameters for autovacuum workers so that we
ration out I/O equally, I think we should keep the current mechanism
that the autovacuum launcher sets workers' delay parameters and they
update accordingly.Yes, agreed, it should go in the same place as where we update
wi_cost_limit (autovac_balance_cost()). I think we should potentially
rename autovac_balance_cost() because its name and all its comments
point to its only purpose being to balance the total of the workers
wi_cost_limits to no more than autovacuum_vacuum_cost_limit. And the
autovacuum_vacuum_cost_delay doesn't need to be balanced in this way.Though, since this change on its own would make autovacuum pick up new
values of autovacuum_vacuum_cost_limit (without having the worker reload
the config file), I wonder if it makes sense to try and have
vacuum_delay_point() only reload the config file if it is an explicit
vacuum or an analyze not being run in an outer transaction (to avoid
overhead of reloading config file)?The lifecycle of this different vacuum delay-related gucs and how it
differs between autovacuum workers and explicit vacuum is quite tangled
already, though.
So, I've attached a new version of the patch which is quite different
from the previous versions.
In this version I've removed wi_cost_delay from WorkerInfoData. There is
no synchronization of cost_delay amongst workers, so there is no reason
to keep it in shared memory.
One consequence of not updating VacuumCostDelay from wi_cost_delay is
that we have to have a way to keep track of whether or not autovacuum
table options are in use.
This patch does this in a cringeworthy way. I added two global
variables, one to track whether or not cost delay table options are in
use and the other to store the value of the table option cost delay. I
didn't want to use a single variable with a special value to indicate
that table option cost delay is in use because
autovacuum_vacuum_cost_delay already has special values that mean
certain things. My code needs a better solution.
It is worth mentioning that I think that in master,
AutoVacuumUpdateDelay() was incorrectly reading wi_cost_limit and
wi_cost_delay from shared memory without holding a lock.
I've added in a shared lock for reading from wi_cost_limit in this
patch. However, AutoVacuumUpdateLimit() is called unconditionally in
vacuum_delay_point(), which is called quite often (per block-ish), so I
was trying to think if there is a way we could avoid having to check
this shared memory variable on every call to vacuum_delay_point().
Rebalances shouldn't happen very often (done by the launcher when a new
worker is launched and by workers between vacuuming tables). Maybe we
can read from it less frequently?
Also not sure how the patch interacts with failsafe autovac and parallel
vacuum.
- Melanie
Attachments:
v2-0001-Reload-config-file-more-often-while-vacuuming.patchtext/x-patch; charset=US-ASCII; name=v2-0001-Reload-config-file-more-often-while-vacuuming.patchDownload+104-34
On Mon, Mar 6, 2023 at 5:26 AM Melanie Plageman
<melanieplageman@gmail.com> wrote:
On Thu, Mar 2, 2023 at 6:37 PM Melanie Plageman
<melanieplageman@gmail.com> wrote:On Thu, Mar 2, 2023 at 2:36 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
On Thu, Mar 2, 2023 at 10:41 AM Melanie Plageman
<melanieplageman@gmail.com> wrote:On another topic, I've just realized that when autovacuuming we only
update tab->at_vacuum_cost_delay/limit from
autovacuum_vacuum_cost_delay/limit for each table (in
table_recheck_autovac()) and then use that to update
MyWorkerInfo->wi_cost_delay/limit. MyWorkerInfo->wi_cost_delay/limit is
what is used to update VacuumCostDelay/Limit in AutoVacuumUpdateDelay().
So, even if we reload the config file in vacuum_delay_point(), if we
don't use the new value of autovacuum_vacuum_cost_delay/limit it will
have no effect for autovacuum.Right, but IIUC wi_cost_limit (and VacuumCostDelayLimit) might be
updated. After the autovacuum launcher reloads the config file, it
calls autovac_balance_cost() that updates that value of active
workers. I'm not sure why we don't update workers' wi_cost_delay,
though.Ah yes, I didn't realize this. Thanks. I went back and did more code
reading/analysis, and I see no reason why we shouldn't update
worker->wi_cost_delay to the new value of autovacuum_vac_cost_delay in
autovac_balance_cost(). Then, as you said, the autovac launcher will
call autovac_balance_cost() when it reloads the configuration file.
Then, the next time the autovac worker calls AutoVacuumUpdateDelay(), it
will update VacuumCostDelay.I started writing a little helper that could be used to update these
workerinfo->wi_cost_delay/limit in vacuum_delay_point(),Since we set vacuum delay parameters for autovacuum workers so that we
ration out I/O equally, I think we should keep the current mechanism
that the autovacuum launcher sets workers' delay parameters and they
update accordingly.Yes, agreed, it should go in the same place as where we update
wi_cost_limit (autovac_balance_cost()). I think we should potentially
rename autovac_balance_cost() because its name and all its comments
point to its only purpose being to balance the total of the workers
wi_cost_limits to no more than autovacuum_vacuum_cost_limit. And the
autovacuum_vacuum_cost_delay doesn't need to be balanced in this way.Though, since this change on its own would make autovacuum pick up new
values of autovacuum_vacuum_cost_limit (without having the worker reload
the config file), I wonder if it makes sense to try and have
vacuum_delay_point() only reload the config file if it is an explicit
vacuum or an analyze not being run in an outer transaction (to avoid
overhead of reloading config file)?The lifecycle of this different vacuum delay-related gucs and how it
differs between autovacuum workers and explicit vacuum is quite tangled
already, though.So, I've attached a new version of the patch which is quite different
from the previous versions.
Thank you for updating the patch!
In this version I've removed wi_cost_delay from WorkerInfoData. There is
no synchronization of cost_delay amongst workers, so there is no reason
to keep it in shared memory.One consequence of not updating VacuumCostDelay from wi_cost_delay is
that we have to have a way to keep track of whether or not autovacuum
table options are in use.This patch does this in a cringeworthy way. I added two global
variables, one to track whether or not cost delay table options are in
use and the other to store the value of the table option cost delay. I
didn't want to use a single variable with a special value to indicate
that table option cost delay is in use because
autovacuum_vacuum_cost_delay already has special values that mean
certain things. My code needs a better solution.
While it's true that wi_cost_delay doesn't need to be shared, it seems
to make the logic somewhat complex. We need to handle cost_delay in a
different way from other vacuum-related parameters and we need to make
sure av[_use]_table_option_cost_delay are set properly. Removing
wi_cost_delay from WorkerInfoData saves 8 bytes shared memory per
autovacuum worker but it might be worth considering to keep
wi_cost_delay for simplicity.
---
void
AutoVacuumUpdateDelay(void)
{
- if (MyWorkerInfo)
+ /*
+ * We are using autovacuum-related GUCs to update
VacuumCostDelay, so we
+ * only want autovacuum workers and autovacuum launcher to do this.
+ */
+ if (!(am_autovacuum_worker || am_autovacuum_launcher))
+ return;
Is there any case where the autovacuum launcher calls
AutoVacuumUpdateDelay() function?
---
In at autovac_balance_cost(), we have,
int vac_cost_limit = (autovacuum_vac_cost_limit > 0 ?
autovacuum_vac_cost_limit : VacuumCostLimit);
double vac_cost_delay = (autovacuum_vac_cost_delay >= 0 ?
autovacuum_vac_cost_delay : VacuumCostDelay);
:
/* not set? nothing to do */
if (vac_cost_limit <= 0 || vac_cost_delay <= 0)
return;
IIUC if autovacuum_vac_cost_delay is changed to 0 during autovacuums
running, their vacuum delay parameters are not changed. It's not a bug
of the patch but I think we can fix it in this patch.
It is worth mentioning that I think that in master,
AutoVacuumUpdateDelay() was incorrectly reading wi_cost_limit and
wi_cost_delay from shared memory without holding a lock.
Indeed.
I've added in a shared lock for reading from wi_cost_limit in this
patch. However, AutoVacuumUpdateLimit() is called unconditionally in
vacuum_delay_point(), which is called quite often (per block-ish), so I
was trying to think if there is a way we could avoid having to check
this shared memory variable on every call to vacuum_delay_point().
Rebalances shouldn't happen very often (done by the launcher when a new
worker is launched and by workers between vacuuming tables). Maybe we
can read from it less frequently?
Yeah, acquiring the lwlock for every call to vacuum_delay_point()
seems to be harmful. One idea would be to have one sig_atomic_t
variable in WorkerInfoData and autovac_balance_cost() set it to true
after rebalancing the worker's cost-limit. The worker can check it
without locking and update its delay parameters if the flag is true.
Also not sure how the patch interacts with failsafe autovac and parallel
vacuum.
Good point.
When entering the failsafe mode, we disable the vacuum delays (see
lazy_check_wraparound_failsafe()). We need to keep disabling the
vacuum delays even after reloading the config file. One idea is to
have another global variable indicating we're in the failsafe mode.
vacuum_delay_point() doesn't update VacuumCostActive if the flag is
true.
As far as I can see we don't need special treatments for parallel
vacuum cases since it works only in manual vacuum. It calculates the
sleep time based on the shared cost balance and how much the worker
did I/O but the basic mechanism is the same as non-parallel case.
Regards,
--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com
On 3/2/23 1:36 AM, Masahiko Sawada wrote:
For example, I guess we will need to take care of changes of
maintenance_work_mem. Currently we initialize the dead tuple space at
the beginning of lazy vacuum, but perhaps we would need to
enlarge/shrink it based on the new value?
Doesn't the dead tuple space grow as needed? Last I looked we don't
allocate up to 1GB right off the bat.
I don't think we need to do anything about that initially, just because the
config can be changed in a more granular way, doesn't mean we have to react to
every change for the current operation.Perhaps we can mention in the docs that a change to maintenance_work_mem
will not take effect in the middle of vacuuming a table. But, Ithink it probably
isn't needed.Agreed.
I disagree that there's no need for this. Sure, if
maintenance_work_memory is 10MB then it's no big deal to just abandon
your current vacuum and start a new one, but the index vacuuming phase
with maintenance_work_mem set to say 500MB can take quite a while.
Forcing a user to either suck it up or throw everything in the phase
away isn't terribly good.
Of course, if the patch that eliminates the 1GB vacuum limit gets
committed the situation will be even worse.
While it'd be nice to also honor maintenance_work_mem getting set lower,
I don't see any need to go through heroics to accomplish that. Simply
recording the change and honoring it for future attempts to grow the
memory and on future passes through the heap would be plenty.
All that said, don't let these suggestions get in the way of committing
this. Just having the ability to tweak cost parameters would be a win.
Hi,
On 2023-03-08 11:42:31 -0600, Jim Nasby wrote:
On 3/2/23 1:36 AM, Masahiko Sawada wrote:
For example, I guess we will need to take care of changes of
maintenance_work_mem. Currently we initialize the dead tuple space at
the beginning of lazy vacuum, but perhaps we would need to
enlarge/shrink it based on the new value?Doesn't the dead tuple space grow as needed? Last I looked we don't allocate
up to 1GB right off the bat.I don't think we need to do anything about that initially, just because the
config can be changed in a more granular way, doesn't mean we have to react to
every change for the current operation.Perhaps we can mention in the docs that a change to maintenance_work_mem
will not take effect in the middle of vacuuming a table. But, Ithink it probably
isn't needed.Agreed.
I disagree that there's no need for this. Sure, if maintenance_work_memory
is 10MB then it's no big deal to just abandon your current vacuum and start
a new one, but the index vacuuming phase with maintenance_work_mem set to
say 500MB can take quite a while. Forcing a user to either suck it up or
throw everything in the phase away isn't terribly good.Of course, if the patch that eliminates the 1GB vacuum limit gets committed
the situation will be even worse.While it'd be nice to also honor maintenance_work_mem getting set lower, I
don't see any need to go through heroics to accomplish that. Simply
recording the change and honoring it for future attempts to grow the memory
and on future passes through the heap would be plenty.All that said, don't let these suggestions get in the way of committing
this. Just having the ability to tweak cost parameters would be a win.
Nobody said anything about it not being useful to react to m_w_m changes, just
that it's not required to make some progress . So I really don't understand
what the point of your comment is.
Greetings,
Andres Freund
On Thu, Mar 9, 2023 at 12:42 AM Jim Nasby <nasbyj@amazon.com> wrote:
Doesn't the dead tuple space grow as needed? Last I looked we don't
allocate up to 1GB right off the bat.
Incorrect.
Of course, if the patch that eliminates the 1GB vacuum limit gets
committed the situation will be even worse.
If you're referring to the proposed tid store, I'd be interested in seeing
a reproducible test case with a m_w_m over 1GB where it makes things worse
than the current state of affairs.
--
John Naylor
EDB: http://www.enterprisedb.com
On Thu, Mar 9, 2023 at 4:47 PM John Naylor <john.naylor@enterprisedb.com> wrote:
On Thu, Mar 9, 2023 at 12:42 AM Jim Nasby <nasbyj@amazon.com> wrote:
Doesn't the dead tuple space grow as needed? Last I looked we don't allocate up to 1GB right off the bat.
Incorrect.
Of course, if the patch that eliminates the 1GB vacuum limit gets committed the situation will be even worse.
If you're referring to the proposed tid store, I'd be interested in seeing a reproducible test case with a m_w_m over 1GB where it makes things worse than the current state of affairs.
And I think that the tidstore makes it easy to react to
maintenance_work_mem changes. We don't need to enlarge it and just
update its memory limit at an appropriate time.
Regards,
--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com
On Tue, Mar 7, 2023 at 12:10 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
On Mon, Mar 6, 2023 at 5:26 AM Melanie Plageman
<melanieplageman@gmail.com> wrote:On Thu, Mar 2, 2023 at 6:37 PM Melanie Plageman
In this version I've removed wi_cost_delay from WorkerInfoData. There is
no synchronization of cost_delay amongst workers, so there is no reason
to keep it in shared memory.One consequence of not updating VacuumCostDelay from wi_cost_delay is
that we have to have a way to keep track of whether or not autovacuum
table options are in use.This patch does this in a cringeworthy way. I added two global
variables, one to track whether or not cost delay table options are in
use and the other to store the value of the table option cost delay. I
didn't want to use a single variable with a special value to indicate
that table option cost delay is in use because
autovacuum_vacuum_cost_delay already has special values that mean
certain things. My code needs a better solution.While it's true that wi_cost_delay doesn't need to be shared, it seems
to make the logic somewhat complex. We need to handle cost_delay in a
different way from other vacuum-related parameters and we need to make
sure av[_use]_table_option_cost_delay are set properly. Removing
wi_cost_delay from WorkerInfoData saves 8 bytes shared memory per
autovacuum worker but it might be worth considering to keep
wi_cost_delay for simplicity.
Ah, it turns out we can't really remove wi_cost_delay from WorkerInfo
anyway because the launcher doesn't know anything about table options
and so the workers have to keep an updated wi_cost_delay that the
launcher or other autovac workers who are not vacuuming that table can
read from when calculating the new limit in autovac_balance_cost().
However, wi_cost_delay is a double, so if we start updating it on config
reload in vacuum_delay_point(), we definitely need some protection
against torn reads.
The table options can only change when workers start vacuuming a new
table, so maybe there is some way to use this to solve this problem?
It is worth mentioning that I think that in master,
AutoVacuumUpdateDelay() was incorrectly reading wi_cost_limit and
wi_cost_delay from shared memory without holding a lock.Indeed.
I've added in a shared lock for reading from wi_cost_limit in this
patch. However, AutoVacuumUpdateLimit() is called unconditionally in
vacuum_delay_point(), which is called quite often (per block-ish), so I
was trying to think if there is a way we could avoid having to check
this shared memory variable on every call to vacuum_delay_point().
Rebalances shouldn't happen very often (done by the launcher when a new
worker is launched and by workers between vacuuming tables). Maybe we
can read from it less frequently?Yeah, acquiring the lwlock for every call to vacuum_delay_point()
seems to be harmful. One idea would be to have one sig_atomic_t
variable in WorkerInfoData and autovac_balance_cost() set it to true
after rebalancing the worker's cost-limit. The worker can check it
without locking and update its delay parameters if the flag is true.
Maybe we can do something like this with the table options values?
- Melanie
On Fri, Mar 10, 2023 at 11:23 AM Melanie Plageman
<melanieplageman@gmail.com> wrote:
On Tue, Mar 7, 2023 at 12:10 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
On Mon, Mar 6, 2023 at 5:26 AM Melanie Plageman
<melanieplageman@gmail.com> wrote:On Thu, Mar 2, 2023 at 6:37 PM Melanie Plageman
In this version I've removed wi_cost_delay from WorkerInfoData. There is
no synchronization of cost_delay amongst workers, so there is no reason
to keep it in shared memory.One consequence of not updating VacuumCostDelay from wi_cost_delay is
that we have to have a way to keep track of whether or not autovacuum
table options are in use.This patch does this in a cringeworthy way. I added two global
variables, one to track whether or not cost delay table options are in
use and the other to store the value of the table option cost delay. I
didn't want to use a single variable with a special value to indicate
that table option cost delay is in use because
autovacuum_vacuum_cost_delay already has special values that mean
certain things. My code needs a better solution.While it's true that wi_cost_delay doesn't need to be shared, it seems
to make the logic somewhat complex. We need to handle cost_delay in a
different way from other vacuum-related parameters and we need to make
sure av[_use]_table_option_cost_delay are set properly. Removing
wi_cost_delay from WorkerInfoData saves 8 bytes shared memory per
autovacuum worker but it might be worth considering to keep
wi_cost_delay for simplicity.Ah, it turns out we can't really remove wi_cost_delay from WorkerInfo
anyway because the launcher doesn't know anything about table options
and so the workers have to keep an updated wi_cost_delay that the
launcher or other autovac workers who are not vacuuming that table can
read from when calculating the new limit in autovac_balance_cost().
IIUC if any of the cost delay parameters has been set individually,
the autovacuum worker is excluded from the balance algorithm.
However, wi_cost_delay is a double, so if we start updating it on config
reload in vacuum_delay_point(), we definitely need some protection
against torn reads.The table options can only change when workers start vacuuming a new
table, so maybe there is some way to use this to solve this problem?It is worth mentioning that I think that in master,
AutoVacuumUpdateDelay() was incorrectly reading wi_cost_limit and
wi_cost_delay from shared memory without holding a lock.Indeed.
I've added in a shared lock for reading from wi_cost_limit in this
patch. However, AutoVacuumUpdateLimit() is called unconditionally in
vacuum_delay_point(), which is called quite often (per block-ish), so I
was trying to think if there is a way we could avoid having to check
this shared memory variable on every call to vacuum_delay_point().
Rebalances shouldn't happen very often (done by the launcher when a new
worker is launched and by workers between vacuuming tables). Maybe we
can read from it less frequently?Yeah, acquiring the lwlock for every call to vacuum_delay_point()
seems to be harmful. One idea would be to have one sig_atomic_t
variable in WorkerInfoData and autovac_balance_cost() set it to true
after rebalancing the worker's cost-limit. The worker can check it
without locking and update its delay parameters if the flag is true.Maybe we can do something like this with the table options values?
Since an autovacuum that uses any of table option cost delay
parameters is excluded from the balancing algorithm, the launcher
doesn't need to notify such workers of changes of the cost-limit, no?
Regards,
--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com
Quotes below are combined from two of Sawada-san's emails.
I've also attached a patch with my suggested current version.
On Thu, Mar 9, 2023 at 10:27 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
On Fri, Mar 10, 2023 at 11:23 AM Melanie Plageman
<melanieplageman@gmail.com> wrote:On Tue, Mar 7, 2023 at 12:10 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
On Mon, Mar 6, 2023 at 5:26 AM Melanie Plageman
<melanieplageman@gmail.com> wrote:On Thu, Mar 2, 2023 at 6:37 PM Melanie Plageman
In this version I've removed wi_cost_delay from WorkerInfoData. There is
no synchronization of cost_delay amongst workers, so there is no reason
to keep it in shared memory.One consequence of not updating VacuumCostDelay from wi_cost_delay is
that we have to have a way to keep track of whether or not autovacuum
table options are in use.This patch does this in a cringeworthy way. I added two global
variables, one to track whether or not cost delay table options are in
use and the other to store the value of the table option cost delay. I
didn't want to use a single variable with a special value to indicate
that table option cost delay is in use because
autovacuum_vacuum_cost_delay already has special values that mean
certain things. My code needs a better solution.While it's true that wi_cost_delay doesn't need to be shared, it seems
to make the logic somewhat complex. We need to handle cost_delay in a
different way from other vacuum-related parameters and we need to make
sure av[_use]_table_option_cost_delay are set properly. Removing
wi_cost_delay from WorkerInfoData saves 8 bytes shared memory per
autovacuum worker but it might be worth considering to keep
wi_cost_delay for simplicity.Ah, it turns out we can't really remove wi_cost_delay from WorkerInfo
anyway because the launcher doesn't know anything about table options
and so the workers have to keep an updated wi_cost_delay that the
launcher or other autovac workers who are not vacuuming that table can
read from when calculating the new limit in autovac_balance_cost().IIUC if any of the cost delay parameters has been set individually,
the autovacuum worker is excluded from the balance algorithm.
Ah, yes! That's right. So it is not a problem. Then I still think
removing wi_cost_delay from the worker info makes sense. wi_cost_delay
is a double and can't easily be accessed atomically the way
wi_cost_limit can be.
Keeping the cost delay local to the backends also makes it clear that
cost delay is not something that should be written to by other backends
or that can differ from worker to worker. Without table options in the
picture, the cost delay should be the same for any worker who has
reloaded the config file.
As for the cost limit safe access issue, maybe we can avoid a LWLock
acquisition for reading wi_cost_limit by using an atomic similar to what
you suggested here for "did_rebalance".
I've added in a shared lock for reading from wi_cost_limit in this
patch. However, AutoVacuumUpdateLimit() is called unconditionally in
vacuum_delay_point(), which is called quite often (per block-ish), so I
was trying to think if there is a way we could avoid having to check
this shared memory variable on every call to vacuum_delay_point().
Rebalances shouldn't happen very often (done by the launcher when a new
worker is launched and by workers between vacuuming tables). Maybe we
can read from it less frequently?Yeah, acquiring the lwlock for every call to vacuum_delay_point()
seems to be harmful. One idea would be to have one sig_atomic_t
variable in WorkerInfoData and autovac_balance_cost() set it to true
after rebalancing the worker's cost-limit. The worker can check it
without locking and update its delay parameters if the flag is true.
Instead of having the atomic indicate whether or not someone (launcher
or another worker) did a rebalance, it would simply store the current
cost limit. Then the worker can normally access it with a simple read.
My rationale is that if we used an atomic to indicate whether or not we
did a rebalance ("did_rebalance"), we would have the same cache
coherency guarantees as if we just used the atomic for the cost limit.
If we read from the "did_rebalance" variable and missed someone having
written to it on another core, we still wouldn't get around to checking
the wi_cost_limit variable in shared memory, so it doesn't matter that
we bothered to keep it in shared memory and use a lock to access it.
I noticed we don't allow wi_cost_limit to ever be less than 0, so we
could store wi_cost_limit in an atomic uint32.
I'm not sure if it is okay to do pg_atomic_read_u32() and
pg_atomic_unlocked_write_u32() or if we need pg_atomic_write_u32() in
most cases.
I've implemented the atomic cost limit in the attached patch. Though,
I'm pretty unsure about how I initialized the atomics in
AutoVacuumShmemInit()...
If the consensus is that it is simply too confusing to take
wi_cost_delay out of WorkerInfo, we might be able to afford using a
shared lock to access it because we won't call AutoVacuumUpdateDelay()
on every invocation of vacuum_delay_point() -- only when we've reloaded
the config file.
One potential option to avoid taking a shared lock on every call to
AutoVacuumUpdateDelay() is to set a global variable to indicate that we
did update it (since we are the only ones updating it) and then only
take the shared LWLock in AutoVacuumUpdateDelay() if that flag is true.
--- void AutoVacuumUpdateDelay(void) { - if (MyWorkerInfo) + /* + * We are using autovacuum-related GUCs to update VacuumCostDelay, so we + * only want autovacuum workers and autovacuum launcher to do this. + */ + if (!(am_autovacuum_worker || am_autovacuum_launcher)) + return;Is there any case where the autovacuum launcher calls
AutoVacuumUpdateDelay() function?
I had meant to add it to HandleAutoVacLauncherInterrupts() after
reloading the config file (done in attached patch). When using the
global variables for cost delay (instead of wi_cost_delay in worker
info), the autovac launcher also has to do the check in the else branch
of AutoVacuumUpdateDelay()
VacuumCostDelay = autovacuum_vac_cost_delay >= 0 ?
autovacuum_vac_cost_delay : VacuumCostDelay;
to make sure VacuumCostDelay is correct for when it calls
autovac_balance_cost().
This also made me think about whether or not we still need cost_limit_base.
It is used to ensure that autovac_balance_cost() never ends up setting
workers' wi_cost_limits above the current autovacuum_vacuum_cost_limit
(or VacuumCostLimit). However, the launcher and all the workers should
know what the value is without cost_limit_base, no?
---
In at autovac_balance_cost(), we have,int vac_cost_limit = (autovacuum_vac_cost_limit > 0 ?
autovacuum_vac_cost_limit : VacuumCostLimit);
double vac_cost_delay = (autovacuum_vac_cost_delay >= 0 ?
autovacuum_vac_cost_delay : VacuumCostDelay);
:
/* not set? nothing to do */
if (vac_cost_limit <= 0 || vac_cost_delay <= 0)
return;IIUC if autovacuum_vac_cost_delay is changed to 0 during autovacuums
running, their vacuum delay parameters are not changed. It's not a bug
of the patch but I think we can fix it in this patch.
Yes, currently (in master) wi_cost_delay does not get updated anywhere.
In my patch, the global variable we are using for delay is updated but
it is not done in autovac_balance_cost().
Also not sure how the patch interacts with failsafe autovac and parallel
vacuum.Good point.
When entering the failsafe mode, we disable the vacuum delays (see
lazy_check_wraparound_failsafe()). We need to keep disabling the
vacuum delays even after reloading the config file. One idea is to
have another global variable indicating we're in the failsafe mode.
vacuum_delay_point() doesn't update VacuumCostActive if the flag is
true.
I think we might not need to do this. Other than in
lazy_check_wraparound_failsafe(), VacuumCostActive is only updated in
two places:
1) in vacuum() which autovacuum will call per table. And failsafe is
reset per table as well.
2) in vacuum_delay_point(), but, since VacuumCostActive will already be
false when we enter vacuum_delay_point() the next time after
lazy_check_wraparound_failsafe(), we won't set VacuumCostActive there.
Thanks again for the detailed feedback!
- Melanie