Isn't it better with "autovacuum worker...." instead of "worker took too long to start; canceled" specific to "auto
Hi,
Is there a specific reason that we have a generic WARNING "worker took
too long to start; canceled" for an autovacuum worker? Isn't it better
with "autovacuum worker took too long to start; canceled"? It is
confusing to see the generic message in the server logs while
debugging an issue for a user who doesn't know the internals of
autovacuum code.
To be more informative about the message, how about the following:
1) ereport(WARNING,
(errmsg( "worker took too long to start"),
errdetail("Previous attempt to start autovacuum
worker was failed, canceled.")));
or
2) ereport(WARNING,
(errmsg( "worker took too long to start, canceled"),
errdetail("The postmaster couldn't start an
autovacuum worker.")));
or
3) ereport(WARNING,
(errmsg( "worker took too long to start, canceled"),
errdetail("Previous attempt to start autovacuum
worker was failed.")));
or
4) elog(WARNING, "postmaster couldn't start an autovacuum worker");
Thoughts?
Regards,
Bharath Rupireddy.
On 10/27/21, 9:29 AM, "Bharath Rupireddy" <bharath.rupireddyforpostgres@gmail.com> wrote:
Is there a specific reason that we have a generic WARNING "worker took
too long to start; canceled" for an autovacuum worker? Isn't it better
with "autovacuum worker took too long to start; canceled"? It is
confusing to see the generic message in the server logs while
debugging an issue for a user who doesn't know the internals of
autovacuum code.
It looks like it has been this way for a while [0]https://git.postgresql.org/gitweb/?p=postgresql.git;a=commit;h=bae0b56. I don't know if
I've ever seen this message before, and from the comments near it, it
sounds like it is expected to rarely happen.
To be more informative about the message, how about the following:
My vote is to just change it to
ereport(WARNING,
(errmsg("autovacuum worker took too long to start; canceled")));
and call it a day. If we wanted to add errdetail(), I think we should
make sure it is providing useful context, but I'm not sure what that
might look like.
Nathan
[0]: https://git.postgresql.org/gitweb/?p=postgresql.git;a=commit;h=bae0b56
On Wed, Oct 27, 2021 at 07:05:10PM +0000, Bossart, Nathan wrote:
On 10/27/21, 9:29 AM, "Bharath Rupireddy" <bharath.rupireddyforpostgres@gmail.com> wrote:
Is there a specific reason that we have a generic WARNING "worker took
too long to start; canceled" for an autovacuum worker? Isn't it better
with "autovacuum worker took too long to start; canceled"? It is
confusing to see the generic message in the server logs while
debugging an issue for a user who doesn't know the internals of
autovacuum code.It looks like it has been this way for a while [0]. I don't know if
I've ever seen this message before, and from the comments near it, it
sounds like it is expected to rarely happen.
I was surprised to see that I have only two logs for this in the last 8 weeks.
To be more informative about the message, how about the following:
My vote is to just change it to
ereport(WARNING,
(errmsg("autovacuum worker took too long to start; canceled")));and call it a day. If we wanted to add errdetail(), I think we should
make sure it is providing useful context, but I'm not sure what that
might look like.
I think that's fine.
Note that the backend_type is illuminating for those who use CSV logs, or use
P13+ and log_line_prefix += %b (see 70a7b4776).
session_line | 1
error_severity | WARNING
message | worker took too long to start; canceled
backend_type | autovacuum launcher
--
Justin
At Wed, 27 Oct 2021 14:26:11 -0500, Justin Pryzby <pryzby@telsasoft.com> wrote in
On Wed, Oct 27, 2021 at 07:05:10PM +0000, Bossart, Nathan wrote:
My vote is to just change it to
ereport(WARNING,
(errmsg("autovacuum worker took too long to start; canceled")));and call it a day. If we wanted to add errdetail(), I think we should
make sure it is providing useful context, but I'm not sure what that
might look like.I think that's fine.
+1
Note that the backend_type is illuminating for those who use CSV logs, or use
P13+ and log_line_prefix += %b (see 70a7b4776).session_line | 1
error_severity | WARNING
message | worker took too long to start; canceled
backend_type | autovacuum launcher
Yeah, the additional "autovacuum" is not noisy at all even in that
context. Some other messages are prefixed with "autovacuum".
"could not fork autovacuum worker process: %m"
"autovacuum worker started without a worker entry"
By a quick look all occurances of "laucher" are prefixed with
"autovacuum" or "logical replcaion", which seems fine.
As a related topic, autovacuum.c has another use of bare "worker"s.
tmpcxt = AllocSetContextCreate(CurrentMemoryContext,
"Start worker tmp cxt",
ALLOCSET_DEFAULT_SIZES);
AutovacMemCxt = AllocSetContextCreate(TopMemoryContext,
"AV worker",
ALLOCSET_DEFAULT_SIZES);
I'm not sure the former needs to be fixed, but the latter is actually
visible to users via pg_log_backend_memory_contexts().
LOG: level: 1; AV worker: 8192 total in 1 blocks; 7928 free (0 chunks); 264 used
regards.
--
Kyotaro Horiguchi
NTT Open Source Software Center
On Thu, Oct 28, 2021 at 7:11 AM Kyotaro Horiguchi
<horikyota.ntt@gmail.com> wrote:
At Wed, 27 Oct 2021 14:26:11 -0500, Justin Pryzby <pryzby@telsasoft.com> wrote in
On Wed, Oct 27, 2021 at 07:05:10PM +0000, Bossart, Nathan wrote:
My vote is to just change it to
ereport(WARNING,
(errmsg("autovacuum worker took too long to start; canceled")));and call it a day. If we wanted to add errdetail(), I think we should
make sure it is providing useful context, but I'm not sure what that
might look like.I think that's fine.
+1
Done.
Note that the backend_type is illuminating for those who use CSV logs, or use
P13+ and log_line_prefix += %b (see 70a7b4776).session_line | 1
error_severity | WARNING
message | worker took too long to start; canceled
backend_type | autovacuum launcherYeah, the additional "autovacuum" is not noisy at all even in that
context. Some other messages are prefixed with "autovacuum"."could not fork autovacuum worker process: %m"
"autovacuum worker started without a worker entry"By a quick look all occurances of "laucher" are prefixed with
"autovacuum" or "logical replcaion", which seems fine.As a related topic, autovacuum.c has another use of bare "worker"s.
tmpcxt = AllocSetContextCreate(CurrentMemoryContext,
"Start worker tmp cxt",
ALLOCSET_DEFAULT_SIZES);AutovacMemCxt = AllocSetContextCreate(TopMemoryContext,
"AV worker",
ALLOCSET_DEFAULT_SIZES);I'm not sure the former needs to be fixed, but the latter is actually
visible to users via pg_log_backend_memory_contexts().LOG: level: 1; AV worker: 8192 total in 1 blocks; 7928 free (0 chunks); 264 used
Good catch. I've seen the use of "AV" in some of the mem context
names, why that? Let's be specific and say "Autovacuum". Attached
patch does that. Please review it.
Regards,
Bharath Rupireddy.
Attachments:
v1-0001-autovacuum-be-specific-about-error-message-and-me.patchapplication/octet-stream; name=v1-0001-autovacuum-be-specific-about-error-message-and-me.patchDownload
From 640fd1b54359977c20aff760b8ded49f9722870c Mon Sep 17 00:00:00 2001
From: Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com>
Date: Thu, 28 Oct 2021 02:38:30 +0000
Subject: [PATCH v1] autovacuum - be specific about error message and mem
context names
---
src/backend/postmaster/autovacuum.c | 10 +++++-----
1 file changed, 5 insertions(+), 5 deletions(-)
diff --git a/src/backend/postmaster/autovacuum.c b/src/backend/postmaster/autovacuum.c
index 96332320a7..3f877b669d 100644
--- a/src/backend/postmaster/autovacuum.c
+++ b/src/backend/postmaster/autovacuum.c
@@ -754,7 +754,7 @@ AutoVacLauncherMain(int argc, char *argv[])
dlist_push_head(&AutoVacuumShmem->av_freeWorkers,
&worker->wi_links);
AutoVacuumShmem->av_startingWorker = NULL;
- elog(WARNING, "worker took too long to start; canceled");
+ elog(WARNING, "autovacuum worker took too long to start; canceled");
}
}
else
@@ -966,10 +966,10 @@ rebuild_database_list(Oid newdb)
autovac_refresh_stats();
newcxt = AllocSetContextCreate(AutovacMemCxt,
- "AV dblist",
+ "Autovacuum dblist",
ALLOCSET_DEFAULT_SIZES);
tmpcxt = AllocSetContextCreate(newcxt,
- "tmp AV dblist",
+ "Autovacuum tmp dblist",
ALLOCSET_DEFAULT_SIZES);
oldcxt = MemoryContextSwitchTo(tmpcxt);
@@ -1179,7 +1179,7 @@ do_start_worker(void)
* allocated for the database list.
*/
tmpcxt = AllocSetContextCreate(CurrentMemoryContext,
- "Start worker tmp cxt",
+ "Autovacuum start worker tmp cxt",
ALLOCSET_DEFAULT_SIZES);
oldcxt = MemoryContextSwitchTo(tmpcxt);
@@ -1981,7 +1981,7 @@ do_autovacuum(void)
* relations to vacuum/analyze across transactions.
*/
AutovacMemCxt = AllocSetContextCreate(TopMemoryContext,
- "AV worker",
+ "Autovacuum worker",
ALLOCSET_DEFAULT_SIZES);
MemoryContextSwitchTo(AutovacMemCxt);
--
2.25.1
On Thu, Oct 28, 2021 at 8:14 AM Bharath Rupireddy
<bharath.rupireddyforpostgres@gmail.com> wrote:
LOG: level: 1; AV worker: 8192 total in 1 blocks; 7928 free (0 chunks); 264 used
Good catch. I've seen the use of "AV" in some of the mem context
names, why that? Let's be specific and say "Autovacuum". Attached
patch does that. Please review it.
+1, the error message and other improvements look good.
--
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com
On Thu, Oct 28, 2021 at 11:44 AM Bharath Rupireddy
<bharath.rupireddyforpostgres@gmail.com> wrote:
On Thu, Oct 28, 2021 at 7:11 AM Kyotaro Horiguchi
<horikyota.ntt@gmail.com> wrote:At Wed, 27 Oct 2021 14:26:11 -0500, Justin Pryzby <pryzby@telsasoft.com> wrote in
On Wed, Oct 27, 2021 at 07:05:10PM +0000, Bossart, Nathan wrote:
My vote is to just change it to
ereport(WARNING,
(errmsg("autovacuum worker took too long to start; canceled")));and call it a day. If we wanted to add errdetail(), I think we should
make sure it is providing useful context, but I'm not sure what that
might look like.I think that's fine.
+1
Done.
Note that the backend_type is illuminating for those who use CSV logs, or use
P13+ and log_line_prefix += %b (see 70a7b4776).session_line | 1
error_severity | WARNING
message | worker took too long to start; canceled
backend_type | autovacuum launcherYeah, the additional "autovacuum" is not noisy at all even in that
context. Some other messages are prefixed with "autovacuum"."could not fork autovacuum worker process: %m"
"autovacuum worker started without a worker entry"By a quick look all occurances of "laucher" are prefixed with
"autovacuum" or "logical replcaion", which seems fine.As a related topic, autovacuum.c has another use of bare "worker"s.
tmpcxt = AllocSetContextCreate(CurrentMemoryContext,
"Start worker tmp cxt",
ALLOCSET_DEFAULT_SIZES);AutovacMemCxt = AllocSetContextCreate(TopMemoryContext,
"AV worker",
ALLOCSET_DEFAULT_SIZES);I'm not sure the former needs to be fixed, but the latter is actually
visible to users via pg_log_backend_memory_contexts().LOG: level: 1; AV worker: 8192 total in 1 blocks; 7928 free (0 chunks); 264 used
Good catch. I've seen the use of "AV" in some of the mem context
names, why that? Let's be specific and say "Autovacuum". Attached
patch does that. Please review it.
+1. The patch looks good to me too.
Regards,
--
Masahiko Sawada
EDB: https://www.enterprisedb.com/
On Thu, Oct 28, 2021 at 12:41 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
On Thu, Oct 28, 2021 at 11:44 AM Bharath Rupireddy
<bharath.rupireddyforpostgres@gmail.com> wrote:On Thu, Oct 28, 2021 at 7:11 AM Kyotaro Horiguchi
<horikyota.ntt@gmail.com> wrote:At Wed, 27 Oct 2021 14:26:11 -0500, Justin Pryzby <pryzby@telsasoft.com> wrote in
On Wed, Oct 27, 2021 at 07:05:10PM +0000, Bossart, Nathan wrote:
My vote is to just change it to
ereport(WARNING,
(errmsg("autovacuum worker took too long to start; canceled")));and call it a day. If we wanted to add errdetail(), I think we should
make sure it is providing useful context, but I'm not sure what that
might look like.I think that's fine.
+1
Done.
Note that the backend_type is illuminating for those who use CSV logs, or use
P13+ and log_line_prefix += %b (see 70a7b4776).session_line | 1
error_severity | WARNING
message | worker took too long to start; canceled
backend_type | autovacuum launcherYeah, the additional "autovacuum" is not noisy at all even in that
context. Some other messages are prefixed with "autovacuum"."could not fork autovacuum worker process: %m"
"autovacuum worker started without a worker entry"By a quick look all occurances of "laucher" are prefixed with
"autovacuum" or "logical replcaion", which seems fine.As a related topic, autovacuum.c has another use of bare "worker"s.
tmpcxt = AllocSetContextCreate(CurrentMemoryContext,
"Start worker tmp cxt",
ALLOCSET_DEFAULT_SIZES);AutovacMemCxt = AllocSetContextCreate(TopMemoryContext,
"AV worker",
ALLOCSET_DEFAULT_SIZES);I'm not sure the former needs to be fixed, but the latter is actually
visible to users via pg_log_backend_memory_contexts().LOG: level: 1; AV worker: 8192 total in 1 blocks; 7928 free (0 chunks); 264 used
Good catch. I've seen the use of "AV" in some of the mem context
names, why that? Let's be specific and say "Autovacuum". Attached
patch does that. Please review it.+1. The patch looks good to me too.
Thanks all for reviewing this. Here's the CF entry -
https://commitfest.postgresql.org/35/3378/
Regards,
Bharath Rupireddy.
On 10/28/21, 5:42 AM, "Bharath Rupireddy" <bharath.rupireddyforpostgres@gmail.com> wrote:
Thanks all for reviewing this. Here's the CF entry -
https://commitfest.postgresql.org/35/3378/
I've marked this one as ready-for-committer.
Nathan
On 2021-Oct-28, Bharath Rupireddy wrote:
Thanks all for reviewing this. Here's the CF entry -
https://commitfest.postgresql.org/35/3378/
Thanks, pushed. I changed a couple of things though -- notably changed
the elog() to ereport() as suggested by Nathan early on, but never
materialized in the submitted patch. I also changed the wording of the
context names, since the proposed ones weren't much more satisfactory
than the existing ones.
--
Álvaro Herrera 39°49'30"S 73°17'W — https://www.EnterpriseDB.com/
"Every machine is a smoke machine if you operate it wrong enough."
https://twitter.com/libseybieda/status/1541673325781196801