Reducing the log spam

Started by Laurenz Albeabout 2 years ago19 messageshackers
Jump to latest
#1Laurenz Albe
laurenz.albe@cybertec.at

Inspired by feedback to [1]/messages/by-id/b8b8502915e50f44deb111bc0b43a99e2733e117.camel@cybertec.at, I thought about how to reduce log spam.

My experience from the field is that a lot of log spam looks like

database/table/... "xy" does not exist
duplicate key value violates unique constraint "xy"

So what about a parameter "log_suppress_sqlstates" that suppresses
logging ERROR and FATAL messages with the enumerated SQL states?

My idea for a default setting would be something like

log_suppress_sqlstates = '23505,3D000,3F000,42601,42704,42883,42P01'

but that's of course bikeshedding territory.

Yours,
Laurenz Albe

[1]: /messages/by-id/b8b8502915e50f44deb111bc0b43a99e2733e117.camel@cybertec.at

#2Pavel Stehule
pavel.stehule@gmail.com
In reply to: Laurenz Albe (#1)
Re: Reducing the log spam

Hi

út 5. 3. 2024 v 13:55 odesílatel Laurenz Albe <laurenz.albe@cybertec.at>
napsal:

Inspired by feedback to [1], I thought about how to reduce log spam.

My experience from the field is that a lot of log spam looks like

database/table/... "xy" does not exist
duplicate key value violates unique constraint "xy"

So what about a parameter "log_suppress_sqlstates" that suppresses
logging ERROR and FATAL messages with the enumerated SQL states?

My idea for a default setting would be something like

log_suppress_sqlstates = '23505,3D000,3F000,42601,42704,42883,42P01'

+1 in this form

the overhead of this implementation should be small

Regards

Pavel

Show quoted text

but that's of course bikeshedding territory.

Yours,
Laurenz Albe

[1]:
/messages/by-id/b8b8502915e50f44deb111bc0b43a99e2733e117.camel@cybertec.at

#3Jim Jones
jim.jones@uni-muenster.de
In reply to: Laurenz Albe (#1)
Re: Reducing the log spam

Hi Laurenz

On 05.03.24 13:55, Laurenz Albe wrote:

Inspired by feedback to [1], I thought about how to reduce log spam.

My experience from the field is that a lot of log spam looks like

database/table/... "xy" does not exist
duplicate key value violates unique constraint "xy"

So what about a parameter "log_suppress_sqlstates" that suppresses
logging ERROR and FATAL messages with the enumerated SQL states?

My idea for a default setting would be something like

log_suppress_sqlstates = '23505,3D000,3F000,42601,42704,42883,42P01'

but that's of course bikeshedding territory.

Yours,
Laurenz Albe

[1]: /messages/by-id/b8b8502915e50f44deb111bc0b43a99e2733e117.camel@cybertec.at

I like this idea, and I could see myself using it a lot in some projects.

Additionally, it would be nice to also have the possibility suppress a
whole class instead of single SQL states, e.g.

log_suppress_sqlstates = 'class_08' to suppress these all at once:

08000 connection_exception
08003 connection_does_not_exist
08006 connection_failure
08001 sqlclient_unable_to_establish_sqlconnection
08004 sqlserver_rejected_establishment_of_sqlconnection
08007 transaction_resolution_unknown
08P01 protocol_violation

Best regards,
Jim

#4Pavel Stehule
pavel.stehule@gmail.com
In reply to: Jim Jones (#3)
Re: Reducing the log spam

Hi

út 5. 3. 2024 v 14:55 odesílatel Jim Jones <jim.jones@uni-muenster.de>
napsal:

Hi Laurenz

On 05.03.24 13:55, Laurenz Albe wrote:

Inspired by feedback to [1], I thought about how to reduce log spam.

My experience from the field is that a lot of log spam looks like

database/table/... "xy" does not exist
duplicate key value violates unique constraint "xy"

So what about a parameter "log_suppress_sqlstates" that suppresses
logging ERROR and FATAL messages with the enumerated SQL states?

My idea for a default setting would be something like

log_suppress_sqlstates = '23505,3D000,3F000,42601,42704,42883,42P01'

but that's of course bikeshedding territory.

Yours,
Laurenz Albe

[1]:

/messages/by-id/b8b8502915e50f44deb111bc0b43a99e2733e117.camel@cybertec.at

I like this idea, and I could see myself using it a lot in some projects.

Additionally, it would be nice to also have the possibility suppress a
whole class instead of single SQL states, e.g.

log_suppress_sqlstates = 'class_08' to suppress these all at once:

08000 connection_exception
08003 connection_does_not_exist
08006 connection_failure
08001 sqlclient_unable_to_establish_sqlconnection
08004 sqlserver_rejected_establishment_of_sqlconnection
08007 transaction_resolution_unknown
08P01 protocol_violation

It can take code from PLpgSQL.

Regards

Pavel

Show quoted text

Best regards,
Jim

#5Jelte Fennema-Nio
postgres@jeltef.nl
In reply to: Jim Jones (#3)
Re: Reducing the log spam

On Tue, 5 Mar 2024 at 14:55, Jim Jones <jim.jones@uni-muenster.de> wrote:

So what about a parameter "log_suppress_sqlstates" that suppresses
logging ERROR and FATAL messages with the enumerated SQL states?

Big +1 from me for this idea.

#6Aleksander Alekseev
aleksander@timescale.com
In reply to: Laurenz Albe (#1)
Re: Reducing the log spam

Hi,

So what about a parameter "log_suppress_sqlstates" that suppresses
logging ERROR and FATAL messages with the enumerated SQL states?

My idea for a default setting would be something like

log_suppress_sqlstates = '23505,3D000,3F000,42601,42704,42883,42P01'

but that's of course bikeshedding territory.

I like the idea of suppressing certain log messages in general, but
the particular user interface doesn't strike me as an especially
convenient one.

Firstly I don't think many people remember sqlstates and what 3F000
stands for. IMO most users don't know such a thing exists. Secondly,
whether we should list sqlstates to suppress or the opposite - list
the states that shouldn't be suppressed, is a debatable question. Last
but not least, it's not quite clear whether PostgreSQL core is the
right place for implementing this functionality. For instance, one
could argue that the log message should just contain sqlstate and be
directed to |grep instead.

I suspect this could be one of "there is no one size fits all"
situations. The typical solution in such cases is to form a structure
containing the log message and its attributes and submit this
structure to a registered hook of a pluggable logging subsystem. This
would be the most flexible approach. It will allow not only filtering
the messages but also using binary logging, JSON logging, logging to
external systems like Loki instead of a given text file, etc.

I don't think we currently have this in the core, but maybe I just missed it.

--
Best regards,
Aleksander Alekseev

#7Greg Sabino Mullane
greg@turnstep.com
In reply to: Laurenz Albe (#1)
Re: Reducing the log spam

On Tue, Mar 5, 2024 at 7:55 AM Laurenz Albe <laurenz.albe@cybertec.at>
wrote:

My experience from the field is that a lot of log spam looks like

database/table/... "xy" does not exist
duplicate key value violates unique constraint "xy"

Forcibly hiding those at the Postgres level seems a heavy hammer for what
is ultimately an application problem.

Tell me about a system that logs different classes of errors to different
log files, and I'm interested again.

Cheers,
Greg

#8Laurenz Albe
laurenz.albe@cybertec.at
In reply to: Aleksander Alekseev (#6)
Re: Reducing the log spam

On Wed, 2024-03-06 at 17:09 +0300, Aleksander Alekseev wrote:

I like the idea of suppressing certain log messages in general, but
the particular user interface doesn't strike me as an especially
convenient one.

Firstly I don't think many people remember sqlstates and what 3F000
stands for. IMO most users don't know such a thing exists. Secondly,
whether we should list sqlstates to suppress or the opposite - list
the states that shouldn't be suppressed, is a debatable question. Last
but not least, it's not quite clear whether PostgreSQL core is the
right place for implementing this functionality. For instance, one
could argue that the log message should just contain sqlstate and be
directed to |grep instead.

I suspect this could be one of "there is no one size fits all"
situations. The typical solution in such cases is to form a structure
containing the log message and its attributes and submit this
structure to a registered hook of a pluggable logging subsystem. This
would be the most flexible approach. It will allow not only filtering
the messages but also using binary logging, JSON logging, logging to
external systems like Loki instead of a given text file, etc.

I don't think we currently have this in the core, but maybe I just missed it.

The target would not primarily be installations where people configure
nifty logging software to filter logs (those people know how to deal
with log spam), but installations where people don't even know enough
to configure "shared_buffers". So I'd like something that is part of
core and reduces spam without the user needing to configure anything.

I am somewhat worried that people will come up with all kinds of
justified but complicated wishes for such a feature:

- an option to choose whether to include or to exclude certain errors
- be able to configure that certain errors be logged on FATAL, but
not on ERROR
- allow exception names in addition to SQL states
- have wildcards for exception names
- ...

I would like to write a simple patch that covers the basic functionality
I described, provided enough people find it useful. That does not
exclude the option for future extensions for this feature.

Yours,
Laurenz Albe

#9Laurenz Albe
laurenz.albe@cybertec.at
In reply to: Greg Sabino Mullane (#7)
Re: Reducing the log spam

On Wed, 2024-03-06 at 10:50 -0500, Greg Sabino Mullane wrote:

On Tue, Mar 5, 2024 at 7:55 AM Laurenz Albe <laurenz.albe@cybertec.at> wrote:

My experience from the field is that a lot of log spam looks like

  database/table/... "xy" does not exist
  duplicate key value violates unique constraint "xy"

Forcibly hiding those at the Postgres level seems a heavy hammer for what is ultimately an application problem.

Yes... or no. Lots of applications violate constraints routinely.
As long as the error is caught and handled, that's not a problem.

Whoever cares about the log messages can enable them. My impression
is that most people don't care about them.

But thanks for your opinion.

Yours,
Laurenz Albe

#10Isaac Morland
isaac.morland@gmail.com
In reply to: Laurenz Albe (#1)
Re: Reducing the log spam

On Tue, 5 Mar 2024 at 07:55, Laurenz Albe <laurenz.albe@cybertec.at> wrote:

Inspired by feedback to [1], I thought about how to reduce log spam.

My experience from the field is that a lot of log spam looks like

database/table/... "xy" does not exist
duplicate key value violates unique constraint "xy"

So what about a parameter "log_suppress_sqlstates" that suppresses
logging ERROR and FATAL messages with the enumerated SQL states?

My idea for a default setting would be something like

log_suppress_sqlstates = '23505,3D000,3F000,42601,42704,42883,42P01'

but that's of course bikeshedding territory.

I like the basic idea and the way of specifying states seems likely to
cover a lot of typical use cases. Of course in principle the application
should be fixed, but in practice we can't always control that.

I have two questions about this:

First, can it be done per role? If I have a particular application which is
constantly throwing some particular error, I might want to suppress it, but
not suppress the same error occasionally coming from another application. I
see ALTER DATABASE name SET configuration_parameter … as being useful here,
but often multiple applications share a database.

Second, where can this setting be adjusted? Can any session turn off
logging of arbitrary sets of sqlstates resulting from its queries? It feels
to me like that might allow security problems to be hidden. Specifically,
the first thing an SQL injection might do would be to turn off logging of
important error states, then proceed to try various nefarious things.

It seems to me the above questions interact; an answer to the first might
be "ALTER ROLE role_specification SET configuration_parameter", but I think
that would allow roles to change their own settings, contrary to the
concern raised by the second question.

#11Laurenz Albe
laurenz.albe@cybertec.at
In reply to: Isaac Morland (#10)
Re: Reducing the log spam

On Wed, 2024-03-06 at 17:33 -0500, Isaac Morland wrote:

I have two questions about this:

First, can it be done per role? If I have a particular application which is
constantly throwing some particular error, I might want to suppress it, but
not suppress the same error occasionally coming from another application.
I see ALTER DATABASE name SET configuration_parameter … as being useful here,
but often multiple applications share a database.

Second, where can this setting be adjusted? Can any session turn off logging
of arbitrary sets of sqlstates resulting from its queries? It feels to me
like that might allow security problems to be hidden. Specifically, the first
thing an SQL injection might do would be to turn off logging of important
error states, then proceed to try various nefarious things.

I was envisioning the parameter to be like other logging parameters, for
example "log_statement": only superusers can set the parameter or GRANT
that privilege to others. Also, a superuser could use ALTER ROLE to set
the parameter for all sessions by that role.

It seems to me the above questions interact; an answer to the first might be
"ALTER ROLE role_specification SET configuration_parameter", but I think that
would allow roles to change their own settings, contrary to the concern
raised by the second question.

If a superuser sets "log_statement" on a role, that role cannot undo or change
the setting. That's just how I plan to implement the new parameter.

Yours,
Laurenz Albe

#12Laurenz Albe
laurenz.albe@cybertec.at
In reply to: Laurenz Albe (#11)
Re: Reducing the log spam

On Thu, 2024-03-07 at 08:30 +0100, Laurenz Albe wrote:

On Wed, 2024-03-06 at 17:33 -0500, Isaac Morland wrote:

I have two questions about this:

First, can it be done per role? If I have a particular application which is
constantly throwing some particular error, I might want to suppress it, but
not suppress the same error occasionally coming from another application.
I see ALTER DATABASE name SET configuration_parameter … as being useful here,
but often multiple applications share a database.

Second, where can this setting be adjusted? Can any session turn off logging
of arbitrary sets of sqlstates resulting from its queries? It feels to me
like that might allow security problems to be hidden. Specifically, the first
thing an SQL injection might do would be to turn off logging of important
error states, then proceed to try various nefarious things.

I was envisioning the parameter to be like other logging parameters, for
example "log_statement": only superusers can set the parameter or GRANT
that privilege to others. Also, a superuser could use ALTER ROLE to set
the parameter for all sessions by that role.

It seems to me the above questions interact; an answer to the first might be
"ALTER ROLE role_specification SET configuration_parameter", but I think that
would allow roles to change their own settings, contrary to the concern
raised by the second question.

If a superuser sets "log_statement" on a role, that role cannot undo or change
the setting. That's just how I plan to implement the new parameter.

Here is a patch that implements this.

I went with "log_suppress_errcodes", since the term "error code" is used
throughout our documentation.

The initial value is 23505,3D000,3F000,42601,42704,42883,42P01,57P03

Yours,
Laurenz Albe

Attachments:

v1-0001-Add-parameter-log_suppress_errcodes.patchtext/x-patch; charset=UTF-8; name=v1-0001-Add-parameter-log_suppress_errcodes.patchDownload+180-1
#13Laurenz Albe
laurenz.albe@cybertec.at
In reply to: Laurenz Albe (#12)
Re: Reducing the log spam

On Sat, 2024-03-09 at 14:03 +0100, Laurenz Albe wrote:

Here is a patch that implements this.

And here is patch v2 that fixes a bug and passes the regression tests.

Yours,
Laurenz Albe

Attachments:

v2-0001-Add-parameter-log_suppress_errcodes.patchtext/x-patch; charset=UTF-8; name=v2-0001-Add-parameter-log_suppress_errcodes.patchDownload+187-2
#14Jelte Fennema-Nio
postgres@jeltef.nl
In reply to: Laurenz Albe (#13)
Re: Reducing the log spam
-   the subscriber's server log.
+   the subscriber's server log if you remove <literal>23505</literal> from
+   <xref linkend="guc-log-suppress-errcodes"/>.

This seems like a pretty big regression. Being able to know why your
replication got closed seems pretty critical.

Show quoted text

On Mon, 11 Mar 2024 at 03:44, Laurenz Albe <laurenz.albe@cybertec.at> wrote:

On Sat, 2024-03-09 at 14:03 +0100, Laurenz Albe wrote:

Here is a patch that implements this.

And here is patch v2 that fixes a bug and passes the regression tests.

Yours,
Laurenz Albe

#15Laurenz Albe
laurenz.albe@cybertec.at
In reply to: Jelte Fennema-Nio (#14)
Re: Reducing the log spam

On Mon, 2024-03-11 at 09:33 +0100, Jelte Fennema-Nio wrote:

-   the subscriber's server log.
+   the subscriber's server log if you remove <literal>23505</literal> from
+   <xref linkend="guc-log-suppress-errcodes"/>.

This seems like a pretty big regression. Being able to know why your
replication got closed seems pretty critical.

The actual SQLSTATEs that get suppressed are subject to discussion
(an I have a gut feeling that some people will want the list empty).

As far as this specific functionality is concerned, I think that the
actual problem is a deficiency in PostgreSQL. The problem is that
the log is the *only* place where you can get this information. That
will be a problem for many people, even without "log_suppress_errcodes".

I think that this isformation should be available in some statistics
view.

Yours,
Laurenz Albe

#16Laurenz Albe
laurenz.albe@cybertec.at
In reply to: Jelte Fennema-Nio (#14)
Re: Reducing the log spam

On Mon, 2024-03-11 at 09:33 +0100, Jelte Fennema-Nio wrote:

-   the subscriber's server log.
+   the subscriber's server log if you remove <literal>23505</literal> from
+   <xref linkend="guc-log-suppress-errcodes"/>.

This seems like a pretty big regression. Being able to know why your
replication got closed seems pretty critical.

Yes. But I'd argue that that is a shortcoming of logical replication:
there should be a ways to get this information via SQL. Having to look into
the log file is not a very useful option.

The feature will become much less useful if unique voilations keep getting logged.

Yours,
Laurenz Albe

#17Jelte Fennema-Nio
postgres@jeltef.nl
In reply to: Laurenz Albe (#16)
Re: Reducing the log spam

On Thu, 2 May 2024 at 12:47, Laurenz Albe <laurenz.albe@cybertec.at> wrote:

Yes. But I'd argue that that is a shortcoming of logical replication:
there should be a ways to get this information via SQL. Having to look into
the log file is not a very useful option.

Definitely agreed that accessing the error details using SQL would be
much better. But having no way at all (by default) to find the cause
of the failure is clearly much worse.

The feature will become much less useful if unique voilations keep getting logged.

Agreed. How about changing the patch so that the GUC is not applied to
logical replication apply workers (and document that accordingly). I
can think of two ways of achieving that (but there might be
other/better ones):
1. Set the GUC to empty string when an apply worker is started.
2. Change the newly added check in errcode() to only set
output_to_server to false when IsLogicalWorker() returns false.

#18Jelte Fennema-Nio
postgres@jeltef.nl
In reply to: Jelte Fennema-Nio (#17)
Re: Reducing the log spam

On Thu, 2 May 2024 at 13:08, Jelte Fennema-Nio <postgres@jeltef.nl> wrote:

2. Change the newly added check in errcode() to only set
output_to_server to false when IsLogicalWorker() returns false.

Actually a third, and probably even better solution would be to only
apply this new GUC to non-backgroundworker processes. That seems quite
reasonable, since often the only way to access background worker
errors is often through the logs.

#19Laurenz Albe
laurenz.albe@cybertec.at
In reply to: Jelte Fennema-Nio (#18)
Re: Reducing the log spam

On Thu, 2024-05-02 at 13:11 +0200, Jelte Fennema-Nio wrote:

On Thu, 2 May 2024 at 13:08, Jelte Fennema-Nio <postgres@jeltef.nl> wrote:

2. Change the newly added check in errcode() to only set
output_to_server to false when IsLogicalWorker() returns false.

Actually a third, and probably even better solution would be to only
apply this new GUC to non-backgroundworker processes. That seems quite
reasonable, since often the only way to access background worker
errors is often through the logs.

That is a good idea. This version only suppresses error messages
in ordinary backend processes.

Yours,
Laurenz Albe

Attachments:

v3-0001-Add-parameter-log_suppress_errcodes.patchtext/x-patch; charset=UTF-8; name=v3-0001-Add-parameter-log_suppress_errcodes.patchDownload+184-1