Log details for stats dropped more than once

Started by Bertrand Drouvotalmost 2 years ago4 messageshackers
Jump to latest
#1Bertrand Drouvot
bertranddrouvot.pg@gmail.com

Hi hackers,

While resuming the work on refilenode stats (mentioned in [1]/messages/by-id/ZbIdgTjR2QcFJ2mE@ip-10-97-1-34.eu-west-3.compute.internal but did not share
the patch yet), I realized that my current POC patch is buggy enough to produce
things like:

024-05-14 09:51:14.783 UTC [1788714] FATAL: can only drop stats once

While the CONTEXT provides the list of dropped stats:

2024-05-14 09:51:14.783 UTC [1788714] CONTEXT: WAL redo at 0/D75F478 for Transaction/ABORT: 2024-05-14 09:51:14.782223+00; dropped stats: 2/16384/27512/0 2/16384/27515/0 2/16384/27516/0

It's not clear which one generates the error (don't pay attention to the actual
values, the issue comes from the new refilenode stats that I removed from the
output).

Attached a tiny patch to report the stat that generates the error. The patch uses
errdetail_internal() as the extra details don't seem to be useful to average
users.

[1]: /messages/by-id/ZbIdgTjR2QcFJ2mE@ip-10-97-1-34.eu-west-3.compute.internal

Looking forward to your feedback,

Regards

--
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

Attachments:

v1-0001-Log-details-for-stats-dropped-more-than-once.patchtext/x-diff; charset=us-asciiDownload+5-2
#2Michael Paquier
michael@paquier.xyz
In reply to: Bertrand Drouvot (#1)
Re: Log details for stats dropped more than once

On Tue, May 14, 2024 at 10:07:14AM +0000, Bertrand Drouvot wrote:

While resuming the work on refilenode stats (mentioned in [1] but did not share
the patch yet), I realized that my current POC patch is buggy enough to produce
things like:

024-05-14 09:51:14.783 UTC [1788714] FATAL: can only drop stats once

While the CONTEXT provides the list of dropped stats:

2024-05-14 09:51:14.783 UTC [1788714] CONTEXT: WAL redo at 0/D75F478 for Transaction/ABORT: 2024-05-14 09:51:14.782223+00; dropped stats: 2/16384/27512/0 2/16384/27515/0 2/16384/27516/0

Can refcount be useful to know in this errcontext?

Attached a tiny patch to report the stat that generates the error. The patch uses
errdetail_internal() as the extra details don't seem to be useful to average
users.

I think that's fine. Overall that looks like useful information for
debugging, so no objections from here.
--
Michael

#3Bertrand Drouvot
bertranddrouvot.pg@gmail.com
In reply to: Michael Paquier (#2)
Re: Log details for stats dropped more than once

Hi,

On Wed, May 15, 2024 at 02:47:29PM +0900, Michael Paquier wrote:

On Tue, May 14, 2024 at 10:07:14AM +0000, Bertrand Drouvot wrote:

While resuming the work on refilenode stats (mentioned in [1] but did not share
the patch yet), I realized that my current POC patch is buggy enough to produce
things like:

024-05-14 09:51:14.783 UTC [1788714] FATAL: can only drop stats once

While the CONTEXT provides the list of dropped stats:

2024-05-14 09:51:14.783 UTC [1788714] CONTEXT: WAL redo at 0/D75F478 for Transaction/ABORT: 2024-05-14 09:51:14.782223+00; dropped stats: 2/16384/27512/0 2/16384/27515/0 2/16384/27516/0

Can refcount be useful to know in this errcontext?

Thanks for looking at it!

Do you mean as part of the added errdetail_internal()? If so, yeah I think it's
a good idea (done that way in v2 attached).

Attached a tiny patch to report the stat that generates the error. The patch uses
errdetail_internal() as the extra details don't seem to be useful to average
users.

I think that's fine. Overall that looks like useful information for
debugging, so no objections from here.

Thanks! BTW, I just realized that adding more details for this error has already
been mentioned in [1]/messages/by-id/20240505160915.6boysum4f34siqct@awork3.anarazel.de (and Andres did propose a slightly different version).

[1]: /messages/by-id/20240505160915.6boysum4f34siqct@awork3.anarazel.de

Regards,

--
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

Attachments:

v2-0001-Log-details-for-stats-dropped-more-than-once.patchtext/x-diff; charset=us-asciiDownload+6-2
#4Michael Paquier
michael@paquier.xyz
In reply to: Bertrand Drouvot (#3)
Re: Log details for stats dropped more than once

On Wed, May 15, 2024 at 08:04:48AM +0000, Bertrand Drouvot wrote:

Thanks! BTW, I just realized that adding more details for this error has already
been mentioned in [1] (and Andres did propose a slightly different version).

[1]: /messages/by-id/20240505160915.6boysum4f34siqct@awork3.anarazel.de

Ah, it is not surprising. I'd agree with doing what is posted there
for simplicity's sake. Rather than putting that in a errdetail, let's
keep all the information in an errmsg() as that makes deparsing
easier, and let's keep the elog().
--
Michael