Log details for stats dropped more than once
Hi hackers,
While resuming the work on refilenode stats (mentioned in [1]/messages/by-id/ZbIdgTjR2QcFJ2mE@ip-10-97-1-34.eu-west-3.compute.internal but did not share
the patch yet), I realized that my current POC patch is buggy enough to produce
things like:
024-05-14 09:51:14.783 UTC [1788714] FATAL: can only drop stats once
While the CONTEXT provides the list of dropped stats:
2024-05-14 09:51:14.783 UTC [1788714] CONTEXT: WAL redo at 0/D75F478 for Transaction/ABORT: 2024-05-14 09:51:14.782223+00; dropped stats: 2/16384/27512/0 2/16384/27515/0 2/16384/27516/0
It's not clear which one generates the error (don't pay attention to the actual
values, the issue comes from the new refilenode stats that I removed from the
output).
Attached a tiny patch to report the stat that generates the error. The patch uses
errdetail_internal() as the extra details don't seem to be useful to average
users.
[1]: /messages/by-id/ZbIdgTjR2QcFJ2mE@ip-10-97-1-34.eu-west-3.compute.internal
Looking forward to your feedback,
Regards
--
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com
Attachments:
v1-0001-Log-details-for-stats-dropped-more-than-once.patchtext/x-diff; charset=us-asciiDownload+5-2
On Tue, May 14, 2024 at 10:07:14AM +0000, Bertrand Drouvot wrote:
While resuming the work on refilenode stats (mentioned in [1] but did not share
the patch yet), I realized that my current POC patch is buggy enough to produce
things like:024-05-14 09:51:14.783 UTC [1788714] FATAL: can only drop stats once
While the CONTEXT provides the list of dropped stats:
2024-05-14 09:51:14.783 UTC [1788714] CONTEXT: WAL redo at 0/D75F478 for Transaction/ABORT: 2024-05-14 09:51:14.782223+00; dropped stats: 2/16384/27512/0 2/16384/27515/0 2/16384/27516/0
Can refcount be useful to know in this errcontext?
Attached a tiny patch to report the stat that generates the error. The patch uses
errdetail_internal() as the extra details don't seem to be useful to average
users.
I think that's fine. Overall that looks like useful information for
debugging, so no objections from here.
--
Michael
Hi,
On Wed, May 15, 2024 at 02:47:29PM +0900, Michael Paquier wrote:
On Tue, May 14, 2024 at 10:07:14AM +0000, Bertrand Drouvot wrote:
While resuming the work on refilenode stats (mentioned in [1] but did not share
the patch yet), I realized that my current POC patch is buggy enough to produce
things like:024-05-14 09:51:14.783 UTC [1788714] FATAL: can only drop stats once
While the CONTEXT provides the list of dropped stats:
2024-05-14 09:51:14.783 UTC [1788714] CONTEXT: WAL redo at 0/D75F478 for Transaction/ABORT: 2024-05-14 09:51:14.782223+00; dropped stats: 2/16384/27512/0 2/16384/27515/0 2/16384/27516/0
Can refcount be useful to know in this errcontext?
Thanks for looking at it!
Do you mean as part of the added errdetail_internal()? If so, yeah I think it's
a good idea (done that way in v2 attached).
Attached a tiny patch to report the stat that generates the error. The patch uses
errdetail_internal() as the extra details don't seem to be useful to average
users.I think that's fine. Overall that looks like useful information for
debugging, so no objections from here.
Thanks! BTW, I just realized that adding more details for this error has already
been mentioned in [1]/messages/by-id/20240505160915.6boysum4f34siqct@awork3.anarazel.de (and Andres did propose a slightly different version).
[1]: /messages/by-id/20240505160915.6boysum4f34siqct@awork3.anarazel.de
Regards,
--
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com
Attachments:
v2-0001-Log-details-for-stats-dropped-more-than-once.patchtext/x-diff; charset=us-asciiDownload+6-2
On Wed, May 15, 2024 at 08:04:48AM +0000, Bertrand Drouvot wrote:
Thanks! BTW, I just realized that adding more details for this error has already
been mentioned in [1] (and Andres did propose a slightly different version).[1]: /messages/by-id/20240505160915.6boysum4f34siqct@awork3.anarazel.de
Ah, it is not surprising. I'd agree with doing what is posted there
for simplicity's sake. Rather than putting that in a errdetail, let's
keep all the information in an errmsg() as that makes deparsing
easier, and let's keep the elog().
--
Michael