Thread: Log details for stats dropped more than once

Log details for stats dropped more than once

From
Bertrand Drouvot
Date:
Hi hackers,

While resuming the work on refilenode stats (mentioned in [1] but did not share
the patch yet), I realized that my current POC patch is buggy enough to produce
things like:

024-05-14 09:51:14.783 UTC [1788714] FATAL:  can only drop stats once

While the CONTEXT provides the list of dropped stats:

2024-05-14 09:51:14.783 UTC [1788714] CONTEXT:  WAL redo at 0/D75F478 for Transaction/ABORT: 2024-05-14
09:51:14.782223+00;dropped stats: 2/16384/27512/0 2/16384/27515/0 2/16384/27516/0
 

It's not clear which one generates the error (don't pay attention to the actual
values, the issue comes from the new refilenode stats that I removed from the
output).

Attached a tiny patch to report the stat that generates the error. The patch uses
errdetail_internal() as the extra details don't seem to be useful to average
users.

[1]: https://www.postgresql.org/message-id/ZbIdgTjR2QcFJ2mE%40ip-10-97-1-34.eu-west-3.compute.internal

Looking forward to your feedback,

Regards

-- 
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

Attachment

Re: Log details for stats dropped more than once

From
Michael Paquier
Date:
On Tue, May 14, 2024 at 10:07:14AM +0000, Bertrand Drouvot wrote:
> While resuming the work on refilenode stats (mentioned in [1] but did not share
> the patch yet), I realized that my current POC patch is buggy enough to produce
> things like:
>
> 024-05-14 09:51:14.783 UTC [1788714] FATAL:  can only drop stats once
>
> While the CONTEXT provides the list of dropped stats:
>
> 2024-05-14 09:51:14.783 UTC [1788714] CONTEXT:  WAL redo at 0/D75F478 for Transaction/ABORT: 2024-05-14
09:51:14.782223+00;dropped stats: 2/16384/27512/0 2/16384/27515/0 2/16384/27516/0 

Can refcount be useful to know in this errcontext?

> Attached a tiny patch to report the stat that generates the error. The patch uses
> errdetail_internal() as the extra details don't seem to be useful to average
> users.

I think that's fine.  Overall that looks like useful information for
debugging, so no objections from here.
--
Michael

Attachment

Re: Log details for stats dropped more than once

From
Bertrand Drouvot
Date:
Hi,

On Wed, May 15, 2024 at 02:47:29PM +0900, Michael Paquier wrote:
> On Tue, May 14, 2024 at 10:07:14AM +0000, Bertrand Drouvot wrote:
> > While resuming the work on refilenode stats (mentioned in [1] but did not share
> > the patch yet), I realized that my current POC patch is buggy enough to produce
> > things like:
> > 
> > 024-05-14 09:51:14.783 UTC [1788714] FATAL:  can only drop stats once
> > 
> > While the CONTEXT provides the list of dropped stats:
> > 
> > 2024-05-14 09:51:14.783 UTC [1788714] CONTEXT:  WAL redo at 0/D75F478 for Transaction/ABORT: 2024-05-14
09:51:14.782223+00;dropped stats: 2/16384/27512/0 2/16384/27515/0 2/16384/27516/0
 
> 
> Can refcount be useful to know in this errcontext?

Thanks for looking at it!

Do you mean as part of the added errdetail_internal()? If so, yeah I think it's
a good idea (done that way in v2 attached).

> > Attached a tiny patch to report the stat that generates the error. The patch uses
> > errdetail_internal() as the extra details don't seem to be useful to average
> > users.
> 
> I think that's fine.  Overall that looks like useful information for
> debugging, so no objections from here.

Thanks! BTW, I just realized that adding more details for this error has already
been mentioned in [1] (and Andres did propose a slightly different version).

[1]: https://www.postgresql.org/message-id/20240505160915.6boysum4f34siqct%40awork3.anarazel.de

Regards,

-- 
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

Attachment

Re: Log details for stats dropped more than once

From
Michael Paquier
Date:
On Wed, May 15, 2024 at 08:04:48AM +0000, Bertrand Drouvot wrote:
> Thanks! BTW, I just realized that adding more details for this error has already
> been mentioned in [1] (and Andres did propose a slightly different version).
>
> [1]: https://www.postgresql.org/message-id/20240505160915.6boysum4f34siqct%40awork3.anarazel.de

Ah, it is not surprising.  I'd agree with doing what is posted there
for simplicity's sake.  Rather than putting that in a errdetail, let's
keep all the information in an errmsg() as that makes deparsing
easier, and let's keep the elog().
--
Michael

Attachment