Thread: Log details for stats dropped more than once
Hi hackers, While resuming the work on refilenode stats (mentioned in [1] but did not share the patch yet), I realized that my current POC patch is buggy enough to produce things like: 024-05-14 09:51:14.783 UTC [1788714] FATAL: can only drop stats once While the CONTEXT provides the list of dropped stats: 2024-05-14 09:51:14.783 UTC [1788714] CONTEXT: WAL redo at 0/D75F478 for Transaction/ABORT: 2024-05-14 09:51:14.782223+00;dropped stats: 2/16384/27512/0 2/16384/27515/0 2/16384/27516/0 It's not clear which one generates the error (don't pay attention to the actual values, the issue comes from the new refilenode stats that I removed from the output). Attached a tiny patch to report the stat that generates the error. The patch uses errdetail_internal() as the extra details don't seem to be useful to average users. [1]: https://www.postgresql.org/message-id/ZbIdgTjR2QcFJ2mE%40ip-10-97-1-34.eu-west-3.compute.internal Looking forward to your feedback, Regards -- Bertrand Drouvot PostgreSQL Contributors Team RDS Open Source Databases Amazon Web Services: https://aws.amazon.com
Attachment
On Tue, May 14, 2024 at 10:07:14AM +0000, Bertrand Drouvot wrote: > While resuming the work on refilenode stats (mentioned in [1] but did not share > the patch yet), I realized that my current POC patch is buggy enough to produce > things like: > > 024-05-14 09:51:14.783 UTC [1788714] FATAL: can only drop stats once > > While the CONTEXT provides the list of dropped stats: > > 2024-05-14 09:51:14.783 UTC [1788714] CONTEXT: WAL redo at 0/D75F478 for Transaction/ABORT: 2024-05-14 09:51:14.782223+00;dropped stats: 2/16384/27512/0 2/16384/27515/0 2/16384/27516/0 Can refcount be useful to know in this errcontext? > Attached a tiny patch to report the stat that generates the error. The patch uses > errdetail_internal() as the extra details don't seem to be useful to average > users. I think that's fine. Overall that looks like useful information for debugging, so no objections from here. -- Michael
Attachment
Hi, On Wed, May 15, 2024 at 02:47:29PM +0900, Michael Paquier wrote: > On Tue, May 14, 2024 at 10:07:14AM +0000, Bertrand Drouvot wrote: > > While resuming the work on refilenode stats (mentioned in [1] but did not share > > the patch yet), I realized that my current POC patch is buggy enough to produce > > things like: > > > > 024-05-14 09:51:14.783 UTC [1788714] FATAL: can only drop stats once > > > > While the CONTEXT provides the list of dropped stats: > > > > 2024-05-14 09:51:14.783 UTC [1788714] CONTEXT: WAL redo at 0/D75F478 for Transaction/ABORT: 2024-05-14 09:51:14.782223+00;dropped stats: 2/16384/27512/0 2/16384/27515/0 2/16384/27516/0 > > Can refcount be useful to know in this errcontext? Thanks for looking at it! Do you mean as part of the added errdetail_internal()? If so, yeah I think it's a good idea (done that way in v2 attached). > > Attached a tiny patch to report the stat that generates the error. The patch uses > > errdetail_internal() as the extra details don't seem to be useful to average > > users. > > I think that's fine. Overall that looks like useful information for > debugging, so no objections from here. Thanks! BTW, I just realized that adding more details for this error has already been mentioned in [1] (and Andres did propose a slightly different version). [1]: https://www.postgresql.org/message-id/20240505160915.6boysum4f34siqct%40awork3.anarazel.de Regards, -- Bertrand Drouvot PostgreSQL Contributors Team RDS Open Source Databases Amazon Web Services: https://aws.amazon.com
Attachment
On Wed, May 15, 2024 at 08:04:48AM +0000, Bertrand Drouvot wrote: > Thanks! BTW, I just realized that adding more details for this error has already > been mentioned in [1] (and Andres did propose a slightly different version). > > [1]: https://www.postgresql.org/message-id/20240505160915.6boysum4f34siqct%40awork3.anarazel.de Ah, it is not surprising. I'd agree with doing what is posted there for simplicity's sake. Rather than putting that in a errdetail, let's keep all the information in an errmsg() as that makes deparsing easier, and let's keep the elog(). -- Michael