Re: Flush pgstats file during checkpoints - Mailing list pgsql-hackers

From Bertrand Drouvot
Subject Re: Flush pgstats file during checkpoints
Date
Msg-id Zp4D1UvyvfTUdSpH@ip-10-97-1-34.eu-west-3.compute.internal
Whole thread Raw
In response to Re: Flush pgstats file during checkpoints  (Michael Paquier <michael@paquier.xyz>)
Responses Re: Flush pgstats file during checkpoints
List pgsql-hackers
Hi,

On Wed, Jul 17, 2024 at 12:52:12PM +0900, Michael Paquier wrote:
> On Tue, Jul 16, 2024 at 10:37:39AM +0900, Michael Paquier wrote:
> > On Fri, Jul 12, 2024 at 01:01:19PM +0000, Bertrand Drouvot wrote:
> >> Instead of removing the stat file, should we keep it around until the first call
> >> to pgstat_write_statsfile()?
> >
> > Oops.  You are right, I have somewhat missed the unlink() once we are
> > done reading the stats file with a correct redo location.
> 
> The durable_rename() part has been applied.  Please find attached a
> rebase of the patch set with all the other comments addressed.

Thanks!

Looking at 0001:

1 ===

-               pgstat_write_statsfile();
+               pgstat_write_statsfile(GetRedoRecPtr());

Not related with your patch but this comment in the GetRedoRecPtr() function:

     * grabbed a WAL insertion lock to read the authoritative value in
     * Insert->RedoRecPtr

sounds weird. Should'nt that be s/Insert/XLogCtl/?

2 ===

+       /* Write the redo LSN, used to cross check the file loaded */

Nit: s/loaded/read/?

3 ===

+       /*
+        * Read the redo LSN stored in the file.
+        */
+       if (!read_chunk_s(fpin, &file_redo) ||
+               file_redo != redo)
+               goto error;

I wonder if it would make sense to have dedicated error messages for
"file_redo != redo" and for "format_id != PGSTAT_FILE_FORMAT_ID". That would
ease to diagnose as to why the stat file is discarded.

Looking at 0002:

LGTM

Looking at 0003:

4 ===

@@ -5638,10 +5634,7 @@ StartupXLOG(void)
         * TODO: With a bit of extra work we could just start with a pgstat file
         * associated with the checkpoint redo location we're starting from.
         */
-       if (didCrash)
-               pgstat_discard_stats();
-       else
-               pgstat_restore_stats(checkPoint.redo);
+       pgstat_restore_stats(checkPoint.redo)

remove the TODO comment?

5 ===

+ * process) if the stats file has a redo LSN that matches with the .

unfinished sentence?

6 ===

- * Should only be called by the startup process or in single user mode.
+ * This is called by the checkpointer or in single-user mode.
  */
 void
-pgstat_discard_stats(void)
+pgstat_flush_stats(XLogRecPtr redo)
 {

Would that make sense to add an Assert in pgstat_flush_stats()? (checking what 
the above comment states).

Regards,

-- 
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com



pgsql-hackers by date:

Previous
From: Amit Kapila
Date:
Subject: Re: Slow catchup of 2PC (twophase) transactions on replica in LR
Next
From: "Li, Yong"
Date:
Subject: Re: Make COPY format extendable: Extract COPY TO format implementations