Re: WIP: WAL prefetch (another approach) - Mailing list pgsql-hackers

From Justin Pryzby
Subject Re: WIP: WAL prefetch (another approach)
Date
Msg-id 20210409033703.GP6592@telsasoft.com
Whole thread Raw
In response to Re: WIP: WAL prefetch (another approach)  (Thomas Munro <thomas.munro@gmail.com>)
Responses Re: WIP: WAL prefetch (another approach)  (Thomas Munro <thomas.munro@gmail.com>)
Re: WIP: WAL prefetch (another approach)  (Justin Pryzby <pryzby@telsasoft.com>)
List pgsql-hackers
Here's some little language fixes.

BTW, before beginning "recovery", PG syncs all the data dirs.
This can be slow, and it seems like the slowness is frequently due to file
metadata.  For example, that's an obvious consequence of an OS crash, after
which the page cache is empty.  I've made a habit of running find /zfs -ls |wc
to pre-warm it, which can take a little bit, but then the recovery process
starts moments later.  I don't have any timing measurements, but I expect that
starting to stat() all data files as soon as possible would be a win.

commit cc9707de333fe8242607cde9f777beadc68dbf04
Author: Justin Pryzby <pryzbyj@telsasoft.com>
Date:   Thu Apr 8 10:43:14 2021 -0500

    WIP: doc review: Optionally prefetch referenced data in recovery.
    
    1d257577e08d3e598011d6850fd1025858de8c8c

diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml
index bc4a8b2279..139dee7aa2 100644
--- a/doc/src/sgml/config.sgml
+++ b/doc/src/sgml/config.sgml
@@ -3621,7 +3621,7 @@ include_dir 'conf.d'
         pool after that.  However, on file systems with a block size larger
         than
         <productname>PostgreSQL</productname>'s, prefetching can avoid a
-        costly read-before-write when a blocks are later written.
+        costly read-before-write when blocks are later written.
         The default is off.
        </para>
       </listitem>
diff --git a/doc/src/sgml/wal.sgml b/doc/src/sgml/wal.sgml
index 24cf567ee2..36e00c92c2 100644
--- a/doc/src/sgml/wal.sgml
+++ b/doc/src/sgml/wal.sgml
@@ -816,9 +816,7 @@
    prefetching mechanism is most likely to be effective on systems
    with <varname>full_page_writes</varname> set to
    <varname>off</varname> (where that is safe), and where the working
-   set is larger than RAM.  By default, prefetching in recovery is enabled
-   on operating systems that have <function>posix_fadvise</function>
-   support.
+   set is larger than RAM.  By default, prefetching in recovery is disabled.
   </para>
  </sect1>
 
diff --git a/src/backend/access/transam/xlogprefetch.c b/src/backend/access/transam/xlogprefetch.c
index 28764326bc..363c079964 100644
--- a/src/backend/access/transam/xlogprefetch.c
+++ b/src/backend/access/transam/xlogprefetch.c
@@ -31,7 +31,7 @@
  * stall; this is counted with "skip_fpw".
  *
  * The only way we currently have to know that an I/O initiated with
- * PrefetchSharedBuffer() has that recovery will eventually call ReadBuffer(),
+ * PrefetchSharedBuffer() has that recovery will eventually call ReadBuffer(), XXX: what ??
  * and perform a synchronous read.  Therefore, we track the number of
  * potentially in-flight I/Os by using a circular buffer of LSNs.  When it's
  * full, we have to wait for recovery to replay records so that the queue
@@ -660,7 +660,7 @@ XLogPrefetcherScanBlocks(XLogPrefetcher *prefetcher)
             /*
              * I/O has possibly been initiated (though we don't know if it was
              * already cached by the kernel, so we just have to assume that it
-             * has due to lack of better information).  Record this as an I/O
+             * was due to lack of better information).  Record this as an I/O
              * in progress until eventually we replay this LSN.
              */
             XLogPrefetchIncrement(&SharedStats->prefetch);
diff --git a/src/backend/utils/misc/guc.c b/src/backend/utils/misc/guc.c
index 090abdad8b..8c72ba1f1a 100644
--- a/src/backend/utils/misc/guc.c
+++ b/src/backend/utils/misc/guc.c
@@ -2774,7 +2774,7 @@ static struct config_int ConfigureNamesInt[] =
     {
         {"wal_decode_buffer_size", PGC_POSTMASTER, WAL_ARCHIVE_RECOVERY,
             gettext_noop("Maximum buffer size for reading ahead in the WAL during recovery."),
-            gettext_noop("This controls the maximum distance we can read ahead n the WAL to prefetch referenced
blocks."),
+            gettext_noop("This controls the maximum distance we can read ahead in the WAL to prefetch referenced
blocks."),
             GUC_UNIT_BYTE
         },
         &wal_decode_buffer_size,



pgsql-hackers by date:

Previous
From: Kohei KaiGai
Date:
Subject: Re: TRUNCATE on foreign table
Next
From: Etsuro Fujita
Date:
Subject: Re: [POC] Fast COPY FROM command for the table with foreign partitions