Re: optimizing pg_upgrade's once-in-each-database steps - Mailing list pgsql-hackers

From Ilya Gladyshev
Subject Re: optimizing pg_upgrade's once-in-each-database steps
Date
Msg-id 3A3553B1-123D-4F11-9426-277F2E920AD7@gmail.com
Whole thread Raw
In response to Re: optimizing pg_upgrade's once-in-each-database steps  (Nathan Bossart <nathandbossart@gmail.com>)
List pgsql-hackers
LGTM in general, but here are some final nitpicks:

+ if (maxFd != 0)
+ (void) select(maxFd + 1, &input_mask, &output_mask, &except_mask, NULL);

It’s a good idea to check for the return value of select, in case it returns any errors.

+ dbs_complete++;
+ (void) PQgetResult(slot->conn);
+ PQfinish(slot->conn);

Perhaps it’s rather for me to understand, what does PQgetResult call do here?

+ /* Check for connection failure. */
+ if (PQconnectPoll(slot->conn) == PGRES_POLLING_FAILED)
+ pg_fatal("connection failure: %s", PQerrorMessage(slot->conn));
+
+ /* Check whether the connection is still establishing. */
+ if (PQconnectPoll(slot->conn) != PGRES_POLLING_OK)
+ return;

Are the two consecutive calls of PQconnectPoll intended here? Seems a bit wasteful, but maybe that’s ok.

We should probably mention this change in the docs as well, I found these two places that I think can be improved:

diff --git a/doc/src/sgml/ref/pgupgrade.sgml b/doc/src/sgml/ref/pgupgrade.sgml
index 9877f2f01c..ad7aa33f07 100644
--- a/doc/src/sgml/ref/pgupgrade.sgml
+++ b/doc/src/sgml/ref/pgupgrade.sgml
@@ -118,7 +118,7 @@ PostgreSQL documentation
      <varlistentry>
       <term><option>-j <replaceable class="parameter">njobs</replaceable></option></term>
       <term><option>--jobs=<replaceable class="parameter">njobs</replaceable></option></term>
-      <listitem><para>number of simultaneous processes or threads to use
+      <listitem><para>number of simultaneous processes, threads or connections to use
       </para></listitem>
      </varlistentry>
 
@@ -587,8 +587,9 @@ NET STOP postgresql-&majorversion;
 
     <para>
      The <option>--jobs</option> option allows multiple CPU cores to be used
-     for copying/linking of files and to dump and restore database schemas
-     in parallel;  a good place to start is the maximum of the number of
+     for copying/linking of files, to dump and restore database schemas in
+     parallel and to use multiple simultaneous connections for upgrade checks;
+      a good place to start is the maximum of the number of
      CPU cores and tablespaces.  This option can dramatically reduce the
      time to upgrade a multi-database server running on a multiprocessor
      machine.
-- 


28 авг. 2024 г., в 22:46, Nathan Bossart <nathandbossart@gmail.com> написал(а):

I spent some time cleaning up names, comments, etc.  Barring additional
feedback, I'm planning to commit this stuff in the September commitfest so
that it has plenty of time to bake in the buildfarm.

--
nathan
<v10-0001-Introduce-framework-for-parallelizing-various-pg.patch><v10-0002-Use-pg_upgrade-s-new-parallel-framework-for-subs.patch><v10-0003-Use-pg_upgrade-s-new-parallel-framework-to-get-r.patch><v10-0004-Use-pg_upgrade-s-new-parallel-framework-to-get-l.patch><v10-0005-Use-pg_upgrade-s-new-parallel-framework-for-exte.patch><v10-0006-Use-pg_upgrade-s-new-parallel-framework-for-data.patch><v10-0007-Use-pg_upgrade-s-new-parallel-framework-for-isn-.patch><v10-0008-Use-pg_upgrade-s-new-parallel-framework-for-post.patch><v10-0009-Use-pg_upgrade-s-new-parallel-framework-for-poly.patch><v10-0010-Use-pg_upgrade-s-new-parallel-framework-for-WITH.patch><v10-0011-Use-pg_upgrade-s-parallel-framework-for-encoding.patch>

pgsql-hackers by date:

Previous
From: Thomas Munro
Date:
Subject: Re: Interrupts vs signals
Next
From: Noah Misch
Date:
Subject: Re: race condition in pg_class