Re: optimizing pg_upgrade's once-in-each-database steps - Mailing list pgsql-hackers

From Ilya Gladyshev
Subject Re: optimizing pg_upgrade's once-in-each-database steps
Date
Msg-id 60cbc507-4c36-4037-9af7-028b1c8455b5@gmail.com
Whole thread Raw
In response to Re: optimizing pg_upgrade's once-in-each-database steps  (Nathan Bossart <nathandbossart@gmail.com>)
Responses Re: optimizing pg_upgrade's once-in-each-database steps
List pgsql-hackers
On 01.08.2024 22:41, Nathan Bossart wrote:
> Here is a new patch set.  Besides rebasing, I've added the recursive call
> to process_slot() mentioned in the quoted text, and I've added quite a bit
> of commentary to async.c.
That's much better now, thanks! Here's my code review, note that I 
haven't tested the patches yet:

+void
+async_task_add_step(AsyncTask *task,
+                    AsyncTaskGetQueryCB query_cb,
+                    AsyncTaskProcessCB process_cb, bool free_result,
+                    void *arg)

Is there any reason to have query as a callback function instead of char 
*? From what I see right now, it doesn't give any extra flexibility, as 
the query has to be static anyway (can't be customized on a per-database 
basis) and will be created once before all the callbacks are run. While 
passing in char * makes the API simpler, excludes any potential error of 
making the query dependent on the current database and removes the 
unnecessary malloc/free of the static strings.

+static void
+dispatch_query(const ClusterInfo *cluster, AsyncSlot *slot,
+               const AsyncTask *task)
+{
+    ...
+    if (!PQsendQuery(slot->conn, cbs->query))
+        conn_failure(slot->conn);
+}

This will print "connection failure: connection pointer is NULL", which 
I don't think makes a lot of sense to the end user. I'd prefer something 
like pg_fatal("failed to allocate a new connection").

      if (found)
-        pg_fatal("Data type checks failed: %s", report.data);
+    {
+        pg_fatal("Data type checks failed: %s", 
data_type_check_report.data);
+        termPQExpBuffer(&data_type_check_report);
+    }

`found` should be removed and replaced with `data_type_check_failed`, as 
it's not set anymore. Also the termPQExpBuffer after pg_fatal looks 
unnecessary.

+static bool *data_type_check_results;
+static bool data_type_check_failed;
+static PQExpBufferData data_type_check_report;

IMO, it would be nicer to have these as a local state, that's passed in 
as an arg* to the AsyncTaskProcessCB, which aligns with how the other 
checks do it.

-- End of review --

Regarding keeping the connections, the way I envisioned it entailed 
passing a list of connections from one check to the next one (or keeping 
a global state with connections?). I didn't concretely look at the code 
to verify this, so it's just an abstract idea.





pgsql-hackers by date:

Previous
From: Jeff Davis
Date:
Subject: Re: Comments on Custom RMGRs
Next
From: "Andrey M. Borodin"
Date:
Subject: Re: Thoughts about NUM_BUFFER_PARTITIONS