Re: optimizing pg_upgrade's once-in-each-database steps - Mailing list pgsql-hackers

From Nathan Bossart
Subject Re: optimizing pg_upgrade's once-in-each-database steps
Date
Msg-id Zpg4WnS9BRWbgBLi@nathan
Whole thread Raw
In response to Re: optimizing pg_upgrade's once-in-each-database steps  (Daniel Gustafsson <daniel@yesql.se>)
Responses Re: optimizing pg_upgrade's once-in-each-database steps
List pgsql-hackers
On Wed, Jul 17, 2024 at 11:16:59PM +0200, Daniel Gustafsson wrote:
> First reaction after a read-through is that this seems really cool, can't wait
> to see how much v18 pg_upgrade will be over v17.  I will do more testing and
> review once back from vacation, below are some comments from reading which is
> all I had time for at this point:

Thanks for taking a look!

> +static void
> +conn_failure(PGconn *conn)
> +{
> +   pg_log(PG_REPORT, "%s", PQerrorMessage(conn));
> +   printf(_("Failure, exiting\n"));
> +   exit(1);
> +}
> 
> Any particular reason this isn't using pg_fatal()?

IIRC this was to match the error in connectToServer().  We could probably
move to pg_fatal().

> +static void
> +dispatch_query(const ClusterInfo *cluster, AsyncSlot *slot,
>     ....
> +   pg_free(query);
> +}
> 
> A minor point, perhaps fueled by me not having played around much with this
> patchset.  It seems a bit odd that dispatch_query is responsible for freeing
> the query from the get_query callback.  I would have expected the output from
> AsyncTaskGetQueryCB to be stored in AsyncTask and released by async_task_free.

I don't see any problem with doing it the way you suggest.

Tangentially related, I waffled a bit on whether to require the query
callbacks to put the result in allocated memory.  Some queries are the same
no matter what, and some require customization at runtime.  As you can see,
I ended up just requiring allocated memory.  That makes the code a tad
simpler, and I doubt the extra work is noticeable.

> +static void
> +sub_process(DbInfo *dbinfo, PGresult *res, void *arg)
> +{
>      ....
> +       fprintf(state->script, "The table sync state \"%s\" is not allowed for database:\"%s\" subscription:\"%s\"
schema:\"%s\"relation:\"%s\"\n",
 
> +               PQgetvalue(res, i, 0),
> +               dbinfo->db_name,
> +               PQgetvalue(res, i, 1),
> 
> With the query being in a separate place in the code from the processing it
> takes a bit of jumping around to resolve the columns in PQgetvalue calls like
> this.  Using PQfnumber() calls and descriptive names would make this easier.  I
> know this case is copying old behavior, but the function splits make it less
> useful than before.

Good point.

> +   char       *query = pg_malloc(QUERY_ALLOC);
> 
> Should we convert this to a PQExpBuffer?

Seems like a good idea.  I think I was trying to change as few lines as
possible for my proof-of-concept.  :)

-- 
nathan



pgsql-hackers by date:

Previous
From: Daniel Gustafsson
Date:
Subject: Re: optimizing pg_upgrade's once-in-each-database steps
Next
From: Joe Conway
Date:
Subject: Re: CI, macports, darwin version problems