RE: pg_upgrade: Pass -j down to vacuumdb - Mailing list pgsql-hackers

From Jamison, Kirk
Subject RE: pg_upgrade: Pass -j down to vacuumdb
Date
Msg-id D09B13F772D2274BB348A310EE3027C642DA10@g01jpexmbkw24
Whole thread Raw
In response to Re: pg_upgrade: Pass -j down to vacuumdb  (Jesper Pedersen <jesper.pedersen@redhat.com>)
List pgsql-hackers
Hi, 

On February 1, 2019 8:14 PM +0000, Jesper Pedersen wrote:

> On 2/1/19 4:58 AM, Alvaro Herrera wrote:
>> On 2019-Feb-01, Jamison, Kirk wrote:
>>> I'm not sure if misunderstood the purpose of $VACUUMDB_OPTS. I 
>>> thought what the other developers suggested is to utilize it so that 
>>> --jobs for vacuumdb would be optional and just based from user-specified option $VACUUMDB_OPTS.
>>> By which it would not utilize the amount of jobs used for pg_upgrade.
>>> To address your need of needing a parallel, the script would just 
>>> then suggest if the user wants a --jobs option. The if/else for 
>>> number of jobs is not needed in that case, maybe only for ifndef WIN32 else case.
>> 
>> No Kirk, you got it right -- (some of) those ifdefs are not needed, as 
>> adding the --jobs in the command line is not needed.  I do think some 
>> extra whitespace in the format string is needed.


> The point of the patch is to highlight to the user that even though he/she does "pg_upgrade -j 8" the "-j 8" option
isn'tpassed down to vacuumdb.
 
> Default value is 1, so in that case the echo command doesn't highlight that fact, otherwise it is. The user can then
cancelthe script and use the suggested command line.
 
> The script then executes the vaccumdb command with the $VACUUMDB_OPTS argument as noted in the documentation.
> Sample script attached as well.

Sorry I think I might have understood now where you are coming from,
where your script clarifies that it's not necessarily passed down.
If committers can let it pass, maybe it's necessary to highlight in the script,
Something like:
"If you would like default statistics as quickly as possible (e.g. run in parallel)..."

The difference is still perhaps your use of --jobs option
as somehow "explicitly" embedded in the code, compared to the suggestion of
making it optional by using the $VACUUMDB_OPTS variable, so that
jobs need not to be explicitly written, unless the user wants to.
(which I also think it's a better improvement because it's optional)

Some quick code check. sorry, don't have much time to write a patch for it today.
Perhaps you could write it similar to what you did in the --analyze-in-stages part.
Maybe something like (not sure if correct):
+    fprintf(script, "echo %s    \"%s/vacuumdb\" %s $VACUUMDB_OPTS --all %s%s\n", ECHO_QUOTE,
+            new_cluster.bindir, user_specification.data,
+            /* Did we copy the free space files? */
+            (GET_MAJOR_VERSION(old_cluster.major_version) >= 804) ?
+            "--analyze-only" : "--analyze", ECHO_QUOTE);

I'll continue to review though from time to time.

Regards,
Kirk Jamison

pgsql-hackers by date:

Previous
From: Michael Paquier
Date:
Subject: Re: log bind parameter values on error
Next
From: Michael Paquier
Date:
Subject: Re: Using POPCNT and other advanced bit manipulation instructions