Re: Add parallelism and glibc dependent only options to reindexdb - Mailing list pgsql-hackers

From Michael Paquier
Subject Re: Add parallelism and glibc dependent only options to reindexdb
Date
Msg-id 20190717075902.GA20614@paquier.xyz
Whole thread Raw
In response to Re: Add parallelism and glibc dependent only options to reindexdb  (Julien Rouhaud <rjuju123@gmail.com>)
Responses Re: Add parallelism and glibc dependent only options to reindexdb  (Julien Rouhaud <rjuju123@gmail.com>)
List pgsql-hackers
On Tue, Jul 16, 2019 at 02:03:16PM +0200, Julien Rouhaud wrote:
> After more thinking about schema and multiple jobs, I think that
> erroring out is quite user unfriendly, as it's entirely ok to ask
> for
> multiple indexes and multiple object that do support parallelism in
> a
> single call.  So I think it's better to remove the error, ignore the
> given --jobs options for indexes and document this behavior.

No objections to that.  I still need to study a bit more 0002 though
to come to a clear conclusion.

Actually, from patch 0002:
+       free_slot = ParallelSlotsGetIdle(slots, concurrentCons, progname);
+       if (!free_slot)
+       {
+           failed = true;
+           goto finish;
+       }
+
+       run_reindex_command(conn, process_type, objname, progname, echo,
+                           verbose, concurrently, true);
The same connection gets reused, shouldn't the connection come from
the free slot?

On top of that quick lookup, I have done an in-depth review on 0001 to
bring it to a committable state, fixing a couple of typos, incorrect
comments (description of ParallelSlotsGetIdle was for example
incorrect) on the way.  Other things include that connectDatabase
should have an assertion for a non-NULL connection, calling pg_free()
on the slots terminate is more consistent as pg_malloc is used first.
A comment at the top of processQueryResult still referred to
vacuuming of a missing relation.  Most of the patch was in a clean
state, with a clear interface for parallel slots, the place of the new
routines also makes sense, so I did not have much to do :)

Another thing I have noticed is that we don't really need to pass down
progname across all those layers as we finish by using pg_log_error()
when processing results, so more simplifications can be done.  Let's
handle all that in the same patch as we are messing with the area.
connectDatabase() and connectMaintenanceDatabase() still need it
though as this is used in the connection string, so
ParallelSlotsSetup() also needs it.  This part is not really your
fault but as I am looking at it, it does not hurt to clean up what we
can.  Attached is an updated version of 0001 that I am comfortable
with.  I'd like to commit that with the cleanups included and then
let's move to the real deal with 0002.
--
Michael

Attachment

pgsql-hackers by date:

Previous
From: Fabien COELHO
Date:
Subject: Re: refactoring - share str2*int64 functions
Next
From: Michael Paquier
Date:
Subject: Re: pg_receivewal documentation