Re: Add parallelism and glibc dependent only options to reindexdb - Mailing list pgsql-hackers

From Michael Paquier
Subject Re: Add parallelism and glibc dependent only options to reindexdb
Date
Msg-id 20190723073801.GD2059@paquier.xyz
Whole thread Raw
In response to Re: Add parallelism and glibc dependent only options to reindexdb  (Julien Rouhaud <rjuju123@gmail.com>)
Responses Re: Add parallelism and glibc dependent only options to reindexdb  (Julien Rouhaud <rjuju123@gmail.com>)
List pgsql-hackers
On Mon, Jul 22, 2019 at 02:40:19PM +0200, Julien Rouhaud wrote:
> Right, so I kept the long option.  Also this comment was outdated, as
> the --jobs is now just ignored with a list of indexes, so I fixed that
> too.

+   if (!parallel)
+   {
+       if (user_list == NULL)
+       {
+           /*
+            * Create a dummy list with an empty string, as user requires an
+            * element.
+            */
+           process_list = pg_malloc0(sizeof(SimpleStringList));
+           simple_string_list_append(process_list, "");
+       }
+   }
This part looks a bit hacky and this is needed because we don't have a
list of objects when doing a non-parallel system or database reindex.
The deal is that we just want a list with one element: the database of
the connection.  Wouldn't it be more natural to assign the database
name here using PQdb(conn)?  Then add an assertion at the top of
run_reindex_command() checking for a non-NULL name?

> I considered this, but it would require to adapt all code that declare
> SimpleStringList stack variable (vacuumdb.c, clusterdb.c,
> createuser.c, pg_dumpall.c and pg_dump.c), so it looked like too much
> trouble to avoid 2 local variables here and 1 in vacuumdb.c.  I don't
> have a strong opinion here, so I can go for it if you prefer.

Okay.  This is a tad wider than the original patch proposal, and this
adds two lines.  So let's discard that for now and keep it simple.

>> +$node->issues_sql_like([qw(reindexdb -j2)],
>> +   qr/statement: REINDEX TABLE public.test1/,
>> +   'Global and parallel reindex will issue per-table REINDEX');
>> Would it make sense to have some tests for schemas here?
>>
>> One of my comments in [1] has not been answered.  What about
>> the decomposition of a list of schemas into a list of tables when
>> using the parallel mode?
>
> I did that in attached v6, and added suitable regression tests.

The two tests for "reindexdb -j2" can be combined into a single call,
checking for both commands to be generated in the same output.  The
second command triggering a reindex on two schemas cannot be used to
check for the generation of both s1.t1 and s2.t2 as the ordering may
not be guaranteed.  The commands arrays also looked inconsistent with
the rest.  Attached is an updated patch with some format modifications
and the fixes for the tests.
--
Michael

Attachment

pgsql-hackers by date:

Previous
From: Fabien COELHO
Date:
Subject: Re: make \d pg_toast.foo show its indices ; and, \d toast show itsmain table ; and \d relkind=I show its partitions
Next
From: Fabien COELHO
Date:
Subject: Re: pgbench - add pseudo-random permutation function