Home > mailing lists

Re: Add parallelism and glibc dependent only options to reindexdb - Mailing list pgsql-hackers

From	Michael Paquier
Subject	Re: Add parallelism and glibc dependent only options to reindexdb
Date	July 11, 2019 16:34:17
Msg-id	20190711133417.GK4500@paquier.xyz Whole thread Raw
In response to	Re: Add parallelism and glibc dependent only options to reindexdb (Julien Rouhaud <rjuju123@gmail.com>)
Responses	Re: Add parallelism and glibc dependent only options to reindexdb (Julien Rouhaud <rjuju123@gmail.com>)
List	pgsql-hackers

Tree view

On Thu, Jul 11, 2019 at 11:48:20AM +0200, Julien Rouhaud wrote:
> On Thu, Jul 11, 2019 at 6:04 AM Michael Paquier <michael@paquier.xyz> wrote:
>> Get* would be my choice, because we look at the set of parallel slots,
>> and get an idle one.
>
> That's what the former GetIdleSlot (that I renamed to get_idle_slot as
> it's not static) is doing.  ConsumeIdleSlot() actually get an idle
> slot and mark it as being used.  That's probably some leakage of
> internal implementation, but having a GetIdleParallelSlot (or
> ParallelSlotGetIdle) *and* a get_idle_slot sounds like a bad idea, and
> I don't have a better idea on how to rename get_idle_slot.  If you
> really prefer Get* and you're fine with a static get_idle_slot, that's
> fine by me.

Do we actually need get_idle_slot?  ConsumeIdleSlot is its only
caller.

>> and we'd
>> need to be careful of the same where pg_catalog is listed.  Actually,
>> your patch breaks if we do a parallel run with pg_catalog and another
>> schema, no?
>
> It can definitely cause problems if you ask for pg_catalog and other
> schema, same as if you ask a parallel reindex of some catalog tables
> (possibly with other tables).  There's a --system switch for that
> need, so I don't know if documenting the limitation to avoid some
> extra code to deal with it is a good enough solution?

vacuumdb --full still has limitations in this area and we had some
reports on the matter about this behavior being annoying.  Its
documentation also mentions that mixing catalog relations with  --full
can cause deadlocks.

Documenting it may be fine at the end, but my take is that it would be
nice to make sure that we don't have deadlocks if we can avoid them
easily.  It is also a matter of balance.  If for example the patch
gets 3 times bigger in size because of that we may have an argument
for not doing it and keep the code simple.  What do people think about
that?  I would be nice to get more opinions here.

And while scanning the code...

+ * getQuerySucess
Typo here.

- * Pump the conn till it's dry of results; return false if any are errors.
This comment could be improved on the way, like "Go through all the
connections and make sure to consume any remaining results.  If any
error is found, false is returned after processing all the parallel
slots."
--
Michael

Attachment

signature.asc

pgsql-hackers by date:

From: Thomas Munro
Date: 11 July 2019, 15:49:22
Subject: Re: [RFC] [PATCH] Flexible "partition pruning" hook

From: Michael Paquier
Date: 11 July 2019, 16:40:14
Subject: Re: complier warnings from ecpg tests

Re: Add parallelism and glibc dependent only options to reindexdb - Mailing list pgsql-hackers

Attachment

Previous

Next