Re: [HACKERS] Parallel tuplesort (for parallel B-Tree index creation) - Mailing list pgsql-hackers
From | Robert Haas |
---|---|
Subject | Re: [HACKERS] Parallel tuplesort (for parallel B-Tree index creation) |
Date | |
Msg-id | CA+TgmoZTfC7WeYNfXHKXjwZGMd7zKYEbf93J+OQJrdQeRH2u+g@mail.gmail.com Whole thread Raw |
In response to | Re: [HACKERS] Parallel tuplesort (for parallel B-Tree index creation) (Peter Geoghegan <pg@bowt.ie>) |
Responses |
Re: [HACKERS] Parallel tuplesort (for parallel B-Tree index creation)
Re: [HACKERS] Parallel tuplesort (for parallel B-Tree index creation) |
List | pgsql-hackers |
On Wed, Jan 17, 2018 at 7:00 PM, Peter Geoghegan <pg@bowt.ie> wrote: > There seems to be some yak shaving involved in getting the barrier > abstraction to do exactly what is required, as Thomas went into at the > time. How should that prerequisite work be structured? For example, > should a patch be spun off for that part? > > I may not be the most qualified person for this job, since Thomas > considered two alternative approaches (to making the static barrier > abstraction forget about never-launched participants) without ever > settling on one of them. I had forgotten about the previous discussion. The sketch in my previous email supposed that we would use dynamic barriers since the whole point, after all, is to handle the fact that we don't know how many participants will really show up. Thomas's idea seems to be that the leader will initialize the barrier based on the anticipated number of participants and then tell it to forget about the participants that don't materialize. Of course, that would require that the leader somehow figure out how many participants didn't show up so that it can deduct then from the counter in the barrier. And how is it going to do that? It's true that the leader will know the value of nworkers_launched, but as the comment in LaunchParallelWorkers() says: "The caller must be able to tolerate ending up with fewer workers than expected, so there is no need to throw an error here if registration fails. It wouldn't help much anyway, because registering the worker in no way guarantees that it will start up and initialize successfully." So it seems to me that a much better plan than having the leader try to figure out how many workers failed to launch would be to just keep a count of how many workers did in fact launch. The count can be stored in shared memory, and each worker that comes along can increment it. Then we don't have to worry about whether we accurately detect failure to launch. We can argue about whether it's possible to detect all cases of failure to launch unerringly, but what's for sure is that if a worker increments a counter in shared memory, it launched. Now, where should this counter be located? There are of course multiple possibilities, but in my sketch it goes in some_barrier_variable->nparticipants i.e. we just use a dynamic barrier. So my position (at least until Thomas or Andres shows up and tells me why I'm wrong) is that you can use the Barrier API just as it is without any yak-shaving, just by following the sketch I set out before. The additional API I proposed in that sketch isn't really required, although it might be more efficient. But it doesn't really matter: if that comes along later, it will be trivial to adjust the code to take advantage of it. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
pgsql-hackers by date: