Thread: pgsql: Handle lack of DSM slots in parallel btree build.

pgsql: Handle lack of DSM slots in parallel btree build.

From
Thomas Munro
Date:
Handle lack of DSM slots in parallel btree build.

If no DSM slots are available, a ParallelContext can still be
created, but its seg pointer is NULL.  Teach parallel btree build
to cope with that by falling back to a regular non-parallel build,
to avoid crashing with a segmentation fault.

Back-patch to 11, where parallel CREATE INDEX landed.

Reported-by: Nicola Contu
Reviewed-by: Peter Geoghegan
Discussion: https://postgr.es/m/CA%2BhUKGJgJEBnkuODBVomyK3MWFvDBbMVj%3Dgdt6DnRPU-5sQ6UQ%40mail.gmail.com

Branch
------
master

Details
-------
https://git.postgresql.org/pg/commitdiff/74618e77b43cfce670b4725d5b9a300a2afd12d1

Modified Files
--------------
src/backend/access/nbtree/nbtsort.c | 9 +++++++++
1 file changed, 9 insertions(+)


Re: pgsql: Handle lack of DSM slots in parallel btree build.

From
Peter Geoghegan
Date:
On Thu, Jan 30, 2020 at 2:34 PM Thomas Munro <tmunro@postgresql.org> wrote:
> Handle lack of DSM slots in parallel btree build.
>
> If no DSM slots are available, a ParallelContext can still be
> created, but its seg pointer is NULL.  Teach parallel btree build
> to cope with that by falling back to a regular non-parallel build,
> to avoid crashing with a segmentation fault.

Uh, this seems to have completely disabled parallel index builds on
the master branch.

-- 
Peter Geoghegan



Re: pgsql: Handle lack of DSM slots in parallel btree build.

From
Thomas Munro
Date:
Hi Peter,

On Tue, Feb 4, 2020 at 9:13 AM Peter Geoghegan <pg@bowt.ie> wrote:
> On Thu, Jan 30, 2020 at 2:34 PM Thomas Munro <tmunro@postgresql.org> wrote:
> > Handle lack of DSM slots in parallel btree build.
> >
> > If no DSM slots are available, a ParallelContext can still be
> > created, but its seg pointer is NULL.  Teach parallel btree build
> > to cope with that by falling back to a regular non-parallel build,
> > to avoid crashing with a segmentation fault.
>
> Uh, this seems to have completely disabled parallel index builds on
> the master branch.

Oops.  The check needs to move down below InitializeParallelDSM(), and
release any extra resources that might have been acquired (snapshot?).
I will do some testing in a few hours and post a fix.



Re: pgsql: Handle lack of DSM slots in parallel btree build.

From
Thomas Munro
Date:
On Tue, Feb 4, 2020 at 10:03 AM Thomas Munro <thomas.munro@gmail.com> wrote:
> On Tue, Feb 4, 2020 at 9:13 AM Peter Geoghegan <pg@bowt.ie> wrote:
> > Uh, this seems to have completely disabled parallel index builds on
> > the master branch.
>
> Oops.  The check needs to move down below InitializeParallelDSM(), and
> release any extra resources that might have been acquired (snapshot?).
> I will do some testing in a few hours and post a fix.

Here is take 2.  I tested CI and CIC, and verified that workers are
started or not depending on dsm_create() failure, using the attached
fault injector patch.  (Fuzzing the regression tests repeatedly with
that applied also revealed similar problems elsewhere in the tree,
which I'll write about in a new thread.)

Attachment

Re: pgsql: Handle lack of DSM slots in parallel btree build.

From
Peter Geoghegan
Date:
On Tue, Feb 4, 2020 at 2:18 AM Thomas Munro <thomas.munro@gmail.com> wrote:
> Here is take 2.  I tested CI and CIC, and verified that workers are
> started or not depending on dsm_create() failure, using the attached
> fault injector patch.  (Fuzzing the regression tests repeatedly with
> that applied also revealed similar problems elsewhere in the tree,
> which I'll write about in a new thread.)

Can we reuse _bt_end_parallel() this time around? It would be easy to
add a "bool wait" argument to _bt_end_parallel(). All existing callers
would pass true, but when not using parallelism due to the new DSM
segment check we'd pass false.

-- 
Peter Geoghegan



Re: pgsql: Handle lack of DSM slots in parallel btree build.

From
Peter Geoghegan
Date:
On Tue, Feb 4, 2020 at 9:42 AM Peter Geoghegan <pg@bowt.ie> wrote:
> Can we reuse _bt_end_parallel() this time around? It would be easy to
> add a "bool wait" argument to _bt_end_parallel(). All existing callers
> would pass true, but when not using parallelism due to the new DSM
> segment check we'd pass false.

Hmm. I see why you didn't do it that way -- we don't quite have the
variables set up in the way that _bt_end_parallel() expects them.

This looks good, then.

-- 
Peter Geoghegan



Re: pgsql: Handle lack of DSM slots in parallel btree build.

From
Thomas Munro
Date:
On Wed, Feb 5, 2020 at 6:47 AM Peter Geoghegan <pg@bowt.ie> wrote:
> On Tue, Feb 4, 2020 at 9:42 AM Peter Geoghegan <pg@bowt.ie> wrote:
> > Can we reuse _bt_end_parallel() this time around? It would be easy to
> > add a "bool wait" argument to _bt_end_parallel(). All existing callers
> > would pass true, but when not using parallelism due to the new DSM
> > segment check we'd pass false.
>
> Hmm. I see why you didn't do it that way -- we don't quite have the
> variables set up in the way that _bt_end_parallel() expects them.
>
> This looks good, then.

Thanks.  Pushed.



Re: pgsql: Handle lack of DSM slots in parallel btree build.

From
Peter Geoghegan
Date:
On Tue, Feb 4, 2020 at 3:38 PM Thomas Munro <thomas.munro@gmail.com> wrote:
> Thanks.  Pushed.

Thanks.

-- 
Peter Geoghegan