Re: pg_restore crash when there is a failure before all child processis created - Mailing list pgsql-hackers

From vignesh C
Subject Re: pg_restore crash when there is a failure before all child processis created
Date
Msg-id CALDaNm3-EDA2QzesP2Ltiu8=WVRf5hjoNMBAJZttywoWv-Aw5w@mail.gmail.com
Whole thread Raw
In response to Re: pg_restore crash when there is a failure before all child process is created  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-hackers
On Fri, Jan 31, 2020 at 1:09 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
>
> vignesh C <vignesh21@gmail.com> writes:
> > On Wed, Jan 29, 2020 at 6:54 PM Ahsan Hadi <ahsan.hadi@gmail.com> wrote:
> >> Can you share a test case or steps that you are using to reproduce this issue? Are you reproducing this using a
debugger?
>
> > I could reproduce with the following steps:
> > Make cluster setup.
> > Create few tables.
> > Take a dump in directory format using pg_dump.
> > Restore the dump generated above using pg_restore with very high
> > number for --jobs options around 600.
>
> I agree this is quite broken.  Another way to observe the crash is
> to make the fork() call randomly fail, as per booby-trap-fork.patch
> below (not intended for commit, obviously).
>
> I don't especially like the proposed patch, though, as it introduces
> a great deal of confusion into what ParallelState.numWorkers means.
> I think it's better to leave that as being the allocated array size,
> and instead clean up all the fuzzy thinking about whether workers
> are actually running or not.  Like 0001-fix-worker-status.patch below.
>

The patch looks fine to me. The test is also getting fixed by the patch.

Regards,
Vignesh
EnterpriseDB: http://www.enterprisedb.com



pgsql-hackers by date:

Previous
From: Mark Charsley
Date:
Subject: Re: Data race in interfaces/libpq/fe-exec.c
Next
From: Alexey Kondratov
Date:
Subject: Re: Physical replication slot advance is not persistent