On Sat, Mar 28, 2020 at 12:34 AM Justin Pryzby <pryzby@telsasoft.com> wrote:
>
> On Fri, Mar 27, 2020 at 11:50:30AM +0530, Amit Kapila wrote:
> > > > The crash scenario I'm trying to avoid would be like statement_timeout or other
> > > > asynchronous event occurring between two non-atomic operations.
> > > >
> > > +if (errinfo->phase==VACUUM_ERRCB_PHASE_VACUUM_INDEX && errinfo->indname==NULL)
> > > +{
> > > +kill(getpid(), SIGINT);
> > > +pg_sleep(1); // that's needed since signals are delivered asynchronously
> > > +}
> > > I'm not sure if those are possible outside of "induced" errors. Maybe the
> > > function is essentially atomic due to no CHECK_FOR_INTERRUPTS or similar?
> >
> > Yes, this is exactly the point. I think unless you have
> > CHECK_FOR_INTERRUPTS in that function, the problems you are trying to
> > think won't happen.
>
> Hm, but I caused a crash *without* adding CHECK_FOR_INTERRUPTS, just
> kill+sleep. The kill() could come from running pg_cancel_backend(). And the
> sleep() just encourages a context switch, which can happen at any time.
>
pg_sleep internally uses CHECK_FOR_INTERRUPTS() due to which it would
have accepted the signal sent via pg_cancel_backend(). Can you try
your scenario by temporarily removing CHECK_FOR_INTERRUPTS from
pg_sleep() or maybe better by using OS Sleep call?
--
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com