Re: Parallel INSERT (INTO ... SELECT ...) - Mailing list pgsql-hackers

From Greg Nancarrow
Subject Re: Parallel INSERT (INTO ... SELECT ...)
Date
Msg-id CAJcOf-dfxut1wLjNmF56EUC5GjbgQELOseAgLRCxHzwM-ONJ3g@mail.gmail.com
Whole thread Raw
In response to Re: Parallel INSERT (INTO ... SELECT ...)  (Amit Kapila <amit.kapila16@gmail.com>)
Responses Re: Parallel INSERT (INTO ... SELECT ...)  (Amit Kapila <amit.kapila16@gmail.com>)
Re: Parallel INSERT (INTO ... SELECT ...)  (Amit Kapila <amit.kapila16@gmail.com>)
List pgsql-hackers
On Mon, Jan 18, 2021 at 8:10 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> > 1)
> >
> > >Here, it seems we are accessing the relation descriptor without any
> > >lock on the table which is dangerous considering the table can be
> > >modified in a parallel session. Is there a reason why you think this
> > >is safe? Did you check anywhere else such a coding pattern?
> >
> > Yes, there's a very good reason and I certainly have checked for the
> > same coding pattern elsewhere, and not just randomly decided that
> > locking can be ignored.
> > The table has ALREADY been locked (by the caller) during the
> > parse/analyze phase.
> >
>
> Fair enough. I suggest adding a comment saying the same so that the
> reader doesn't get confused about the same.
>

OK, I'll add a comment.

> > (This is not the case for a partition, in which case the patch code
> > uses AccessShareLock, as you will see).
>
> Okay, but I see you release the lock on partition rel immediately
> after checking parallel-safety. What if a user added some
> parallel-unsafe constraint (via Alter Table) after that check?
>
> >

I'm not sure. But there would be a similar concern for current
Parallel SELECT functionality, right?
My recollection is that ALTER TABLE obtains an exclusive lock on the
table which it retains until the end of the transaction, so that will
result in blocking at certain points, during parallel-checks and
execution, but there may still be a window.

> > 4)
> >
> > >domain_max_parallel_hazard_for_modify()
> > >{
> > >..
> > >+ if (isnull)
> > >+ {
> > >+ /*
> > >+ * This shouldn't ever happen, but if it does, log a WARNING
> > >+ * and return UNSAFE, rather than erroring out.
> > >+ */
> > >+ elog(WARNING, "null conbin for constraint %u", con->oid);
> > >+ context->max_hazard = PROPARALLEL_UNSAFE;
> > >+ break;
> > >+ }
> > >..
> > >}
> > >index_expr_max_parallel_hazard_for_modify()
> > >{
> > >..
> > >+ if (index_expr_item == NULL) /* shouldn't happen */
> > >+ {
> > >+ index_close(index_rel, lockmode);
> > >+ context->max_hazard = PROPARALLEL_UNSAFE;
> > >+ return context->max_hazard;
> > >+ }
> > >..
> > >}
> >
> > >It is not clear why the above two are shouldn't happen cases and if so
> > >why we want to treat them as unsafe. Ideally, there should be an
> > >Assert if these can't happen but it is difficult to decide without
> > >knowing why you have considered them unsafe?
> >
> > The checks being done here for "should never happen" cases are THE
> > SAME as other parts of the Postgres code.
> > For example, search Postgres code for "null conbin" and you'll find 6
> > other places in the Postgres code which actually ERROR out if conbin
> > (binary representation of the constraint) in a pg_constraint tuple is
> > found to be null.
> > The cases that you point out in the patch used to also error out in
> > the same way, but Vignesh suggested changing them to just return
> > parallel-unsafe instead of erroring-out, which I agree with.
> >
>
> You have not raised a WARNING for the second case.

The same checks in current Postgres code also don't raise a WARNING
for that case, so I'm just being consistent with existing Postgres
code (which itself isn't consistent for those two cases).

>But in the first
> place what is the reasoning for making this different from other parts
> of code? If we don't have a solid reason then I suggest keeping these
> checks and errors the same as in other parts of the code.
>

The checks are the same as done in existing Postgres source - but
instead of failing with an ERROR (i.e. whole backend dies), in the
middle of parallel-safety-checking, it has been changed to regard the
operation as parallel-unsafe, so that it will try to execute in
non-parallel mode, where it will most likely fail too when those
corrupted attributes are accessed - but it will fail in the way that
it currently does in Postgres, should that very rare condition ever
happen. This was suggested by Vignesh, and I agree with him. So in
effect, it's just allowing it to use the existing error paths in the
code, rather than introducing new ERROR points.

Regards,
Greg Nancarrow
Fujitsu Australia



pgsql-hackers by date:

Previous
From: 曾文旌
Date:
Subject: Re: Proposal: Global Index
Next
From: Daniel Gustafsson
Date:
Subject: Re: Allow matching whole DN from a client certificate