Thread: Why does create_gather_merge_plan need make_sort?

Why does create_gather_merge_plan need make_sort?

From
James Coleman
Date:
While looking at another issue I noticed that create_gather_merge_plan
calls make_sort if the subplan isn't sufficiently sorted. In all of
the cases I've seen where a gather merge path (not plan) is created
the input path is expected to be properly sorted, so I was wondering
if anyone happened to know what case is being handled by the make_sort
call. Removing it doesn't seem to break any tests.

Thanks,
James



Re: Why does create_gather_merge_plan need make_sort?

From
Tomas Vondra
Date:
On 11/20/20 11:24 PM, James Coleman wrote:
> While looking at another issue I noticed that create_gather_merge_plan
> calls make_sort if the subplan isn't sufficiently sorted. In all of
> the cases I've seen where a gather merge path (not plan) is created
> the input path is expected to be properly sorted, so I was wondering
> if anyone happened to know what case is being handled by the make_sort
> call. Removing it doesn't seem to break any tests.
> 

Yeah, I think you're right this is dead code, essentially. We're only
ever calling create_gather_merge_path() with pathkeys matching the
subpath. And it's like that on REL_12_STABLE too, i.e. before the
incremental sort was introduced.


regards

-- 
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: Why does create_gather_merge_plan need make_sort?

From
Tom Lane
Date:
Tomas Vondra <tomas.vondra@enterprisedb.com> writes:
> On 11/20/20 11:24 PM, James Coleman wrote:
>> While looking at another issue I noticed that create_gather_merge_plan
>> calls make_sort if the subplan isn't sufficiently sorted. In all of
>> the cases I've seen where a gather merge path (not plan) is created
>> the input path is expected to be properly sorted, so I was wondering
>> if anyone happened to know what case is being handled by the make_sort
>> call. Removing it doesn't seem to break any tests.

> Yeah, I think you're right this is dead code, essentially. We're only
> ever calling create_gather_merge_path() with pathkeys matching the
> subpath. And it's like that on REL_12_STABLE too, i.e. before the
> incremental sort was introduced.

It's probably there by analogy to the other callers of
prepare_sort_from_pathkeys, which all do at least a conditional
make_sort().  I'd be inclined to leave it there; it's cheap insurance
against somebody weakening the existing behavior.

            regards, tom lane



Re: Why does create_gather_merge_plan need make_sort?

From
Tomas Vondra
Date:
On 11/22/20 10:31 PM, Tom Lane wrote:
> Tomas Vondra <tomas.vondra@enterprisedb.com> writes:
>> On 11/20/20 11:24 PM, James Coleman wrote:
>>> While looking at another issue I noticed that create_gather_merge_plan
>>> calls make_sort if the subplan isn't sufficiently sorted. In all of
>>> the cases I've seen where a gather merge path (not plan) is created
>>> the input path is expected to be properly sorted, so I was wondering
>>> if anyone happened to know what case is being handled by the make_sort
>>> call. Removing it doesn't seem to break any tests.
> 
>> Yeah, I think you're right this is dead code, essentially. We're only
>> ever calling create_gather_merge_path() with pathkeys matching the
>> subpath. And it's like that on REL_12_STABLE too, i.e. before the
>> incremental sort was introduced.
> 
> It's probably there by analogy to the other callers of
> prepare_sort_from_pathkeys, which all do at least a conditional
> make_sort().  I'd be inclined to leave it there; it's cheap insurance
> against somebody weakening the existing behavior.
> 

But how do we know it's safe to actually do the sort there, e.g. in
light of the volatility/parallel-safety issues discussed in other threads?

I agree the check may be useful, but maybe we should just do elog(ERROR)
instead.



regards

-- 
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: Why does create_gather_merge_plan need make_sort?

From
James Coleman
Date:
On Sun, Nov 22, 2020 at 5:07 PM Tomas Vondra
<tomas.vondra@enterprisedb.com> wrote:
>
> On 11/22/20 10:31 PM, Tom Lane wrote:
> > Tomas Vondra <tomas.vondra@enterprisedb.com> writes:
> >> On 11/20/20 11:24 PM, James Coleman wrote:
> >>> While looking at another issue I noticed that create_gather_merge_plan
> >>> calls make_sort if the subplan isn't sufficiently sorted. In all of
> >>> the cases I've seen where a gather merge path (not plan) is created
> >>> the input path is expected to be properly sorted, so I was wondering
> >>> if anyone happened to know what case is being handled by the make_sort
> >>> call. Removing it doesn't seem to break any tests.
> >
> >> Yeah, I think you're right this is dead code, essentially. We're only
> >> ever calling create_gather_merge_path() with pathkeys matching the
> >> subpath. And it's like that on REL_12_STABLE too, i.e. before the
> >> incremental sort was introduced.
> >
> > It's probably there by analogy to the other callers of
> > prepare_sort_from_pathkeys, which all do at least a conditional
> > make_sort().  I'd be inclined to leave it there; it's cheap insurance
> > against somebody weakening the existing behavior.
> >
>
> But how do we know it's safe to actually do the sort there, e.g. in
> light of the volatility/parallel-safety issues discussed in other threads?
>
> I agree the check may be useful, but maybe we should just do elog(ERROR)
> instead.

That was my thought. I'm not a big fan of maintaining a "this might be
useful" path particularly when there isn't any case in the code or
tests at all that exercises it. In other words, not only is it not
useful [yet], but also we don't actually have any signal to know that
it works (or keeps working) -- whether through tests or production
use.

So I'm +1 on turning it into an ERROR log instead, since it seems to
me that encountering this case would almost certainly represent a bug
in path generation.

James



Re: Why does create_gather_merge_plan need make_sort?

From
James Coleman
Date:
On Mon, Nov 23, 2020 at 8:19 AM James Coleman <jtc331@gmail.com> wrote:
>
> On Sun, Nov 22, 2020 at 5:07 PM Tomas Vondra
> <tomas.vondra@enterprisedb.com> wrote:
> >
> > On 11/22/20 10:31 PM, Tom Lane wrote:
> > > Tomas Vondra <tomas.vondra@enterprisedb.com> writes:
> > >> On 11/20/20 11:24 PM, James Coleman wrote:
> > >>> While looking at another issue I noticed that create_gather_merge_plan
> > >>> calls make_sort if the subplan isn't sufficiently sorted. In all of
> > >>> the cases I've seen where a gather merge path (not plan) is created
> > >>> the input path is expected to be properly sorted, so I was wondering
> > >>> if anyone happened to know what case is being handled by the make_sort
> > >>> call. Removing it doesn't seem to break any tests.
> > >
> > >> Yeah, I think you're right this is dead code, essentially. We're only
> > >> ever calling create_gather_merge_path() with pathkeys matching the
> > >> subpath. And it's like that on REL_12_STABLE too, i.e. before the
> > >> incremental sort was introduced.
> > >
> > > It's probably there by analogy to the other callers of
> > > prepare_sort_from_pathkeys, which all do at least a conditional
> > > make_sort().  I'd be inclined to leave it there; it's cheap insurance
> > > against somebody weakening the existing behavior.
> > >
> >
> > But how do we know it's safe to actually do the sort there, e.g. in
> > light of the volatility/parallel-safety issues discussed in other threads?
> >
> > I agree the check may be useful, but maybe we should just do elog(ERROR)
> > instead.
>
> That was my thought. I'm not a big fan of maintaining a "this might be
> useful" path particularly when there isn't any case in the code or
> tests at all that exercises it. In other words, not only is it not
> useful [yet], but also we don't actually have any signal to know that
> it works (or keeps working) -- whether through tests or production
> use.
>
> So I'm +1 on turning it into an ERROR log instead, since it seems to
> me that encountering this case would almost certainly represent a bug
> in path generation.

Here's a patch to do that.

James

Attachment

Re: Why does create_gather_merge_plan need make_sort?

From
Tomas Vondra
Date:

On 11/29/20 3:44 PM, James Coleman wrote:
> On Mon, Nov 23, 2020 at 8:19 AM James Coleman <jtc331@gmail.com> wrote:
>>
>> ..
>>
>> So I'm +1 on turning it into an ERROR log instead, since it seems to
>> me that encountering this case would almost certainly represent a bug
>> in path generation.
> 
> Here's a patch to do that.
>

Thanks. Isn't the comment incomplete? Should it mention just parallel
safety or also volatility?


regards

-- 
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: Why does create_gather_merge_plan need make_sort?

From
James Coleman
Date:
On Sun, Nov 29, 2020 at 4:10 PM Tomas Vondra
<tomas.vondra@enterprisedb.com> wrote:
>
>
>
> On 11/29/20 3:44 PM, James Coleman wrote:
> > On Mon, Nov 23, 2020 at 8:19 AM James Coleman <jtc331@gmail.com> wrote:
> >>
> >> ..
> >>
> >> So I'm +1 on turning it into an ERROR log instead, since it seems to
> >> me that encountering this case would almost certainly represent a bug
> >> in path generation.
> >
> > Here's a patch to do that.
> >
>
> Thanks. Isn't the comment incomplete? Should it mention just parallel
> safety or also volatility?

Volatility makes it parallel unsafe, and I'm not sure I want to get
into listing every possible issue here, but in the interest of making
it more obviously representative of the kinds of issues we might
encounter, I've tweaked it to mention volatility also, as well as
refer to the issues as "examples" of such concerns.

James

Attachment

Re: Why does create_gather_merge_plan need make_sort?

From
Tomas Vondra
Date:
For the record, this got committed as 6bc27698324 in December, along 
with the other incremental sort fixes and improvements. I forgot to mark 
it as committed in the app, so I've done that now.


regards

-- 
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company