Thread: Why does create_gather_merge_plan need make_sort?
While looking at another issue I noticed that create_gather_merge_plan calls make_sort if the subplan isn't sufficiently sorted. In all of the cases I've seen where a gather merge path (not plan) is created the input path is expected to be properly sorted, so I was wondering if anyone happened to know what case is being handled by the make_sort call. Removing it doesn't seem to break any tests. Thanks, James
On 11/20/20 11:24 PM, James Coleman wrote: > While looking at another issue I noticed that create_gather_merge_plan > calls make_sort if the subplan isn't sufficiently sorted. In all of > the cases I've seen where a gather merge path (not plan) is created > the input path is expected to be properly sorted, so I was wondering > if anyone happened to know what case is being handled by the make_sort > call. Removing it doesn't seem to break any tests. > Yeah, I think you're right this is dead code, essentially. We're only ever calling create_gather_merge_path() with pathkeys matching the subpath. And it's like that on REL_12_STABLE too, i.e. before the incremental sort was introduced. regards -- Tomas Vondra EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Tomas Vondra <tomas.vondra@enterprisedb.com> writes: > On 11/20/20 11:24 PM, James Coleman wrote: >> While looking at another issue I noticed that create_gather_merge_plan >> calls make_sort if the subplan isn't sufficiently sorted. In all of >> the cases I've seen where a gather merge path (not plan) is created >> the input path is expected to be properly sorted, so I was wondering >> if anyone happened to know what case is being handled by the make_sort >> call. Removing it doesn't seem to break any tests. > Yeah, I think you're right this is dead code, essentially. We're only > ever calling create_gather_merge_path() with pathkeys matching the > subpath. And it's like that on REL_12_STABLE too, i.e. before the > incremental sort was introduced. It's probably there by analogy to the other callers of prepare_sort_from_pathkeys, which all do at least a conditional make_sort(). I'd be inclined to leave it there; it's cheap insurance against somebody weakening the existing behavior. regards, tom lane
On 11/22/20 10:31 PM, Tom Lane wrote: > Tomas Vondra <tomas.vondra@enterprisedb.com> writes: >> On 11/20/20 11:24 PM, James Coleman wrote: >>> While looking at another issue I noticed that create_gather_merge_plan >>> calls make_sort if the subplan isn't sufficiently sorted. In all of >>> the cases I've seen where a gather merge path (not plan) is created >>> the input path is expected to be properly sorted, so I was wondering >>> if anyone happened to know what case is being handled by the make_sort >>> call. Removing it doesn't seem to break any tests. > >> Yeah, I think you're right this is dead code, essentially. We're only >> ever calling create_gather_merge_path() with pathkeys matching the >> subpath. And it's like that on REL_12_STABLE too, i.e. before the >> incremental sort was introduced. > > It's probably there by analogy to the other callers of > prepare_sort_from_pathkeys, which all do at least a conditional > make_sort(). I'd be inclined to leave it there; it's cheap insurance > against somebody weakening the existing behavior. > But how do we know it's safe to actually do the sort there, e.g. in light of the volatility/parallel-safety issues discussed in other threads? I agree the check may be useful, but maybe we should just do elog(ERROR) instead. regards -- Tomas Vondra EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
On Sun, Nov 22, 2020 at 5:07 PM Tomas Vondra <tomas.vondra@enterprisedb.com> wrote: > > On 11/22/20 10:31 PM, Tom Lane wrote: > > Tomas Vondra <tomas.vondra@enterprisedb.com> writes: > >> On 11/20/20 11:24 PM, James Coleman wrote: > >>> While looking at another issue I noticed that create_gather_merge_plan > >>> calls make_sort if the subplan isn't sufficiently sorted. In all of > >>> the cases I've seen where a gather merge path (not plan) is created > >>> the input path is expected to be properly sorted, so I was wondering > >>> if anyone happened to know what case is being handled by the make_sort > >>> call. Removing it doesn't seem to break any tests. > > > >> Yeah, I think you're right this is dead code, essentially. We're only > >> ever calling create_gather_merge_path() with pathkeys matching the > >> subpath. And it's like that on REL_12_STABLE too, i.e. before the > >> incremental sort was introduced. > > > > It's probably there by analogy to the other callers of > > prepare_sort_from_pathkeys, which all do at least a conditional > > make_sort(). I'd be inclined to leave it there; it's cheap insurance > > against somebody weakening the existing behavior. > > > > But how do we know it's safe to actually do the sort there, e.g. in > light of the volatility/parallel-safety issues discussed in other threads? > > I agree the check may be useful, but maybe we should just do elog(ERROR) > instead. That was my thought. I'm not a big fan of maintaining a "this might be useful" path particularly when there isn't any case in the code or tests at all that exercises it. In other words, not only is it not useful [yet], but also we don't actually have any signal to know that it works (or keeps working) -- whether through tests or production use. So I'm +1 on turning it into an ERROR log instead, since it seems to me that encountering this case would almost certainly represent a bug in path generation. James
On Mon, Nov 23, 2020 at 8:19 AM James Coleman <jtc331@gmail.com> wrote: > > On Sun, Nov 22, 2020 at 5:07 PM Tomas Vondra > <tomas.vondra@enterprisedb.com> wrote: > > > > On 11/22/20 10:31 PM, Tom Lane wrote: > > > Tomas Vondra <tomas.vondra@enterprisedb.com> writes: > > >> On 11/20/20 11:24 PM, James Coleman wrote: > > >>> While looking at another issue I noticed that create_gather_merge_plan > > >>> calls make_sort if the subplan isn't sufficiently sorted. In all of > > >>> the cases I've seen where a gather merge path (not plan) is created > > >>> the input path is expected to be properly sorted, so I was wondering > > >>> if anyone happened to know what case is being handled by the make_sort > > >>> call. Removing it doesn't seem to break any tests. > > > > > >> Yeah, I think you're right this is dead code, essentially. We're only > > >> ever calling create_gather_merge_path() with pathkeys matching the > > >> subpath. And it's like that on REL_12_STABLE too, i.e. before the > > >> incremental sort was introduced. > > > > > > It's probably there by analogy to the other callers of > > > prepare_sort_from_pathkeys, which all do at least a conditional > > > make_sort(). I'd be inclined to leave it there; it's cheap insurance > > > against somebody weakening the existing behavior. > > > > > > > But how do we know it's safe to actually do the sort there, e.g. in > > light of the volatility/parallel-safety issues discussed in other threads? > > > > I agree the check may be useful, but maybe we should just do elog(ERROR) > > instead. > > That was my thought. I'm not a big fan of maintaining a "this might be > useful" path particularly when there isn't any case in the code or > tests at all that exercises it. In other words, not only is it not > useful [yet], but also we don't actually have any signal to know that > it works (or keeps working) -- whether through tests or production > use. > > So I'm +1 on turning it into an ERROR log instead, since it seems to > me that encountering this case would almost certainly represent a bug > in path generation. Here's a patch to do that. James
Attachment
On 11/29/20 3:44 PM, James Coleman wrote: > On Mon, Nov 23, 2020 at 8:19 AM James Coleman <jtc331@gmail.com> wrote: >> >> .. >> >> So I'm +1 on turning it into an ERROR log instead, since it seems to >> me that encountering this case would almost certainly represent a bug >> in path generation. > > Here's a patch to do that. > Thanks. Isn't the comment incomplete? Should it mention just parallel safety or also volatility? regards -- Tomas Vondra EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
On Sun, Nov 29, 2020 at 4:10 PM Tomas Vondra <tomas.vondra@enterprisedb.com> wrote: > > > > On 11/29/20 3:44 PM, James Coleman wrote: > > On Mon, Nov 23, 2020 at 8:19 AM James Coleman <jtc331@gmail.com> wrote: > >> > >> .. > >> > >> So I'm +1 on turning it into an ERROR log instead, since it seems to > >> me that encountering this case would almost certainly represent a bug > >> in path generation. > > > > Here's a patch to do that. > > > > Thanks. Isn't the comment incomplete? Should it mention just parallel > safety or also volatility? Volatility makes it parallel unsafe, and I'm not sure I want to get into listing every possible issue here, but in the interest of making it more obviously representative of the kinds of issues we might encounter, I've tweaked it to mention volatility also, as well as refer to the issues as "examples" of such concerns. James
Attachment
For the record, this got committed as 6bc27698324 in December, along with the other incremental sort fixes and improvements. I forgot to mark it as committed in the app, so I've done that now. regards -- Tomas Vondra EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company