Re: parallel.c is not marked as test covered - Mailing list pgsql-hackers
| From | David Rowley |
|---|---|
| Subject | Re: parallel.c is not marked as test covered |
| Date | |
| Msg-id | CAKJS1f9N7kKrCz4OoVB0GRxn8bAOdEBv9OXOOzUo9aeSRkvHHA@mail.gmail.com Whole thread Raw |
| In response to | Re: parallel.c is not marked as test covered (Robert Haas <robertmhaas@gmail.com>) |
| List | pgsql-hackers |
On 12 May 2016 at 07:04, Robert Haas <robertmhaas@gmail.com> wrote:
> On Wed, May 11, 2016 at 1:57 PM, Robert Haas <robertmhaas@gmail.com> wrote:
>> I don't immediately understand what's going wrong here. It looks to
>> me like make_group_input_target() already called, and that worked OK,
>> but now make_partialgroup_input_target() is failing using more-or-less
>> the same logic. Presumably that's because make_group_input_target()
>> was called on final_target as returned by create_pathtarget(root,
>> tlist), but make_partialgroup_input_target() is being called on
>> grouping_target, which I'm guessing came from
>> make_window_input_target, which somehow lacks sortgroupref labeling.
>> But I don't immediately see how that would happen, so there's
>> obviously something I'm missing here.
>
> So, it turns out you can reproduce this bug pretty easily without
> force_parallel_mode, like this:
>
> alter table int4_tbl set (parallel_degree = 4);
> SELECT SUM(COUNT(f1)) OVER () FROM int4_tbl WHERE f1=42;
>
> Or you can just a query that involves a window function on a table
> large enough for parallelism to be considered:
>
> SELECT SUM(COUNT(aid)) OVER () FROM pgbench_accounts;
>
> The crash goes way if the target list involves at least one plain
> column that uses no aggregate or window function, because then the
> PathTarget list has a sortgrouprefs array. Trivial fix patch
> attached, although some more review from you (Tom Lane, PathTarget
> inventor and planner whiz) and David Rowley (author of this function)
> would be appreciated in case there are deeper issues here.
The problem is make_group_input_target() only calls add_column_to_pathtarget() (which allocates this array) when there's a GROUP BY, otherwise it just appends to the non_group_col list. Since your query has no GROUP BY it means that add_column_to_pathtarget() is never called with a non-zero sortgroupref.
It looks like Tom has intended that PathTarget->sortgrouprefs can be NULL going by both the comment /* corresponding sort/group refnos, or 0 */, and the coding inside add_column_to_pathtarget(), which does not allocate the array if given a 0 sortgroupref.
It looks like make_sort_input_target(), make_window_input_target() and make_group_input_target() all get away without this check because they're all using final_target, which was built by make_pathtarget_from_tlist() which *always* allocates the sortgrouprefs array, even if it's left filled with zeros.
It might be better if this was all consistent. Perhaps it would be worth modifying make_pathtarget_from_tlist() to only allocate the sortgrouprefs array if there's any non-zero tle->ressortgroupref, then modify the other make_*_input_target() functions to handle a NULL array, similar to the fix that's in your patch. This saves an allocation which is likely much more expensive than the NULL check later. Alternatively add_column_to_pathtarget() could be modified to allocate the array even if sortgroupref is zero.
I think consistency is good here, as if this had been consistent this would not be a bug.
--
David Rowley http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services
> On Wed, May 11, 2016 at 1:57 PM, Robert Haas <robertmhaas@gmail.com> wrote:
>> I don't immediately understand what's going wrong here. It looks to
>> me like make_group_input_target() already called, and that worked OK,
>> but now make_partialgroup_input_target() is failing using more-or-less
>> the same logic. Presumably that's because make_group_input_target()
>> was called on final_target as returned by create_pathtarget(root,
>> tlist), but make_partialgroup_input_target() is being called on
>> grouping_target, which I'm guessing came from
>> make_window_input_target, which somehow lacks sortgroupref labeling.
>> But I don't immediately see how that would happen, so there's
>> obviously something I'm missing here.
>
> So, it turns out you can reproduce this bug pretty easily without
> force_parallel_mode, like this:
>
> alter table int4_tbl set (parallel_degree = 4);
> SELECT SUM(COUNT(f1)) OVER () FROM int4_tbl WHERE f1=42;
>
> Or you can just a query that involves a window function on a table
> large enough for parallelism to be considered:
>
> SELECT SUM(COUNT(aid)) OVER () FROM pgbench_accounts;
>
> The crash goes way if the target list involves at least one plain
> column that uses no aggregate or window function, because then the
> PathTarget list has a sortgrouprefs array. Trivial fix patch
> attached, although some more review from you (Tom Lane, PathTarget
> inventor and planner whiz) and David Rowley (author of this function)
> would be appreciated in case there are deeper issues here.
The problem is make_group_input_target() only calls add_column_to_pathtarget() (which allocates this array) when there's a GROUP BY, otherwise it just appends to the non_group_col list. Since your query has no GROUP BY it means that add_column_to_pathtarget() is never called with a non-zero sortgroupref.
It looks like Tom has intended that PathTarget->sortgrouprefs can be NULL going by both the comment /* corresponding sort/group refnos, or 0 */, and the coding inside add_column_to_pathtarget(), which does not allocate the array if given a 0 sortgroupref.
It looks like make_sort_input_target(), make_window_input_target() and make_group_input_target() all get away without this check because they're all using final_target, which was built by make_pathtarget_from_tlist() which *always* allocates the sortgrouprefs array, even if it's left filled with zeros.
It might be better if this was all consistent. Perhaps it would be worth modifying make_pathtarget_from_tlist() to only allocate the sortgrouprefs array if there's any non-zero tle->ressortgroupref, then modify the other make_*_input_target() functions to handle a NULL array, similar to the fix that's in your patch. This saves an allocation which is likely much more expensive than the NULL check later. Alternatively add_column_to_pathtarget() could be modified to allocate the array even if sortgroupref is zero.
I think consistency is good here, as if this had been consistent this would not be a bug.
--
David Rowley http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services
pgsql-hackers by date: