Re: [HACKERS] why not parallel seq scan for slow functions - Mailing list pgsql-hackers

From Amit Kapila
Subject Re: [HACKERS] why not parallel seq scan for slow functions
Date
Msg-id CAA4eK1J+S4W+W9oaToNHA8x8xh_Kv9wQgGbPfjo=-HtnsPQO9Q@mail.gmail.com
Whole thread Raw
In response to Re: [HACKERS] why not parallel seq scan for slow functions  (Robert Haas <robertmhaas@gmail.com>)
Responses Re: [HACKERS] why not parallel seq scan for slow functions  (Robert Haas <robertmhaas@gmail.com>)
List pgsql-hackers
On Mon, Nov 6, 2017 at 3:51 AM, Robert Haas <robertmhaas@gmail.com> wrote:
> On Sun, Nov 5, 2017 at 12:57 AM, Amit Kapila <amit.kapila16@gmail.com> wrote:
>> Thanks for the confirmation.  Find rebased patch attached.
>
> This looks like it's on the right track to me.  I hope Tom will look
> into it, but if he doesn't I may try to get it committed myself.
>
> -    if (rel->reloptkind == RELOPT_BASEREL)
> -        generate_gather_paths(root, rel);
> +    if (rel->reloptkind == RELOPT_BASEREL &&
> +        root->simple_rel_array_size > 2 &&
> +        !root->append_rel_list)
>
> This test doesn't look correct to me.  Actually, it doesn't look
> anywhere close to correct to me.  So, one of us is very confused...
> not sure whether it's you or me.
>

It is quite possible that I haven't got it right, but it shouldn't be
completely bogus as it stands the regression tests and some manual
verification.  Can you explain what is your concern about this test?

>      simple_gather_path = (Path *)
>          create_gather_path(root, rel, cheapest_partial_path, rel->reltarget,
>                             NULL, NULL);
> +
> +    /* Add projection step if needed */
> +    if (target && simple_gather_path->pathtarget != target)
> +        simple_gather_path = apply_projection_to_path(root, rel,
> simple_gather_path, target);
>
> Instead of using apply_projection_to_path, why not pass the correct
> reltarget to create_gather_path?
>

We need to push it to gather's subpath as is done in
apply_projection_to_path and then we have to cost it accordingly.  I
think if we don't use apply_projection_to_path  then we might end up
with much of the code similar to it in generate_gather_paths.  In
fact, I have tried something similar to what you are suggesting in the
first version of patch [1] and it didn't turn out to be clean.  Also,
I think we already do something similar in create_ordered_paths.


> +        /* Set or update cheapest_total_path and related fields */
> +        set_cheapest(current_rel);
>
> I wonder if it's really OK to call set_cheapest() a second time for
> the same relation...
>

I think if we want we can avoid it by checking whether we have
generated any gather path for the relation (basically, check if it has
partial path list).  Another idea could be that we consider the
generation of gather/gathermerge for top-level scan/join relation as a
separate step and generate a new kind of upper rel for it which will
be a mostly dummy but will have paths for gather/gathermerge.

[1] - https://www.postgresql.org/message-id/CAA4eK1JUvL9WS9z%3D5hjSuSMNCo3TdBxFa0pA%3DE__E%3Dp6iUffUQ%40mail.gmail.com

-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

pgsql-hackers by date:

Previous
From: Ashutosh Bapat
Date:
Subject: Re: [HACKERS] [PATCH] Overestimated filter cost and its mitigation
Next
From: Alexander Korotkov
Date:
Subject: Re: [HACKERS] Display number of heap accesses for index scans