Re: [HACKERS] why not parallel seq scan for slow functions - Mailing list pgsql-hackers

From Amit Kapila
Subject Re: [HACKERS] why not parallel seq scan for slow functions
Date
Msg-id CAA4eK1Jr1CAv4A-rH_B-9GN-VUeOsFAkPRzU=UXxnSghjZug2Q@mail.gmail.com
Whole thread Raw
In response to Re: [HACKERS] why not parallel seq scan for slow functions  (Robert Haas <robertmhaas@gmail.com>)
Responses Re: [HACKERS] why not parallel seq scan for slow functions  (Robert Haas <robertmhaas@gmail.com>)
List pgsql-hackers
On Tue, Mar 20, 2018 at 1:23 AM, Robert Haas <robertmhaas@gmail.com> wrote:
> On Sat, Mar 17, 2018 at 1:16 AM, Amit Kapila <amit.kapila16@gmail.com> wrote:
>> Test-1
>> ----------
>> DO $$
>> DECLARE count integer;
>> BEGIN
>> For count In 1..1000000 Loop
>> Execute 'explain Select ten from tenk1';
>> END LOOP;
>> END;
>> $$;
>>
>> In the above block, I am explaining the simple statement which will
>> have just one path, so there will be one additional path projection
>> and removal cycle for this statement.  I have just executed the above
>> block in psql by having \timing option 'on' and the average timing for
>> ten runs on HEAD is  21292.388 ms, with patches (0001.* ~ 0003) is
>> 22405.2466 ms and with patches (0001.* ~ 0005.*) is 22537.1362.  These
>> results indicate that there is approximately 5~6% of the increase in
>> planning time.
>
> Ugh.  I'm able to reproduce this, more or less -- with master, this
> test took 42089.484 ms, 41935.849 ms, 42519.336 ms on my laptop, but
> with 0001-0003 applied, 43925.959 ms, 43619.004 ms, 43648.426 ms.
> However I have a feeling there's more going on here, because the
> following patch on top of 0001-0003 made the time go back down to
> 42353.548, 41797.757 ms, 41891.194 ms.
>
..
>
> It seems pretty obvious that creating an extra projection path that is
> just thrown away can't "really" be making this faster, so there's
> evidently some other effect here involving how the code is laid out,
> or CPU cache effects, or, uh, something.
>

Yeah, sometimes that kind of stuff change performance characteristics,
but I think what is going on here is that create_projection_plan is
causing the lower node to build physical tlist which takes some
additional time.  I have tried below change on top of the patch series
and it brings back the performance for me.

@@ -1580,7 +1580,7 @@ create_projection_plan(PlannerInfo *root,
ProjectionPath *best_path, int flags)
        List       *tlist;

        /* Since we intend to project, we don't need to constrain child tlist */
-       subplan = create_plan_recurse(root, best_path->subpath, 0);
+       subplan = create_plan_recurse(root, best_path->subpath, flags);

Another point I have noticed in
0001-Teach-create_projection_plan-to-omit-projection-wher  patch:

-create_projection_plan(PlannerInfo *root, ProjectionPath *best_path)
+create_projection_plan(PlannerInfo *root, ProjectionPath *best_path, int flags)
{
..
+ else if ((flags & CP_LABEL_TLIST) != 0)
+ {
+ tlist = copyObject(subplan->targetlist);
+ apply_pathtarget_labeling_to_tlist(tlist, best_path->path.pathtarget);
+ }
+ else
+ return subplan;
..
}

Before returning subplan, don't we need to copy the cost estimates
from best_path as is done in the same function after few lines.

-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com


pgsql-hackers by date:

Previous
From: Peter Eisentraut
Date:
Subject: Re: [HACKERS] taking stdbool.h into use
Next
From: Amit Langote
Date:
Subject: Re: [HACKERS] MERGE SQL Statement for PG11