Re: Our trial to TPC-DS but optimizer made unreasonable plan - Mailing list pgsql-hackers

From Kouhei Kaigai
Subject Re: Our trial to TPC-DS but optimizer made unreasonable plan
Date
Msg-id 9A28C8860F777E439AA12E8AEA7694F801134C49@BPXM15GP.gisp.nec.co.jp
Whole thread Raw
Responses Re: Our trial to TPC-DS but optimizer made unreasonable plan  (Qingqing Zhou <zhouqq.postgres@gmail.com>)
List pgsql-hackers
> On Mon, Aug 17, 2015 at 6:40 AM, Kouhei Kaigai <kaigai@ak.jp.nec.com> wrote:
> > Here is one other thing I could learn from TPC-DS benchmark.
> >
> > The attached query is Q4 of TPC-DS, and its result was towards SF=100.
> > It took long time to compete (about 30min), please see the attached
> > EXPLAIN ANALYZE output.
> >
> Look at this:
> 
>  ->  CTE Scan on year_total t_s_firstyear  (cost=0.00..13120715.27
> rows=3976 width=52) (actual time=0.020..5425.980 rows=1816438 loops=1)
>                            Filter: ((year_total > '0'::numeric) AND
> (sale_type = 's'::text) AND (dyear = 2001))
>                            Rows Removed by Filter: 19879897
>  ->  CTE Scan on year_total t_s_secyear  (cost=0.00..11927922.98
> rows=11928 width=164) (actual time=0.007..45.249 rows=46636 loops=1)
>                            Filter: ((sale_type = 's'::text) AND (dyear = 2002))
>                            Rows Removed by Filter: 185596
> 
> CTE expansion shall help here as we can push the filer down. I did a
> quick patch to demonstrate the idea, following Tom's proposal
> (38448.1430519406@sss.pgh.pa.us). I see obvious performance boost:
> 
> Turn off NLJ:
>  original: Planning time: 4.391 ms
>  Execution time: 77113.721 ms
>  patched: Planning time: 8.429 ms
>  Execution time: 18572.663 ms
> 
> + work_mem to 1G
>  original: Planning time: 4.487 ms
>  Execution time: 29249.466 ms
>  patched: Planning time: 11.148 ms
>  Execution time: 7309.586 ms
> 
> Attached please find the WIP patch and also the ANALYZE results.
> Notes: the patch may not directly apply to head as some network issue
> here so my Linux box can't talk to git server.
>
Thanks for your patch.
Let me test and report the result in my environment.

BTW, did you register the patch on the upcoming commit-fest?
I think it may be a helpful feature, if we can add alternative
subquery-path towards cte-scan on set_cte_pathlist() and choose
them according to the cost estimation.

Best regards,
--
NEC Business Creation Division / PG-Strom Project
KaiGai Kohei <kaigai@ak.jp.nec.com>


pgsql-hackers by date:

Previous
From: Tom Lane
Date:
Subject: Re: Bug? ExecChooseHashTableSize() got assertion failed with crazy number of rows
Next
From: David Rowley
Date:
Subject: Re: Bug? ExecChooseHashTableSize() got assertion failed with crazy number of rows