On Mon, Aug 17, 2015 at 6:40 AM, Kouhei Kaigai <kaigai@ak.jp.nec.com> wrote:
> Here is one other thing I could learn from TPC-DS benchmark.
>
> The attached query is Q4 of TPC-DS, and its result was towards SF=100.
> It took long time to compete (about 30min), please see the attached
> EXPLAIN ANALYZE output.
>
Look at this:
-> CTE Scan on year_total t_s_firstyear (cost=0.00..13120715.27
rows=3976 width=52) (actual time=0.020..5425.980 rows=1816438 loops=1)
Filter: ((year_total > '0'::numeric) AND
(sale_type = 's'::text) AND (dyear = 2001))
Rows Removed by Filter: 19879897
-> CTE Scan on year_total t_s_secyear (cost=0.00..11927922.98
rows=11928 width=164) (actual time=0.007..45.249 rows=46636 loops=1)
Filter: ((sale_type = 's'::text) AND (dyear = 2002))
Rows Removed by Filter: 185596
CTE expansion shall help here as we can push the filer down. I did a
quick patch to demonstrate the idea, following Tom's proposal
(38448.1430519406@sss.pgh.pa.us). I see obvious performance boost:
Turn off NLJ:
original: Planning time: 4.391 ms
Execution time: 77113.721 ms
patched: Planning time: 8.429 ms
Execution time: 18572.663 ms
+ work_mem to 1G
original: Planning time: 4.487 ms
Execution time: 29249.466 ms
patched: Planning time: 11.148 ms
Execution time: 7309.586 ms
Attached please find the WIP patch and also the ANALYZE results.
Notes: the patch may not directly apply to head as some network issue
here so my Linux box can't talk to git server.
Regards,
Qingqing