I've started scaling dbt3 up to the 10GB scale factor against CVS and
the fast COPY patch:
http://www.testing.osdl.org/projects/dbt3testing/results/dev4-010/53/
I'm sure there are some better database parameters I should use so
please let me know what to try. ;) What I've found interesting is the
difference in the time it takes for Q9 to run in the power test than the
throughput test:
Power Test 00:56:27
Throughput Stream 1 00:38:13
Throughput Stream 2 00:41:33
Throughput Stream 3 00:20:16
Throughput Stream 4 00:18:11
Diffing the query plans between the Power Test and the individual
streams in the Throughput Test, I only see a Materialize and Seq Scan on
the nation table versus just a Seq Scan on the nation table between two
of the plans. But this doesn't appear to account for the execution time
difference as the query with the same plan executes just as fast during
the Throughput Test. Here are the plans in full:
EXPLAIN ANALYZE in the Power Test for Q9:
EXPLAIN ANALYZE select nation, o_year, sum(amount) as sum_profit from ( select n_name as nation, extract(year from
o_orderdate)as o_year, l_extendedprice * (1 - l_discount) - ps_supplycost * l_quantity as amount from part, supplier,
lineitem,partsupp, orders, nation where s_suppkey = l_suppkey and ps_suppkey = l_suppkey and ps_partkey = l_partkey and
p_partkey= l_partkey and o_orderkey = l_orderkey and s_nationkey = n_nationkey and p_name like '%light%' ) as profit
groupby nation, o_year order by nation, o_year desc;
QUERY PLAN
-------------------------------------------------------------------------------------------------------------------------------------------------------------------------GroupAggregate
(cost=182648.08..182648.22 rows=5 width=49) (actual time=3375680.779..3387407.186 rows=175 loops=1) -> Sort
(cost=182648.08..182648.09rows=5 width=49) (actual time=3375639.390..3379116.547 rows=3254907 loops=1) Sort Key:
nation.n_name,date_part('year'::text, (orders.o_orderdate)::timestamp without time zone) -> Nested Loop
(cost=1.27..182648.02rows=5 width=49) (actual time=43.412..3276365.695 rows=3254907 loops=1) -> Nested
Loop (cost=1.27..182632.87 rows=5 width=49) (actual time=14.798..2510769.669 rows=3254907 loops=1)
JoinFilter: ("outer".s_nationkey = "inner".n_nationkey) -> Nested Loop (cost=0.00..182628.78
rows=5width=24) (actual time=14.685..2415773.175 rows=3254907 loops=1) -> Nested Loop
(cost=0.00..182613.65rows=5 width=28) (actual time=14.654..2377308.988 rows=3254907 loops=1)
-> Nested Loop (cost=0.00..105392.76 rows=12804 width=16) (actual time=0.071..10991.718 rows=433752 loops=1)
-> Seq Scan on part (cost=0.00..73877.55 rows=3200 width=4) (actual
time=0.023..3698.415rows=108438 loops=1) Filter: ((p_name)::text ~~
'%light%'::text) -> Index Scan using i_ps_partkey on partsupp (cost=0.00..8.00
rows=148width=12) (actual time=0.045..0.059 rows=4 loops=108438) Index Cond:
("outer".p_partkey= partsupp.ps_partkey) -> Index Scan using i_l_suppkey_partkey on
lineitem (cost=0.00..6.02 rows=1 width=24) (actual time=0.891..5.441 rows=8 loops=433752)
Index Cond: (("outer".ps_partkey = lineitem.l_partkey) AND ("outer".ps_suppkey = lineitem.l_suppkey))
-> Index Scan using pk_supplier on supplier (cost=0.00..3.01 rows=1 width=8) (actual time=0.008..0.009
rows=1loops=3254907) Index Cond: (supplier.s_suppkey = "outer".l_suppkey)
-> Materialize (cost=1.27..1.52 rows=25 width=33) (actual time=0.000..0.011 rows=25 loops=3254907)
-> Seq Scan on nation (cost=0.00..1.25 rows=25 width=33) (actual time=0.005..0.047 rows=25 loops=1)
-> Index Scan using pk_orders on orders (cost=0.00..3.01 rows=1 width=8) (actual time=0.229..0.230 rows=1
loops=3254907) Index Cond: (orders.o_orderkey = "outer".l_orderkey)Total runtime: 3387491.286 ms
(22 rows)
Throughput Stream 1 Q9:
EXPLAIN ANALYZE select nation, o_year, sum(amount) as sum_profit from ( select n_name as nation, extract(year from
o_orderdate)as o_year, l_extendedprice * (1 - l_discount) - ps_supplycost * l_quantity as amount from part, supplier,
lineitem,partsupp, orders, nation where s_suppkey = l_suppkey and ps_suppkey = l_suppkey and ps_partkey = l_partkey and
p_partkey= l_partkey and o_orderkey = l_orderkey and s_nationkey = n_nationkey and p_name like '%ivory%' ) as profit
groupby nation, o_year order by nation, o_year desc;
QUERY PLAN
-------------------------------------------------------------------------------------------------------------------------------------------------------------------------GroupAggregate
(cost=182648.08..182648.22 rows=5 width=49) (actual time=2280062.422..2293138.084 rows=175 loops=1) -> Sort
(cost=182648.08..182648.09rows=5 width=49) (actual time=2280017.683..2283991.830 rows=3401050 loops=1) Sort Key:
nation.n_name,date_part('year'::text, (orders.o_orderdate)::timestamp without time zone) -> Nested Loop
(cost=1.27..182648.02rows=5 width=49) (actual time=7.639..2171994.940 rows=3401050 loops=1) -> Nested
Loop (cost=1.27..182632.87 rows=5 width=49) (actual time=7.572..1383633.029 rows=3401050 loops=1)
JoinFilter: ("outer".s_nationkey = "inner".n_nationkey) -> Nested Loop (cost=0.00..182628.78
rows=5width=24) (actual time=7.450..1266801.542 rows=3401050 loops=1) -> Nested Loop
(cost=0.00..182613.65rows=5 width=28) (actual time=7.391..1219339.594 rows=3401050 loops=1)
-> Nested Loop (cost=0.00..105392.76 rows=12804 width=16) (actual time=7.262..18789.674 rows=451992 loops=1)
-> Seq Scan on part (cost=0.00..73877.55 rows=3200 width=4) (actual time=0.026..7964.747
rows=112998loops=1) Filter: ((p_name)::text ~~ '%ivory%'::text)
-> Index Scan using i_ps_partkey on partsupp (cost=0.00..8.00 rows=148 width=12) (actual
time=0.075..0.086rows=4 loops=112998) Index Cond: ("outer".p_partkey =
partsupp.ps_partkey) -> Index Scan using i_l_suppkey_partkey on lineitem
(cost=0.00..6.02rows=1 width=24) (actual time=0.564..2.641 rows=8 loops=451992)
IndexCond: (("outer".ps_partkey = lineitem.l_partkey) AND ("outer".ps_suppkey = lineitem.l_suppkey))
-> Index Scan using pk_supplier on supplier (cost=0.00..3.01 rows=1 width=8) (actual time=0.009..0.010 rows=1
loops=3401050) Index Cond: (supplier.s_suppkey = "outer".l_suppkey) ->
Materialize (cost=1.27..1.52 rows=25 width=33) (actual time=0.001..0.014 rows=25 loops=3401050)
-> Seq Scan on nation (cost=0.00..1.25 rows=25 width=33) (actual time=0.017..0.062 rows=25 loops=1) ->
IndexScan using pk_orders on orders (cost=0.00..3.01 rows=1 width=8) (actual time=0.225..0.226 rows=1 loops=3401050)
Index Cond: (orders.o_orderkey = "outer".l_orderkey)Total runtime: 2293355.531 ms
(22 rows)
Throughput Stream 2 Q9:
EXPLAIN ANALYZE select nation, o_year, sum(amount) as sum_profit from ( select n_name as nation, extract(year from
o_orderdate)as o_year, l_extendedprice * (1 - l_discount) - ps_supplycost * l_quantity as amount from part, supplier,
lineitem,partsupp, orders, nation where s_suppkey = l_suppkey and ps_suppkey = l_suppkey and ps_partkey = l_partkey and
p_partkey= l_partkey and o_orderkey = l_orderkey and s_nationkey = n_nationkey and p_name like '%goldenrod%' ) as
profitgroup by nation, o_year order by nation, o_year desc;
QUERY PLAN
-------------------------------------------------------------------------------------------------------------------------------------------------------------------------GroupAggregate
(cost=74055.04..74055.07 rows=1 width=49) (actual time=2476208.097..2493189.788 rows=175 loops=1) -> Sort
(cost=74055.04..74055.05rows=1 width=49) (actual time=2476149.653..2482032.976 rows=3538739 loops=1) Sort Key:
nation.n_name,date_part('year'::text, (orders.o_orderdate)::timestamp without time zone) -> Nested Loop
(cost=0.00..74055.03rows=1 width=49) (actual time=50.669..2362896.628 rows=3538739 loops=1) -> Nested
Loop (cost=0.00..74052.00 rows=1 width=49) (actual time=36.915..1110819.931 rows=3538739 loops=1)
JoinFilter: ("outer".s_nationkey = "inner".n_nationkey) -> Nested Loop (cost=0.00..74050.44 rows=1
width=24)(actual time=36.534..827220.019 rows=3538739 loops=1) -> Nested Loop
(cost=0.00..74047.41rows=1 width=28) (actual time=31.045..739895.872 rows=3538739 loops=1)
-> Nested Loop (cost=0.00..73926.79 rows=20 width=16) (actual time=15.880..25654.447 rows=471212 loops=1)
-> Seq Scan on part (cost=0.00..73877.55 rows=5 width=4) (actual time=0.008..6843.928
rows=117803loops=1) Filter: ((p_name)::text ~~ '%goldenrod%'::text)
-> Index Scan using i_ps_partkey on partsupp (cost=0.00..8.00 rows=148 width=12) (actual
time=0.133..0.149rows=4 loops=117803) Index Cond: ("outer".p_partkey =
partsupp.ps_partkey) -> Index Scan using i_l_suppkey_partkey on lineitem
(cost=0.00..6.02rows=1 width=24) (actual time=0.293..1.498 rows=8 loops=471212)
IndexCond: (("outer".ps_partkey = lineitem.l_partkey) AND ("outer".ps_suppkey = lineitem.l_suppkey))
-> Index Scan using pk_supplier on supplier (cost=0.00..3.01 rows=1 width=8) (actual time=0.019..0.021 rows=1
loops=3538739) Index Cond: (supplier.s_suppkey = "outer".l_suppkey) ->
Seq Scan on nation (cost=0.00..1.25 rows=25 width=33) (actual time=0.005..0.050 rows=25 loops=3538739) ->
Index Scan using pk_orders on orders (cost=0.00..3.01 rows=1 width=8) (actual time=0.346..0.347 rows=1 loops=3538739)
Index Cond: (orders.o_orderkey = "outer".l_orderkey)Total runtime: 2493372.155 ms
(21 rows)
Throughput Stream 3 Q9:
EXPLAIN ANALYZE select nation, o_year, sum(amount) as sum_profit from ( select n_name as nation, extract(year from
o_orderdate)as o_year, l_extendedprice * (1 - l_discount) - ps_supplycost * l_quantity as amount from part, supplier,
lineitem,partsupp, orders, nation where s_suppkey = l_suppkey and ps_suppkey = l_suppkey and ps_partkey = l_partkey and
p_partkey= l_partkey and o_orderkey = l_orderkey and s_nationkey = n_nationkey and p_name like '%firebrick%' ) as
profitgroup by nation, o_year order by nation, o_year desc;
QUERY PLAN
-------------------------------------------------------------------------------------------------------------------------------------------------------------------------GroupAggregate
(cost=74055.04..74055.07 rows=1 width=49) (actual time=1202351.569..1216140.153 rows=175 loops=1) -> Sort
(cost=74055.04..74055.05rows=1 width=49) (actual time=1202303.507..1206470.228 rows=3692230 loops=1) Sort Key:
nation.n_name,date_part('year'::text, (orders.o_orderdate)::timestamp without time zone) -> Nested Loop
(cost=0.00..74055.03rows=1 width=49) (actual time=8.594..1086270.124 rows=3692230 loops=1) -> Nested Loop
(cost=0.00..74052.00 rows=1 width=49) (actual time=4.970..772224.811 rows=3692230 loops=1) Join
Filter:("outer".s_nationkey = "inner".n_nationkey) -> Nested Loop (cost=0.00..74050.44 rows=1
width=24)(actual time=4.915..476254.721 rows=3692230 loops=1) -> Nested Loop
(cost=0.00..74047.41rows=1 width=28) (actual time=4.873..388805.585 rows=3692230 loops=1)
-> Nested Loop (cost=0.00..73926.79 rows=20 width=16) (actual time=3.056..13289.660 rows=490268 loops=1)
-> Seq Scan on part (cost=0.00..73877.55 rows=5 width=4) (actual time=0.026..3358.087
rows=122567loops=1) Filter: ((p_name)::text ~~ '%firebrick%'::text)
-> Index Scan using i_ps_partkey on partsupp (cost=0.00..8.00 rows=148 width=12) (actual
time=0.060..0.071rows=4 loops=122567) Index Cond: ("outer".p_partkey =
partsupp.ps_partkey) -> Index Scan using i_l_suppkey_partkey on lineitem
(cost=0.00..6.02rows=1 width=24) (actual time=0.130..0.749 rows=8 loops=490268)
IndexCond: (("outer".ps_partkey = lineitem.l_partkey) AND ("outer".ps_suppkey = lineitem.l_suppkey))
-> Index Scan using pk_supplier on supplier (cost=0.00..3.01 rows=1 width=8) (actual time=0.018..0.020 rows=1
loops=3692230) Index Cond: (supplier.s_suppkey = "outer".l_suppkey) ->
Seq Scan on nation (cost=0.00..1.25 rows=25 width=33) (actual time=0.005..0.050 rows=25 loops=3692230) ->
Index Scan using pk_orders on orders (cost=0.00..3.01 rows=1 width=8) (actual time=0.077..0.079 rows=1 loops=3692230)
Index Cond: (orders.o_orderkey = "outer".l_orderkey)Total runtime: 1216237.044 ms
(21 rows)
Throughput Stream 4 Q9:
EXPLAIN ANALYZE select nation, o_year, sum(amount) as sum_profit from ( select n_name as nation, extract(year from
o_orderdate)as o_year, l_extendedprice * (1 - l_discount) - ps_supplycost * l_quantity as amount from part, supplier,
lineitem,partsupp, orders, nation where s_suppkey = l_suppkey and ps_suppkey = l_suppkey and ps_partkey = l_partkey and
p_partkey= l_partkey and o_orderkey = l_orderkey and s_nationkey = n_nationkey and p_name like '%cyan%' ) as profit
groupby nation, o_year order by nation, o_year desc;
QUERY PLAN
-------------------------------------------------------------------------------------------------------------------------------------------------------------------------GroupAggregate
(cost=617707.12..617707.84 rows=26 width=49) (actual time=1076603.089..1091550.720 rows=175 loops=1) -> Sort
(cost=617707.12..617707.19rows=26 width=49) (actual time=1076549.073..1081134.008 rows=3877975 loops=1) Sort
Key:nation.n_name, date_part('year'::text, (orders.o_orderdate)::timestamp without time zone) -> Nested Loop
(cost=1.27..617706.51rows=26 width=49) (actual time=25.395..954764.745 rows=3877975 loops=1) -> Nested
Loop (cost=1.27..617627.70 rows=26 width=49) (actual time=17.100..615326.489 rows=3877975 loops=1)
JoinFilter: ("outer".s_nationkey = "inner".n_nationkey) -> Nested Loop (cost=0.00..617611.80
rows=26width=24) (actual time=17.076..448739.290 rows=3877975 loops=1) -> Nested Loop
(cost=0.00..617530.10rows=27 width=28) (actual time=17.029..355106.680 rows=3877975 loops=1)
-> Nested Loop (cost=0.00..231443.74 rows=64017 width=16) (actual time=10.339..19179.080 rows=515500 loops=1)
-> Seq Scan on part (cost=0.00..73877.55 rows=15999 width=4) (actual
time=0.033..3776.024rows=128875 loops=1) Filter: ((p_name)::text ~~
'%cyan%'::text) -> Index Scan using i_ps_partkey on partsupp (cost=0.00..8.00
rows=148width=12) (actual time=0.095..0.109 rows=4 loops=128875) Index Cond:
("outer".p_partkey= partsupp.ps_partkey) -> Index Scan using i_l_suppkey_partkey on
lineitem (cost=0.00..6.02 rows=1 width=24) (actual time=0.142..0.635 rows=8 loops=515500)
Index Cond: (("outer".ps_partkey = lineitem.l_partkey) AND ("outer".ps_suppkey = lineitem.l_suppkey))
-> Index Scan using pk_supplier on supplier (cost=0.00..3.01 rows=1 width=8) (actual time=0.019..0.020
rows=1loops=3877975) Index Cond: (supplier.s_suppkey = "outer".l_suppkey)
-> Materialize (cost=1.27..1.52 rows=25 width=33) (actual time=0.001..0.018 rows=25 loops=3877975)
-> Seq Scan on nation (cost=0.00..1.25 rows=25 width=33) (actual time=0.005..0.038 rows=25 loops=1)
-> Index Scan using pk_orders on orders (cost=0.00..3.01 rows=1 width=8) (actual time=0.080..0.081 rows=1
loops=3877975) Index Cond: (orders.o_orderkey = "outer".l_orderkey)Total runtime: 1091647.945 ms
(22 rows)
--
Mark