hi,
avoid subselect: create a temp table and use join...
CREATE TEMP TABLE tmp AS SELECT lineitem.orderkey FROM lineitem WHERE lineitem.orderkey=orders.orderkey GROUP
BY lineitem.orderkey HAVING sum(lineitem.quantity)>300;
CREATE INDEX tmp_idx ON tmp (orderkey);
SELECT customer.name, customer.custkey, orders.orderkey, orders.orderdate, orders.totalprice,
sum(lineitem.quantity)FROMcustomer, orders, lineitem, tmpWHERE orders.orderkey=tmp.orderkey AND
customer.custkey=orders.custkeyAND orders.orderkey=lineitem.orderkeyGROUP BY customer.name, customer.custkey,
orders.orderkey,orders.orderdate, orders.totalpriceORDER BY orders.totalprice DESC, orders.orderdate;
may be the index is not necessary...
kuba
> I'm going to keep on asking about another query:
>
> SELECT
> customer.name,
> customer.custkey,
> orders.orderkey,
> orders.orderdate,
> orders.totalprice,
> sum(lineitem.quantity)
> FROM
> customer,
> orders,
> lineitem
> WHERE
> exists(
> SELECT
> lineitem.orderkey
> FROM
> lineitem
> WHERE
> lineitem.orderkey=orders.orderkey
> GROUP BY
> lineitem.orderkey HAVING
> sum(lineitem.quantity)>300
> )
> AND customer.custkey=orders.custkey
> AND orders.orderkey=lineitem.orderkey
> GROUP BY
> customer.name,
> customer.custkey,
> orders.orderkey,
> orders.orderdate,
> orders.totalprice
>
> ORDER BY
> orders.totalprice DESC,
> orders.orderdate;
>
> NOTICE: QUERY PLAN:
>
> Sort (cost=26923941.97..26923941.97 rows=300061 width=66)
> -> Aggregate (cost=26851634.86..26896644.05 rows=300061 width=66)
> -> Group (cost=26851634.86..26889142.52 rows=3000612 width=66)
> -> Sort (cost=26851634.86..26851634.86 rows=3000612
> width=66)
> -> Hash Join (cost=26107574.81..26457309.10
> rows=3000612 width=66)
> -> Seq Scan on lineitem (cost=0.00..222208.25
> rows=6001225 width=8)
> -> Hash (cost=26105699.81..26105699.81
> rows=750000 width=58)
> -> Hash Join (cost=7431.00..26105699.81
> rows=750000 width=58)
> -> Seq Scan on orders
> (cost=0.00..26083268.81 rows=750000 width=25)
> SubPlan
> -> Aggregate
> (cost=0.00..17.35 rows=1 width=8)
> -> Group
> (cost=0.00..17.34 rows=5 width=8)
> -> Index Scan
> using lineitem_pkey on lineitem (cost=0.00..17.33 rows=5 width=8)
> -> Hash (cost=7056.00..7056.00
> rows=150000 width=33)
> -> Seq Scan on customer
> (cost=0.00..7056.00 rows=150000 width=33)
>
> again:
> orders 1500000 tuples
> lineitem 6000000 tuples there are 1 to 7 lineitems per orderkey
> Customer 150000 tuples
>
> select attname,n_distinct,correlation from pg_stats where
> tablename='lineitem';
> attname | n_distinct | correlation
> ---------------+------------+-------------
> orderkey | -0.199847 | 1
> partkey | 196448 | 0.0223377
> suppkey | 9658 | -0.00822751
> linenumber | 7 | 0.17274
> quantity | 50 | 0.0150153
> extendedprice | 25651 | -0.00790245
> discount | 11 | 0.103761
> tax | 9 | 0.0993771
> returnflag | 3 | 0.391434
> linestatus | 2 | 0.509791
> shipdate | 2440 | 0.0072777
> commitdate | 2497 | 0.00698162
> receiptdate | 2416 | 0.00726686
> shipinstruct | 4 | 0.241511
> shipmode | 7 | 0.138432
> comment | 275488 | 0.0188006
> (16 rows)
>
> select attname,n_distinct,correlation from pg_stats where
> tablename='orders';
> attname | n_distinct | correlation
> ---------------+------------+-------------
> orderkey | -1 | -0.999925
> custkey | 76309 | 0.00590596
> orderstatus | 3 | 0.451991
> totalprice | -1 | -0.00768806
> orderdate | 2431 | -0.0211354
> orderpriority | 5 | 0.182489
> clerk | 1009 | 0.00546939
> shippriority | 1 | 1
> comment | -0.750125 | -0.0123887
>
> Customer
> attname | n_distinct | correlation
> ------------+------------+-------------
> custkey | -1 | 1
> name | -1 | 1
> address | -1 | -0.00510274
> nationkey | 25 | 0.0170533
> phone | -1 | -0.0227816
> acctbal | -0.83444 | -0.00220958
> mktsegment | 5 | 0.205013
> comment | -1 | 0.0327827
>
> This query takes 12 minutes to run and returns about 50 customers.
> lineitem.quantity takes values from 1 to 50, so 300 per orderkey is very
> restrictive
>
> May someone help on improving performance?
> Again thanks in advance
> Regards
>
>
>
> ---------------------------(end of broadcast)---------------------------
> TIP 5: Have you checked our extensive FAQ?
>
> http://www.postgresql.org/users-lounge/docs/faq.html
>