Re: [HACKERS] please help on query - Mailing list pgsql-sql
From | Luis Alberto Amigo Navarro |
---|---|
Subject | Re: [HACKERS] please help on query |
Date | |
Msg-id | 019201c229b9$61acfa50$cab990c1@atc.unican.es Whole thread Raw |
In response to | Re: [HACKERS] please help on query (Jakub Ouhrabka <jakub.ouhrabka@comgate.cz>) |
Responses |
Re: [HACKERS] please help on query
(Masaru Sugawara <rk73@sea.plala.or.jp>)
|
List | pgsql-sql |
Lineitem is being modified on run time, so creating a temp table don't solves my problem The time of creating this table is the same of performing the subselect (or so I think), it could be done creating a new table, and a new trigger, but there are already triggers to calculate lineitem.extendedprice=part.retailprice*lineitem.quantity*(1+taxes)*(1-disco unt) and to calculate orderstatus in order with linestatus and to calculate orders.totalprice as sum(extendedprice) where lineitem.orderkey=new.orderkey. A new trigger in order to insert orderkey if sum(quantity) where orderkey=new.orderkey might be excessive. Any other idea? Thanks And Regards ----- Original Message ----- From: "Jakub Ouhrabka" <jakub.ouhrabka@comgate.cz> To: "Luis Alberto Amigo Navarro" <lamigo@atc.unican.es> Cc: "Manfred Koizar" <mkoi-pg@aon.at>; <pgsql-sql@postgresql.org> Sent: Friday, July 12, 2002 1:50 PM Subject: Re: [SQL] [HACKERS] please help on query > hi, > > avoid subselect: create a temp table and use join... > > CREATE TEMP TABLE tmp AS > SELECT > lineitem.orderkey > FROM > lineitem > WHERE > lineitem.orderkey=orders.orderkey > GROUP BY > lineitem.orderkey HAVING > sum(lineitem.quantity)>300; > > CREATE INDEX tmp_idx ON tmp (orderkey); > > SELECT > customer.name, > customer.custkey, > orders.orderkey, > orders.orderdate, > orders.totalprice, > sum(lineitem.quantity) > FROM > customer, > orders, > lineitem, > tmp > WHERE > orders.orderkey=tmp.orderkey > AND customer.custkey=orders.custkey > AND orders.orderkey=lineitem.orderkey > GROUP BY > customer.name, > customer.custkey, > orders.orderkey, > orders.orderdate, > orders.totalprice > ORDER BY > orders.totalprice DESC, > orders.orderdate; > > > may be the index is not necessary... > > kuba > > > > I'm going to keep on asking about another query: > > > > SELECT > > customer.name, > > customer.custkey, > > orders.orderkey, > > orders.orderdate, > > orders.totalprice, > > sum(lineitem.quantity) > > FROM > > customer, > > orders, > > lineitem > > WHERE > > exists( > > SELECT > > lineitem.orderkey > > FROM > > lineitem > > WHERE > > lineitem.orderkey=orders.orderkey > > GROUP BY > > lineitem.orderkey HAVING > > sum(lineitem.quantity)>300 > > ) > > AND customer.custkey=orders.custkey > > AND orders.orderkey=lineitem.orderkey > > GROUP BY > > customer.name, > > customer.custkey, > > orders.orderkey, > > orders.orderdate, > > orders.totalprice > > > > ORDER BY > > orders.totalprice DESC, > > orders.orderdate; > > > > NOTICE: QUERY PLAN: > > > > Sort (cost=26923941.97..26923941.97 rows=300061 width=66) > > -> Aggregate (cost=26851634.86..26896644.05 rows=300061 width=66) > > -> Group (cost=26851634.86..26889142.52 rows=3000612 width=66) > > -> Sort (cost=26851634.86..26851634.86 rows=3000612 > > width=66) > > -> Hash Join (cost=26107574.81..26457309.10 > > rows=3000612 width=66) > > -> Seq Scan on lineitem (cost=0.00..222208.25 > > rows=6001225 width=8) > > -> Hash (cost=26105699.81..26105699.81 > > rows=750000 width=58) > > -> Hash Join (cost=7431.00..26105699.81 > > rows=750000 width=58) > > -> Seq Scan on orders > > (cost=0.00..26083268.81 rows=750000 width=25) > > SubPlan > > -> Aggregate > > (cost=0.00..17.35 rows=1 width=8) > > -> Group > > (cost=0.00..17.34 rows=5 width=8) > > -> Index Scan > > using lineitem_pkey on lineitem (cost=0.00..17.33 rows=5 width=8) > > -> Hash (cost=7056.00..7056.00 > > rows=150000 width=33) > > -> Seq Scan on customer > > (cost=0.00..7056.00 rows=150000 width=33) > > > > again: > > orders 1500000 tuples > > lineitem 6000000 tuples there are 1 to 7 lineitems per orderkey > > Customer 150000 tuples > > > > select attname,n_distinct,correlation from pg_stats where > > tablename='lineitem'; > > attname | n_distinct | correlation > > ---------------+------------+------------- > > orderkey | -0.199847 | 1 > > partkey | 196448 | 0.0223377 > > suppkey | 9658 | -0.00822751 > > linenumber | 7 | 0.17274 > > quantity | 50 | 0.0150153 > > extendedprice | 25651 | -0.00790245 > > discount | 11 | 0.103761 > > tax | 9 | 0.0993771 > > returnflag | 3 | 0.391434 > > linestatus | 2 | 0.509791 > > shipdate | 2440 | 0.0072777 > > commitdate | 2497 | 0.00698162 > > receiptdate | 2416 | 0.00726686 > > shipinstruct | 4 | 0.241511 > > shipmode | 7 | 0.138432 > > comment | 275488 | 0.0188006 > > (16 rows) > > > > select attname,n_distinct,correlation from pg_stats where > > tablename='orders'; > > attname | n_distinct | correlation > > ---------------+------------+------------- > > orderkey | -1 | -0.999925 > > custkey | 76309 | 0.00590596 > > orderstatus | 3 | 0.451991 > > totalprice | -1 | -0.00768806 > > orderdate | 2431 | -0.0211354 > > orderpriority | 5 | 0.182489 > > clerk | 1009 | 0.00546939 > > shippriority | 1 | 1 > > comment | -0.750125 | -0.0123887 > > > > Customer > > attname | n_distinct | correlation > > ------------+------------+------------- > > custkey | -1 | 1 > > name | -1 | 1 > > address | -1 | -0.00510274 > > nationkey | 25 | 0.0170533 > > phone | -1 | -0.0227816 > > acctbal | -0.83444 | -0.00220958 > > mktsegment | 5 | 0.205013 > > comment | -1 | 0.0327827 > > > > This query takes 12 minutes to run and returns about 50 customers. > > lineitem.quantity takes values from 1 to 50, so 300 per orderkey is very > > restrictive > > > > May someone help on improving performance? > > Again thanks in advance > > Regards > > > > > > > > ---------------------------(end of broadcast)--------------------------- > > TIP 5: Have you checked our extensive FAQ? > > > > http://www.postgresql.org/users-lounge/docs/faq.html > > > > > ---------------------------(end of broadcast)--------------------------- > TIP 2: you can get off all lists at once with the unregister command > (send "unregister YourEmailAddressHere" to majordomo@postgresql.org) > >