Thread: Four table join with million records - performance improvement?
Hi i have four sample tables ename, esal, edoj and esum All of them have 1000000 records. Im running the following query : select ename.eid, name, sal, doj, summary from ename,esal,edoj,esum where ename.eid=esal.eid and ename.eid=edoj.eid and ename.eid=esum.eid. Its a join of all four tables which returns all 1 million records. The eid field in ename is a Primary Key and the eid in all other tables are Foreign Keys. I have created an index for all Foreign Keys. This query takes around 16 MINUTES to complete. Can this time be reduced? Thanks Vijay ---------------------------------------------------------------- EXPLAIN OUTPUT QUERY PLAN Merge Join (cost=647497.97..163152572.97 rows=25000025000000 width=80) Merge Cond: ("outer".eid = "inner".eid) -> Merge Join (cost=356059.69..75361059.69 rows=5000000000 width=44) Merge Cond: ("outer".eid = "inner".eid) -> Sort (cost=150295.84..152795.84 rows=1000000 width=8) Sort Key: edoj.eid -> Seq Scan on edoj (cost=0.00..15568.00 rows=1000000 width=8) -> Sort (cost=205763.84..208263.84 rows=1000000 width=36) Sort Key: esum.eid -> Seq Scan on esum (cost=0.00..31976.00 rows=1000000 width=36) -> Sort (cost=291438.28..293938.29 rows=1000002 width=48) Sort Key: ename.eid -> Hash Join (cost=26683.01..107880.23 rows=1000002 width=48) Hash Cond: ("outer".eid = "inner".eid) -> Seq Scan on esal (cost=0.00..21613.01 rows=1000001 width=12) -> Hash (cost=16370.01..16370.01 rows=1000001 width=36) -> Seq Scan on ename (cost=0.00..16370.01 rows=1000001 width=36) 17 row(s) Total runtime: 181.021 ms ---------------------------------------------------------------- EXPLAIN ANALYZE OUTPUT QUERY PLAN Merge Join (cost=647497.97..163152572.97 rows=25000025000000 width=80) (actual time=505418.965..584981.013 rows=1000000 loops=1) Merge Cond: ("outer".eid = "inner".eid) -> Merge Join (cost=356059.69..75361059.69 rows=5000000000 width=44) (actual time=110394.376..138177.569 rows=1000000 loops=1) Merge Cond: ("outer".eid = "inner".eid) -> Sort (cost=150295.84..152795.84 rows=1000000 width=8) (actual time=27587.622..31077.077 rows=1000000 loops=1) Sort Key: edoj.eid -> Seq Scan on edoj (cost=0.00..15568.00 rows=1000000 width=8) (actual time=144.000..10445.145 rows=1000000 loops=1) -> Sort (cost=205763.84..208263.84 rows=1000000 width=36) (actual time=82806.646..90322.943 rows=1000000 loops=1) Sort Key: esum.eid -> Seq Scan on esum (cost=0.00..31976.00 rows=1000000 width=36) (actual time=20.312..29030.247 rows=1000000 loops=1) -> Sort (cost=291438.28..293938.29 rows=1000002 width=48) (actual time=395024.482..426870.491 rows=1000001 loops=1) Sort Key: ename.eid -> Hash Join (cost=26683.01..107880.23 rows=1000002 width=48) (actual time=29234.472..198064.105 rows=1000001 loops=1) Hash Cond: ("outer".eid = "inner".eid) -> Seq Scan on esal (cost=0.00..21613.01 rows=1000001 width=12) (actual time=32.257..23999.163 rows=1000001 loops=1) -> Hash (cost=16370.01..16370.01 rows=1000001 width=36) (actual time=19362.095..19362.095 rows=0 loops=1) -> Seq Scan on ename (cost=0.00..16370.01 rows=1000001 width=36) (actual time=26.744..13878.410 rows=1000001 loops=1) Total runtime: 586226.831 ms 18 row(s) Total runtime: 586,435.978 ms ----------------------------------------------------------------
Vijay Moses <vijay.moses@gmail.com> writes: > Hi i have four sample tables ename, esal, edoj and esum > All of them have 1000000 records. Im running the following > query : select ename.eid, name, sal, doj, summary from > ename,esal,edoj,esum where ename.eid=esal.eid and ename.eid=edoj.eid > and ename.eid=esum.eid. Its a join of all four tables which returns > all 1 million records. The eid field in ename is a Primary Key and the > eid in all other tables are Foreign Keys. I have created an index for > all Foreign Keys. This query takes around 16 MINUTES to complete. Can > this time be reduced? The indexes will be completely useless for that sort of query; the reasonable choices are sort/merge or hashjoin. For either one, your best way to speed it up is to increase sort_mem. regards, tom lane