Re: Improving Inner Join Performance - Mailing list pgsql-performance
From | Andy |
---|---|
Subject | Re: Improving Inner Join Performance |
Date | |
Msg-id | 00c001c612aa$f39ed1d0$0b00a8c0@forge Whole thread Raw |
In response to | Improving Inner Join Performance ("Andy" <frum@ar-sd.net>) |
Responses |
Re: Improving Inner Join Performance
|
List | pgsql-performance |
At the moment: o.id_status>3 can have values only 4 and 6. The 6 is around 90% from the whole table. This is why seq scan is made. Now, depending on the user input the query can have more where fields. For example: SELECT count(*) FROM orders o INNER JOIN report r ON r.id_order=o.id WHERE o.id_status > 3 AND r.id_zufriden=7 Aggregate (cost=7317.15..7317.15 rows=1 width=0) (actual time=213.418..213.419 rows=1 loops=1) -> Hash Join (cost=3139.00..7310.80 rows=2540 width=0) (actual time=57.554..212.215 rows=1308 loops=1) Hash Cond: ("outer".id = "inner".id_order) -> Seq Scan on orders o (cost=0.00..3785.31 rows=72216 width=4) (actual time=0.014..103.292 rows=72121 loops=1) Filter: (id_status > 3) -> Hash (cost=3132.51..3132.51 rows=2597 width=4) (actual time=57.392..57.392 rows=0 loops=1) -> Seq Scan on report r (cost=0.00..3132.51 rows=2597 width=4) (actual time=0.019..56.220 rows=1308 loops=1) Filter: (id_zufriden = 7) Total runtime: 213.514 ms These examples can go on and on. If I run this query SELECT count(*) FROM orders o INNER JOIN report r ON r.id_order=o.id WHERE o.id_status>3 under normal system load the average response time is between 1.3 > 2.5 seconds. Sometimes even more. If I run it rapidly a few times then it respondes faster(that is normal I supose). The ideea of this query is to count all the possible results that the user can have. I use this to build pages of results. Andy. ----- Original Message ----- From: "Pandurangan R S" <pandurangan.r.s@gmail.com> To: "Andy" <frum@ar-sd.net> Cc: <pgsql-performance@postgresql.org> Sent: Friday, January 06, 2006 11:56 AM Subject: Re: [PERFORM] Improving Inner Join Performance > If the users puts in some other search fields on the where then the query > runs faster but > in this format sometimes it takes a lot lot of > time(sometimes even 2,3 seconds). Can you eloborate under what conditions which query is slower? On 1/5/06, Andy <frum@ar-sd.net> wrote: > > Hi to all, > > I have the following query: > > SELECT count(*) FROM orders o > INNER JOIN report r ON r.id_order=o.id > WHERE o.id_status>3 > > Explaing analyze: > Aggregate (cost=8941.82..8941.82 rows=1 width=0) (actual > time=1003.297..1003.298 rows=1 loops=1) > -> Hash Join (cost=3946.28..8881.72 rows=24041 width=0) (actual > time=211.985..951.545 rows=72121 loops=1) > Hash Cond: ("outer".id_order = "inner".id) > -> Seq Scan on report r (cost=0.00..2952.21 rows=72121 width=4) > (actual time=0.005..73.869 rows=72121 loops=1) > -> Hash (cost=3787.57..3787.57 rows=24682 width=4) (actual > time=211.855..211.855 rows=0 loops=1) > -> Seq Scan on orders o (cost=0.00..3787.57 rows=24682 > width=4) (actual time=0.047..147.170 rows=72121 loops=1) > Filter: (id_status > 3) > Total runtime: 1003.671 ms > > > I could use it in the following format, because I have to the moment only > the 4,6 values for the id_status. > > SELECT count(*) FROM orders o > INNER JOIN report r ON r.id_order=o.id > WHERE o.id_status IN (4,6) > > Explain analyze: > Aggregate (cost=5430.04..5430.04 rows=1 width=0) (actual > time=1472.877..1472.877 rows=1 loops=1) > -> Hash Join (cost=2108.22..5428.23 rows=720 width=0) (actual > time=342.080..1419.775 rows=72121 loops=1) > Hash Cond: ("outer".id_order = "inner".id) > -> Seq Scan on report r (cost=0.00..2952.21 rows=72121 width=4) > (actual time=0.036..106.217 rows=72121 loops=1) > -> Hash (cost=2106.37..2106.37 rows=739 width=4) (actual > time=342.011..342.011 rows=0 loops=1) > -> Index Scan using orders_id_status_idx, > orders_id_status_idx on orders o (cost=0.00..2106.37 rows=739 width=4) > (actual time=0.131..268.397 rows=72121 loops=1) > Index Cond: ((id_status = 4) OR (id_status = 6)) > Total runtime: 1474.356 ms > > How can I improve this query's performace?? The ideea is to count all the > values that I have in the database for the following conditions. If the > users puts in some other search fields on the where then the query runs > faster but in this format sometimes it takes a lot lot of time(sometimes > even 2,3 seconds). > > Can this be tuned somehow??? > > Regards, > Andy. > >
pgsql-performance by date: