Re: Improving Inner Join Performance - Mailing list pgsql-performance

From Andy
Subject Re: Improving Inner Join Performance
Date
Msg-id 00c001c612aa$f39ed1d0$0b00a8c0@forge
Whole thread Raw
In response to Improving Inner Join Performance  ("Andy" <frum@ar-sd.net>)
Responses Re: Improving Inner Join Performance
List pgsql-performance
At the moment: o.id_status>3 can have values only 4 and 6. The 6 is around
90% from the whole table. This is why seq scan is made.

Now, depending on the user input the query can have more where fields. For
example:
SELECT count(*) FROM orders o
       INNER JOIN report r ON r.id_order=o.id
       WHERE o.id_status > 3 AND r.id_zufriden=7

Aggregate  (cost=7317.15..7317.15 rows=1 width=0) (actual
time=213.418..213.419 rows=1 loops=1)
  ->  Hash Join  (cost=3139.00..7310.80 rows=2540 width=0) (actual
time=57.554..212.215 rows=1308 loops=1)
        Hash Cond: ("outer".id = "inner".id_order)
        ->  Seq Scan on orders o  (cost=0.00..3785.31 rows=72216 width=4)
(actual time=0.014..103.292 rows=72121 loops=1)
              Filter: (id_status > 3)
        ->  Hash  (cost=3132.51..3132.51 rows=2597 width=4) (actual
time=57.392..57.392 rows=0 loops=1)
              ->  Seq Scan on report r  (cost=0.00..3132.51 rows=2597
width=4) (actual time=0.019..56.220 rows=1308 loops=1)
                    Filter: (id_zufriden = 7)
Total runtime: 213.514 ms

These examples can go on and on.

If I run this query
SELECT count(*) FROM orders o
INNER JOIN report r ON r.id_order=o.id
WHERE o.id_status>3
under normal system load the average response time is between 1.3 > 2.5
seconds. Sometimes even more. If I run it rapidly a few times then it
respondes faster(that is normal I supose).

The ideea of this query is to count all the possible results that the user
can have. I use this to build pages of results.

Andy.

----- Original Message -----
From: "Pandurangan R S" <pandurangan.r.s@gmail.com>
To: "Andy" <frum@ar-sd.net>
Cc: <pgsql-performance@postgresql.org>
Sent: Friday, January 06, 2006 11:56 AM
Subject: Re: [PERFORM] Improving Inner Join Performance


> If the users puts in some other search fields on the where then the query
> runs faster but > in this format sometimes it takes a lot lot of
> time(sometimes even 2,3 seconds).

Can you eloborate under what conditions which query is slower?

On 1/5/06, Andy <frum@ar-sd.net> wrote:
>
> Hi to all,
>
> I have the following query:
>
> SELECT count(*) FROM orders o
>       INNER JOIN report r ON r.id_order=o.id
>       WHERE o.id_status>3
>
> Explaing analyze:
> Aggregate  (cost=8941.82..8941.82 rows=1 width=0) (actual
> time=1003.297..1003.298 rows=1 loops=1)
>   ->  Hash Join  (cost=3946.28..8881.72 rows=24041 width=0) (actual
> time=211.985..951.545 rows=72121 loops=1)
>         Hash Cond: ("outer".id_order = "inner".id)
>         ->  Seq Scan on report r  (cost=0.00..2952.21 rows=72121 width=4)
> (actual time=0.005..73.869 rows=72121 loops=1)
>         ->  Hash  (cost=3787.57..3787.57 rows=24682 width=4) (actual
> time=211.855..211.855 rows=0 loops=1)
>               ->  Seq Scan on orders o  (cost=0.00..3787.57 rows=24682
> width=4) (actual time=0.047..147.170 rows=72121 loops=1)
>                     Filter: (id_status > 3)
> Total runtime: 1003.671 ms
>
>
> I could use it in the following format, because I have to the moment only
> the 4,6 values for the id_status.
>
> SELECT count(*) FROM orders o
>       INNER JOIN report r ON r.id_order=o.id
>       WHERE o.id_status IN (4,6)
>
> Explain analyze:
> Aggregate  (cost=5430.04..5430.04 rows=1 width=0) (actual
> time=1472.877..1472.877 rows=1 loops=1)
>   ->  Hash Join  (cost=2108.22..5428.23 rows=720 width=0) (actual
> time=342.080..1419.775 rows=72121 loops=1)
>         Hash Cond: ("outer".id_order = "inner".id)
>         ->  Seq Scan on report r  (cost=0.00..2952.21 rows=72121 width=4)
> (actual time=0.036..106.217 rows=72121 loops=1)
>         ->  Hash  (cost=2106.37..2106.37 rows=739 width=4) (actual
> time=342.011..342.011 rows=0 loops=1)
>               ->  Index Scan using orders_id_status_idx,
> orders_id_status_idx on orders o  (cost=0.00..2106.37 rows=739 width=4)
> (actual time=0.131..268.397 rows=72121 loops=1)
>                     Index Cond: ((id_status = 4) OR (id_status = 6))
> Total runtime: 1474.356 ms
>
> How can I improve this query's performace?? The ideea is to count all the
> values that I have in the database for the following conditions. If the
> users puts in some other search fields on the where then the query runs
> faster but in this format sometimes it takes a lot lot of time(sometimes
> even 2,3 seconds).
>
> Can this be tuned somehow???
>
> Regards,
> Andy.
>
>



pgsql-performance by date:

Previous
From: Pandurangan R S
Date:
Subject: Re: Improving Inner Join Performance
Next
From: Ian Westmacott
Date:
Subject: Re: improving write performance for logging