Re: another query optimization question - Mailing list pgsql-performance

From David Teran
Subject Re: another query optimization question
Date
Msg-id F8B3002A-5350-11D8-9392-000A95A6F0DC@cluster9.com
Whole thread Raw
In response to Re: another query optimization question  (Stephan Szabo <sszabo@megazone.bigpanda.com>)
Responses Re: another query optimization question
List pgsql-performance
Hi,


On 30.01.2004, at 19:10, Stephan Szabo wrote:

>
> On Fri, 30 Jan 2004, David Teran wrote:
>
>> select
>> sum(job_property_difference(t0.int_value, t1.int_value)) as rank
>>    from
>>    job_property t0,
>>    job_property t1
>>    where
>>    t0.id_job_profile = 911
>>    and t0.id_job_attribute = t1.id_job_attribute
>>    and t1.id_job_profile in (select id_job_profile from unemployed)
>>    and t1.id_job_profile <> 911;
>>
>> results in a query plan result:
>
> Can we see explain analyze output for the query, it'll give more
> information about actual time and row counts than plain explain.
>


sure, here it is comes. What we need to achieve is: we have different
job_profiles, each profile has multiple values. For one given profile
we need the ' sum of the distance of every value in the given profile
and every other profile'. The result is usually grouped by the profile
id but to make the query easier i removed this, it does not cost too
much time and it turned out that this query here uses most of the time.

thanks, David




            QUERY PLAN
------------------------------------------------------------------------
------------------------------------------------------------------------
--------------------------------
  Aggregate  (cost=2689349.81..2689349.81 rows=1 width=8) (actual
time=100487.423..100487.423 rows=1 loops=1)
    ->  Merge Join  (cost=2451266.53..2655338.83 rows=13604393 width=8)
(actual time=82899.466..-2371037.726 rows=2091599 loops=1)
          Merge Cond: ("outer".id_job_attribute =
"inner".id_job_attribute)
          ->  Sort  (cost=97.43..100.63 rows=1281 width=8) (actual
time=3.937..4.031 rows=163 loops=1)
                Sort Key: t0.id_job_attribute
                ->  Index Scan using
job_property__id_job_profile__fk_index on job_property t0
(cost=0.00..31.31 rows=1281 width=8) (actual time=1.343..3.766 rows=163
loops=1)
                      Index Cond: (id_job_profile = 911)
          ->  Sort  (cost=2451169.10..2483246.47 rows=12830947 width=8)
(actual time=82891.076..-529619.213 rows=4187378 loops=1)
                Sort Key: t1.id_job_attribute
                ->  Hash IN Join  (cost=507.32..439065.37 rows=12830947
width=8) (actual time=61.943..1874640.807 rows=4187378 loops=1)
                      Hash Cond: ("outer".id_job_profile =
"inner".id_job_profile)
                      ->  Seq Scan on job_property t1
(cost=0.00..246093.84 rows=12830947 width=12) (actual
time=0.136..19101.796 rows=8482533 loops=1)
                            Filter: (id_job_profile <> 911)
                      ->  Hash  (cost=467.46..467.46 rows=15946 width=4)
(actual time=61.313..61.313 rows=0 loops=1)
                            ->  Seq Scan on unemployed
(cost=0.00..467.46 rows=15946 width=4) (actual time=0.157..50.842
rows=15960 loops=1)
  Total runtime: 103769.592 ms



pgsql-performance by date:

Previous
From: Stephan Szabo
Date:
Subject: Re: another query optimization question
Next
From: Tom Lane
Date:
Subject: Re: another query optimization question