Re: Repeated Query is much slower in PostgreSQL8.2.4 than DB2 9.1 - Mailing list pgsql-performance

From Mike Ivanov
Subject Re: Repeated Query is much slower in PostgreSQL8.2.4 than DB2 9.1
Date
Msg-id 4A5FD319.2080408@activestate.com
Whole thread Raw
In response to Re: Repeated Query is much slower in PostgreSQL8.2.4 than DB2 9.1  (ning <mailxiening@gmail.com>)
List pgsql-performance
Interesting. It's quite a hairy plan even though all the branches are
cut off by conditions ("never executed") so the query yields 0 rows.

0.018 is not a bad timing for that.

However, if you run this query with different parameters, the result
could be quite sad.

There are some deeply nested loops with joins filtered by inner seq
scans; this can be extremely expensive. Also, note that Left Merge Join
with 16243 rows being reduced into just 1.

With a database like DB2, the results you had are quite predictable:
slow first time execution (because of the ineffective query) and then
fast consequent responses because the tiny resultset produced by the
query can be stored in the memory.

Now, with Postgres the picture is different: all this complex stuff has
to be executed each time the query is sent.

I would rather rewrite the query without inner selects, using straight
joins instead. Also, try to filter things before joining, not after.
Correct me if I'm wrong, but in this particular case this seems pretty
much possible.

Cheers,
Mike


ning wrote:
> Hi Mike,
>
> Thank you for your explanation.
> The "explain analyze" command used is as follows, several integers are
> bound to '?'.
> -----
> SELECT oid_,void,nameId,tag,intval,lowerbound,upperbound,crossfeeddir,feeddir,units,opqval,bigval,strval
> FROM (SELECT attributeOf,void,nameId,tag,intval,lowerbound,upperbound,crossfeeddir,feeddir,units,opqval,bigval,strval
> FROM DenormAttributePerf WHERE attributeof IN (SELECT oid_ FROM
> JobView WHERE JobView.JobId=? and JobView.assignedTo_=?) AND nameId in
> (?)) x RIGHT OUTER JOIN (SELECT oid_ FROM JobView WHERE
> JobView.JobId=? and JobView.assignedTo_=?) y ON attributeof = oid_ FOR
> READ ONLY
> -----
>
> The result of the command is
> -----
>
>      QUERY PLAN
>
-------------------------------------------------------------------------------------------------------------------------------------------------------------------
>  Nested Loop Left Join  (cost=575.60..1273.15 rows=81 width=568)
> (actual time=0.018..0.018 rows=0 loops=1)
>    Join Filter: (x.attributeof = j1.oid_)
>    ->  Index Scan using job_tc1 on job j1  (cost=0.00..8.27 rows=1
> width=4) (actual time=0.016..0.016 rows=0 loops=1)
>          Index Cond: ((assignedto_ = 888) AND (jobid = 0))
>    ->  Merge Left Join  (cost=575.60..899.41 rows=16243 width=564)
> (never executed)
>          Merge Cond: (v.void = b.void)
>          ->  Merge Left Join  (cost=470.77..504.87 rows=2152
> width=556) (never executed)
>                Merge Cond: (v.void = res.void)
>                ->  Sort  (cost=373.61..374.39 rows=310 width=544)
> (never executed)
>                      Sort Key: v.void
>                      ->  Hash Left Join  (cost=112.07..360.78 rows=310
> width=544) (never executed)
>                            Hash Cond: (v.void = i.void)
>                            ->  Hash Left Join  (cost=65.40..303.17
> rows=38 width=540) (never executed)
>                                  Hash Cond: (v.void = r.void)
>                                  ->  Hash Left Join
> (cost=21.42..257.86 rows=5 width=532) (never executed)
>                                        Hash Cond: (v.void = s.void)
>                                        ->  Nested Loop Left Join
> (cost=8.27..244.65 rows=5 width=16) (never executed)
>                                              Join Filter: (v.containedin = a.id)
>                                              ->  Nested Loop
> (cost=8.27..16.57 rows=1 width=12) (never executed)
>                                                    ->  HashAggregate
> (cost=8.27..8.28 rows=1 width=4) (never executed)
>                                                          ->  Index
> Scan using job_tc1 on job j1  (cost=0.00..8.27 rows=1 width=4) (never
> executed)
>                                                                Index
> Cond: ((assignedto_ = 888) AND (jobid = 0))
>                                                    ->  Index Scan
> using attribute_tc1 on attribute a  (cost=0.00..8.27 rows=1 width=12)
> (never executed)
>                                                          Index Cond:
> ((a.attributeof = j1.oid_) AND (a.nameid = 6))
>                                              ->  Append
> (cost=0.00..137.60 rows=7239 width=12) (never executed)
>                                                    ->  Index Scan
> using attribute_value_i on attribute_value v  (cost=0.00..5.30 rows=9
> width=12) (never executed)
>                                                          Index Cond:
> (v.containedin = a.id)
>                                                    ->  Seq Scan on
> string_value v  (cost=0.00..11.40 rows=140 width=12) (never executed)
>                                                    ->  Seq Scan on
> integer_value v  (cost=0.00..26.30 rows=1630 width=12) (never
> executed)
>                                                    ->  Seq Scan on
> bigint_value v  (cost=0.00..25.10 rows=1510 width=12) (never executed)
>                                                    ->  Seq Scan on
> rangeofint_value v  (cost=0.00..25.10 rows=1510 width=12) (never
> executed)
>                                                    ->  Seq Scan on
> resolution_value v  (cost=0.00..24.00 rows=1400 width=12) (never
> executed)
>                                                    ->  Seq Scan on
> opaque_value v  (cost=0.00..20.40 rows=1040 width=12) (never executed)
>                                        ->  Hash  (cost=11.40..11.40
> rows=140 width=520) (never executed)
>                                              ->  Seq Scan on
> string_value s  (cost=0.00..11.40 rows=140 width=520) (never executed)
>                                  ->  Hash  (cost=25.10..25.10
> rows=1510 width=12) (never executed)
>                                        ->  Seq Scan on
> rangeofint_value r  (cost=0.00..25.10 rows=1510 width=12) (never
> executed)
>                            ->  Hash  (cost=26.30..26.30 rows=1630
> width=8) (never executed)
>                                  ->  Seq Scan on integer_value i
> (cost=0.00..26.30 rows=1630 width=8) (never executed)
>                ->  Sort  (cost=97.16..100.66 rows=1400 width=16)
> (never executed)
>                      Sort Key: res.void
>                      ->  Seq Scan on resolution_value res
> (cost=0.00..24.00 rows=1400 width=16) (never executed)
>          ->  Sort  (cost=104.83..108.61 rows=1510 width=12) (never executed)
>                Sort Key: b.void
>                ->  Seq Scan on bigint_value b  (cost=0.00..25.10
> rows=1510 width=12) (never executed)
>  Total runtime: 0.479 ms
> (46 rows)
> -----
>
> Best regards,
> Ning
>
> On Thu, Jul 16, 2009 at 7:37 AM, Mike Ivanov<mikei@activestate.com> wrote:
>
>> ning wrote:
>>
>>> The log is really long,
>>>
>> Which usually signals a problem with the query.
>>
>>
>>> but I compared the result of "explain analyze"
>>> for first and later executions, except for 3 "time=XXX" numbers, they
>>> are identical.
>>>
>>>
>> They are supposed to be identical unless something is really badly broken.
>>
>>
>>> I agree with you that PostgreSQL is doing different level of caching,
>>> I just wonder if there is any way to speed up PostgreSQL in this
>>> scenario,
>>>
>> This is what EXPLAIN ANALYZE for. Could you post the results please?
>>
>> Cheers,
>> Mike
>>
>>
>>


pgsql-performance by date:

Previous
From: Greg Stark
Date:
Subject: Re: cluster index on a table
Next
From: Scott Carey
Date:
Subject: Re: cluster index on a table