Re: Optimization idea - Mailing list pgsql-performance

From Vlad Arkhipov
Subject Re: Optimization idea
Date
Msg-id 4BD10805.9060004@dc.baikal.ru
Whole thread Raw
In response to Re: Optimization idea  (Greg Smith <greg@2ndquadrant.com>)
Responses Re: Optimization idea
List pgsql-performance
Greg Smith пишет:
> Vlad Arkhipov wrote:
>> Please do this small optimization if it is possible. It seem that the
>> optimizer have the all information to create a fast plan but it does
>> not do that.
>
> This isn't strictly an optimization problem; it's an issue with
> statistics the optimizer has to work with, the ones ANALYZE computes.
> You noticed this yourself:
>
I don't think this is just an issue with statistics, because the same
problem arises when I try executing a query like this:

explain analyze
select *
from t2
where t2.t in (select 2 union select 3 union select 4) /* It works well
if there is only one row in the subquery */

"Hash Semi Join  (cost=0.17..2474.10 rows=60060 width=16) (actual
time=0.032..103.034 rows=400 loops=1)"
"  Hash Cond: (t2.t = (2))"
"  ->  Seq Scan on t2  (cost=0.00..1543.00 rows=100100 width=16) (actual
time=0.007..47.856 rows=100100 loops=1)"
"  ->  Hash  (cost=0.13..0.13 rows=3 width=4) (actual time=0.019..0.019
rows=3 loops=1)"
"        ->  HashAggregate  (cost=0.07..0.10 rows=3 width=0) (actual
time=0.013..0.015 rows=3 loops=1)"
"              ->  Append  (cost=0.00..0.06 rows=3 width=0) (actual
time=0.001..0.007 rows=3 loops=1)"
"                    ->  Result  (cost=0.00..0.01 rows=1 width=0)
(actual time=0.001..0.001 rows=1 loops=1)"
"                    ->  Result  (cost=0.00..0.01 rows=1 width=0)
(actual time=0.000..0.000 rows=1 loops=1)"
"                    ->  Result  (cost=0.00..0.01 rows=1 width=0)
(actual time=0.000..0.000 rows=1 loops=1)"
"Total runtime: 103.244 ms"

vs

explain analyze
select *
from t2
where t2.t in (2, 3, 4)

"Bitmap Heap Scan on t2  (cost=15.53..527.91 rows=357 width=16) (actual
time=0.068..0.255 rows=400 loops=1)"
"  Recheck Cond: (t = ANY ('{2,3,4}'::bigint[]))"
"  ->  Bitmap Index Scan on t_idx  (cost=0.00..15.44 rows=357 width=0)
(actual time=0.056..0.056 rows=400 loops=1)"
"        Index Cond: (t = ANY ('{2,3,4}'::bigint[]))"
"Total runtime: 0.445 ms"

I also tried setting columns' statistics to 10000, nothing happened.
PostgreSQL version is 8.4.2. It sounds good that there is no such issue
on PostgreSQL 9.0, i'll try it on the weekend.

pgsql-performance by date:

Previous
From: Eliot Gable
Date:
Subject: Re: Replacing Cursors with Temporary Tables
Next
From: Vlad Arkhipov
Date:
Subject: Re: Optimization idea