Re-2: BUG #7495: chosen wrong index - Mailing list pgsql-bugs

From psql@elbrief.de
Subject Re-2: BUG #7495: chosen wrong index
Date
Msg-id 1345112023.CaCf3400.11949@debian2
Whole thread Raw
In response to Re: BUG #7495: chosen wrong index  ("Kevin Grittner" <Kevin.Grittner@wicourts.gov>)
List pgsql-bugs
"Kevin Grittner" <Kevin.Grittner@wicourts.gov> wrote:
> insert into bla ( a , b )
>   select a , floor(random() * 1000000)   1
>   from generate_series( 1 , 1000000 ) as a ( a ) ;
>
> On my machine, with that data, all of the queries run fast.

Yes, this runs by me fast too. But here is no relation
between a and b. By my data psql must scan mostly all data
to find the rows.

This is only an example. In my live environment i have a table with
a boolean where the boolean is usually true. The boolean is
false on new or changed entrys and if i select the false-rows
order by primary key i get slow querys. The table has a lot of
million rows and a very small amount of rows with false. The
boolean indicates that this row has an entry in a second table
so i can select the rows without a join which is expensive too.

BTW: it would be nice to have an index wich works on
select * from a left join b on b.id = a.id where b.id is null.

explain select * from bla where b > 990000 order by a limit 10 ;
                                  QUERY PLAN
-------------------------------------------------------------------------------
 Limit  (cost=0.00..30.75 rows=10 width=8)
   ->  Index Scan using bla_a on bla  (cost=0.00..30747.29 rows=10000 width=8)
         Filter: (b > 990000)

drop index bla_a ;

explain select * from bla where b > 990000 order by a limit 10 ;
                                    QUERY PLAN
-----------------------------------------------------------------------------------
 Limit  (cost=633.50..633.52 rows=10 width=8)
   ->  Sort  (cost=633.50..658.50 rows=10000 width=8)
         Sort Key: a
         ->  Index Scan using bla_b on bla  (cost=0.00..417.40 rows=10000 width=8)
               Index Cond: (b > 990000)

The first explain reduce by the limit the cost by the faktor 10/10000.
This is good for data with no relation between a and b. But in my
example it is about 1000 times higher.

BTW: in the first explain this is not an Index Scan, it is an
sequential  scan on an index. It would be nice to show this in
the explain.

Perhaps it would be good to reduce the cost for the limit on
an sequential scan on an index not linear.


Best regards,
Andreas

pgsql-bugs by date:

Previous
From: Dave Page
Date:
Subject: Re: BUG #7485: 9.2 beta3 libxml2 can't be loaded on Windows
Next
From: psql@elbrief.de
Date:
Subject: Re-2: BUG #7495: chosen wrong index