Thread: Re: Indexes not used

Re: Indexes not used

From

Tom Lane

Date:

15 March 2001, 15:56:07

"D. Duccini" <duccini@backpack.com> writes:
> # select count(*) from radusage;
>  count
> --------
>  573042
> (1 row)

In that case 5757 is definitely a default estimate (.01 is the default
selectivity IIRC).

> what is the analyze?  i've run vacuum several times

VACUUM ANALYZE

            regards, tom lane

Re: Indexes not used

From

"D. Duccini"

Date:

15 March 2001, 16:01:12

> > what is the analyze?  i've run vacuum several times
>
> VACUUM ANALYZE

i've done this as well, but it gives me no output other than "VACUUM"

or does it work silently in the background?

i've even dropped the index and recreated, but it didn't have any impact
on its use

-----------------------------------------------------------------------------
david@backpack.com            BackPack Software, Inc.        www.backpack.com
+1 651.645.7550 voice       "Life is an Adventure.
+1 651.645.9798 fax            Don't forget your BackPack!"
-----------------------------------------------------------------------------

Re: Indexes not used

From

Tom Lane

Date:

15 March 2001, 16:16:55

"D. Duccini" <duccini@backpack.com> writes:
>> VACUUM ANALYZE

> i've done this as well, but it gives me no output other than "VACUUM"

That's what it's supposed to do.

> or does it work silently in the background?

If you want noise, try VACUUM VERBOSE ANALYZE.


Anyway: don't the EXPLAIN numbers change once you've done VACUUM
ANALYZE?  How many rows are there matching 'someuser', anyhow?

It might be useful to see the planner's statistics, too -- try

select attname,attdisbursion,s.*
from pg_statistic s, pg_attribute a, pg_class c
where starelid = c.oid and attrelid = c.oid and staattnum = attnum
and relname = 'FOO';

(substitute name of interesting table for FOO)

            regards, tom lane

Re: Indexes not used

From

David Olbersen

Date:

15 March 2001, 18:37:05

Just as an example, here's the query plan of the *SAME* query before and after a
VACUUM ANALYZE

Notice the way the two plans are *COMPLETELY* different. l_portal_statuses and
b_portal_statuses only have *3 rows* right now, but there was no way for the
planner to know that. Anyway, this should be evidence that a good VACUUM ANALYZE
periodically is a Good Thing(tm).

(BTW, the site that this database drives is now significantly more responsive)

Before:
----------------------------------------
Merge Join  (cost=97.62..170.37 rows=1000 width=110)
  ->  Index Scan using l_portal_statuses_pkey on l_portal_statuses lps (cost=0.00..59.00 rows=1000 width=16)
  ->  Sort  (cost=97.62..97.62 rows=100 width=94)
    ->  Merge Join  (cost=22.67..94.30 rows=100 width=94)
      ->  Index Scan using b_portal_statuses_pkey on b_portal_statuses bps  (cost=0.00..59.00 rows=1000 width=16)
      ->  Sort  (cost=22.67..22.67 rows=10 width=78)
        ->  Seq Scan on contracts c  (cost=0.00..22.50 rows=10 width=78)

After:
----------------------------------------
Nested Loop  (cost=0.00..3.47 rows=1 width=110)
  ->  Nested Loop  (cost=0.00..2.40 rows=1 width=94)
    ->  Seq Scan on contracts c  (cost=0.00..1.34 rows=1 width=78)
    ->  Seq Scan on b_portal_statuses bps  (cost=0.00..1.03 rows=3 width=16)
  ->  Seq Scan on l_portal_statuses lps  (cost=0.00..1.03 rows=3 width=16)

-- Dave

Re: Indexes not used

From

"D. Duccini"

Date:

16 March 2001, 09:27:29

Maybe I'm not getting something here...but how is a sequential scan EVER
faster than a B-tree / index lookup on a database with over 500,000
records?

Certainly I could split out the data, and do some "roll-up" ops on the
information in there, it just seems odd that in 6.5.x it was using the
indices and was blazing fast

Now in 7.0.3 its like they are not even considered...at least on this
particular table....other tables they seem to be working

On Thu, 15 Mar 2001, David Olbersen wrote:

> Just as an example, here's the query plan of the *SAME* query before and after a
> VACUUM ANALYZE
>
> Notice the way the two plans are *COMPLETELY* different. l_portal_statuses and
> b_portal_statuses only have *3 rows* right now, but there was no way for the
> planner to know that. Anyway, this should be evidence that a good VACUUM ANALYZE
> periodically is a Good Thing(tm).
>
> (BTW, the site that this database drives is now significantly more responsive)
>
> Before:
> ----------------------------------------
> Merge Join  (cost=97.62..170.37 rows=1000 width=110)
>   ->  Index Scan using l_portal_statuses_pkey on l_portal_statuses lps (cost=0.00..59.00 rows=1000 width=16)
>   ->  Sort  (cost=97.62..97.62 rows=100 width=94)
>     ->  Merge Join  (cost=22.67..94.30 rows=100 width=94)
>       ->  Index Scan using b_portal_statuses_pkey on b_portal_statuses bps  (cost=0.00..59.00 rows=1000 width=16)
>       ->  Sort  (cost=22.67..22.67 rows=10 width=78)
>         ->  Seq Scan on contracts c  (cost=0.00..22.50 rows=10 width=78)
>
> After:
> ----------------------------------------
> Nested Loop  (cost=0.00..3.47 rows=1 width=110)
>   ->  Nested Loop  (cost=0.00..2.40 rows=1 width=94)
>     ->  Seq Scan on contracts c  (cost=0.00..1.34 rows=1 width=78)
>     ->  Seq Scan on b_portal_statuses bps  (cost=0.00..1.03 rows=3 width=16)
>   ->  Seq Scan on l_portal_statuses lps  (cost=0.00..1.03 rows=3 width=16)
>
> -- Dave
>
>
> ---------------------------(end of broadcast)---------------------------
> TIP 1: subscribe and unsubscribe commands go to majordomo@postgresql.org
>


-----------------------------------------------------------------------------
david@backpack.com            BackPack Software, Inc.        www.backpack.com
+1 651.645.7550 voice       "Life is an Adventure.
+1 651.645.9798 fax            Don't forget your BackPack!"
-----------------------------------------------------------------------------

Re: Indexes not used

From

Tom Lane

Date:

16 March 2001, 11:21:57

"D. Duccini" <duccini@backpack.com> writes:
> Maybe I'm not getting something here...but how is a sequential scan EVER
> faster than a B-tree / index lookup on a database with over 500,000
> records?

If the system needs to fetch more than a small percentage of the
records, then seqscan *will* be faster.  The issue you are dealing
with seems to be misestimation of the retrieval percentage for this
particular query, causing the planner to guess wrong about which
kind of plan to use.

            regards, tom lane

Re: Indexes not used

From

"D. Duccini"

Date:

16 March 2001, 11:30:15

> If the system needs to fetch more than a small percentage of the
> records, then seqscan *will* be faster.  The issue you are dealing
> with seems to be misestimation of the retrieval percentage for this
> particular query, causing the planner to guess wrong about which
> kind of plan to use.

no worries...i'll try building a subset of the data and see if there is
some "threshhold" value

or...maybe its time i actually contributed some code to the project :)

i built an OO database engine a few years ago (in objective-c) that used a
modified N-tree approach to indicies that massively accelerated the
retrieval of a lot of "highly similar" data items

-duck

-----------------------------------------------------------------------------
david@backpack.com            BackPack Software, Inc.        www.backpack.com
+1 651.645.7550 voice       "Life is an Adventure.
+1 651.645.9798 fax            Don't forget your BackPack!"
-----------------------------------------------------------------------------

voodoo index usage ;)

From

"D. Duccini"

Date:

17 March 2001, 14:37:00

perhaps the db gawds can explain this....

# \d radusage
          Table "radusage"
 Attribute |    Type     | Modifier
-----------+-------------+----------
 datetime  | timestamp   | not null
 account   | varchar(64) |
 usage     | integer     |
 sent      | integer     |
 recv      | integer     |
 ip        | bigint      |
 host      | bigint      |
 port      | smallint    |
Indices: idxradaccount,
         idxraddate,
         idxradoid

# \d idxradaccount
  Index "idxradaccount"
 Attribute |    Type
-----------+-------------
 account   | varchar(64)
btree

# \d idxraddate
  Index "idxraddate"
 Attribute |   Type
-----------+-----------
 datetime  | timestamp
btree

# explain select * from radusage where account = 'someuser';
NOTICE:  QUERY PLAN:

Seq Scan on radusage  (cost=0.00..13870.80 rows=5674 width=50)

and if i add in datetime (without effectively changing the semantic
meaning of the search)

# explain select * from radusage where account = 'someuser' and datetime >
'1900-01-01';
NOTICE:  QUERY PLAN:

Index Scan using idxradaccount on radusage  (cost=0.00..15295.37 rows=5668
width=50)


first case doesn't use the index, the second does use what would seem to
be the correct index

isn't that wacky????

-duck

-----------------------------------------------------------------------------
david@backpack.com            BackPack Software, Inc.        www.backpack.com
+1 651.645.7550 voice       "Life is an Adventure.
+1 651.645.9798 fax            Don't forget your BackPack!"
-----------------------------------------------------------------------------

Re: voodoo index usage ;)

From

Tom Lane

Date:

17 March 2001, 15:15:43

"D. Duccini" <duccini@backpack.com> writes:
> # explain select * from radusage where account = 'someuser';
> NOTICE:  QUERY PLAN:

> Seq Scan on radusage  (cost=0.00..13870.80 rows=5674 width=50)

> and if i add in datetime (without effectively changing the semantic
> meaning of the search)

> # explain select * from radusage where account = 'someuser' and datetime >
> '1900-01-01';
> NOTICE:  QUERY PLAN:

> Index Scan using idxradaccount on radusage  (cost=0.00..15295.37 rows=5668
> width=50)

You could get more information by looking at the estimated cost of the
other alternative in each case (do SET ENABLE_SEQSCAN = OFF or SET
ENABLE_INDEXSCAN = OFF, respectively, to force the planner to choose the
other alternative).  I bet you'll find that the estimated costs are
pretty close together.  What's probably happening here is that the small
extra cost estimated for evaluating the "datetime > '1900-01-01'"
condition at each row is pushing the cost of the seqscan up to be more
than the cost of the indexscan.  That extra cost gets charged for every
row in the table in the seqscan case, but only for those rows pulled
from the index in the indexscan case, so adding extra WHERE conditions
favors the indexscan case.  Not by a lot, but evidently by enough in
this example.

            regards, tom lane