Re: Use of Functional Indexs and Planner estimates - Mailing list pgsql-performance

From Scott Marlowe
Subject Re: Use of Functional Indexs and Planner estimates
Date
Msg-id 1086706451.27200.50.camel@localhost.localdomain
Whole thread Raw
In response to Use of Functional Indexs and Planner estimates  (Russell Smith <mr-russ@pws.com.au>)
Responses Re: Use of Functional Indexs and Planner estimates  (Greg Stark <gsstark@mit.edu>)
List pgsql-performance
On Tue, 2004-06-08 at 01:24, Russell Smith wrote:
> Dear All,
>
> I have a table with approximately 570k Rows.
>
>
>                   Table "filter.rules"
>   Column  |          Type          |               Modifiers
> ----------+------------------------+----------------------------------------
>  rulename | character varying(16)  | not null default ''::character varying
>  uri      | character varying(200) | not null default ''::character varying
>  redirect | character varying(200) | not null default ''::character varying
>  moddate  | date                   | not null default ('now'::text)::date
>  active   | boolean                | not null default true
>  comment  | character varying(255) | not null default ''::character varying
> Indexes:
>     "rules_pkey" primary key, btree (rulename, uri)
>     "moddate_idx" btree (moddate)
>     "rules_idx" btree (lower((uri)::text))
>
> Statistic on the uri column have been set 1000
> Vacuum full and analyze was run before tests, no alteration to tables since then.
>
> # analyze verbose filter.rules;
> INFO:  analyzing "filter.rules"
> INFO:  "rules": 5228 pages, 300000 rows sampled, 570533 estimated total rows
> ANALYZE
>
> # explain analyze SELECT rulename, redirect from filter.rules WHERE lower(uri)
IN(lower('land.com'),lower('com'),lower('land.com/'),lower('com/'))GROUP BY rulename,redirect; 
>                                                                                            QUERY PLAN
>
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
>  HashAggregate  (cost=22352.79..22352.79 rows=1 width=12) (actual time=2047.331..2047.332 rows=1 loops=1)
>    ->  Seq Scan on rules  (cost=0.00..22296.32 rows=11294 width=12) (actual time=540.149..2047.308 rows=1 loops=1)
>          Filter: ((lower((uri)::text) = 'land.com'::text) OR (lower((uri)::text) = 'com'::text) OR
(lower((uri)::text)= 'land.com/'::text) OR (lower((uri)::text) = 'com/'::text)) 
>  Total runtime: 2047.420 ms
> (4 rows)
>
> # SET enable_seqscan=off;
>
> # explain analyze SELECT rulename, redirect from filter.rules WHERE lower(uri)
IN(lower('land.com'),lower('com'),lower('land.com/'),lower('com/'))GROUP BY rulename,redirect; 
>                                                                                            QUERY PLAN
>
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
>  HashAggregate  (cost=38970.68..38970.68 rows=1 width=12) (actual time=0.328..0.328 rows=1 loops=1)
>    ->  Index Scan using rules_idx, rules_idx, rules_idx, rules_idx on rules  (cost=0.00..38914.21 rows=11294
width=12)(actual time=0.210..0.312 rows=1 loops=1) 
>          Index Cond: ((lower((uri)::text) = 'land.com'::text) OR (lower((uri)::text) = 'com'::text) OR
(lower((uri)::text)= 'land.com/'::text) OR (lower((uri)::text) = 'com/'::text)) 
>  Total runtime: 0.700 ms
> (4 rows)
>
> Could anybody offer explanations of why the planner does such a terrible job of estimated the number of rows for this
query,with the stats set so high. 
> Tests were also done with stats set to 100, and 1.  The results are exactly the same.  Which I would have assumed.

Simple, the planner is choosing a sequential scan when it should be
choosing an index scan.  This is usually because random_page_cost is set
too high, at the default of 4.  Try settings between 1.2 and 2.x or so
to see how that helps.  Be sure and test with various queries of your
own to be sure you've got about the right setting.


pgsql-performance by date:

Previous
From: Josué Maldonado
Date:
Subject: Re: Join slow on "large" tables
Next
From: Litao Wu
Date:
Subject: reindex and copy - deadlock?