Thread: again on index usage (7.1.3)

again on index usage (7.1.3)

From
Daniel Kalchev
Date:
I had one more frustrating exprience with the 7.1.3 optimizer handling 
index/scan selection.

Here is the schema

RADIUS=# \d attrib                 Table "attrib"Attribute |      Type      |      Modifier       
-----------+----------------+---------------------user_name | character(32)  | not null default ''attr      |
character(32) | not null default ''value     | character(128) | op        | character(2)   | 
 
Index: uattr

RADIUS=# \d uattr      Index "uattr"Attribute |     Type      
-----------+---------------user_name | character(32)attr      | character(32)op        | character(2)
btree


(this is for use by gnu-radius).

RADIUS=# select count(*) from attrib;count  
--------396117
(1 row)

RADIUS=# select count(distinct user_name) from attrib;count 
-------62713
(1 row)


each username has more or less the same number of attributes.

SELECT * FROM attrib WHERE user_name = 'xyz';

always results in sequential scan.

As you can see, there is sufficient number of different user_name values - why 
the sequential scan?

Needless to say that turning off sequential scans results is measurably faster 
index scan.

Daniel



Re: again on index usage (7.1.3)

From
Stephan Szabo
Date:
On Tue, 12 Feb 2002, Daniel Kalchev wrote:

> I had one more frustrating exprience with the 7.1.3 optimizer handling
> index/scan selection.
>
> Here is the schema
>
> RADIUS=# \d attrib
>                   Table "attrib"
>  Attribute |      Type      |      Modifier
> -----------+----------------+---------------------
>  user_name | character(32)  | not null default ''
>  attr      | character(32)  | not null default ''
>  value     | character(128) |
>  op        | character(2)   |
> Index: uattr
>
> RADIUS=# \d uattr
>        Index "uattr"
>  Attribute |     Type
> -----------+---------------
>  user_name | character(32)
>  attr      | character(32)
>  op        | character(2)
> btree
>
>
> (this is for use by gnu-radius).
>
> RADIUS=# select count(*) from attrib;
>  count
> --------
>  396117
> (1 row)
>
> RADIUS=# select count(distinct user_name) from attrib;
>  count
> -------
>  62713
> (1 row)
>
>
> each username has more or less the same number of attributes.
>
> SELECT * FROM attrib WHERE user_name = 'xyz';
>
> always results in sequential scan.
>
> As you can see, there is sufficient number of different user_name values - why
> the sequential scan?
>
> Needless to say that turning off sequential scans results is measurably faster
> index scan.

Let's start with the standard set of things.  Have you vacuum analyzed,
what does explain show for the query, is there one value that is more
common than all others?




Re: again on index usage (7.1.3)

From
Daniel Kalchev
Date:
>>>Stephan Szabo said:> Do you have a single value that is much more common than the rest (say> approximately 170000 of
therows?)  It's estimating almost 18000 matching> rows, but I'm guessing that that's not a reasonable estimate.
 

Not likely. There are no more than 30-40 attributes per user (this is another 
story, but I discuss there the PostgreSQL aspect, not RADIUS :). Entries with 
most rows have up to 35 rows.

However, there is indeed an user_name entry with 179225 values! Somehow on 
these rows user_name is ''.

Time to change my standard answer...

Daniel



Re: again on index usage (7.1.3)

From
Daniel Kalchev
Date:
>>>Stephan Szabo said:> > Let's start with the standard set of things.  Have you vacuum analyzed,> what does explain
showfor the query, is there one value that is more> common than all others?> 
 

My most recent 'standard' answer these days is "it worked well before VACUUM 
ANALYZE" :-)

RADIUS=# explain
select * from attrib where user_name = 'Paacons'RADIUS-# ;
NOTICE:  QUERY PLAN:

Seq Scan on attrib  (cost=0.00..16978.46 rows=17922 width=48)

EXPLAIN

is what explain says by default.

RADIUS=# set enable_seqscan='off';
SET VARIABLE

RADIUS=# explain
select * from attrib where user_name = 'Paacons';
NOTICE:  QUERY PLAN:

Index Scan using uattr on attrib  (cost=0.00..32861.00 rows=17922 width=48)

EXPLAIN

Daniel



Re: again on index usage (7.1.3)

From
Stephan Szabo
Date:
On Tue, 12 Feb 2002, Daniel Kalchev wrote:

> >>>Stephan Szabo said:
>  >
>  > Let's start with the standard set of things.  Have you vacuum analyzed,
>  > what does explain show for the query, is there one value that is more
>  > common than all others?
>  >
>
> My most recent 'standard' answer these days is "it worked well before VACUUM
> ANALYZE" :-)

Do you have a single value that is much more common than the rest (say
approximately 170000 of the rows?)  It's estimating almost 18000 matching
rows, but I'm guessing that that's not a reasonable estimate.

>
> RADIUS=# explain
> select * from attrib where user_name = 'Paacons'RADIUS-# ;
> NOTICE:  QUERY PLAN:
>
> Seq Scan on attrib  (cost=0.00..16978.46 rows=17922 width=48)
>
> EXPLAIN
>
> is what explain says by default.
>
> RADIUS=# set enable_seqscan='off';
> SET VARIABLE
>
> RADIUS=# explain
> select * from attrib where user_name = 'Paacons';
> NOTICE:  QUERY PLAN:
>
> Index Scan using uattr on attrib  (cost=0.00..32861.00 rows=17922 width=48)
>
> EXPLAIN
>
> Daniel
>



Re: again on index usage (7.1.3)

From
Tom Lane
Date:
Daniel Kalchev <daniel@digsys.bg> writes:
> However, there is indeed an user_name entry with 179225 values! Somehow on 
> these rows user_name is ''.

> Time to change my standard answer...

No, time to update to 7.2.  7.2 doesn't get fooled by single values that
are vastly more common than anything else.
        regards, tom lane