Thread: again on index usage (7.1.3)
I had one more frustrating exprience with the 7.1.3 optimizer handling index/scan selection. Here is the schema RADIUS=# \d attrib Table "attrib"Attribute | Type | Modifier -----------+----------------+---------------------user_name | character(32) | not null default ''attr | character(32) | not null default ''value | character(128) | op | character(2) | Index: uattr RADIUS=# \d uattr Index "uattr"Attribute | Type -----------+---------------user_name | character(32)attr | character(32)op | character(2) btree (this is for use by gnu-radius). RADIUS=# select count(*) from attrib;count --------396117 (1 row) RADIUS=# select count(distinct user_name) from attrib;count -------62713 (1 row) each username has more or less the same number of attributes. SELECT * FROM attrib WHERE user_name = 'xyz'; always results in sequential scan. As you can see, there is sufficient number of different user_name values - why the sequential scan? Needless to say that turning off sequential scans results is measurably faster index scan. Daniel
On Tue, 12 Feb 2002, Daniel Kalchev wrote: > I had one more frustrating exprience with the 7.1.3 optimizer handling > index/scan selection. > > Here is the schema > > RADIUS=# \d attrib > Table "attrib" > Attribute | Type | Modifier > -----------+----------------+--------------------- > user_name | character(32) | not null default '' > attr | character(32) | not null default '' > value | character(128) | > op | character(2) | > Index: uattr > > RADIUS=# \d uattr > Index "uattr" > Attribute | Type > -----------+--------------- > user_name | character(32) > attr | character(32) > op | character(2) > btree > > > (this is for use by gnu-radius). > > RADIUS=# select count(*) from attrib; > count > -------- > 396117 > (1 row) > > RADIUS=# select count(distinct user_name) from attrib; > count > ------- > 62713 > (1 row) > > > each username has more or less the same number of attributes. > > SELECT * FROM attrib WHERE user_name = 'xyz'; > > always results in sequential scan. > > As you can see, there is sufficient number of different user_name values - why > the sequential scan? > > Needless to say that turning off sequential scans results is measurably faster > index scan. Let's start with the standard set of things. Have you vacuum analyzed, what does explain show for the query, is there one value that is more common than all others?
>>>Stephan Szabo said:> Do you have a single value that is much more common than the rest (say> approximately 170000 of therows?) It's estimating almost 18000 matching> rows, but I'm guessing that that's not a reasonable estimate. Not likely. There are no more than 30-40 attributes per user (this is another story, but I discuss there the PostgreSQL aspect, not RADIUS :). Entries with most rows have up to 35 rows. However, there is indeed an user_name entry with 179225 values! Somehow on these rows user_name is ''. Time to change my standard answer... Daniel
>>>Stephan Szabo said:> > Let's start with the standard set of things. Have you vacuum analyzed,> what does explain showfor the query, is there one value that is more> common than all others?> My most recent 'standard' answer these days is "it worked well before VACUUM ANALYZE" :-) RADIUS=# explain select * from attrib where user_name = 'Paacons'RADIUS-# ; NOTICE: QUERY PLAN: Seq Scan on attrib (cost=0.00..16978.46 rows=17922 width=48) EXPLAIN is what explain says by default. RADIUS=# set enable_seqscan='off'; SET VARIABLE RADIUS=# explain select * from attrib where user_name = 'Paacons'; NOTICE: QUERY PLAN: Index Scan using uattr on attrib (cost=0.00..32861.00 rows=17922 width=48) EXPLAIN Daniel
On Tue, 12 Feb 2002, Daniel Kalchev wrote: > >>>Stephan Szabo said: > > > > Let's start with the standard set of things. Have you vacuum analyzed, > > what does explain show for the query, is there one value that is more > > common than all others? > > > > My most recent 'standard' answer these days is "it worked well before VACUUM > ANALYZE" :-) Do you have a single value that is much more common than the rest (say approximately 170000 of the rows?) It's estimating almost 18000 matching rows, but I'm guessing that that's not a reasonable estimate. > > RADIUS=# explain > select * from attrib where user_name = 'Paacons'RADIUS-# ; > NOTICE: QUERY PLAN: > > Seq Scan on attrib (cost=0.00..16978.46 rows=17922 width=48) > > EXPLAIN > > is what explain says by default. > > RADIUS=# set enable_seqscan='off'; > SET VARIABLE > > RADIUS=# explain > select * from attrib where user_name = 'Paacons'; > NOTICE: QUERY PLAN: > > Index Scan using uattr on attrib (cost=0.00..32861.00 rows=17922 width=48) > > EXPLAIN > > Daniel >
Daniel Kalchev <daniel@digsys.bg> writes: > However, there is indeed an user_name entry with 179225 values! Somehow on > these rows user_name is ''. > Time to change my standard answer... No, time to update to 7.2. 7.2 doesn't get fooled by single values that are vastly more common than anything else. regards, tom lane