Thread: Functional Indices
Hi all, The pg manual, chapter 7 : "For example, a common way to do case-insensitive comparisons is to use the lower: SELECT * FROM test1 WHERE lower(col1) = 'value'; In order for that query to be able to use an index, it has to be defined on the result of the lower(column) operation: CREATE INDEX test1_lower_col1_idx ON test1 (lower(col1));" I have a table like this : \d titles Table "titles" Attribute | Type | Modifier -----------+------------------------+---------------------------------------------- id | integer | not null default nextval('titles_seq'::text) issn | character(9) | not null tag | integer | not null prefix | character varying(32) | title | character varying(640) | not null Indices: issn, prefix, tag, create index lower_title on titles (lower(title)); vacuum analyze; ... explain select * from titles where lower(title) = 'monde'; Seq Scan on titles (cost=0.00..39392.10 rows=14145 width=44) Why it does not use the index ? PGSQL 7.1.1 on Suse Linux 7.1 thx
kavoos writes: > create index lower_title on titles (lower(title)); > vacuum analyze; > ... > explain select * from titles where lower(title) = 'monde'; > Seq Scan on titles (cost=0.00..39392.10 rows=14145 width=44) > > Why it does not use the index ? The planner estimates that this query returns 14145 rows. If this is more than a small fraction of the rows in this table then a sequential scan is better than an index scan. The questions are, how many rows does this query really return, and how many rows are in this table. Btw., this is discussed in the manual under performance tips. -- Peter Eisentraut peter_e@gmx.net http://funkturm.homeip.net/~peter
On Mon, 21 May 2001, kavoos wrote: > Hi all, > > > The pg manual, chapter 7 : > "For example, a common way to do case-insensitive comparisons is to use > the lower: SELECT * FROM test1 WHERE lower(col1) = 'value'; > In order for that query to be able to use an index, it has to be defined > on the result of the lower(column) operation: > CREATE INDEX test1_lower_col1_idx ON test1 (lower(col1));" > > I have a table like this : > \d titles > Table "titles" > Attribute | Type | Modifier > -----------+------------------------+---------------------------------------------- > id | integer | not null default > nextval('titles_seq'::text) > issn | character(9) | not null > tag | integer | not null > prefix | character varying(32) | > title | character varying(640) | not null > Indices: issn, > prefix, > tag, > > > create index lower_title on titles (lower(title)); > vacuum analyze; > ... > explain select * from titles where lower(title) = 'monde'; > Seq Scan on titles (cost=0.00..39392.10 rows=14145 width=44) How many rows are in titles? It seems to estimate 14000+ rows are going to match. If that's true, sequence scan may be a better plan than the index. Or, perhaps, do you have a very common title value that's throwing off the statistics?
On Tue, May 22, 2001 at 09:43:02PM +0200, Peter Eisentraut wrote: > kavoos writes: > > > create index lower_title on titles (lower(title)); > > vacuum analyze; > > ... > > explain select * from titles where lower(title) = 'monde'; > > Seq Scan on titles (cost=0.00..39392.10 rows=14145 width=44) > > > > Why it does not use the index ? > > The planner estimates that this query returns 14145 rows. If this is more > than a small fraction of the rows in this table then a sequential scan is > better than an index scan. > > The questions are, how many rows does this query really return, and how > many rows are in this table. Btw., this is discussed in the manual under > performance tips. In our database there are several cases where postgres incorrectly chooses a sequential scan. The table in question is 250MB (1.2 million rows) so a sequential scan takes a *long* time. The disk read speed is 13MB/s so it takes around 20 seconds to scan the whole table. Now the average number of rows per instance of the primary key is around 470 but it can go as high as 16,000. But I know something that postgres doesn't. The data is clustered somewhat around the id we're comparing on. There is a "runlength" involved. Thus, when doing an index scan, once it has read in the first tuple of a run there will be more just sitting in the cache at basically zero cost. I wrote a program to get the distribution of runlengths for each field in the table. What it found was that for only 0.6% or rows did that ID appear alone. Over 95% of rows are in runs of 10 or more. 80% are 50 or more. This means that Postgres vastly overestimates the cost of an index scan. That's just one example. We have one column where 80% is 200 or more. No, I'm not clustering the data intentionally. It's just a product of the way our system processes data. Currently I work around this by fiddling enable_seqscan is strategic places but that's blunt instrument. It affects all tables, not just that one. It occured to me to edit the values in pg_statistic directly as well. If anyones interested I can post the program and the actual output. -- Martijn van Oosterhout <kleptog@svana.org> http://svana.org/kleptog/
Martijn van Oosterhout <kleptog@svana.org> writes: > But I know something that postgres doesn't. The data is clustered somewhat > around the id we're comparing on. There is a "runlength" involved. Thus, > when doing an index scan, once it has read in the first tuple of a run there > will be more just sitting in the cache at basically zero cost. Hm. There is code in development sources that will notice data clustering --- actually what it looks at is the correlation between physical tuple order and logical order of the values. I'm not sure whether it would do the right thing if you have runs of identical keys but no overall ordering of the keys, however. > Currently I work around this by fiddling enable_seqscan is strategic places > but that's blunt instrument. Yup, it sure is. Can you propose a statistical measurement that would cope with this situation? regards, tom lane
On Wed, May 23, 2001 at 10:48:35AM -0400, Tom Lane wrote: > Martijn van Oosterhout <kleptog@svana.org> writes: > > But I know something that postgres doesn't. The data is clustered somewhat > > around the id we're comparing on. There is a "runlength" involved. Thus, > > when doing an index scan, once it has read in the first tuple of a run there > > will be more just sitting in the cache at basically zero cost. > > Hm. There is code in development sources that will notice data > clustering --- actually what it looks at is the correlation between > physical tuple order and logical order of the values. I'm not sure > whether it would do the right thing if you have runs of identical keys > but no overall ordering of the keys, however. That wouldn't help much in this case as outside of the actual grouping, the keys are not sorted against eachother. > > Currently I work around this by fiddling enable_seqscan is strategic places > > but that's blunt instrument. > > Yup, it sure is. Can you propose a statistical measurement that would > cope with this situation? I was thinking "average runlength". If this were 10 for example, when it came to calculating the cost of the index scan, it would divide the per-tuple cost by 10. You can go the simple calculation method which would count the number of times the value in a column was different than the previous value, then divide that into the total number of tuples. That's not difficult to implement. I was actually thinking of editting the ANALYZE code to calculate this value but after looking at the code I decided I would need to study the codebase a bit first. There is another way which would allow you to calculate a value that you can tune. It works something like this (this is for only one column): counter = 1 lastval = tuple[1].value foreach remaining tuple if tuple.value != lastval histogram[counter]++ lastval = tuple.value counter = 1 else counter++ You than have a histogram of runlengths. Then, from the highest runlength to the lowest, do a cumulative sum of number of tuples compared against the total. You can choose where to take the value at the 50% mark, 80% mark or even 95%. Note that I have absolutly no theory to base this on, only that it seems silly not to take advantage of any clustering present in the data. For a perl script the calculate the histogram, go to: http://svana.org/kleptog/getcoherency.pl Use like: ./getcoherency dbname tablename fieldnames... > output For example output, go to: http://svana.org/kleptog/output.txt Hope this helps, -- Martijn van Oosterhout <kleptog@svana.org> http://svana.org/kleptog/ - Every night you must rediscover the secret to sleep, - only to forget it moments later...
Martijn van Oosterhout <kleptog@svana.org> writes: > I was thinking "average runlength". If this were 10 for example, when it > came to calculating the cost of the index scan, it would divide the > per-tuple cost by 10. > You can go the simple calculation method which would count the number of > times the value in a column was different than the previous value, then > divide that into the total number of tuples. That's not difficult to > implement. Unfortunately, it is difficult to implement, in fact impossible, given the new sampling-based implementation of ANALYZE. You could only discover that runs of identical keys exist if you were willing to examine every row, not just a statistical sample. Since this seems a rather specialized situation, I'm not eager to pay that high a price to recognize it ... regards, tom lane
On Wed, May 23, 2001 at 01:22:41PM -0400, Tom Lane wrote: > Martijn van Oosterhout <kleptog@svana.org> writes: > > I was thinking "average runlength". If this were 10 for example, when it > > came to calculating the cost of the index scan, it would divide the > > per-tuple cost by 10. > > > You can go the simple calculation method which would count the number of > > times the value in a column was different than the previous value, then > > divide that into the total number of tuples. That's not difficult to > > implement. > > Unfortunately, it is difficult to implement, in fact impossible, given > the new sampling-based implementation of ANALYZE. You could only > discover that runs of identical keys exist if you were willing to > examine every row, not just a statistical sample. Ouch! You're right. With such an implementation it's impossible. > Since this seems a rather specialized situation, I'm not eager to pay > that high a price to recognize it ... I'm not sure how common this is (long runs in a foreign key column) and it's probably not worth it in the general case. So, is there a column in pg_statistic where I can twiddle the per-tuple index-scan cost? If so then my own program can fill in the value. In my case 2 hours spent scanning at 4am is worth 20 seconds per query during the day. I suppose it's unlikely that there will be a VACUUM ANALYZE EVERYTHING? Doesn't matter I guess. Fiddling enable_seqscan makes everything mostly right. We'd get better results with partial indexes anyway I think. Maybe I should look at that some more. Anyway, thank for listening. -- Martijn van Oosterhout <kleptog@svana.org> http://svana.org/kleptog/
Martijn van Oosterhout <kleptog@svana.org> writes: > I'm not sure how common this is (long runs in a foreign key column) and it's > probably not worth it in the general case. So, is there a column in > pg_statistic where I can twiddle the per-tuple index-scan cost? You could stick a phony value into the correlation datum. > I suppose it's unlikely that there will be a VACUUM ANALYZE EVERYTHING? The current code wants to see sorted samples. You could feed it a complete sorted input for moderate-sized tables, but this doesn't sound like a recipe that scales... > We'd get better results with partial indexes anyway I think. I'd like to see the partial-index support cranked up again, for sure. But how does that solve your problem? I don't see the connection. regards, tom lane
On Wed, May 23, 2001 at 10:40:38PM -0400, Tom Lane wrote: > Martijn van Oosterhout <kleptog@svana.org> writes: > > I'm not sure how common this is (long runs in a foreign key column) and it's > > probably not worth it in the general case. So, is there a column in > > pg_statistic where I can twiddle the per-tuple index-scan cost? > > You could stick a phony value into the correlation datum. Ah, that would do it. Would need to experiment. Is this in 7.1 or 7.2? > > We'd get better results with partial indexes anyway I think. > > I'd like to see the partial-index support cranked up again, for sure. > But how does that solve your problem? I don't see the connection. Because the queries are of the form ID1 = 'xxxx' and ID2 is null. So, improving the index scan would make it use the index for the first clause and scan for the second (nulls don't appear in indicies, right?) With a partial index on the ID2 is null clause it could scan that index and look for tuples with match the first. As indicated on that paper linked to from the documentation, it also gives a hints to the database where most of the queries are likely to be directed at. The first clause matches about 0.04% of rows, the second about 5%. Most of the time it's fine but when you start summerising data so you need to scan many of ID1 it decides to start using an sequential scan. Look at the estimates for the sequential and index scan: Seq Scan on dailycalls (cost=0.00..46352.20 rows=1630 width=139) Index Scan using dailycalls_clid on dailycalls (cost=0.00..6279.75 rows=1630 width=139) Yet I can tell you that empirical evidence suggests that the index scan is at least 50 times faster than the sequential scan (10 seconds to less than 0.2 seconds). Does the planner take into account that since the total size of the database is less than the total amount of memory, a lot of the database is likely to be cached? Talking about statistics, is there anywhere that counts the number of times an index has been used? So I can check to see if all my indicies are worthwhile. Thanks for listening, -- Martijn van Oosterhout <kleptog@svana.org> http://svana.org/kleptog/
Martijn van Oosterhout <kleptog@svana.org> writes: >> You could stick a phony value into the correlation datum. > Ah, that would do it. Would need to experiment. Is this in 7.1 or 7.2? Current development sources (7.2-to-be). >> I'd like to see the partial-index support cranked up again, for sure. >> But how does that solve your problem? I don't see the connection. > Because the queries are of the form ID1 = 'xxxx' and ID2 is null. So, > improving the index scan would make it use the index for the first clause > and scan for the second (nulls don't appear in indicies, right?) btrees do actually store nulls, but no one's gotten around to fixing the rest of the index machinery to make an IS NULL clause indexable. I don't think there's any reason it couldn't work, just needs someone to go through the code and make it happen. > With a partial index on the ID2 is null clause it could scan that index and > look for tuples with match the first. Hm. The old index-predicate-check code is pretty limited as to what it can recognize. Again, I'm not sure that an IS NULL clause would be recognized as implying the partial-index condition --- at least not without work. But hey, if you can fix the partial-index code at all, that shouldn't stop you ;-) Offhand I'd say that teaching the index machinery to allow IS [NOT] NULL to be indexable would be a better answer to your problem for less work. A short-term answer would be to add a boolean column that could be used in place of the NULL test on ID2, and to index (ID1,boolean) in place of what you're doing now. > Does the planner take into account that since the total size of the database > is less than the total amount of memory, a lot of the database is likely to > be cached? It tries. Whether the cost model for this has anything to do with the real behavior of your system is another question. > Talking about statistics, is there anywhere that counts the number of times > an index has been used? So I can check to see if all my indicies are > worthwhile. Not at the moment, but I think Jan has some plans for a statistics module that might be able to do that. regards, tom lane
mordicus wrote: i forgot, the titles don't have common values, or very little thx
Stephan Szabo wrote: >> explain select * from titles where lower(title) = 'monde'; >> Seq Scan on titles (cost=0.00..39392.10 rows=14145 width=44) > > How many rows are in titles? It seems to estimate 14000+ > rows are going to match. If that's true, sequence scan may > be a better plan than the index. Or, perhaps, do you have > a very common title value that's throwing off the statistics? > Hello, register=# select count(title) from titles; count --------- 1414473 (1 row) I have solved the probleme by setting enable_seqscan to false and now it use index, but i don't understand why it choose to do a seq scan. thanks Bojnourdi Kaikavous
On Tue, 22 May 2001, mordicus wrote: > Stephan Szabo wrote: > >> explain select * from titles where lower(title) = 'monde'; > >> Seq Scan on titles (cost=0.00..39392.10 rows=14145 width=44) > > > > How many rows are in titles? It seems to estimate 14000+ > > rows are going to match. If that's true, sequence scan may > > be a better plan than the index. Or, perhaps, do you have > > a very common title value that's throwing off the statistics? > > > Hello, > > register=# select count(title) from titles; > count > --------- > 1414473 > (1 row) > > I have solved the probleme by setting enable_seqscan to false and now it > use index, but i don't understand why it choose to do a seq scan. It was probably estimating the cost of the non-sequential reads to get those 14000 rows to be larger than the sequential reads on the table. Postgres still needs to go to the heap file for things that meet the criteria in the index in order to see if the row is visible to the current transaction. My guess would be that: select count(*) from titles where lower(title) = 'monde' returns something much lower than 14000, is that correct?