Thread: index usage (and foreign keys/triggers)
Hi gurus et al ;) I have a database with a couple of hundred tables. In this database one table, "person", represents a user. person.userid is a primary key. To this key there are about a hundred foreign keys spread out over aproximately equaly many tables. When deleting a user I noticed a quite big difference in time depending on how much data there are in the foreign key-tables. After some investigation I concluded that for some users and some tables the indices wheren't used when deleting, resulting in longer run-times. Here's an example: select count(*) from login; count ------- 96824 select count(*) from login where userid = 'patrik'; count ------- 608 select count(*) from login where userid = 'jennie'; count ------- 4211 explain delete from login where userid = 'patrik'; QUERY PLAN --------------------------------------------------------------------------------- Index Scan using login_userid_idx on login (cost=0.00..237.06 rows=61 width=6) Index Cond: (userid = 'patrik'::text) explain delete from login where userid = 'jennie'; QUERY PLAN ----------------------------------------------------------- Seq Scan on login (cost=0.00..2045.30 rows=3421 width=6) Filter: (userid = 'jennie'::text) What makes the planer choose seq scan for 'jennie', but not for 'patrik'? I also tested the following: delete from login where userid = 'jennie'; DELETE 4211 Time: 508.94 ms set enable_seqscan = false; delete from login where userid = 'jennie'; DELETE 4211 Time: 116.92 ms As you can see the index scan is almost 5 times faster, but still postgres chooses to seq scan... Am I doing something wrong or is postgres being stupid on me? select version(); version ------------------------------------------------------------------- PostgreSQL 7.3 on i386-portbld-freebsd4.7, compiled by GCC 2.95.4 I tried the same thing on a similar database running on 7.3.2 with the same results. Regards, Patrik Kudo
On Wed, 26 Feb 2003, Patrik Kudo wrote: > Hi gurus et al ;) > > I have a database with a couple of hundred tables. In this database one > table, "person", represents a user. person.userid is a primary key. To > this key there are about a hundred foreign keys spread out over > aproximately equaly many tables. When deleting a user I noticed a quite > big difference in time depending on how much data there are in the > foreign key-tables. After some investigation I concluded that for some > users and some tables the indices wheren't used when deleting, resulting > in longer run-times. > > Here's an example: > > select count(*) from login; > count > ------- > 96824 > > select count(*) from login where userid = 'patrik'; > count > ------- > 608 > > select count(*) from login where userid = 'jennie'; > count > ------- > 4211 > > explain delete from login where userid = 'patrik'; > QUERY PLAN > > --------------------------------------------------------------------------------- > Index Scan using login_userid_idx on login (cost=0.00..237.06 rows=61 > width=6) > Index Cond: (userid = 'patrik'::text) > > explain delete from login where userid = 'jennie'; > QUERY PLAN > ----------------------------------------------------------- > Seq Scan on login (cost=0.00..2045.30 rows=3421 width=6) > Filter: (userid = 'jennie'::text) > > > What makes the planer choose seq scan for 'jennie', but not for > 'patrik'? I also tested the following: Well at 3421 of 96824 it's estimating that the cost is lower, what's the explain look like with seqscan turned off (my guess'd be it's slightly higher cost). It's possible that random_page_cost should be lower, or that perhaps there's some level of clustering in the data that's not being picked up. You might want to try raising the number of statistics buckets and re-analyzing just to see if that helps.
Stephan Szabo wrote: >>explain delete from login where userid = 'jennie'; >> QUERY PLAN >>----------------------------------------------------------- >> Seq Scan on login (cost=0.00..2045.30 rows=3421 width=6) >> Filter: (userid = 'jennie'::text) >> > > Well at 3421 of 96824 it's estimating that the cost is lower, what's > the explain look like with seqscan turned off (my guess'd be it's > slightly higher cost). It's possible that random_page_cost should Yepp! You're right. The cost is higher: set enable_seqscan to off; explain delete from login where userid = 'jennie'; QUERY PLAN ------------------------------------------------------------------------------------ Index Scan using login_userid_idx on login (cost=0.00..3363.71 rows=4131 width=6) Index Cond: (userid = 'jennie'::text) If I lower the random_page_cost to about 2 the index is being used instead of seq scan. Is it reasonable to have such a setting on a production server? random_page_cost = 2 is good for this particular query, but could it have negative effect on other queries? > be lower, or that perhaps there's some level of clustering in the data > that's not being picked up. You might want to try raising the > number of statistics buckets and re-analyzing just to see if that helps. I'm afraid I'm a bit too new at this kind of tweaking... do you mean the "default_statistics_target"? In that case I tried to raise it from the default 10 to as high as 45, but without any other result than vacuum analyze being slower. Did I understand your suggestion right? Regards, Patrik Kudo
On Wed, 26 Feb 2003, Patrik Kudo wrote: > Stephan Szabo wrote: > >>explain delete from login where userid = 'jennie'; > >> QUERY PLAN > >>----------------------------------------------------------- > >> Seq Scan on login (cost=0.00..2045.30 rows=3421 width=6) > >> Filter: (userid = 'jennie'::text) > >> > > > > Well at 3421 of 96824 it's estimating that the cost is lower, what's > > the explain look like with seqscan turned off (my guess'd be it's > > slightly higher cost). It's possible that random_page_cost should > > Yepp! You're right. The cost is higher: > > set enable_seqscan to off; > explain delete from login where userid = 'jennie'; > QUERY PLAN > > ------------------------------------------------------------------------------------ > Index Scan using login_userid_idx on login (cost=0.00..3363.71 > rows=4131 width=6) > Index Cond: (userid = 'jennie'::text) > > If I lower the random_page_cost to about 2 the index is being used > instead of seq scan. Is it reasonable to have such a setting on a > production server? random_page_cost = 2 is good for this particular > query, but could it have negative effect on other queries? It's possible since it might make other queries use an index when the sequence scan is better. It's probably worth doing some testing with the setting. > > > be lower, or that perhaps there's some level of clustering in the data > > that's not being picked up. You might want to try raising the > > number of statistics buckets and re-analyzing just to see if that helps. > > I'm afraid I'm a bit too new at this kind of tweaking... do you mean the > "default_statistics_target"? In that case I tried to raise it from the > default 10 to as high as 45, but without any other result than vacuum > analyze being slower. Did I understand your suggestion right? I'd thought about doing it with ALTER TABLE ALTER COLUMN SET STATISTICS, but I would think that it would probably have worked with default as well. Is it possible that the data has local clustering on the field (many rows with the same value stuck together) while not being terribly ordered overall? That's a case that the statistics don't really cover right now (there have been some discussions of this in the past)
On Wed, 26 Feb 2003, Patrik Kudo wrote: > Hi gurus et al ;) > > I have a database with a couple of hundred tables. In this database one > table, "person", represents a user. person.userid is a primary key. To > this key there are about a hundred foreign keys spread out over > aproximately equaly many tables. When deleting a user I noticed a quite > big difference in time depending on how much data there are in the > foreign key-tables. After some investigation I concluded that for some > users and some tables the indices wheren't used when deleting, resulting > in longer run-times. > > Here's an example: > > select count(*) from login; > count > ------- > 96824 > > select count(*) from login where userid = 'patrik'; > count > ------- > 608 > > select count(*) from login where userid = 'jennie'; > count > ------- > 4211 > > explain delete from login where userid = 'patrik'; > QUERY PLAN > > --------------------------------------------------------------------------------- > Index Scan using login_userid_idx on login (cost=0.00..237.06 rows=61 > width=6) > Index Cond: (userid = 'patrik'::text) > > explain delete from login where userid = 'jennie'; > QUERY PLAN > ----------------------------------------------------------- > Seq Scan on login (cost=0.00..2045.30 rows=3421 width=6) > Filter: (userid = 'jennie'::text) > > > What makes the planer choose seq scan for 'jennie', but not for > 'patrik'? I also tested the following: > > delete from login where userid = 'jennie'; > DELETE 4211 > Time: 508.94 ms > > set enable_seqscan = false; > > delete from login where userid = 'jennie'; > DELETE 4211 > Time: 116.92 ms > > As you can see the index scan is almost 5 times faster, but still > postgres chooses to seq scan... Am I doing something wrong or is > postgres being stupid on me? Postgresql is being smart, just not smart enough. Imagine that one of your queries was to delete 99.9% of all the tuples. Would an index scan help then? Of course not, since you're going to visit nearly every row in the database. the planner uses several settings to try and figure out the cost of sequentially scanning a table versus index access, and it doesn't always get things right. Take a look at random_page_cost. It defaults to 4, which means that postgresql will make it's decisions on index versus seq scan assuming that random individual pages cost 4 times as much to get as a sequential scan that just happens to include them. On most modern machines the difference in cost is very low, what with disk caching and all. This is especially true for smaller tables that can fit in memory. Once a table's in buffer memory, along with it's index, random page cost will be about 1. I.e. a seq scan in memory or an index op are both quite fast. In fact, it is possible that at this point, a random page access for certain percentages of your table will cost you LESS than 1 in practice because linear access in memory yields little if any gain over random access. The only overhead is reading the index blocks. So, try tuning your random page cost down to somewhere between 1.0 and 2.0 for best performance on these kinds of things. Our database at work runs on a machine with 1.5 gig ram, of which 800 megs is cache, and postgresql has 256 meg shared buffer. It generally only hits the drives about 5% of the reads, so random page cost for us is set to 1.2 and works well. Welcome to the wonderful world of performance tuning...
Stephan Szabo wrote: >>If I lower the random_page_cost to about 2 the index is being used >>instead of seq scan. Is it reasonable to have such a setting on a >>production server? random_page_cost = 2 is good for this particular >>query, but could it have negative effect on other queries? > > > It's possible since it might make other queries use an index when the > sequence scan is better. It's probably worth doing some testing with the > setting. Ok. I'll do some testing and see what seems to work best for us. >>>be lower, or that perhaps there's some level of clustering in the data >>>that's not being picked up. You might want to try raising the >>>number of statistics buckets and re-analyzing just to see if that helps. >> >>I'm afraid I'm a bit too new at this kind of tweaking... do you mean the >>"default_statistics_target"? In that case I tried to raise it from the >>default 10 to as high as 45, but without any other result than vacuum >>analyze being slower. Did I understand your suggestion right? > > > I'd thought about doing it with ALTER TABLE ALTER COLUMN SET STATISTICS, > but I would think that it would probably have worked with default as well. What exactly does this setting do and how does it affect the planner/optimizer? I couldn't find much about this in the docs. > Is it possible that the data has local clustering on the field (many > rows with the same value stuck together) while not being terribly ordered > overall? That's a case that the statistics don't really cover right now > (there have been some discussions of this in the past) How can I find this out? A simple "select * from login" and just browse the result, or is there any automated way to analyze this? Thanks, Patrik Kudo
scott.marlowe wrote: > > > Postgresql is being smart, just not smart enough. > > Imagine that one of your queries was to delete 99.9% of all the tuples. > Would an index scan help then? Of course not, since you're going to visit > nearly every row in the database. > > the planner uses several settings to try and figure out the cost of > sequentially scanning a table versus index access, and it doesn't always > get things right. > > Take a look at random_page_cost. It defaults to 4, which means that > postgresql will make it's decisions on index versus seq scan assuming that > random individual pages cost 4 times as much to get as a sequential scan > that just happens to include them. > > On most modern machines the difference in cost is very low, what with disk > caching and all. This is especially true for smaller tables that can fit > in memory. Once a table's in buffer memory, along with it's index, random > page cost will be about 1. I.e. a seq scan in memory or an index op are > both quite fast. In fact, it is possible that at this point, a random > page access for certain percentages of your table will cost you LESS than > 1 in practice because linear access in memory yields little if any gain > over random access. The only overhead is reading the index blocks. > > So, try tuning your random page cost down to somewhere between 1.0 and 2.0 > for best performance on these kinds of things. Our database at work runs > on a machine with 1.5 gig ram, of which 800 megs is cache, and postgresql > has 256 meg shared buffer. It generally only hits the drives about 5% of > the reads, so random page cost for us is set to 1.2 and works well. Thanks for a good explanation! However, a setting lower than 2 seems a bit scary for me though. Our databases are quite large due to many large objects, in some cases around 4Gb, so all the data couldn't possible be cached all the time. The most frequently accessed tables however are fewer and smaller and would probably easily fit into the 1-2Gb RAM (that's the span we usually have on the servers). Any top of mind suggestions or reflections on tuning strategies? ;) > Welcome to the wonderful world of performance tuning... It's a jungle, but I love it! =) Regards, Patrik Kudo
On Thu, 27 Feb 2003, Patrik Kudo wrote: > Stephan Szabo wrote: > >>>be lower, or that perhaps there's some level of clustering in the data > >>>that's not being picked up. You might want to try raising the > >>>number of statistics buckets and re-analyzing just to see if that helps. > >> > >>I'm afraid I'm a bit too new at this kind of tweaking... do you mean the > >>"default_statistics_target"? In that case I tried to raise it from the > >>default 10 to as high as 45, but without any other result than vacuum > >>analyze being slower. Did I understand your suggestion right? > > > > > > I'd thought about doing it with ALTER TABLE ALTER COLUMN SET STATISTICS, > > but I would think that it would probably have worked with default as well. > > What exactly does this setting do and how does it affect the > planner/optimizer? I couldn't find much about this in the docs. It changes the number of entries that get stored in the statistics table, but IIRC in this case (potentially more importantly) also raises the number of rows that it grabs as its sample. > > Is it possible that the data has local clustering on the field (many > > rows with the same value stuck together) while not being terribly ordered > > overall? That's a case that the statistics don't really cover right now > > (there have been some discussions of this in the past) > > How can I find this out? A simple "select * from login" and just browse > the result, or is there any automated way to analyze this? There's probably a reasonable way to automate it, although I don't know if anyone's done it. Technically I think want you really want to know is for a given value of the search key how many pages of the database heap file can you find such rows on versus the total number of pages in the table. You might be able to estimate the number of distinct pages that live rows are taking by something like: select ctid, col from table order by col; and parsing it in perl (AFAIK the first part of the ctid column is the page). You can get the total number from vacuum verbose's output or an estimate from last vacuum (I think) from pg_class.
On Thu, 27 Feb 2003, Patrik Kudo wrote: > Thanks for a good explanation! However, a setting lower than 2 seems a > bit scary for me though. Our databases are quite large due to many large > objects, in some cases around 4Gb, so all the data couldn't possible be > cached all the time. The most frequently accessed tables however are > fewer and smaller and would probably easily fit into the 1-2Gb RAM > (that's the span we usually have on the servers). > Any top of mind suggestions or reflections on tuning strategies? ;) Well, my experience has been that accidentally picking an index lookup when a sequential scan would be better may cost up to twice as much as the seq scan, due to the slower random access. And usually these are queries you expect to be slow anyway, like something that selects 90% of a 10M row table. But, making the mistake the other way can be much more costly. I've watched queries that took 30 or more seconds with a seq scan drop to sub second with indexes. So, don't be afraid about dropping below 2. Just make sure it makes senese for your database. note that you can change random_page_cost on the fly as well, so you could do something like: begin; select * from table a; set random_page_cost=1.2; select * from table b where c='d'; set random_page_cost=3.0; ...
"scott.marlowe" <scott.marlowe@ihs.com> writes: > Well, my experience has been that accidentally picking an index lookup > when a sequential scan would be better may cost up to twice as much as the > seq scan, due to the slower random access. And usually these are queries > you expect to be slow anyway, like something that selects 90% of a 10M row > table. My experience is similar. Usually if it picks an index scan when a sequential scan would be better further experimentation shows the two are roughly the same anyways. Whereas when it fails the other way it can be very very bad for an OLTP system like a web server. > But, making the mistake the other way can be much more costly. I've > watched queries that took 30 or more seconds with a seq scan drop to sub > second with indexes. So, don't be afraid about dropping below 2. However, I've just tried something new and it seems to be helping. I tried raising cpu_tuple_cost instead. It now chooses nested loops with index lookups far more aggressively, even when random_page_cost isn't insanely low. I've actually raised cpu_tuple_cost to 0.1. That's a factor of 10 higher. That seems to be what is required to get the ratio of costs between nested loops and merge joins to line up with the ratio of actual times. I wonder if other people see similar behaviour? -- greg