Thread: Recommendations for set statistics
Hello, I could not find any recommandations for the level of set statistics and what a specific level does actually mean. What is the difference between 1, 50 and 100? What is recommanded for a table or column? -- Kind Regards / Viele Grüße Sebastian Hennebrueder ----- http://www.laliluna.de/tutorials.html Tutorials for Java, Struts, JavaServer Faces, JSP, Hibernate, EJB and more.
Sebastian Hennebrueder wrote: > Hello, > > I could not find any recommandations for the level of set statistics and > what a specific level does actually mean. > What is the difference between 1, 50 and 100? What is recommanded for a > table or column? > Default I believe is 10. The higher the number, the more statistics are kept, with a maximum of 1000. The default is a little bit low for columns used in foreign keys, though frequently it is okay. When problems start, try setting them to 100 or 200. Higher is more accurate, but takes longer to compute, *and* takes longer when planning the optimal query method. It can be worth it, though. John =:->
Attachment
After a long battle with technology, usenet@laliluna.de (Sebastian Hennebrueder), an earthling, wrote: > I could not find any recommandations for the level of set statistics > and what a specific level does actually mean. > What is the difference between 1, 50 and 100? What is recommanded for > a table or column? The numbers represent the numbers of "bins" used to establish histograms that estimate how the data looks. The default is to have 10 bins, and 300 items are sampled at ANALYZE time per bin. 1 would probably be rather bad, having very little ability to express the distribution of data. 100 bins would be 10x as expensive to store than 10, but would provide a much distribution. It is widely believed that a somewhat larger default than 10 would be a "good thing," as it seems to be fairly common for 10 to be too small to allow statistics to be stable. But nobody has done any formal evaluation as to whether it would make sense to jump from 10 to: - 15? - 20? - 50? - 100? - More than that? If we could show that 90% of the present "wrong results" that come from the default of 10 could be addressed by an increase to 20 bins, and the remainder could be left to individual tuning, well, getting rid of 90% of the "query plan errors" would seem worthwhile. I'd hope that a moderate (e.g. - from 10 to 20) increase, which would be pretty cheap, would help a fair bit, but there is no evidence one way or the other. Unfortunately, nobody has come up with a decent way of evaluating how much good a change to the default would actually do. If you can discover an evaluation scheme, your results are likely to get an ear. -- "cbbrowne","@","gmail.com" http://linuxdatabases.info/info/lsf.html "In 1555, Nostradamus wrote: 'Come the millennium, month 12, in the home of greatest power, the village idiot will come forth to be acclaimed the leader.'"
Chris, > It is widely believed that a somewhat larger default than 10 would be > a "good thing," as it seems to be fairly common for 10 to be too small > to allow statistics to be stable. But nobody has done any formal > evaluation as to whether it would make sense to jump from 10 to: > > - 15? > - 20? > - 50? > - 100? > - More than that? My anecdotal experience is that if more than 10 is required, you generally need to jump to at least 100, and more often 250. On the other end, I've generally not found any difference between 400 and 1000 when it comes to "bad" queries. I have an unfinished patch in the works which goes through and increases the stats_target for all *indexed* columns to 100 or so. However, I've needed to work up a test case to prove the utility of it. -- Josh Berkus Aglio Database Solutions San Francisco