Thread: Statistics on a table
Hi, I just found very interesting situation, statistic ( n_distinct in particular) records for one of my columns, greatly under calculated it saying: 49726, but in reality 33409816. So planer never choose index but rather using table scan, and query never returns, is it any way how I can improve that? I can turn seqscan off but is it safe?
On Tue, 8 Jul 2003, Maksim Likharev wrote: > Hi, > I just found very interesting situation, > statistic ( n_distinct in particular) records for one of my columns, > greatly under calculated it saying: > > 49726, but in reality 33409816. > > So planer never choose index but rather using table scan, and query > never returns, > is it any way how I can improve that? You might see if raising the statistics target with ALTER TABLE ALTER COLUMN followed by an analyze helps getting reasonable values any. Maybe try a few thousand? > I can turn seqscan off but is it safe? It's somewhat of a large hammer, especially if you're doing joins with this table or something.
Stephan Szabo <sszabo@megazone23.bigpanda.com> writes: >> statistic ( n_distinct in particular) records for one of my columns, >> greatly under calculated it saying: >> 49726, but in reality 33409816. >> is it any way how I can improve that? > You might see if raising the statistics target with ALTER TABLE ALTER > COLUMN followed by an analyze helps getting reasonable values any. Maybe > try a few thousand? That seems like a big jump, considering the default is only 10. Try setting it to 100 (then re-ANALYZE, then check the plan). If that doesn't work, try more. I'd be interested to see what it takes to get the estimate closer to reality. regards, tom lane
Tried 100 did not help, it seems like n_distinct grows 10000 per 10 statistic, in my case to be close to reality I have to put 1000. Is there any complications of that, I mean growing statistic depth other then disk space? -----Original Message----- From: Tom Lane [mailto:tgl@sss.pgh.pa.us] Sent: Tuesday, July 08, 2003 11:33 PM To: Stephan Szabo Cc: Maksim Likharev; pgsql-general@postgresql.org Subject: Re: [GENERAL] Statistics on a table Stephan Szabo <sszabo@megazone23.bigpanda.com> writes: >> statistic ( n_distinct in particular) records for one of my columns, >> greatly under calculated it saying: >> 49726, but in reality 33409816. >> is it any way how I can improve that? > You might see if raising the statistics target with ALTER TABLE ALTER > COLUMN followed by an analyze helps getting reasonable values any. Maybe > try a few thousand? That seems like a big jump, considering the default is only 10. Try setting it to 100 (then re-ANALYZE, then check the plan). If that doesn't work, try more. I'd be interested to see what it takes to get the estimate closer to reality. regards, tom lane