Thread: Statistics on a table

Statistics on a table

From
"Maksim Likharev"
Date:
Hi,
I just found very interesting situation,
statistic ( n_distinct in particular) records for one of my columns,
greatly under calculated it saying:

49726, but in reality 33409816.

So planer never choose index but rather using table scan, and query
never returns,
is it any way how I can improve that?

I can turn seqscan off but is it safe?

Re: Statistics on a table

From
Stephan Szabo
Date:
On Tue, 8 Jul 2003, Maksim Likharev wrote:

> Hi,
> I just found very interesting situation,
> statistic ( n_distinct in particular) records for one of my columns,
> greatly under calculated it saying:
>
> 49726, but in reality 33409816.
>
> So planer never choose index but rather using table scan, and query
> never returns,
> is it any way how I can improve that?

You might see if raising the statistics target with ALTER TABLE ALTER
COLUMN followed by an analyze helps getting reasonable values any.  Maybe
try a few thousand?

> I can turn seqscan off but is it safe?

It's somewhat of a large hammer, especially if you're doing joins with
this table or something.


Re: Statistics on a table

From
Tom Lane
Date:
Stephan Szabo <sszabo@megazone23.bigpanda.com> writes:
>> statistic ( n_distinct in particular) records for one of my columns,
>> greatly under calculated it saying:
>> 49726, but in reality 33409816.
>> is it any way how I can improve that?

> You might see if raising the statistics target with ALTER TABLE ALTER
> COLUMN followed by an analyze helps getting reasonable values any.  Maybe
> try a few thousand?

That seems like a big jump, considering the default is only 10.  Try
setting it to 100 (then re-ANALYZE, then check the plan).  If that
doesn't work, try more.  I'd be interested to see what it takes to
get the estimate closer to reality.

            regards, tom lane

Re: Statistics on a table

From
"Maksim Likharev"
Date:
Tried 100 did not help, it seems like n_distinct grows 10000 per 10
statistic,
in my case to be close to reality I have to put 1000.
Is there any complications of that, I mean growing statistic depth other
then
disk space?


-----Original Message-----
From: Tom Lane [mailto:tgl@sss.pgh.pa.us]
Sent: Tuesday, July 08, 2003 11:33 PM
To: Stephan Szabo
Cc: Maksim Likharev; pgsql-general@postgresql.org
Subject: Re: [GENERAL] Statistics on a table


Stephan Szabo <sszabo@megazone23.bigpanda.com> writes:
>> statistic ( n_distinct in particular) records for one of my columns,
>> greatly under calculated it saying:
>> 49726, but in reality 33409816.
>> is it any way how I can improve that?

> You might see if raising the statistics target with ALTER TABLE ALTER
> COLUMN followed by an analyze helps getting reasonable values any.
Maybe
> try a few thousand?

That seems like a big jump, considering the default is only 10.  Try
setting it to 100 (then re-ANALYZE, then check the plan).  If that
doesn't work, try more.  I'd be interested to see what it takes to
get the estimate closer to reality.

            regards, tom lane