Re: index v. seqscan for certain values - Mailing list pgsql-performance

From Tom Lane
Subject Re: index v. seqscan for certain values
Date
Msg-id 13762.1081803767@sss.pgh.pa.us
Whole thread Raw
In response to Re: index v. seqscan for certain values  ("Jeremy Dunn" <jdunn@autorevenue.com>)
Responses Re: index v. seqscan for certain values
List pgsql-performance
"Jeremy Dunn" <jdunn@autorevenue.com> writes:
> Agreed.  However, given that count(*) is a question that can be answered
> _solely_ using the index (without reference to the actual data blocks),

As Bruno noted, that is not the case in Postgres; we must visit the
table rows anyway.

> When I just tried it again with a value of 300, analyze, then run the
> query, I get a *worse* result for an estimate.  I don't understand this.

That's annoying.  How repeatable are these results --- if you do ANALYZE
over again several times, how much does the row count estimate change
each time?  (It should change somewhat, since ANALYZE is taking a random
sample, but one would like to think not a whole lot.)  Is the variance
more or less at the higher stats target?  Take a look at a few different
CID values to get a sense of the accuracy, don't look at just one ...

(Actually, you might find it more profitable to look at the pg_stats
entry for the CID column rather than reverse-engineering the stats via
ANALYZE.  Look at how well the most-common-values list and associated
frequency numbers track reality.)

Also, can you think of any reason for the distribution of CID values
to be nonuniform within the table?  For instance, do rows get inserted
in order of increasing CID, or is there any clustering of rows with the
same CID?

            regards, tom lane

pgsql-performance by date:

Previous
From: Tom Lane
Date:
Subject: Re: Re: Index Backward Scan fast / Index Scan slow ! (Modifié par Pailloncy Jean-Gérard)
Next
From: Tom Lane
Date:
Subject: Re: Index Backward Scan fast / Index Scan slow !