Re: Performance problem with low correlation data - Mailing list pgsql-general

From Alvaro Herrera
Subject Re: Performance problem with low correlation data
Date
Msg-id 20090709173645.GK6414@alvh.no-ip.org
Whole thread Raw
In response to Re: Performance problem with low correlation data  (m_lists@yahoo.it)
List pgsql-general
m_lists@yahoo.it wrote:

> testinsert contains t values between '2009-08-01' and '2009-08-09', and ne_id from 1 to 20000. But only 800 out of
20000ne_id have to be read; there's no need for a table scan! 
> I guess this is a reflection of the poor "correlation" on ne_id; but, as I said, I don't really think ne_id is so bad
correlated.
> In fact, doing a "select ne_id, t from testinsert limit 100000"  I can see that data is laid out pretty much by
"ne_id,t", grouped by day (that is, same ne_id for one day, then next ne_id and so on until next day). 
> How is the "correlation" calculated? Can someone explain to me why, after the procedure above,correlation is so
low???

Did you run ANALYZE after the procedure above?

--
Alvaro Herrera                                http://www.CommandPrompt.com/
The PostgreSQL Company - Command Prompt, Inc.

pgsql-general by date:

Previous
From: Alvaro Herrera
Date:
Subject: Re: is autovacuum recommended?
Next
From: Andres Freund
Date:
Subject: Re: is autovacuum recommended?