Re: Increasing statistics results in worse estimates - Mailing list pgsql-general

From Shelby Cain
Subject Re: Increasing statistics results in worse estimates
Date
Msg-id 20050429165411.47188.qmail@web50101.mail.yahoo.com
Whole thread Raw
In response to Increasing statistics results in worse estimates  (Shelby Cain <alyandon@yahoo.com>)
Responses Re: Increasing statistics results in worse estimates  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-general
--- Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Hm.  I should have realized why correlation wouldn't
> be high for the
> city name: given the ordering by zipcode, city name
> values may be
> pretty well clumped, but they aren't in any kind of
> alphabetical
> order --- and it's the overall ordering, not the
> clumping, that
> correlation measures.
>

Ah.  Localized clumping != Overall ordering.  Thanks
for the clarification.

> However, there is something absolutely wacko about
> the stats collection
> process here ... you've got fairly reasonable
> looking results for
> most-common-values of city name at the lower end of
> the stats settings
> (HOUSTON and DALLAS are the most common, sounds
> about right) ... but at
> the higher settings the ordering of most-common
> entries just goes nuts.
> We've got some kind of bug there.
>

I had noticed that as well but wasn't sure about the
whether MCV really meant what I thought it did.

> What exactly are you changing in the different cases
> ---
> default_statistics_target, or are you doing an ALTER
> TABLE on some
> of the columns (if so which)?

I have a setting of 30 for default_statistics_target
and I am manipulating the statistics target for city
by alter table.

>
> It might be easier to debug this if you could send
> me the test case.
> Any problem with sending just the city name and
> zipcode columns
> of the table (offlist of course)?  COPY TO with a
> column list can
> extract that for you.
>

I had already removed proprietary data to try and
whittle down the number of columns I needed to
demonstrate the weirdness so I can host a dump of the
table.  However, before I take that step I should
mention that this is the native Windows port so if
that changes anything let me know.

Regards,

Shelby Cain

__________________________________________________
Do You Yahoo!?
Tired of spam?  Yahoo! Mail has the best spam protection around
http://mail.yahoo.com

pgsql-general by date:

Previous
From: James Robinson
Date:
Subject: Re: Composite types as columns used in production?
Next
From: Michael Fuhr
Date:
Subject: Re: out of memory for query result