Re: [GENERAL] Increasing statistics results in worse estimates - Mailing list pgsql-hackers-win32

From Tom Lane
Subject Re: [GENERAL] Increasing statistics results in worse estimates
Date
Msg-id 25129.1114967035@sss.pgh.pa.us
Whole thread Raw
List pgsql-hackers-win32
[ redirecting to pgsql-hackers-win32 ]

Shelby Cain <alyandon@yahoo.com> writes:
> --- Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> However, there is something absolutely wacko about
>> the stats collection
>> process here ... you've got fairly reasonable
>> looking results for
>> most-common-values of city name at the lower end of
>> the stats settings
>> (HOUSTON and DALLAS are the most common, sounds
>> about right) ... but at
>> the higher settings the ordering of most-common
>> entries just goes nuts.
>> We've got some kind of bug there.

> I had noticed that as well but wasn't sure about the
> whether MCV really meant what I thought it did.

>> It might be easier to debug this if you could send
>> me the test case.

> I had already removed proprietary data to try and
> whittle down the number of columns I needed to
> demonstrate the weirdness so I can host a dump of the
> table.  However, before I take that step I should
> mention that this is the native Windows port so if
> that changes anything let me know.

Thanks for sending me the test data.  The bad news is that I can't
reproduce any strange behavior here: the stats get marginally more
accurate as the target goes up, just as you'd expect.  So it would
seem there is something broken about ANALYZE on Windows.  There's
not anything magic about this particular dataset, AFAICS.

Which Windows build are you using, exactly?

Can anyone else reproduce a problem with ANALYZE producing silly
most-common-values stats at higher statistics targets?  The original
thread is here:
http://archives.postgresql.org/pgsql-general/2005-04/msg01368.php

            regards, tom lane

pgsql-hackers-win32 by date:

Previous
From: "Mark Miller"
Date:
Subject: ERROR: Could not find function
Next
From: Shelby Cain
Date:
Subject: Re: [GENERAL] Increasing statistics results in worse estimates