On 06/03/2009 10:42 PM, Kevin Grittner wrote:
> Robert Haas<robertmhaas@gmail.com> wrote:
>
>> When you say, "don't fit in cache", exactly what
>> cache are you talking about? It seems to me that the statistics
>> should be far smaller than the underlying tables, so if even your
>> statistics don't fit in shared buffers (let alone main memory), it
>> doesn't really matter how long your query takes to plan because it
>> will probably take literally forever to execute. How many tables
>> would you have to be joining to get a GB of statistics, even with
>> dst = 1000? A few hundred?
The whole pgstat.stat is around 500k on the test database - seems to be
relatively reasonable.
> Since he can't share the schema, and hasn't even given much of a hint,
The schema isnt the most clear one - the original developers are long
gone and I only somewhat recently jumped the wagon.
If what I have gathered is correct the biggest reason for implementing
materialized views was plan and not execution time.
The schema is a rather normalized DW snowflake-alike schema - with the
abnormality that most of the time a single dimension is actually
multidimensional, i.e. there are multiple different joins to it needed.
The relatively high degree of normalizations introduces a rather big
amount of joins for each additional dimension...
I find it hard to give a short overview over a relative complex schema
without showing it - but thats not up to my choice.
> I don't know whether one (or more) of the columns is a bytea filled
> with 100 MB values; and I don't remember any description of the
> hardware environment either. Since the behavior seems so
> out-of-the-ordinary, I was casting about for possible extraordinary
> characteristics of his environment which might cause it. I'm probably
> way off base....
I would love to find such a issue, but I fear there is none. The problem
exists on different machines, different pg versions, different settings...
Please keep in mind that when using the system "normally" the
materialized views are used and the query plans stay around 1-2s. Which
is quite okay for reporting queries I think.
Only that the materialized views start to take too much space...
Andres