Re: Plan time Improvement - 64bit bitmapset - Mailing list pgsql-hackers

From Andres Freund
Subject Re: Plan time Improvement - 64bit bitmapset
Date
Msg-id 4A26FCFE.4050004@anarazel.de
Whole thread Raw
In response to Re: Plan time Improvement - 64bit bitmapset  ("Kevin Grittner" <Kevin.Grittner@wicourts.gov>)
List pgsql-hackers
On 06/03/2009 10:42 PM, Kevin Grittner wrote:
> Robert Haas<robertmhaas@gmail.com>  wrote:
>
>> When you say, "don't fit in cache", exactly what
>> cache are you talking about?  It seems to me that the statistics
>> should be far smaller than the underlying tables, so if even your
>> statistics don't fit in shared buffers (let alone main memory), it
>> doesn't really matter how long your query takes to plan because it
>> will probably take literally forever to execute.  How many tables
>> would you have to be joining to get a GB of statistics, even with
>> dst = 1000?  A few hundred?
The whole pgstat.stat is around 500k on the test database - seems to be 
relatively reasonable.

> Since he can't share the schema, and hasn't even given much of a hint,
The schema isnt the most clear one - the original developers are long 
gone and I only somewhat recently jumped the wagon.
If what I have gathered is correct the biggest reason for implementing 
materialized views was plan and not execution time.

The schema is a rather normalized DW snowflake-alike schema - with the 
abnormality that most of the time a single dimension is actually 
multidimensional, i.e. there are multiple different joins to it needed.

The relatively high degree of normalizations introduces a rather big 
amount of joins for each additional dimension...

I find it hard to give a short overview over a relative complex schema 
without showing it - but thats not up to my choice.

> I don't know whether one (or more) of the columns is a bytea filled
> with 100 MB values; and I don't remember any description of the
> hardware environment either.  Since the behavior seems so
> out-of-the-ordinary, I was casting about for possible extraordinary
> characteristics of his environment which might cause it.  I'm probably
> way off base....
I would love to find such a issue, but I fear there is none. The problem 
exists on different machines, different pg versions, different settings...

Please keep in mind that when using the system "normally" the 
materialized views are used and the query plans stay around 1-2s. Which 
is quite okay for reporting queries I think.
Only that the materialized views start to take too much space...

Andres


pgsql-hackers by date:

Previous
From: Andres Freund
Date:
Subject: Re: Plan time Improvement - 64bit bitmapset
Next
From: Andrew Dunstan
Date:
Subject: Re: It's June 1; do you know where your release is?