Thread: Queries taking ages in PG 8.1, have been much faster in PG<=8.0

Queries taking ages in PG 8.1, have been much faster in PG<=8.0

From

"Markus Wollny"

Date:

01 December 2005, 11:49:29

Hi!

I've got an urgent problem with an application which is evaluating a
monthly survey; it's running quite a lot of queries like this:

select  SOURCE.NAME as TYPE,
   count(PARTICIPANT.SESSION_ID) as TOTAL
from  (
   select  PARTICIPANT.SESSION_ID
   from   survey.PARTICIPANT,
      survey.ANSWER
   where  PARTICIPANT.STATUS = 1
   and   date_trunc('month', PARTICIPANT.CREATED) = date_trunc('month',
now()-'1 month'::interval)
   and   PARTICIPANT.SESSION_ID = ANSWER.SESSION_ID
   and   ANSWER.QUESTION_ID = 6
   and   ANSWER.VALUE = 1
   )
   as PARTICIPANT,
   survey.ANSWER,
   survey.HANDY_JAVA SOURCE
where  PARTICIPANT.SESSION_ID = ANSWER.SESSION_ID
and   ANSWER.QUESTION_ID = 16
and   ANSWER.VALUE = SOURCE.ID
group by SOURCE.NAME,
   SOURCE.POSITION
order by  SOURCE.POSITION asc;

My current PostgreSQL-version is "PostgreSQL 8.1.0 on i686-pc-linux-gnu,
compiled by GCC gcc (GCC) 3.2". Up to 8.0, a query like this took a
couple of seconds, maybe even up to a minute. In 8.1 a query like this
will run from 30 minutes up to two hours to complete, depending on ist
complexity. I've got autovaccum enabled and run a nightly vacuum analyze
over all of my databases. Here's some information about the relevant
tables: Table answer has got ~ 8.9M rows (estimated 8,872,130, counted
8,876,648), participant has got ~178K rows (estimated 178,165, counted
178,248), HANDY_JAVA has got three rows. This is the
explain-analyze-output for the above:

"Sort  (cost=11383.09..11383.10 rows=3 width=16) (actual
time=1952676.858..1952676.863 rows=3 loops=1)"
"  Sort Key: source."position""
"  ->  HashAggregate  (cost=11383.03..11383.07 rows=3 width=16) (actual
time=1952676.626..1952676.635 rows=3 loops=1)"
"        ->  Nested Loop  (cost=189.32..11383.00 rows=5 width=16)
(actual time=6975.812..1952371.782 rows=9806 loops=1)"
"              ->  Nested Loop  (cost=3.48..3517.47 rows=42 width=20)
(actual time=6819.716..15419.930 rows=9806 loops=1)"
"                    ->  Nested Loop  (cost=3.48..1042.38 rows=738
width=16) (actual time=258.434..6233.039 rows=162723 loops=1)"
"                          ->  Seq Scan on handy_java source
(cost=0.00..1.03 rows=3 width=14) (actual time=0.093..0.118 rows=3
loops=1)"
"                          ->  Bitmap Heap Scan on answer
(cost=3.48..344.04 rows=246 width=8) (actual time=172.381..1820.499
rows=54241 loops=3)"
"                                Recheck Cond: ((answer.question_id =
16) AND (answer.value = "outer".id))"
"                                ->  Bitmap Index Scan on
idx02_performance  (cost=0.00..3.48 rows=246 width=0) (actual
time=98.321..98.321 rows=54245 loops=3)"
"                                      Index Cond: ((answer.question_id
= 16) AND (answer.value = "outer".id))"
"                    ->  Index Scan using idx01_perf_0006 on participant
(cost=0.00..3.34 rows=1 width=4) (actual time=0.049..0.050 rows=0
loops=162723)"
"                          Index Cond: (participant.session_id =
"outer".session_id)"
"                          Filter: ((status = 1) AND
(date_trunc('month'::text, created) = date_trunc('month'::text, (now() -
'1 mon'::interval))))"
"              ->  Bitmap Heap Scan on answer  (cost=185.85..187.26
rows=1 width=4) (actual time=197.490..197.494 rows=1 loops=9806)"
"                    Recheck Cond: (("outer".session_id =
answer.session_id) AND (answer.question_id = 6) AND (answer.value = 1))"
"                    ->  BitmapAnd  (cost=185.85..185.85 rows=1 width=0)
(actual time=197.421..197.421 rows=0 loops=9806)"
"                          ->  Bitmap Index Scan on
idx_answer_session_id  (cost=0.00..2.83 rows=236 width=0) (actual
time=0.109..0.109 rows=49 loops=9806)"
"                                Index Cond: ("outer".session_id =
answer.session_id)"
"                          ->  Bitmap Index Scan on idx02_performance
(cost=0.00..182.77 rows=20629 width=0) (actual time=195.742..195.742
rows=165697 loops=9806)"
"                                Index Cond: ((question_id = 6) AND
(value = 1))"
"Total runtime: 1952678.393 ms"

I am really sorry, but currently I haven't got any 8.0-installation
left, so I cannot provide the explain (analyze) output for 8.0.

I fiddled a little with the statement and managed to speed things up
quite a lot:

select  SOURCE.NAME as TYPE,
   count(ANSWER.SESSION_ID) as TOTAL
from survey.ANSWER,
   survey.HANDY_JAVA SOURCE
where  ANSWER.QUESTION_ID = 16
and   ANSWER.VALUE = SOURCE.ID
and ANSWER.SESSION_ID in (
   select  PARTICIPANT.SESSION_ID
   from   survey.PARTICIPANT,
      survey.ANSWER
   where  PARTICIPANT.STATUS = 1
   and   date_trunc('month', PARTICIPANT.CREATED) = date_trunc('month',
now()-'1 month'::interval)
   and   PARTICIPANT.SESSION_ID = ANSWER.SESSION_ID
   and   ANSWER.QUESTION_ID = 6
   and   ANSWER.VALUE = 1
   )
group by SOURCE.NAME,
   SOURCE.POSITION
order by  SOURCE.POSITION asc;

Here's the explain analyze output:
"Sort  (cost=27835.39..27835.39 rows=3 width=16) (actual
time=9609.207..9609.212 rows=3 loops=1)"
"  Sort Key: source."position""
"  ->  HashAggregate  (cost=27835.33..27835.36 rows=3 width=16) (actual
time=9609.058..9609.067 rows=3 loops=1)"
"        ->  Hash IN Join  (cost=26645.78..27835.29 rows=5 width=16)
(actual time=6374.436..9548.945 rows=9806 loops=1)"
"              Hash Cond: ("outer".session_id = "inner".session_id)"
"              ->  Nested Loop  (cost=3.48..1042.38 rows=738 width=16)
(actual time=190.419..4817.977 rows=162704 loops=1)"
"                    ->  Seq Scan on handy_java source  (cost=0.00..1.03
rows=3 width=14) (actual time=0.036..0.058 rows=3 loops=1)"
"                    ->  Bitmap Heap Scan on answer  (cost=3.48..344.04
rows=246 width=8) (actual time=116.719..1390.931 rows=54235 loops=3)"
"                          Recheck Cond: ((answer.question_id = 16) AND
(answer.value = "outer".id))"
"                          ->  Bitmap Index Scan on idx02_performance
(cost=0.00..3.48 rows=246 width=0) (actual time=63.195..63.195
rows=54235 loops=3)"
"                                Index Cond: ((answer.question_id = 16)
AND (answer.value = "outer".id))"
"              ->  Hash  (cost=26639.37..26639.37 rows=1174 width=8)
(actual time=3906.831..3906.831 rows=9806 loops=1)"
"                    ->  Hash Join  (cost=4829.24..26639.37 rows=1174
width=8) (actual time=464.011..3877.539 rows=9806 loops=1)"
"                          Hash Cond: ("outer".session_id =
"inner".session_id)"
"                          ->  Bitmap Heap Scan on answer
(cost=182.76..21413.93 rows=20626 width=4) (actual
time=273.839..2860.984 rows=165655 loops=1)"
"                                Recheck Cond: ((question_id = 6) AND
(value = 1))"
"                                ->  Bitmap Index Scan on
idx02_performance  (cost=0.00..182.76 rows=20626 width=0) (actual
time=171.933..171.933 rows=165659 loops=1)"
"                                      Index Cond: ((question_id = 6)
AND (value = 1))"
"                          ->  Hash  (cost=4621.13..4621.13 rows=10141
width=4) (actual time=123.351..123.351 rows=11134 loops=1)"
"                                ->  Index Scan using idx01_perf_0005 on
participant  (cost=0.01..4621.13 rows=10141 width=4) (actual
time=0.545..93.200 rows=11134 loops=1)"
"                                      Index Cond:
(date_trunc('month'::text, created) = date_trunc('month'::text, (now() -
'1 mon'::interval)))"
"                                      Filter: (status = 1)"
"Total runtime: 9612.249 ms"

Regarding the total runtime, this is roughly in the same dimension as
for the original query in 8.0, as far as I remember. I haven't written
these queries myself in the first place, but they were done in the dark
ages when IN was a no-no and haven't been touched ever since - there
hadn't been any need to do so.

My current problem is that rewriting hundreds of queries, some of them
quite a bit more complex than this one, but all of them using the same
general scheme, would take quite a lot of time - and I'm expected to
hand over the survey results ASAP. So I will obviously have to do a
rewrite if there's just no other way, but I wondered if there might be
some other option that would allow me to point the planner in the right
direction so it would behave the same as in the previous versions,
namely 8.0?

Any suggestions?

Kind regards

   Markus

Re: Queries taking ages in PG 8.1, have been much faster in PG<=8.0

From

Tom Lane

Date:

01 December 2005, 12:26:11

"Markus Wollny" <Markus.Wollny@computec.de> writes:
> My current problem is that rewriting hundreds of queries, some of them
> quite a bit more complex than this one, but all of them using the same
> general scheme, would take quite a lot of time - and I'm expected to
> hand over the survey results ASAP. So I will obviously have to do a
> rewrite if there's just no other way, but I wondered if there might be
> some other option that would allow me to point the planner in the right
> direction so it would behave the same as in the previous versions,
> namely 8.0?

It looks like "set enable_nestloop = 0" might be a workable hack for
the immediate need.  Once you're not under deadline, I'd like to
investigate more closely to find out why 8.1 does worse than 8.0 here.

            regards, tom lane

Re: Queries taking ages in PG 8.1, have been much faster in PG<=8.0

From

"Markus Wollny"

Date:

01 December 2005, 12:30:51

> -----Ursprüngliche Nachricht-----
> Von: Tom Lane [mailto:tgl@sss.pgh.pa.us]
> Gesendet: Donnerstag, 1. Dezember 2005 17:26
> An: Markus Wollny
> Cc: pgsql-performance@postgresql.org
> Betreff: Re: [PERFORM] Queries taking ages in PG 8.1, have
> been much faster in PG<=8.0

> It looks like "set enable_nestloop = 0" might be a workable
> hack for the immediate need.

Whow - that works miracles :)

"Sort  (cost=81813.13..81813.14 rows=3 width=16) (actual time=7526.745..7526.751 rows=3 loops=1)"
"  Sort Key: source."position""
"  ->  HashAggregate  (cost=81813.07..81813.11 rows=3 width=16) (actual time=7526.590..7526.601 rows=3 loops=1)"
"        ->  Merge Join  (cost=81811.40..81813.03 rows=5 width=16) (actual time=7423.289..7479.175 rows=9806 loops=1)"
"              Merge Cond: ("outer".id = "inner".value)"
"              ->  Sort  (cost=1.05..1.06 rows=3 width=14) (actual time=0.085..0.091 rows=3 loops=1)"
"                    Sort Key: source.id"
"                    ->  Seq Scan on handy_java source  (cost=0.00..1.03 rows=3 width=14) (actual time=0.039..0.049
rows=3loops=1)" 
"              ->  Sort  (cost=81810.35..81811.81 rows=583 width=8) (actual time=7423.179..7440.062 rows=9806 loops=1)"
"                    Sort Key: mafo.answer.value"
"                    ->  Hash Join  (cost=27164.31..81783.57 rows=583 width=8) (actual time=6757.521..7360.822
rows=9806loops=1)" 
"                          Hash Cond: ("outer".session_id = "inner".session_id)"
"                          ->  Bitmap Heap Scan on answer  (cost=506.17..54677.92 rows=88334 width=8) (actual
time=379.245..2660.344rows=162809 loops=1)" 
"                                Recheck Cond: (question_id = 16)"
"                                ->  Bitmap Index Scan on idx_answer_question_id  (cost=0.00..506.17 rows=88334
width=0)(actual time=274.632..274.632 rows=162814 loops=1)" 
"                                      Index Cond: (question_id = 16)"
"                          ->  Hash  (cost=26655.21..26655.21 rows=1175 width=8) (actual time=3831.362..3831.362
rows=9806loops=1)" 
"                                ->  Hash Join  (cost=4829.33..26655.21 rows=1175 width=8) (actual
time=542.227..3800.985rows=9806 loops=1)" 
"                                      Hash Cond: ("outer".session_id = "inner".session_id)"
"                                      ->  Bitmap Heap Scan on answer  (cost=182.84..21429.34 rows=20641 width=4)
(actualtime=292.067..2750.376 rows=165762 loops=1)" 
"                                            Recheck Cond: ((question_id = 6) AND (value = 1))"
"                                            ->  Bitmap Index Scan on idx02_performance  (cost=0.00..182.84 rows=20641
width=0)(actual time=167.306..167.306 rows=165769 loops=1)" 
"                                                  Index Cond: ((question_id = 6) AND (value = 1))"
"                                      ->  Hash  (cost=4621.13..4621.13 rows=10141 width=4) (actual
time=182.842..182.842rows=11134 loops=1)" 
"                                            ->  Index Scan using idx01_perf_0005 on participant  (cost=0.01..4621.13
rows=10141width=4) (actual time=0.632..136.126 rows=11134 loops=1)" 
"                                                  Index Cond: (date_trunc('month'::text, created) =
date_trunc('month'::text,(now() - '1 mon'::interval)))" 
"                                                  Filter: (status = 1)"
"Total runtime: 7535.398 ms"

> Once you're not under deadline,
> I'd like to investigate more closely to find out why 8.1 does
> worse than 8.0 here.

Please tell me what I can do to help in clearing up this issue, I'd be very happy to help! Heck, I am happy anyway that
there'ssuch a quick fix, even if it's not a beautiful one :) 

Kind regards

   Markus

Re: Queries taking ages in PG 8.1, have been much faster in PG<=8.0

From

Tom Lane

Date:

01 December 2005, 12:34:52

"Markus Wollny" <Markus.Wollny@computec.de> writes:
>> Once you're not under deadline,
>> I'd like to investigate more closely to find out why 8.1 does
>> worse than 8.0 here.

> Please tell me what I can do to help in clearing up this issue, I'd be
> very happy to help!

The first thing to do is get 8.0's EXPLAIN ANALYZE for the same query.
After we see how that differs from 8.1, we'll know what the next
question should be ...

            regards, tom lane

Re: Queries taking ages in PG 8.1, have been much faster in PG<=8.0

From

"Markus Wollny"

Date:

04 December 2005, 09:25:23

Hi!

> -----Ursprüngliche Nachricht-----
> Von: Tom Lane [mailto:tgl@sss.pgh.pa.us]
> Gesendet: Donnerstag, 1. Dezember 2005 17:26
> An: Markus Wollny
> Cc: pgsql-performance@postgresql.org
> Betreff: Re: [PERFORM] Queries taking ages in PG 8.1, have
> been much faster in PG<=8.0

> It looks like "set enable_nestloop = 0" might be a workable
> hack for the immediate need.
>
> Once you're not under deadline,
> I'd like to investigate more closely to find out why 8.1 does
> worse than 8.0 here.

I've just set up a PostgreSQL 8.0.3 installation ...

select version();
                                          version
--------------------------------------------------------------------------------------------
PostgreSQL 8.0.3 on i686-pc-linux-gnu, compiled by GCC gcc (GCC) 3.3.5 (Debian 1:3.3.5-13)
(1 row)

...and restored a dump there; here's the explain analyze of the query for 8.0.3:

                                                                            QUERY PLAN
-------------------------------------------------------------------------------------------------------------------------------------------------------------------
Sort (cost=5193.63..5193.63 rows=3 width=16) (actual time=7365.107..7365.110 rows=3 loops=1)
   Sort Key: source."position"
   -> HashAggregate (cost=5193.59..5193.60 rows=3 width=16) (actual time=7365.034..7365.041 rows=3 loops=1)
         -> Nested Loop (cost=0.00..5193.57 rows=3 width=16) (actual time=3190.642..7300.820 rows=11086 loops=1)
               -> Nested Loop (cost=0.00..3602.44 rows=4 width=20) (actual time=3169.968..5875.153 rows=11087 loops=1)
                     -> Nested Loop (cost=0.00..1077.95 rows=750 width=16) (actual time=36.599..2778.129 rows=158288 loops=1)
                           -> Seq Scan on handy_java source (cost=0.00..1.03 rows=3 width=14) (actual time=6.503..6.514 rows=3 loops=1)
                           -> Index Scan using idx02_performance on answer (cost=0.00..355.85 rows=250 width=8) (actual time=10.071..732.746 rows=52763 loops=3)
                                 Index Cond: ((answer.question_id = 16) AND (answer.value = "outer".id))
                     -> Index Scan using pk_participant on participant (cost=0.00..3.35 rows=1 width=4) (actual time=0.016..0.016 rows=0 loops=158288)
                           Index Cond: (participant.session_id = "outer".session_id)
                           Filter: ((status = 1) AND (date_trunc('month'::text, created) = date_trunc('month'::text, (now() - '2 mons'::interval))))
               -> Index Scan using idx_answer_session_id on answer (cost=0.00..397.77 rows=1 width=4) (actual time=0.080..0.122 rows=1 loops=11087)
                     Index Cond: ("outer".session_id = answer.session_id)
                     Filter: ((question_id = 6) AND (value = 1))
Total runtime: 7365.461 ms
(16 rows)

Does this tell you anything useful? It's not on the same machine, mind you, but configuration for PostgreSQL is absolutely identical (apart from the autovacuum-lines which 8.0.3 doesn't like).

Kind regards

   Markus

Re: Queries taking ages in PG 8.1, have been much faster in PG<=8.0

From

Tom Lane

Date:

04 December 2005, 14:31:39

"Markus Wollny" <Markus.Wollny@computec.de> writes:
>> Once you're not under deadline,
>> I'd like to investigate more closely to find out why 8.1 does
>> worse than 8.0 here.

> Does this tell you anything useful? It's not on the same machine, mind
> you, but configuration for PostgreSQL is absolutely identical (apart
> from the autovacuum-lines which 8.0.3 doesn't like).

The data is not quite the same, right?  I notice different numbers of
rows being returned.  But anyway, it seems the problem is with the upper
scan on "answers", which 8.0 does like this:

  ->  Index Scan using idx_answer_session_id on answer  (cost=0.00..397.77 rows=1 width=4) (actual time=0.080..0.122
rows=1loops=11087) 
        Index Cond: ("outer".session_id = answer.session_id)
        Filter: ((question_id = 6) AND (value = 1))

and 8.1 does like this:

  ->  Bitmap Heap Scan on answer  (cost=185.85..187.26 rows=1 width=4) (actual time=197.490..197.494 rows=1 loops=9806)
        Recheck Cond: (("outer".session_id = answer.session_id) AND (answer.question_id = 6) AND (answer.value = 1))
        ->  BitmapAnd  (cost=185.85..185.85 rows=1 width=0) (actual time=197.421..197.421 rows=0 loops=9806)
              ->  Bitmap Index Scan on idx_answer_session_id  (cost=0.00..2.83 rows=236 width=0) (actual
time=0.109..0.109rows=49 loops=9806) 
                    Index Cond: ("outer".session_id = answer.session_id)
              ->  Bitmap Index Scan on idx02_performance (cost=0.00..182.77 rows=20629 width=0) (actual
time=195.742..195.742rows=165697 loops=9806) 
                    Index Cond: ((question_id = 6) AND (value = 1))

It seems that checking question_id/value via the index, rather than
directly on the fetched tuple, is a net loss here.  It looks like 8.1
would have made the right plan choice if it had made a better estimate
of the combined selectivity of the question_id and value conditions,
so ultimately this is another manifestation of the lack of cross-column
statistics.  What I find interesting though is that the plain index scan
in 8.0 is so enormously cheaper than it's estimated to be.  Perhaps the
answer table in your 8.0 installation is almost perfectly ordered by
session_id?

Are you using default values for the planner cost parameters?  It looks
like reducing random_page_cost would help bring the planner estimates
into line with reality on your machines.

            regards, tom lane

Re: Queries taking ages in PG 8.1, have been much faster in PG<=8.0

From

"Markus Wollny"

Date:

05 December 2005, 06:29:01

Hi!

> -----Ursprüngliche Nachricht-----
> Von: Tom Lane [mailto:tgl@sss.pgh.pa.us]
> Gesendet: Sonntag, 4. Dezember 2005 19:32
> An: Markus Wollny
> Cc: pgsql-performance@postgresql.org
> Betreff: Re: [PERFORM] Queries taking ages in PG 8.1, have
> been much faster in PG<=8.0

> The data is not quite the same, right?  I notice different
> numbers of rows being returned.

No, you're right, I didn't manage to restore the 8.1 dump into the 8.0.3 cluster, so I took the quick route and
restoredthe last dump from my 8.0 installation. The numbers should be roughly within the same range, though: 

Table answer has got 8,646,320 rows (counted and estimated, as this db is not live, obviously), table participant has
got173,998 rows; for comparison: 
The production db had an estimated 8,872,130, counted 8,876,648 rows for table answer, and estimated 178,165, counted
178,248rows for participant. As the numbers are a mere 2% apart, I should think that this wouldn't make that much
difference.

> It seems that checking question_id/value via the index,
> rather than directly on the fetched tuple, is a net loss
> here.  It looks like 8.1 would have made the right plan
> choice if it had made a better estimate of the combined
> selectivity of the question_id and value conditions, so
> ultimately this is another manifestation of the lack of
> cross-column statistics.  What I find interesting though is
> that the plain index scan in 8.0 is so enormously cheaper
> than it's estimated to be.  Perhaps the answer table in your
> 8.0 installation is almost perfectly ordered by session_id?

Not quite - there may be several concurrent sessions at any one time, but ordinarily the answers for one session-id
wouldbe quite close together, in a lot of cases even in perfect sequence, so "almost perfectly" might be a fair
description,depending on the exact definition of "almost" :) 

> Are you using default values for the planner cost parameters?

I have to admit that I did tune the random_page_cost and effective_cache_size settings ages ago (7.1-ish) to a value
thatseemed to work best then - and didn't touch it ever since, although my data pool has grown quite a bit over time.
cpu_tuple_cost,cpu_index_tuple_cost and cpu_operator_cost are using default values. 

>  It looks like reducing random_page_cost would help bring the
> planner estimates into line with reality on your machines.

I had set random_page_cost to 1.4 already, so I doubt that it would do much good to further reduce the value - reading
thedocs and the suggestions for tuning I would have thought that I should actually consider increasing this value a
bit,as not all of my data will fit in memory any more. Do you nevertheless want me to try what happens if I reduce
random_page_costeven further? 

Kind regards

   Markus

Re: Queries taking ages in PG 8.1, have been much faster in PG<=8.0

From

Tom Lane

Date:

05 December 2005, 10:32:54

"Markus Wollny" <Markus.Wollny@computec.de> writes:
>> ... What I find interesting though is
>> that the plain index scan in 8.0 is so enormously cheaper
>> than it's estimated to be.  Perhaps the answer table in your
>> 8.0 installation is almost perfectly ordered by session_id?

> Not quite - there may be several concurrent sessions at any one time, but ordinarily the answers for one session-id
wouldbe quite close together, in a lot of cases even in perfect sequence, so "almost perfectly" might be a fair
description,depending on the exact definition of "almost" :) 

Could we see the pg_stats row for answer.session_id in both 8.0 and 8.1?

> I had set random_page_cost to 1.4 already, so I doubt that it would do much good to further reduce the value -
readingthe docs and the suggestions for tuning I would have thought that I should actually consider increasing this
valuea bit, as not all of my data will fit in memory any more. Do you nevertheless want me to try what happens if I
reducerandom_page_cost even further? 

No, that's probably quite low enough already ...

            regards, tom lane

Re: Queries taking ages in PG 8.1, have been much faster in PG<=8.0

From

"Markus Wollny"

Date:

05 December 2005, 10:44:56


> -----Ursprüngliche Nachricht-----
> Von: Tom Lane [mailto:tgl@sss.pgh.pa.us]
> Gesendet: Montag, 5. Dezember 2005 15:33
> An: Markus Wollny
> Cc: pgsql-performance@postgresql.org
> Betreff: Re: AW: [PERFORM] Queries taking ages in PG 8.1,
> have been much faster in PG<=8.0

> Could we see the pg_stats row for answer.session_id in both
> 8.0 and 8.1?

Here you are:

select null_frac
    , avg_width
    , n_distinct
    , most_common_vals
    , most_common_freqs
    , histogram_bounds
    , Correlation
from pg_stats
where schemaname = 'survey'
and tablename = 'answer'
and attname = 'session_id';

8.1:
null_frac        0
avg_width        4
n_distinct        33513
most_common_vals    {1013854,1017890,1021551,1098817,764249,766938,776353,780954,782232,785985}
most_common_freqs    {0.001,0.001,0.001,0.001,0.000666667,0.000666667,0.000666667,0.000666667,0.000666667,0.000666667}
histogram_bounds    {757532,819803,874935,938170,1014421,1081507,1164659,1237281,1288267,1331016,1368939}
Correlation        -0.0736492

8.0.3:
null_frac        0
avg_width        4
n_distinct        29287
most_common_vals    {765411,931762,983933,1180453,1181959,1229963,1280249,1288736,1314970,764901}
most_common_freqs    {0.001,0.001,0.001,0.001,0.001,0.001,0.001,0.001,0.001,0.000666667}
histogram_bounds    {757339,822949,875834,939085,1004782,1065251,1140682,1218336,1270024,1312170,1353082}
Correlation        -0.237136

Kind regards

   Markus

Re: Queries taking ages in PG 8.1, have been much faster in PG<=8.0

From

Tom Lane

Date:

05 December 2005, 11:11:46

"Markus Wollny" <Markus.Wollny@computec.de> writes:
>> Could we see the pg_stats row for answer.session_id in both
>> 8.0 and 8.1?

> Here you are:

> 8.1:
> Correlation        -0.0736492

> 8.0.3:
> Correlation        -0.237136

Interesting --- if the 8.1 database is a dump and restore of the 8.0,
you'd expect the physical ordering to be similar.  Why is 8.1 showing
a significantly lower correlation?  That has considerable impact on the
estimated cost of an indexscan (plain not bitmap), and so it might
explain why 8.1 is mistakenly avoiding the indexscan ...

            regards, tom lane

Re: Queries taking ages in PG 8.1, have been much faster in PG<=8.0

From

"Markus Wollny"

Date:

05 December 2005, 11:40:38

> -----Ursprüngliche Nachricht-----
> Von: Tom Lane [mailto:tgl@sss.pgh.pa.us]
> Gesendet: Montag, 5. Dezember 2005 16:12
> An: Markus Wollny
> Cc: pgsql-performance@postgresql.org
> Betreff: Re: AW: AW: [PERFORM] Queries taking ages in PG 8.1,
> have been much faster in PG<=8.0
>
> "Markus Wollny" <Markus.Wollny@computec.de> writes:
> >> Could we see the pg_stats row for answer.session_id in
> both 8.0 and
> >> 8.1?
>
> > Here you are:
>
> > 8.1:
> > Correlation        -0.0736492
>
> > 8.0.3:
> > Correlation        -0.237136
>
> Interesting --- if the 8.1 database is a dump and restore of
> the 8.0, you'd expect the physical ordering to be similar.

I dumped the data from my 8.0.1 cluster on 2005-11-18 00:23 using pg_dumpall with no further options; the dump was
passedthrough iconv to clear up some UTF-8 encoding issues, then restored into a fresh 8.1 cluster where it went
productive;I used the very same dump to restore the 8.0.3 cluster. So there is a difference between the two datasets,
anadditional 230.328 rows in the answers-table. 

> Why is 8.1 showing a significantly lower correlation?  That
> has considerable impact on the estimated cost of an indexscan
> (plain not bitmap), and so it might explain why 8.1 is
> mistakenly avoiding the indexscan ...

I just ran a vacuum analyze on the table, just to make sure that the stats are up to date (forgot that on the previous
run,thanks to pg_autovacuum...), and the current correlation on the 8.1 installation is now calculated as -0.158921.
That'sstill more than twice the value as for the 8.0-db. I don't know whether that is significant, though. 

Kind regards

   Markus

Re: Queries taking ages in PG 8.1, have been much faster in PG<=8.0

From

"Markus Wollny"

Date:

05 December 2005, 11:45:26

> -----Ursprüngliche Nachricht-----
> Von: pgsql-performance-owner@postgresql.org
> [mailto:pgsql-performance-owner@postgresql.org] Im Auftrag
> von Markus Wollny
> Gesendet: Montag, 5. Dezember 2005 16:41
> An: Tom Lane
> Cc: pgsql-performance@postgresql.org
> Betreff: Re: [PERFORM] Queries taking ages in PG 8.1, have
> been much faster in PG<=8.0

> an additional 230.328 rows in the answers-table.

That was supposed to read 230,328 rows, sorry.

Re: Queries taking ages in PG 8.1, have been much faster in PG<=8.0

From

Manfred Koizar

Date:

10 December 2005, 13:14:19

On Mon, 05 Dec 2005 10:11:41 -0500, Tom Lane <tgl@sss.pgh.pa.us>
wrote:
>> Correlation        -0.0736492
>> Correlation        -0.237136

>That has considerable impact on the
>estimated cost of an indexscan

The cost estimator uses correlationsquared.  So all correlations
between -0.3 and +0.3 can be considered equal under the assumption
that estimation errors of up to 10% are acceptable.
Servus
 Manfred