Re: Queries taking ages in PG 8.1, have been much faster in PG<=8.0 - Mailing list pgsql-performance

From Tom Lane
Subject Re: Queries taking ages in PG 8.1, have been much faster in PG<=8.0
Date
Msg-id 8926.1133721093@sss.pgh.pa.us
Whole thread Raw
In response to Re: Queries taking ages in PG 8.1, have been much faster in PG<=8.0  ("Markus Wollny" <Markus.Wollny@computec.de>)
List pgsql-performance
"Markus Wollny" <Markus.Wollny@computec.de> writes:
>> Once you're not under deadline,
>> I'd like to investigate more closely to find out why 8.1 does
>> worse than 8.0 here.

> Does this tell you anything useful? It's not on the same machine, mind
> you, but configuration for PostgreSQL is absolutely identical (apart
> from the autovacuum-lines which 8.0.3 doesn't like).

The data is not quite the same, right?  I notice different numbers of
rows being returned.  But anyway, it seems the problem is with the upper
scan on "answers", which 8.0 does like this:

  ->  Index Scan using idx_answer_session_id on answer  (cost=0.00..397.77 rows=1 width=4) (actual time=0.080..0.122
rows=1loops=11087) 
        Index Cond: ("outer".session_id = answer.session_id)
        Filter: ((question_id = 6) AND (value = 1))

and 8.1 does like this:

  ->  Bitmap Heap Scan on answer  (cost=185.85..187.26 rows=1 width=4) (actual time=197.490..197.494 rows=1 loops=9806)
        Recheck Cond: (("outer".session_id = answer.session_id) AND (answer.question_id = 6) AND (answer.value = 1))
        ->  BitmapAnd  (cost=185.85..185.85 rows=1 width=0) (actual time=197.421..197.421 rows=0 loops=9806)
              ->  Bitmap Index Scan on idx_answer_session_id  (cost=0.00..2.83 rows=236 width=0) (actual
time=0.109..0.109rows=49 loops=9806) 
                    Index Cond: ("outer".session_id = answer.session_id)
              ->  Bitmap Index Scan on idx02_performance (cost=0.00..182.77 rows=20629 width=0) (actual
time=195.742..195.742rows=165697 loops=9806) 
                    Index Cond: ((question_id = 6) AND (value = 1))

It seems that checking question_id/value via the index, rather than
directly on the fetched tuple, is a net loss here.  It looks like 8.1
would have made the right plan choice if it had made a better estimate
of the combined selectivity of the question_id and value conditions,
so ultimately this is another manifestation of the lack of cross-column
statistics.  What I find interesting though is that the plain index scan
in 8.0 is so enormously cheaper than it's estimated to be.  Perhaps the
answer table in your 8.0 installation is almost perfectly ordered by
session_id?

Are you using default values for the planner cost parameters?  It looks
like reducing random_page_cost would help bring the planner estimates
into line with reality on your machines.

            regards, tom lane

pgsql-performance by date:

Previous
From: "Markus Wollny"
Date:
Subject: Re: Queries taking ages in PG 8.1, have been much faster in PG<=8.0
Next
From: Klint Gore
Date:
Subject: Re: Faster db architecture for a twisted table.