Re: Bad Plan for Questionnaire-Type Query - Mailing list pgsql-performance

From Tom Lane
Subject Re: Bad Plan for Questionnaire-Type Query
Date
Msg-id 7857.1241884325@sss.pgh.pa.us
Whole thread Raw
In response to Re: Bad Plan for Questionnaire-Type Query  (David Blewett <david@dawninglight.net>)
Responses Re: Bad Plan for Questionnaire-Type Query
Re: Bad Plan for Questionnaire-Type Query
List pgsql-performance
David Blewett <david@dawninglight.net> writes:
> On Fri, May 8, 2009 at 10:00 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> Thanks.  Could I trouble you for one other data point --- about how many
>> rows are in each of these tables?

> Not a problem:

As best I can tell, the selectivity numbers are about what they should
be --- for instance, using these stats I get a selectivity of 0.0000074
for the join clause fkr.submission_id = tr.submission_id.  Over the
entire relations (646484 and 142698 rows) that's predicting a join size
of 683551, which seems to be in the right ballpark (it looks like
actually it's one join row per canvas_foreignkeyresponse row, correct?).
The thing that is strange here is that the one-to-one ratio holds up
despite strong and apparently uncorrelated restrictions on the
relations:

      ->  Hash Join  (cost=1485.69..3109.78 rows=28 width=24) (actual time=5.576..22.737 rows=4035 loops=1)
            Hash Cond: (fkr.submission_id = tr.submission_id)
            ->  Bitmap Heap Scan on canvas_foreignkeyresponse fkr  (cost=14.52..1628.19 rows=580 width=4) (actual
time=0.751..4.497rows=4035 loops=1) 
                  Recheck Cond: ((question_id = ANY ('{79,1037}'::integer[])) AND (object_id < 3))
                  ->  Bitmap Index Scan on canvas_foreignkeyresponse_qv2_idx  (cost=0.00..14.38 rows=580 width=0)
(actualtime=0.671..0.671 rows=4035 loops=1) 
                        Index Cond: ((question_id = ANY ('{79,1037}'::integer[])) AND (object_id < 3))
            ->  Hash  (cost=1388.48..1388.48 rows=6615 width=20) (actual time=4.805..4.805 rows=6694 loops=1)
                  ->  Bitmap Heap Scan on canvas_textresponse tr  (cost=131.79..1388.48 rows=6615 width=20) (actual
time=0.954..2.938rows=6694 loops=1) 
                        Recheck Cond: (question_id = ANY ('{4,1044}'::integer[]))
                        ->  Bitmap Index Scan on canvas_textresponse_question_id  (cost=0.00..130.14 rows=6615 width=0)
(actualtime=0.920..0.920 rows=6694 loops=1) 
                              Index Cond: (question_id = ANY ('{4,1044}'::integer[]))

How is it that each fkr row matching those question_ids has a join match
in tr that has those other two question_ids?  It seems like there must
be a whole lot of hidden correlation here.

            regards, tom lane

pgsql-performance by date:

Previous
From: David Blewett
Date:
Subject: Re: Bad Plan for Questionnaire-Type Query
Next
From: David Blewett
Date:
Subject: Re: Bad Plan for Questionnaire-Type Query