different plan when using partitions: stats for inherited tables in joins - Mailing list pgsql-hackers

From Scara Maccai
Subject different plan when using partitions: stats for inherited tables in joins
Date
Msg-id 162452.61332.qm@web24609.mail.ird.yahoo.com
Whole thread Raw
List pgsql-hackers
following:

http://www.mail-archive.com/pgsql-general@postgresql.org/msg135076.html

I recreated the problem using a join between 2 tables:


explain select nome1, dltbfpgpdch
FROM cell_bsc_60_0610 as cell_bsc
left outer join teststscell73_test_0610_1 as data on
data.ne_id=cell_bsc.nome1 where
data.time between '2006-10-01 00:00:00' and '2006-10-02 01:00:00'
and cell_bsc.nome2=2;


result:
http://explain-analyze.info/query_plans/3824-query-plan-2531



This should give similar results, but it doesn't (same query as above, but instead of using "teststscell73_test_0610_1"
explicitly,I use "teststscell73_test", which is the table teststscell73_test_0610_1 inherits from): 

explain select nome1, dltbfpgpdch
FROM cell_bsc_60_0610 as cell_bsc
left outer join teststscell73_test as data on
data.ne_id=cell_bsc.nome1 where
data.time between '2006-10-01 00:00:00' and '2006-10-02 01:00:00'
and cell_bsc.nome2=2;


result:
http://explain-analyze.info/query_plans/3823-query-plan-2530



The number of rows in the second explain are off.

Debugging the code (HEAD) I think the problem is in these lines (selfuncs.c:4059):
    else if (rte->inh)    {        /*         * XXX This means the Var represents a column of an append         *
relation.Later add code to look at the member relations and         * try to derive some kind of combined statistics?
     */ 



I guess that's the problem because running the 2 queries at line costsize.c:2794 (set_joinrel_size_estimates):

nrows = outer_rel->rows * inner_rel->rows * jselec;

I get:

query1 (the "ok" one):
"inner_rel->rows" = 58507.0
"outer_rel->rows" = 285.0
"jselec" = 4.3593112595119964E-4
"nrows" = 7268.931380017649

query2 (the "not ok" one):
"inner_rel->rows" = 58516.0
"outer_rel->rows" = 285.0
"jselec" = 0.0035087719298245615
"nrows" = 58516.0

The jselec is very different in the 2 cases.

query1 doesn't pass through that "else if (rte->inh)" in selfuncs; which means stats are found; while query2 passes
throughthat code, and can't find real stats to be used by the join estimates... 

This leads to the bad estimates of query2.

Of course, I know almost nothing about Postgres, so my analysis is likely to be totally wrong...

As I said before, I have a 500KB dump that recreates the problem.







pgsql-hackers by date:

Previous
From: Robert Haas
Date:
Subject: Re: 8.5 development schedule
Next
From: Peter Eisentraut
Date:
Subject: Re: 8.5 development schedule