Re: star schema and the optimizer - Mailing list pgsql-hackers

From Marc Cousin
Subject Re: star schema and the optimizer
Date
Msg-id 54F07EBE.5030508@gmail.com
Whole thread Raw
In response to Re: star schema and the optimizer  (Tom Lane <tgl@sss.pgh.pa.us>)
Responses Re: star schema and the optimizer  (Marc Cousin <cousinmarc@gmail.com>)
Re: star schema and the optimizer  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-hackers
On 27/02/2015 15:08, Tom Lane wrote:
> Marc Cousin <cousinmarc@gmail.com> writes:
>> So I gave a look at the optimizer's code to try to understand why I got this problem. If I understand correctly, the
optimizerwon't do cross joins, except if it has no choice.
 
> 
> That's right, and as you say, the planning-speed consequences of doing
> otherwise would be disastrous.  However, all you need to help it find the
> right plan is some dummy join condition between the dimension tables,
> which will allow the join path you want to be considered.  Perhaps you
> could do something like
> 
> SELECT * FROM dim1,dim2,facts WHERE facts.dim1=dim1.a and facts.dim2=dim2.a and dim1.b=12 AND dim2.b=17 and
(dim1.a+dim2.a)is not null;
 

No I can't. I cannot rewrite the query at all, in my context.


What do you mean by disastrous ?

I've given it a few tries here, and with 8 joins (same model, 7
dimensions), planning time is around 100ms. At least in my context, it's
well worth the planning time, to save minutes of execution.

I perfectly understand that it's not something that should be "by
default", that would be crazy. But in a datawarehouse, it seems to me
that accepting one, or even a few seconds of planning time to save
minutes of execution is perfectly legetimate.



pgsql-hackers by date:

Previous
From: Anastasia Lubennikova
Date:
Subject: Re: Index-only scans for GiST.
Next
From: Gilles Darold
Date:
Subject: Re: Bug in pg_dump