Re: Optimization idea - Mailing list pgsql-performance

From Vlad Arkhipov
Subject Re: Optimization idea
Date
Msg-id 4BD8020C.20509@dc.baikal.ru
Whole thread Raw
In response to Re: Optimization idea  (Cédric Villemain)
List pgsql-performance
> 2010/4/28 Robert Haas <>:
>
>> On Mon, Apr 26, 2010 at 5:33 AM, Cédric Villemain
>> <> wrote:
>>
>>> In the first query, the planner doesn't use the information of the 2,3,4.
>>> It just does a : I'll bet I'll have 2 rows in t1 (I think it should
>>> say 3, but it doesn't)
>>> So it divide the estimated number of rows in the t2 table by 5
>>> (different values) and multiply by 2 (rows) : 40040.
>>>
>> I think it's doing something more complicated.  See scalararraysel().
>>
>
> Thank you for driving me to the right function, Robert.
> It is in fact more complicated :)
>
>
>>> In the second query the planner use a different behavior : it did
>>> expand the value of t1.t to t2.t for each join relation and find a
>>> costless plan. (than the one using seqscan on t2)
>>>
>> I think the problem here is one we've discussed before: if the query
>> planner knows that something is true of x (like, say, x =
>> ANY('{2,3,4}')) and it also knows that x = y, it doesn't infer that
>> the same thing holds of y (i.e. y = ANY('{2,3,4}') unless the thing
>> that is known to be true of x is that x is equal to some constant.
>> Tom doesn't think it would be worth the additional CPU time that it
>> would take to make these sorts of deductions.  I'm not sure I believe
>> that, but I haven't tried to write the code, either.
>>
>
> If I understand correctly, I did have some issues with
> exclusion_constraint= ON for complex queries in datamining where the
> planner failled to understand it must use only one partition because
> the where clause where not enough 'explicit'. But it's long time ago
> and I don't have my use case.
>
> We probably need to find some real case where the planner optimisation
> make sense. But I don't want usual queries to see their CPU time
> increase...
> <joke>Do we need real Planner Hints ?</joke>
>
>
Even if it will be done it does not solve the original issue. If I
understood you right there is now no any decent way of speeding up the query

select *
from t2
join t1 on t1.t = t2.t
where t1.id = X;

except of the propagating the t1.id value to the table t2 and createing
and index for this column? Then the query will look like

select *
from t2
where t1_id = X;


pgsql-performance by date:

Previous
From: Cédric Villemain
Date:
Subject: Re: Optimization idea
Next
From: "Kevin Grittner"
Date:
Subject: Re: autovacuum strategy / parameters