Thread: Re: Why does the number of rows are different in actual and estimated.

Re: Why does the number of rows are different in actual and estimated.

From
"Kevin Grittner"
Date:
AI Rumman wrote:
> Claudio Freire <klaussfreire@gmail.com>wrote:
>> I think it's more likely a missing FK constraint.

> Does FK Constraint help to improve performance? Or it is only
> for maintaining data integrity?

I'm not aware of any situation where adding a foreign key
constraint would improve performance.

-Kevin


Re: Why does the number of rows are different in actual and estimated.

From
Tom Lane
Date:
"Kevin Grittner" <kgrittn@mail.com> writes:
> AI Rumman wrote:
>> Does FK Constraint help to improve performance? Or it is only
>> for maintaining data integrity?

> I'm not aware of any situation where adding a foreign key
> constraint would improve performance.

There's been talk of teaching the planner to use the existence of FK
constraints to improve plans, but I don't believe any such thing is
in the code today.

            regards, tom lane


Re: Why does the number of rows are different in actual and estimated.

From
AI Rumman
Date:
So I am going to change 
join_collapse_limit 
and 
from_collapse_limit 
to 20.

Do I need to set geqo_threshold to greater than 20. Now it is 12 ( default).

And could you let me know why geqo_optimizer is not working good in this case?



On Fri, Dec 14, 2012 at 2:22 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
"Kevin Grittner" <kgrittn@mail.com> writes:
> AI Rumman wrote:
>> Does FK Constraint help to improve performance? Or it is only
>> for maintaining data integrity?

> I'm not aware of any situation where adding a foreign key
> constraint would improve performance.

There's been talk of teaching the planner to use the existence of FK
constraints to improve plans, but I don't believe any such thing is
in the code today.

                        regards, tom lane

Re: Why does the number of rows are different in actual and estimated.

From
Claudio Freire
Date:
On Fri, Dec 14, 2012 at 4:22 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> "Kevin Grittner" <kgrittn@mail.com> writes:
>> AI Rumman wrote:
>>> Does FK Constraint help to improve performance? Or it is only
>>> for maintaining data integrity?
>
>> I'm not aware of any situation where adding a foreign key
>> constraint would improve performance.
>
> There's been talk of teaching the planner to use the existence of FK
> constraints to improve plans, but I don't believe any such thing is
> in the code today.

That made me look the code.

So, eqjoinsel_inner in selfuncs.c would need those smarts. Cool.

Anyway, reading the code, I think I can now spot the possible issue
behind all of this.

Selectivity is decided based on the number of distinct values on both
sides, and the table's name "entity" makes me think it's a table that
is reused for several things. That could be a problem, since that
inflates distinct values, feeding misinformation to the planner.

Rather than a generic "entity" table, perhaps it would be best to
separate them different entities into different tables. Failing that,
maybe if you have an "entity type" kind of column, you could try
refining the join condition to filter by that kind, hopefully there's
an index over entity kind and the planner can use more accurate MCV
data.


Re: Why does the number of rows are different in actual and estimated.

From
AI Rumman
Date:
Yes, I do have a column in entity table like
setype where the values are 'Contacts', 'Candidate' etc.
I have an index on it also.
Are you suggesting to make different table for Contacts, Candidate etc.

On Fri, Dec 14, 2012 at 3:10 PM, Claudio Freire <klaussfreire@gmail.com> wrote:
On Fri, Dec 14, 2012 at 4:22 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> "Kevin Grittner" <kgrittn@mail.com> writes:
>> AI Rumman wrote:
>>> Does FK Constraint help to improve performance? Or it is only
>>> for maintaining data integrity?
>
>> I'm not aware of any situation where adding a foreign key
>> constraint would improve performance.
>
> There's been talk of teaching the planner to use the existence of FK
> constraints to improve plans, but I don't believe any such thing is
> in the code today.

That made me look the code.

So, eqjoinsel_inner in selfuncs.c would need those smarts. Cool.

Anyway, reading the code, I think I can now spot the possible issue
behind all of this.

Selectivity is decided based on the number of distinct values on both
sides, and the table's name "entity" makes me think it's a table that
is reused for several things. That could be a problem, since that
inflates distinct values, feeding misinformation to the planner.

Rather than a generic "entity" table, perhaps it would be best to
separate them different entities into different tables. Failing that,
maybe if you have an "entity type" kind of column, you could try
refining the join condition to filter by that kind, hopefully there's
an index over entity kind and the planner can use more accurate MCV
data.

Re: Why does the number of rows are different in actual and estimated.

From
Claudio Freire
Date:
On Fri, Dec 14, 2012 at 5:25 PM, AI Rumman <rummandba@gmail.com> wrote:
> Are you suggesting to make different table for Contacts, Candidate etc.

Yes