Re: Why does the number of rows are different in actual and estimated. - Mailing list pgsql-performance

From AI Rumman
Subject Re: Why does the number of rows are different in actual and estimated.
Date
Msg-id CAGoODpeUtvM61pnsOCZHc+YoEkX1-k7seJw-73yLKPXvJJLXqg@mail.gmail.com
Whole thread Raw
In response to Re: Why does the number of rows are different in actual and estimated.  (Claudio Freire <klaussfreire@gmail.com>)
Responses Re: Why does the number of rows are different in actual and estimated.  (Claudio Freire <klaussfreire@gmail.com>)
List pgsql-performance
Yes, I do have a column in entity table like
setype where the values are 'Contacts', 'Candidate' etc.
I have an index on it also.
Are you suggesting to make different table for Contacts, Candidate etc.

On Fri, Dec 14, 2012 at 3:10 PM, Claudio Freire <klaussfreire@gmail.com> wrote:
On Fri, Dec 14, 2012 at 4:22 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> "Kevin Grittner" <kgrittn@mail.com> writes:
>> AI Rumman wrote:
>>> Does FK Constraint help to improve performance? Or it is only
>>> for maintaining data integrity?
>
>> I'm not aware of any situation where adding a foreign key
>> constraint would improve performance.
>
> There's been talk of teaching the planner to use the existence of FK
> constraints to improve plans, but I don't believe any such thing is
> in the code today.

That made me look the code.

So, eqjoinsel_inner in selfuncs.c would need those smarts. Cool.

Anyway, reading the code, I think I can now spot the possible issue
behind all of this.

Selectivity is decided based on the number of distinct values on both
sides, and the table's name "entity" makes me think it's a table that
is reused for several things. That could be a problem, since that
inflates distinct values, feeding misinformation to the planner.

Rather than a generic "entity" table, perhaps it would be best to
separate them different entities into different tables. Failing that,
maybe if you have an "entity type" kind of column, you could try
refining the join condition to filter by that kind, hopefully there's
an index over entity kind and the planner can use more accurate MCV
data.

pgsql-performance by date:

Previous
From: Andrew Dunstan
Date:
Subject: Re: backend suddenly becomes slow, then remains slow
Next
From: Claudio Freire
Date:
Subject: Re: Why does the number of rows are different in actual and estimated.