Thread: New design for FK-based join selectivity estimation

New design for FK-based join selectivity estimation

From
Tom Lane
Date:
This is a branch of the discussion in
https://www.postgresql.org/message-id/flat/20160429102531.GA13701%40huehner.biz
but I'm starting a new thread as the original title is getting
increasingly off-topic.

I complained in that thread that the FK join selectivity patch had a
very brute-force approach to matching join qual clauses to FK
constraints, requiring a total of seven nested levels of looping to
get anything done, and expensively rediscovering the same facts over
and over.  Here is a sketch of what I think is a better way:

* During the planner's catalog data collection phase, construct a
single list (per PlannerInfo, not per RTE) of all relevant FKs.
In this data structure, each FK's referenced and referencing relations
are identified by relid (that is, RTE index) not just OID.  In the case
of a query containing a self-join, that would mean that the same FK
constraint could give rise to multiple list entries, one for each RTE
occurrence of its referenced or referencing target relation.  FKs not
relating two tables of the query are necessarily not in this list,
and we could also choose not to include single-column FKs.

* After finalizing equivalence classes, make a single pass through
the FK list and check each column-pair to see if it can be matched
to any EC (that is, both source and target columns are in the EC and
the comparison operator is in the EC's opfamily).  Mark each matched
column pair in the FK list data structure with a pointer to the EC.

* When performing join selectivity estimation, run through the FK list
a single time, ignoring entries that do not link a member of the join's
LHS to a member of the join's RHS.  This is a fairly cheap test given
the relid labels; it'd be approximately
if ((bms_is_member(fkinfo->src_relid, outer_rel->relids) &&     bms_is_member(fkinfo->dst_relid, inner_rel->relids)) ||
  (bms_is_member(fkinfo->dst_relid, outer_rel->relids) &&     bms_is_member(fkinfo->src_relid, inner_rel->relids)))
 

For each potentially interesting FK entry, run through the join
qual list.  A RestrictInfo that was generated from an EC matches
the FK if and only if that EC appears in the per-column markings;
other RestrictInfos are matched to one of the FK columns normally
(I think this path can ignore FK columns that have been matched to ECs).
At the end of that, we can determine whether all the FK columns have
been matched to some qual item, and we have a count and/or bitmapset
of the qual list entries that matched the FK.  Remember the FK entry
with the largest such count.

* After scanning the list, we have our best FK match and can proceed
with making the actual selectivity estimate as in the current code.

With this approach, we have an iteration structure like
 * once per join relation created   * for each foreign key constraint relevant to the query       (but skipping the
loopsbelow if it's not relevant to this join)     * for each join qual for the joinrel pair       * for each key column
inthat FK
 

which gets us down from seven nested loops to four, and also makes the
work done in the innermost loops significantly cheaper for the EC case,
which will be the more common one.  It's also much easier to make this
structure do zero extra work when there are no relevant FKs, which is
a pleasant property for extra planner work to have.

Now, we'll also add some per-FK-per-EC setup work, but my guess is
that that's negligible compared to the per-join-relation work.

It's possible that we could reduce the cost of matching non-EC join
quals as well, with some trick along the line of pre-matching non-EC
RestrictInfos to FK items.  I'm unsure that that is worth the trouble
though.

Thoughts?
        regards, tom lane



Re: New design for FK-based join selectivity estimation

From
Tom Lane
Date:
I wrote:
> ... Here is a sketch of what I think is a better way:
> ...
> It's possible that we could reduce the cost of matching non-EC join
> quals as well, with some trick along the line of pre-matching non-EC
> RestrictInfos to FK items.  I'm unsure that that is worth the trouble
> though.

After further thought, I believe that may well be worth doing.  That
is, someplace after deconstruct_jointree(), examine all the FKs and
match their columns not only to ECs but to non-EC joinclauses, which
we could find by trawling the joininfo list for either subject relation.
We'd end up with a EC pointer and/or a list of non-EC RestrictInfos
for each FK column.

The thing that makes this attractive is that at the end of this matching,
we would know exactly whether each FK is matched to the query as a whole:
either all its columns have matches, or they don't.  It's not necessary to
re-determine that for each joinrel pair that includes the FK's two subject
relations.  So the per-join-relation work would reduce to scanning the FK
list once to find the matched FK(s) that connect relations on opposite
sides of the join.  Once we've found such an FK, identifying which join
qual list items should be dropped in favor of applying the FK's
selectivity is also really easy: we just check the column markings.
        regards, tom lane



Re: New design for FK-based join selectivity estimation

From
Tomas Vondra
Date:
Hi,

On 06/04/2016 09:44 PM, Tom Lane wrote:
> This is a branch of the discussion in
> https://www.postgresql.org/message-id/flat/20160429102531.GA13701%40huehner.biz
> but I'm starting a new thread as the original title is getting
> increasingly off-topic.
>
> I complained in that thread that the FK join selectivity patch had a
> very brute-force approach to matching join qual clauses to FK
> constraints, requiring a total of seven nested levels of looping to
> get anything done, and expensively rediscovering the same facts over
> and over.  Here is a sketch of what I think is a better way:
>
> * During the planner's catalog data collection phase, construct a
> single list (per PlannerInfo, not per RTE) of all relevant FKs.
> In this data structure, each FK's referenced and referencing relations
> are identified by relid (that is, RTE index) not just OID.  In the case
> of a query containing a self-join, that would mean that the same FK
> constraint could give rise to multiple list entries, one for each RTE
> occurrence of its referenced or referencing target relation.  FKs not
> relating two tables of the query are necessarily not in this list,
> and we could also choose not to include single-column FKs.
>
> * After finalizing equivalence classes, make a single pass through
> the FK list and check each column-pair to see if it can be matched
> to any EC (that is, both source and target columns are in the EC and
> the comparison operator is in the EC's opfamily).  Mark each matched
> column pair in the FK list data structure with a pointer to the EC.
>
> * When performing join selectivity estimation, run through the FK list
> a single time, ignoring entries that do not link a member of the join's
> LHS to a member of the join's RHS.  This is a fairly cheap test given
> the relid labels; it'd be approximately
>
>     if ((bms_is_member(fkinfo->src_relid, outer_rel->relids) &&
>          bms_is_member(fkinfo->dst_relid, inner_rel->relids)) ||
>         (bms_is_member(fkinfo->dst_relid, outer_rel->relids) &&
>          bms_is_member(fkinfo->src_relid, inner_rel->relids)))
>
> For each potentially interesting FK entry, run through the join
> qual list.  A RestrictInfo that was generated from an EC matches
> the FK if and only if that EC appears in the per-column markings;
> other RestrictInfos are matched to one of the FK columns normally
> (I think this path can ignore FK columns that have been matched to ECs).
> At the end of that, we can determine whether all the FK columns have
> been matched to some qual item, and we have a count and/or bitmapset
> of the qual list entries that matched the FK.  Remember the FK entry
> with the largest such count.
>
> * After scanning the list, we have our best FK match and can proceed
> with making the actual selectivity estimate as in the current code.
>
> With this approach, we have an iteration structure like
>
>   * once per join relation created
>     * for each foreign key constraint relevant to the query
>         (but skipping the loops below if it's not relevant to this join)
>       * for each join qual for the joinrel pair
>         * for each key column in that FK
>
> which gets us down from seven nested loops to four, and also makes the
> work done in the innermost loops significantly cheaper for the EC case,
> which will be the more common one.  It's also much easier to make this
> structure do zero extra work when there are no relevant FKs, which is
> a pleasant property for extra planner work to have.
>
> Now, we'll also add some per-FK-per-EC setup work, but my guess is
> that that's negligible compared to the per-join-relation work.
>
> It's possible that we could reduce the cost of matching non-EC join
> quals as well, with some trick along the line of pre-matching non-EC
> RestrictInfos to FK items.  I'm unsure that that is worth the trouble
> though.
>
> Thoughts?

Firstly thanks for looking into this, and also for coming up with a very 
detailed design proposal.

I do agree this new design seems superior to the current one and it 
seems worth a try. I'd like to see how far we can get over the next few 
days (say, until the end of the week).

One of the recent issues with the current design was handling of 
inheritance / appendrels. ISTM the proposed design has the same issue, 
no? What happens if the relations are partitioned?

regards

-- 
Tomas Vondra                  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: New design for FK-based join selectivity estimation

From
Tom Lane
Date:
Tomas Vondra <tomas.vondra@2ndquadrant.com> writes:
> One of the recent issues with the current design was handling of 
> inheritance / appendrels. ISTM the proposed design has the same issue, 
> no? What happens if the relations are partitioned?

I haven't thought about inheritance in this proposal.  My initial feeling
is that considering the parent table's outgoing FKs (if any) as valid is
not unreasonable.  If it has any, probably all the children do too.
Not sure about incoming FKs, but there probably are none anyhow, since
our implementation doesn't really permit reasonable FK definitions that
reference a partitioned table.  In any case, whatever we might choose
to do differently for inheritance would be no harder in this scheme than
what's there now; plus, whatever it is, we'd do it once not once per join
relation.
        regards, tom lane



Re: New design for FK-based join selectivity estimation

From
Simon Riggs
Date:
On 4 June 2016 at 20:44, Tom Lane <tgl@sss.pgh.pa.us> wrote:
This is a branch of the discussion in
https://www.postgresql.org/message-id/flat/20160429102531.GA13701%40huehner.biz
but I'm starting a new thread as the original title is getting
increasingly off-topic.

I complained in that thread that the FK join selectivity patch had a
very brute-force approach to matching join qual clauses to FK
constraints, requiring a total of seven nested levels of looping to
get anything done, and expensively rediscovering the same facts over
and over.  Here is a sketch of what I think is a better way:

Thanks for your review and design notes here, which look like good improvements. 

Tomas has been discussing that with myself and others, but I just realised that might not be apparent on list, so just to mention there is activity on this and new code will be published very soon.


On the above mentioned thread, Tomas' analysis was this...
> There are probably a few reasonably simple things we could do - e.g. ignore foreign keys
> with just a single column, as the primary goal of the patch is improving estimates with
> multi-column foreign keys. I believe that covers a vast majority of foreign keys in the wild.

I agree with that comment. The relcache code retrieves all FKs, even ones that have a single column. Yet the planner code never uses them unless nKeys>1. That was masked somewhat by my two commits, treating the info as generic and then using only a very specific subset of it.

So a simple change is to make RelationGetFKeyList() only retrieve FKs with nKeys>1. Rename to RelationGetMultiColumnFKeyList(). That greatly reduces the scope for increased planning time.

--
Simon Riggs                http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: New design for FK-based join selectivity estimation

From
Tom Lane
Date:
Simon Riggs <simon@2ndquadrant.com> writes:
> So a simple change is to make RelationGetFKeyList() only retrieve FKs with
> nKeys>1. Rename to RelationGetMultiColumnFKeyList(). That greatly reduces
> the scope for increased planning time.

FWIW, I don't particularly agree with that.  It makes the relcache's fkey
storage extremely specific to this one use-case, a decision I expect we'd
regret later.  And the planner needs to filter the fkey list anyway,
because it only wants fkeys that link to tables that are also in the
current query.  Thus, my recommendation was that we should allow
RelationGetFKeyList to return a pointer directly to the cached info list
and require the planner to immediately copy (only) the entries that it
needs for the current query.

Another point here is that I'm now unconvinced that restricting the logic
to consider only multi-column fkeys is really what we want.  It looks to
me like the code can also improve estimates in the case where there are
multiple single-column FKs linking to the same target relation.  That
might not be too common for two-table queries, but I bet it happens a
lot in three-or-more-table queries.
        regards, tom lane



Re: New design for FK-based join selectivity estimation

From
Simon Riggs
Date:
On 13 June 2016 at 19:16, Tom Lane <tgl@sss.pgh.pa.us> wrote:
Simon Riggs <simon@2ndquadrant.com> writes:
> So a simple change is to make RelationGetFKeyList() only retrieve FKs with
> nKeys>1. Rename to RelationGetMultiColumnFKeyList(). That greatly reduces
> the scope for increased planning time.

FWIW, I don't particularly agree with that.  It makes the relcache's fkey
storage extremely specific to this one use-case, a decision I expect we'd
regret later.

Hmm, clearly I thought that earlier also; that earlier thinking may be influencing you. My commits had the concept of generic FK info and then a specific optimization. So the main part of the planning problem was caused by stored info that would never be used, in 9.6.

What changes my mind here is 1) point in dev cycle, 2) the point that the list of FKs doesn't take into account whether the constraints are deferrable, deferred or immediate and whether they are valid/invalid. ISTM likely that we would care about those things in the future if we believe that info is generic.

But then each new usage of the info will have the same planning time problem to consider if they choose to extend the amount of info they hold.

Rejecting an optimization in 9.6 because it might be undone by later changes is surely a problem for those later changes to resolve.
 
And the planner needs to filter the fkey list anyway,
because it only wants fkeys that link to tables that are also in the
current query.  Thus, my recommendation was that we should allow
RelationGetFKeyList to return a pointer directly to the cached info list
and require the planner to immediately copy (only) the entries that it
needs for the current query.

Another point here is that I'm now unconvinced that restricting the logic
to consider only multi-column fkeys is really what we want.  It looks to
me like the code can also improve estimates in the case where there are
multiple single-column FKs linking to the same target relation.  That
might not be too common for two-table queries, but I bet it happens a
lot in three-or-more-table queries.

Is it realistic that we consider that at this point? Certainly not for myself, at least.

--
Simon Riggs                http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: New design for FK-based join selectivity estimation

From
Tom Lane
Date:
Simon Riggs <simon@2ndquadrant.com> writes:
> On 13 June 2016 at 19:16, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> Another point here is that I'm now unconvinced that restricting the logic
>> to consider only multi-column fkeys is really what we want.  It looks to
>> me like the code can also improve estimates in the case where there are
>> multiple single-column FKs linking to the same target relation.  That
>> might not be too common for two-table queries, but I bet it happens a
>> lot in three-or-more-table queries.

> Is it realistic that we consider that at this point? Certainly not for
> myself, at least.

It's pretty much built into the redesign I proposed, or so I thought.
        regards, tom lane



Re: New design for FK-based join selectivity estimation

From
Tomas Vondra
Date:
Hi,

Attached is a reworked patch, mostly following the new design proposal
from this thread.

I'm not entirely happy with the code, but it's the best I've been able
to come up with by now and I won't be able to significantly improve that
due to travel over. Inevitably there will be issues in the code, and if
there's a chance of fixing them I'll do my best to do that over the
evenings at a hotel.

The filtering and matching to eclasses / join quals is triggered from
planmain.c - I believe this is the right place and that those pieces are
reasonably solid.

The estimation still happens in costsize.c, of course, but was modified
to use the pre-matched info. I have my doubts about this part, and I'm
sure Tom had something less complex / more efficient in mind (using the
pre-matched info).


regards

--
Tomas Vondra                  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Attachment

Re: New design for FK-based join selectivity estimation

From
Tom Lane
Date:
Tomas Vondra <tomas.vondra@2ndquadrant.com> writes:
> Attached is a reworked patch, mostly following the new design proposal
> from this thread.

I've whacked this around quite a bit and am now reasonably happy with it.
The issue of planning speed hit should be pretty much gone, although I've
not done testing to prove that.  I've rearranged the actual selectivity
calculation some too, because tests I did here did not look very good for
anything but the plain-inner-join case.  It may be that more work is
needed there, but at least there's reasonable commentary about what we're
doing and why.

I have not adopted the plan of ignoring single-column FKs.  While it would
only take a couple more lines to do so, I think the argument that there
will be a material planning speed hit no longer has much force.  And I see
a couple of arguments in favor of allowing this code to trigger on
single-column FKs.  First, it should work well even when pg_statistic data
is missing or out of date, which would be an improvement; and second, we
are likely not going to get much beta testing of the behavior if we
restrict it to multi-col FKs.  So I think we should ship it like this for
beta even if we end up adding a filter against single-column FKs for
release.

Comments and testing appreciated.

            regards, tom lane

diff --git a/src/backend/nodes/copyfuncs.c b/src/backend/nodes/copyfuncs.c
index 08ed990..8e5b33c 100644
*** a/src/backend/nodes/copyfuncs.c
--- b/src/backend/nodes/copyfuncs.c
***************
*** 27,32 ****
--- 27,33 ----
  #include "nodes/plannodes.h"
  #include "nodes/relation.h"
  #include "utils/datum.h"
+ #include "utils/rel.h"


  /*
*************** _copyValue(const Value *from)
*** 4252,4257 ****
--- 4253,4276 ----
      return newnode;
  }

+
+ static ForeignKeyCacheInfo *
+ _copyForeignKeyCacheInfo(const ForeignKeyCacheInfo *from)
+ {
+     ForeignKeyCacheInfo *newnode = makeNode(ForeignKeyCacheInfo);
+
+     COPY_SCALAR_FIELD(conrelid);
+     COPY_SCALAR_FIELD(confrelid);
+     COPY_SCALAR_FIELD(nkeys);
+     /* COPY_SCALAR_FIELD might work for these, but let's not assume that */
+     memcpy(newnode->conkey, from->conkey, sizeof(newnode->conkey));
+     memcpy(newnode->confkey, from->confkey, sizeof(newnode->confkey));
+     memcpy(newnode->conpfeqop, from->conpfeqop, sizeof(newnode->conpfeqop));
+
+     return newnode;
+ }
+
+
  /*
   * copyObject
   *
*************** copyObject(const void *from)
*** 5050,5055 ****
--- 5069,5081 ----
              retval = _copyRoleSpec(from);
              break;

+             /*
+              * MISCELLANEOUS NODES
+              */
+         case T_ForeignKeyCacheInfo:
+             retval = _copyForeignKeyCacheInfo(from);
+             break;
+
          default:
              elog(ERROR, "unrecognized node type: %d", (int) nodeTag(from));
              retval = 0;            /* keep compiler quiet */
diff --git a/src/backend/nodes/outfuncs.c b/src/backend/nodes/outfuncs.c
index c7b4153..eaaa370 100644
*** a/src/backend/nodes/outfuncs.c
--- b/src/backend/nodes/outfuncs.c
***************
*** 30,35 ****
--- 30,36 ----
  #include "nodes/plannodes.h"
  #include "nodes/relation.h"
  #include "utils/datum.h"
+ #include "utils/rel.h"


  /*
*************** _outPlannerInfo(StringInfo str, const Pl
*** 2048,2053 ****
--- 2049,2055 ----
      WRITE_NODE_FIELD(append_rel_list);
      WRITE_NODE_FIELD(rowMarks);
      WRITE_NODE_FIELD(placeholder_list);
+     WRITE_NODE_FIELD(fkey_list);
      WRITE_NODE_FIELD(query_pathkeys);
      WRITE_NODE_FIELD(group_pathkeys);
      WRITE_NODE_FIELD(window_pathkeys);
*************** _outIndexOptInfo(StringInfo str, const I
*** 2138,2143 ****
--- 2140,2174 ----
  }

  static void
+ _outForeignKeyOptInfo(StringInfo str, const ForeignKeyOptInfo *node)
+ {
+     int            i;
+
+     WRITE_NODE_TYPE("FOREIGNKEYOPTINFO");
+
+     WRITE_UINT_FIELD(con_relid);
+     WRITE_UINT_FIELD(ref_relid);
+     WRITE_INT_FIELD(nkeys);
+     appendStringInfoString(str, " :conkey");
+     for (i = 0; i < node->nkeys; i++)
+         appendStringInfo(str, " %d", node->conkey[i]);
+     appendStringInfoString(str, " :confkey");
+     for (i = 0; i < node->nkeys; i++)
+         appendStringInfo(str, " %d", node->confkey[i]);
+     appendStringInfoString(str, " :conpfeqop");
+     for (i = 0; i < node->nkeys; i++)
+         appendStringInfo(str, " %u", node->conpfeqop[i]);
+     WRITE_INT_FIELD(nmatched);
+     /* for compactness, just print the number of matches per column: */
+     appendStringInfoString(str, " :eclass");
+     for (i = 0; i < node->nkeys; i++)
+         appendStringInfo(str, " %d", (node->eclass[i] != NULL));
+     appendStringInfoString(str, " :rinfos");
+     for (i = 0; i < node->nkeys; i++)
+         appendStringInfo(str, " %d", list_length(node->rinfos[i]));
+ }
+
+ static void
  _outEquivalenceClass(StringInfo str, const EquivalenceClass *node)
  {
      /*
*************** _outConstraint(StringInfo str, const Con
*** 3207,3212 ****
--- 3238,3264 ----
      }
  }

+ static void
+ _outForeignKeyCacheInfo(StringInfo str, const ForeignKeyCacheInfo *node)
+ {
+     int            i;
+
+     WRITE_NODE_TYPE("FOREIGNKEYCACHEINFO");
+
+     WRITE_OID_FIELD(conrelid);
+     WRITE_OID_FIELD(confrelid);
+     WRITE_INT_FIELD(nkeys);
+     appendStringInfoString(str, " :conkey");
+     for (i = 0; i < node->nkeys; i++)
+         appendStringInfo(str, " %d", node->conkey[i]);
+     appendStringInfoString(str, " :confkey");
+     for (i = 0; i < node->nkeys; i++)
+         appendStringInfo(str, " %d", node->confkey[i]);
+     appendStringInfoString(str, " :conpfeqop");
+     for (i = 0; i < node->nkeys; i++)
+         appendStringInfo(str, " %u", node->conpfeqop[i]);
+ }
+

  /*
   * outNode -
*************** outNode(StringInfo str, const void *obj)
*** 3607,3612 ****
--- 3659,3667 ----
              case T_IndexOptInfo:
                  _outIndexOptInfo(str, obj);
                  break;
+             case T_ForeignKeyOptInfo:
+                 _outForeignKeyOptInfo(str, obj);
+                 break;
              case T_EquivalenceClass:
                  _outEquivalenceClass(str, obj);
                  break;
*************** outNode(StringInfo str, const void *obj)
*** 3783,3788 ****
--- 3838,3846 ----
              case T_XmlSerialize:
                  _outXmlSerialize(str, obj);
                  break;
+             case T_ForeignKeyCacheInfo:
+                 _outForeignKeyCacheInfo(str, obj);
+                 break;

              default:

diff --git a/src/backend/optimizer/path/costsize.c b/src/backend/optimizer/path/costsize.c
index e7f63f4..7ea6d57 100644
*** a/src/backend/optimizer/path/costsize.c
--- b/src/backend/optimizer/path/costsize.c
*************** static bool has_indexed_join_quals(NestP
*** 147,156 ****
--- 147,163 ----
  static double approx_tuple_count(PlannerInfo *root, JoinPath *path,
                     List *quals);
  static double calc_joinrel_size_estimate(PlannerInfo *root,
+                            RelOptInfo *outer_rel,
+                            RelOptInfo *inner_rel,
                             double outer_rows,
                             double inner_rows,
                             SpecialJoinInfo *sjinfo,
                             List *restrictlist);
+ static Selectivity get_foreign_key_join_selectivity(PlannerInfo *root,
+                                  Relids outer_relids,
+                                  Relids inner_relids,
+                                  SpecialJoinInfo *sjinfo,
+                                  List **restrictlist);
  static void set_rel_width(PlannerInfo *root, RelOptInfo *rel);
  static double relation_byte_size(double tuples, int width);
  static double page_size(double tuples, int width);
*************** set_joinrel_size_estimates(PlannerInfo *
*** 3837,3842 ****
--- 3844,3851 ----
                             List *restrictlist)
  {
      rel->rows = calc_joinrel_size_estimate(root,
+                                            outer_rel,
+                                            inner_rel,
                                             outer_rel->rows,
                                             inner_rel->rows,
                                             sjinfo,
*************** set_joinrel_size_estimates(PlannerInfo *
*** 3848,3855 ****
   *        Make a size estimate for a parameterized scan of a join relation.
   *
   * 'rel' is the joinrel under consideration.
!  * 'outer_rows', 'inner_rows' are the sizes of the (probably also
!  *        parameterized) join inputs under consideration.
   * 'sjinfo' is any SpecialJoinInfo relevant to this join.
   * 'restrict_clauses' lists the join clauses that need to be applied at the
   * join node (including any movable clauses that were moved down to this join,
--- 3857,3864 ----
   *        Make a size estimate for a parameterized scan of a join relation.
   *
   * 'rel' is the joinrel under consideration.
!  * 'outer_path', 'inner_path' are (probably also parameterized) Paths that
!  *        produce the relations being joined.
   * 'sjinfo' is any SpecialJoinInfo relevant to this join.
   * 'restrict_clauses' lists the join clauses that need to be applied at the
   * join node (including any movable clauses that were moved down to this join,
*************** set_joinrel_size_estimates(PlannerInfo *
*** 3860,3867 ****
   */
  double
  get_parameterized_joinrel_size(PlannerInfo *root, RelOptInfo *rel,
!                                double outer_rows,
!                                double inner_rows,
                                 SpecialJoinInfo *sjinfo,
                                 List *restrict_clauses)
  {
--- 3869,3876 ----
   */
  double
  get_parameterized_joinrel_size(PlannerInfo *root, RelOptInfo *rel,
!                                Path *outer_path,
!                                Path *inner_path,
                                 SpecialJoinInfo *sjinfo,
                                 List *restrict_clauses)
  {
*************** get_parameterized_joinrel_size(PlannerIn
*** 3877,3884 ****
       * estimate for any pair with the same parameterization.
       */
      nrows = calc_joinrel_size_estimate(root,
!                                        outer_rows,
!                                        inner_rows,
                                         sjinfo,
                                         restrict_clauses);
      /* For safety, make sure result is not more than the base estimate */
--- 3886,3895 ----
       * estimate for any pair with the same parameterization.
       */
      nrows = calc_joinrel_size_estimate(root,
!                                        outer_path->parent,
!                                        inner_path->parent,
!                                        outer_path->rows,
!                                        inner_path->rows,
                                         sjinfo,
                                         restrict_clauses);
      /* For safety, make sure result is not more than the base estimate */
*************** get_parameterized_joinrel_size(PlannerIn
*** 3891,3905 ****
--- 3902,3923 ----
   * calc_joinrel_size_estimate
   *        Workhorse for set_joinrel_size_estimates and
   *        get_parameterized_joinrel_size.
+  *
+  * outer_rel/inner_rel are the relations being joined, but they should be
+  * assumed to have sizes outer_rows/inner_rows; those numbers might be less
+  * than what rel->rows says, when we are considering parameterized paths.
   */
  static double
  calc_joinrel_size_estimate(PlannerInfo *root,
+                            RelOptInfo *outer_rel,
+                            RelOptInfo *inner_rel,
                             double outer_rows,
                             double inner_rows,
                             SpecialJoinInfo *sjinfo,
                             List *restrictlist)
  {
      JoinType    jointype = sjinfo->jointype;
+     Selectivity fkselec;
      Selectivity jselec;
      Selectivity pselec;
      double        nrows;
*************** calc_joinrel_size_estimate(PlannerInfo *
*** 3910,3915 ****
--- 3928,3949 ----
       * double-counting them because they were not considered in estimating the
       * sizes of the component rels.
       *
+      * First, see whether any of the joinclauses can be matched to known FK
+      * constraints.  If so, drop those clauses from the restrictlist, and
+      * instead estimate their selectivity using FK semantics.  (We do this
+      * without regard to whether said clauses are local or "pushed down".
+      * Probably, an FK-matching clause could never be seen as pushed down at
+      * an outer join, since it would be strict and hence would be grounds for
+      * join strength reduction.)  fkselec gets the net selectivity for
+      * FK-matching clauses, or 1.0 if there are none.
+      */
+     fkselec = get_foreign_key_join_selectivity(root,
+                                                outer_rel->relids,
+                                                inner_rel->relids,
+                                                sjinfo,
+                                                &restrictlist);
+
+     /*
       * For an outer join, we have to distinguish the selectivity of the join's
       * own clauses (JOIN/ON conditions) from any clauses that were "pushed
       * down".  For inner joins we just count them all as joinclauses.
*************** calc_joinrel_size_estimate(PlannerInfo *
*** 3973,3988 ****
      switch (jointype)
      {
          case JOIN_INNER:
!             nrows = outer_rows * inner_rows * jselec;
              break;
          case JOIN_LEFT:
!             nrows = outer_rows * inner_rows * jselec;
              if (nrows < outer_rows)
                  nrows = outer_rows;
              nrows *= pselec;
              break;
          case JOIN_FULL:
!             nrows = outer_rows * inner_rows * jselec;
              if (nrows < outer_rows)
                  nrows = outer_rows;
              if (nrows < inner_rows)
--- 4007,4023 ----
      switch (jointype)
      {
          case JOIN_INNER:
!             nrows = outer_rows * inner_rows * fkselec * jselec;
!             /* pselec not used */
              break;
          case JOIN_LEFT:
!             nrows = outer_rows * inner_rows * fkselec * jselec;
              if (nrows < outer_rows)
                  nrows = outer_rows;
              nrows *= pselec;
              break;
          case JOIN_FULL:
!             nrows = outer_rows * inner_rows * fkselec * jselec;
              if (nrows < outer_rows)
                  nrows = outer_rows;
              if (nrows < inner_rows)
*************** calc_joinrel_size_estimate(PlannerInfo *
*** 3990,4000 ****
              nrows *= pselec;
              break;
          case JOIN_SEMI:
!             nrows = outer_rows * jselec;
              /* pselec not used */
              break;
          case JOIN_ANTI:
!             nrows = outer_rows * (1.0 - jselec);
              nrows *= pselec;
              break;
          default:
--- 4025,4035 ----
              nrows *= pselec;
              break;
          case JOIN_SEMI:
!             nrows = outer_rows * fkselec * jselec;
              /* pselec not used */
              break;
          case JOIN_ANTI:
!             nrows = outer_rows * (1.0 - fkselec * jselec);
              nrows *= pselec;
              break;
          default:
*************** calc_joinrel_size_estimate(PlannerInfo *
*** 4008,4013 ****
--- 4043,4261 ----
  }

  /*
+  * get_foreign_key_join_selectivity
+  *        Estimate join selectivity for foreign-key-related clauses.
+  *
+  * Remove any clauses that can be matched to FK constraints from *restrictlist,
+  * and return a substitute estimate of their selectivity.  1.0 is returned
+  * when there are no such clauses.
+  *
+  * The reason for treating such clauses specially is that we can get better
+  * estimates this way than by relying on clauselist_selectivity(), especially
+  * for multi-column FKs where that function's assumption that the clauses are
+  * independent falls down badly.  But even with single-column FKs, we may be
+  * able to get a better answer when the pg_statistic stats are missing or out
+  * of date.
+  */
+ static Selectivity
+ get_foreign_key_join_selectivity(PlannerInfo *root,
+                                  Relids outer_relids,
+                                  Relids inner_relids,
+                                  SpecialJoinInfo *sjinfo,
+                                  List **restrictlist)
+ {
+     Selectivity fkselec = 1.0;
+     JoinType    jointype = sjinfo->jointype;
+     List       *worklist = *restrictlist;
+     ListCell   *lc;
+
+     /* Consider each FK constraint that is known to match the query */
+     foreach(lc, root->fkey_list)
+     {
+         ForeignKeyOptInfo *fkinfo = (ForeignKeyOptInfo *) lfirst(lc);
+         bool        ref_is_outer;
+         List       *removedlist;
+         ListCell   *cell;
+         ListCell   *prev;
+         ListCell   *next;
+
+         /*
+          * This FK is not relevant unless it connects a baserel on one side of
+          * this join to a baserel on the other side.
+          */
+         if (bms_is_member(fkinfo->con_relid, outer_relids) &&
+             bms_is_member(fkinfo->ref_relid, inner_relids))
+             ref_is_outer = false;
+         else if (bms_is_member(fkinfo->ref_relid, outer_relids) &&
+                  bms_is_member(fkinfo->con_relid, inner_relids))
+             ref_is_outer = true;
+         else
+             continue;
+
+         /*
+          * Modify the restrictlist by removing clauses that match the FK (and
+          * putting them into removedlist instead).  It seems unsafe to modify
+          * the originally-passed List structure, so we make a shallow copy the
+          * first time through.
+          */
+         if (worklist == *restrictlist)
+             worklist = list_copy(worklist);
+
+         removedlist = NIL;
+         prev = NULL;
+         for (cell = list_head(worklist); cell; cell = next)
+         {
+             RestrictInfo *rinfo = (RestrictInfo *) lfirst(cell);
+             bool        remove_it = false;
+             int            i;
+
+             next = lnext(cell);
+             /* Drop this clause if it matches any column of the FK */
+             for (i = 0; i < fkinfo->nkeys; i++)
+             {
+                 if (rinfo->parent_ec)
+                 {
+                     /*
+                      * EC-derived clauses can only match by EC.  Note that
+                      * checking parent_ec is a bit of a cheat because there
+                      * are EC-derived clauses that don't have parent_ec set;
+                      * but such clauses must compare expressions that aren't
+                      * just Vars, so they cannot match the FK anyway.
+                      */
+                     if (fkinfo->eclass[i] == rinfo->parent_ec)
+                     {
+                         remove_it = true;
+                         break;
+                     }
+                 }
+                 else
+                 {
+                     /* Otherwise, see if rinfo was previously matched to EC */
+                     if (list_member_ptr(fkinfo->rinfos[i], rinfo))
+                     {
+                         remove_it = true;
+                         break;
+                     }
+                 }
+             }
+             if (remove_it)
+             {
+                 worklist = list_delete_cell(worklist, cell, prev);
+                 removedlist = lappend(removedlist, rinfo);
+             }
+             else
+                 prev = cell;
+         }
+
+         /*
+          * If we failed to remove any matching clauses, chicken out and ignore
+          * this FK; applying its selectivity might result in double-counting.
+          * It's not obvious that this is correct, but remember that we already
+          * know the FK matches query quals someplace.  There are just two
+          * reasons we might fail to find matching clauses in this join:
+          *
+          * Some FK we matched earlier might have removed the relevant clause.
+          * This is particularly likely with EC matches, since equivclass.c
+          * provides only a single EC-derived clause per join.  But that means
+          * we have something like A.R1 and B.R2 on one side of the join that
+          * each reference C.PK on the other side, in which case applying the
+          * selectivities of both FK constraints would be double-counting.
+          *
+          * Otherwise, if we didn't find the matching clause in this join, it
+          * must be outerjoin-delayed and will be applied at some higher join
+          * level.  Applying the FK's selectivity anyway would result in
+          * underestimating both this join size and the higher join level's.
+          *
+          * For a multi-column FK, it's possible that we found matches to some
+          * columns but not all, implying that one of the above effects applied
+          * to just some of the columns.  For the moment, we go ahead and
+          * remove those clauses and apply the FK's selectivity anyway.  It
+          * might be better to put back the removed clauses and ignore the FK;
+          * but that amounts to betting on independence of the clauses, which
+          * doesn't seem like a good bet in such messy cases.
+          */
+         if (removedlist == NIL)
+             continue;
+
+         /*
+          * Finally we get to the payoff: estimate selectivity using the
+          * knowledge that each referencing row will match exactly one row in
+          * the referenced table.
+          *
+          * XXX that's not true in the presence of nulls in the referencing
+          * column(s), so in principle we should derate the estimate for those.
+          * However (1) if there are any strict restriction clauses for the
+          * referencing column(s) elsewhere in the query, derating here would
+          * be double-counting the null fraction, and (2) it's not very clear
+          * how to combine null fractions for multiple referencing columns.
+          *
+          * In the first branch of the logic below, null derating is done
+          * implicitly by relying on clause_selectivity(); in the other two
+          * paths, we do nothing for now about correcting for nulls.
+          *
+          * XXX another point here is that if either side of an FK constraint
+          * is an inheritance parent, we estimate as though the constraint
+          * covers all its children as well.  This is not an unreasonable
+          * assumption for a referencing table, ie the user probably applied
+          * identical constraints to all child tables (though perhaps we ought
+          * to check that).  But it's not possible to have done that for a
+          * referenced table.  Fortunately, precisely because that doesn't
+          * work, it is uncommon in practice to have an FK referencing a parent
+          * table.  So, at least for now, disregard inheritance here.
+          */
+         if (ref_is_outer && jointype != JOIN_INNER)
+         {
+             /*
+              * When the referenced table is on the outer side of a non-inner
+              * join, knowing that each inner row has exactly one match is not
+              * as useful as one could wish, since we really need to know the
+              * fraction of outer rows with a match.  Still, we can avoid the
+              * folly of multiplying the per-column estimates together.  Take
+              * the smallest per-column selectivity, instead.  (This should
+              * correspond to the FK column with the most nulls.)
+              */
+             Selectivity thisfksel = 1.0;
+
+             foreach(cell, removedlist)
+             {
+                 RestrictInfo *rinfo = (RestrictInfo *) lfirst(cell);
+                 Selectivity csel;
+
+                 csel = clause_selectivity(root, (Node *) rinfo,
+                                           0, jointype, sjinfo);
+                 thisfksel = Min(thisfksel, csel);
+             }
+             fkselec *= thisfksel;
+         }
+         else if (jointype == JOIN_SEMI || jointype == JOIN_ANTI)
+         {
+             /*
+              * For JOIN_SEMI and JOIN_ANTI, the selectivity is defined as the
+              * fraction of LHS rows that have matches.  If the referenced
+              * table is on the inner side, that means the selectivity is 1.0
+              * (modulo nulls, which we're ignoring for now).  We already
+              * covered the other case, so no work here.
+              */
+         }
+         else
+         {
+             /*
+              * Otherwise, selectivity is exactly 1/referenced-table-size; but
+              * guard against tuples == 0.  Note we should use the raw table
+              * tuple count, not any estimate of its filtered or joined size.
+              */
+             RelOptInfo *ref_rel = find_base_rel(root, fkinfo->ref_relid);
+             double        ref_tuples = Max(ref_rel->tuples, 1.0);
+
+             fkselec *= 1.0 / ref_tuples;
+         }
+     }
+
+     *restrictlist = worklist;
+     return fkselec;
+ }
+
+ /*
   * set_subquery_size_estimates
   *        Set the size estimates for a base relation that is a subquery.
   *
diff --git a/src/backend/optimizer/path/equivclass.c b/src/backend/optimizer/path/equivclass.c
index bfa0c65..0e50ad5 100644
*** a/src/backend/optimizer/path/equivclass.c
--- b/src/backend/optimizer/path/equivclass.c
*************** exprs_known_equal(PlannerInfo *root, Nod
*** 1926,1931 ****
--- 1926,2010 ----


  /*
+  * match_eclasses_to_foreign_key_col
+  *      See whether a foreign key column match is proven by any eclass.
+  *
+  * If the referenced and referencing Vars of the fkey's colno'th column are
+  * known equal due to any eclass, return that eclass; otherwise return NULL.
+  * (In principle there might be more than one matching eclass if multiple
+  * collations are involved, but since collation doesn't matter for equality,
+  * we ignore that fine point here.)  This is much like exprs_known_equal,
+  * except that we insist on the comparison operator matching the eclass, so
+  * that the result is definite not approximate.
+  */
+ EquivalenceClass *
+ match_eclasses_to_foreign_key_col(PlannerInfo *root,
+                                   ForeignKeyOptInfo *fkinfo,
+                                   int colno)
+ {
+     Index        var1varno = fkinfo->con_relid;
+     AttrNumber    var1attno = fkinfo->conkey[colno];
+     Index        var2varno = fkinfo->ref_relid;
+     AttrNumber    var2attno = fkinfo->confkey[colno];
+     Oid            eqop = fkinfo->conpfeqop[colno];
+     List       *opfamilies = NIL;        /* compute only if needed */
+     ListCell   *lc1;
+
+     foreach(lc1, root->eq_classes)
+     {
+         EquivalenceClass *ec = (EquivalenceClass *) lfirst(lc1);
+         bool        item1member = false;
+         bool        item2member = false;
+         ListCell   *lc2;
+
+         /* Never match to a volatile EC */
+         if (ec->ec_has_volatile)
+             continue;
+         /* Note: it seems okay to match to "broken" eclasses here */
+
+         foreach(lc2, ec->ec_members)
+         {
+             EquivalenceMember *em = (EquivalenceMember *) lfirst(lc2);
+             Var           *var;
+
+             if (em->em_is_child)
+                 continue;        /* ignore children here */
+
+             /* EM must be a Var, possibly with RelabelType */
+             var = (Var *) em->em_expr;
+             while (var && IsA(var, RelabelType))
+                 var = (Var *) ((RelabelType *) var)->arg;
+             if (!(var && IsA(var, Var)))
+                 continue;
+
+             /* Match? */
+             if (var->varno == var1varno && var->varattno == var1attno)
+                 item1member = true;
+             else if (var->varno == var2varno && var->varattno == var2attno)
+                 item2member = true;
+
+             /* Have we found both PK and FK column in this EC? */
+             if (item1member && item2member)
+             {
+                 /*
+                  * Succeed if eqop matches EC's opfamilies.  We could test
+                  * this before scanning the members, but it's probably cheaper
+                  * to test for member matches first.
+                  */
+                 if (opfamilies == NIL)    /* compute if we didn't already */
+                     opfamilies = get_mergejoin_opfamilies(eqop);
+                 if (equal(opfamilies, ec->ec_opfamilies))
+                     return ec;
+                 /* Otherwise, done with this EC, move on to the next */
+                 break;
+             }
+         }
+     }
+     return NULL;
+ }
+
+
+ /*
   * add_child_rel_equivalences
   *      Search for EC members that reference the parent_rel, and
   *      add transformed members referencing the child_rel.
diff --git a/src/backend/optimizer/plan/analyzejoins.c b/src/backend/optimizer/plan/analyzejoins.c
index 3d305eb..e28a8dc 100644
*** a/src/backend/optimizer/plan/analyzejoins.c
--- b/src/backend/optimizer/plan/analyzejoins.c
*************** remove_rel_from_query(PlannerInfo *root,
*** 433,438 ****
--- 433,443 ----
              distribute_restrictinfo_to_rels(root, rinfo);
          }
      }
+
+     /*
+      * There may be references to the rel in root->fkey_list, but if so,
+      * match_foreign_keys_to_quals() will get rid of them.
+      */
  }

  /*
diff --git a/src/backend/optimizer/plan/initsplan.c b/src/backend/optimizer/plan/initsplan.c
index 1a1c26a..4d8cf9f 100644
*** a/src/backend/optimizer/plan/initsplan.c
--- b/src/backend/optimizer/plan/initsplan.c
*************** build_implied_join_equality(Oid opno,
*** 2306,2311 ****
--- 2306,2454 ----
  }


+ /*
+  * match_foreign_keys_to_quals
+  *        Match foreign-key constraints to equivalence classes and join quals
+  *
+  * The idea here is to see which query join conditions match equality
+  * constraints of a foreign-key relationship.  For such join conditions,
+  * we can use the FK semantics to make selectivity estimates that are more
+  * reliable than estimating from statistics, especially for multiple-column
+  * FKs, where the normal assumption of independent conditions tends to fail.
+  *
+  * In this function we annotate the ForeignKeyOptInfos in root->fkey_list
+  * with info about which eclasses and join qual clauses they match, and
+  * discard any ForeignKeyOptInfos that are irrelevant for the query.
+  */
+ void
+ match_foreign_keys_to_quals(PlannerInfo *root)
+ {
+     List       *newlist = NIL;
+     ListCell   *lc;
+
+     foreach(lc, root->fkey_list)
+     {
+         ForeignKeyOptInfo *fkinfo = (ForeignKeyOptInfo *) lfirst(lc);
+         RelOptInfo *con_rel = find_base_rel(root, fkinfo->con_relid);
+         RelOptInfo *ref_rel = find_base_rel(root, fkinfo->ref_relid);
+         int            colno;
+
+         /*
+          * Ignore FK unless both rels are baserels.  This gets rid of FKs that
+          * link to inheritance child rels (otherrels) and those that link to
+          * rels removed by join removal (dead rels).
+          */
+         if (con_rel->reloptkind != RELOPT_BASEREL ||
+             ref_rel->reloptkind != RELOPT_BASEREL)
+             continue;
+
+         /*
+          * Scan the columns and try to match them to eclasses and quals.
+          *
+          * Note: for simple inner joins, any match should be in an eclass.
+          * "Loose" quals that match an FK equality must have been rejected for
+          * EC status because they are outer-join quals or similar.  We still
+          * consider them to match the FK, though, since that produces better
+          * selectivity estimates than not matching.
+          */
+         for (colno = 0; colno < fkinfo->nkeys; colno++)
+         {
+             AttrNumber    con_attno,
+                         ref_attno;
+             Oid            fpeqop;
+             ListCell   *lc2;
+
+             fkinfo->eclass[colno] = match_eclasses_to_foreign_key_col(root,
+                                                                       fkinfo,
+                                                                       colno);
+             /* Don't bother looking for loose quals if we got an EC match */
+             if (fkinfo->eclass[colno] != NULL)
+             {
+                 fkinfo->nmatched++;
+                 continue;
+             }
+
+             /*
+              * Scan joininfo list for relevant clauses.  Either rel's joininfo
+              * list would do equally well; we use con_rel's.
+              */
+             con_attno = fkinfo->conkey[colno];
+             ref_attno = fkinfo->confkey[colno];
+             fpeqop = InvalidOid;    /* we'll look this up only if needed */
+
+             foreach(lc2, con_rel->joininfo)
+             {
+                 RestrictInfo *rinfo = (RestrictInfo *) lfirst(lc2);
+                 OpExpr       *clause = (OpExpr *) rinfo->clause;
+                 Var           *leftvar;
+                 Var           *rightvar;
+
+                 /* only binary OpExprs are useful for consideration */
+                 if (!IsA(clause, OpExpr) ||
+                     list_length(clause->args) != 2)
+                     continue;
+                 leftvar = (Var *) get_leftop((Expr *) clause);
+                 rightvar = (Var *) get_rightop((Expr *) clause);
+
+                 /* Operands must be Vars, possibly with RelabelType */
+                 while (leftvar && IsA(leftvar, RelabelType))
+                     leftvar = (Var *) ((RelabelType *) leftvar)->arg;
+                 if (!(leftvar && IsA(leftvar, Var)))
+                     continue;
+                 while (rightvar && IsA(rightvar, RelabelType))
+                     rightvar = (Var *) ((RelabelType *) rightvar)->arg;
+                 if (!(rightvar && IsA(rightvar, Var)))
+                     continue;
+
+                 /* Now try to match the vars to the current foreign key cols */
+                 if (fkinfo->ref_relid == leftvar->varno &&
+                     ref_attno == leftvar->varattno &&
+                     fkinfo->con_relid == rightvar->varno &&
+                     con_attno == rightvar->varattno)
+                 {
+                     /* Vars match, but is it the right operator? */
+                     if (clause->opno == fkinfo->conpfeqop[colno])
+                         fkinfo->rinfos[colno] = lappend(fkinfo->rinfos[colno],
+                                                         rinfo);
+                 }
+                 else if (fkinfo->ref_relid == rightvar->varno &&
+                          ref_attno == rightvar->varattno &&
+                          fkinfo->con_relid == leftvar->varno &&
+                          con_attno == leftvar->varattno)
+                 {
+                     /*
+                      * Reverse match, must check commutator operator.  Look it
+                      * up if we didn't already.  (In the worst case we might
+                      * do multiple lookups here, but that would require an FK
+                      * equality operator without commutator, which is
+                      * unlikely.)
+                      */
+                     if (!OidIsValid(fpeqop))
+                         fpeqop = get_commutator(fkinfo->conpfeqop[colno]);
+                     if (clause->opno == fpeqop)
+                         fkinfo->rinfos[colno] = lappend(fkinfo->rinfos[colno],
+                                                         rinfo);
+                 }
+             }
+             /* If we found any matching loose quals, count col as matched */
+             if (fkinfo->rinfos[colno])
+                 fkinfo->nmatched++;
+         }
+
+         /*
+          * Currently, we drop multicolumn FKs that aren't fully matched to the
+          * query.  Later we might figure out how to derive some sort of
+          * weakened estimate from them, in which case this test should be
+          * weakened to "if (fkinfo->nmatched > 0)".
+          */
+         if (fkinfo->nmatched == fkinfo->nkeys)
+             newlist = lappend(newlist, fkinfo);
+     }
+     /* Replace fkey_list, thereby discarding any useless entries */
+     root->fkey_list = newlist;
+ }
+
+
  /*****************************************************************************
   *
   *     CHECKS FOR MERGEJOINABLE AND HASHJOINABLE CLAUSES
diff --git a/src/backend/optimizer/plan/planmain.c b/src/backend/optimizer/plan/planmain.c
index edd95d8..27234ff 100644
*** a/src/backend/optimizer/plan/planmain.c
--- b/src/backend/optimizer/plan/planmain.c
*************** query_planner(PlannerInfo *root, List *t
*** 115,120 ****
--- 115,121 ----
      root->full_join_clauses = NIL;
      root->join_info_list = NIL;
      root->placeholder_list = NIL;
+     root->fkey_list = NIL;
      root->initial_rels = NIL;

      /*
*************** query_planner(PlannerInfo *root, List *t
*** 206,211 ****
--- 207,220 ----
      create_lateral_join_info(root);

      /*
+      * Match foreign keys to equivalence classes and join quals.  This must be
+      * done after finalizing equivalence classes, and it's useful to wait till
+      * after join removal so that we can skip processing foreign keys
+      * involving removed relations.
+      */
+     match_foreign_keys_to_quals(root);
+
+     /*
       * Look for join OR clauses that we can extract single-relation
       * restriction OR clauses from.
       */
diff --git a/src/backend/optimizer/util/plancat.c b/src/backend/optimizer/util/plancat.c
index 6aa8192..b0658f2 100644
*** a/src/backend/optimizer/util/plancat.c
--- b/src/backend/optimizer/util/plancat.c
*************** int            constraint_exclusion = CONSTRAINT_
*** 52,57 ****
--- 52,59 ----
  get_relation_info_hook_type get_relation_info_hook = NULL;


+ static void get_relation_foreign_keys(PlannerInfo *root, RelOptInfo *rel,
+                           Relation relation);
  static bool infer_collation_opclass_match(InferenceElem *elem, Relation idxRel,
                                List *idxExprs);
  static int32 get_rel_data_width(Relation rel, int32 *attr_widths);
*************** static List *build_index_tlist(PlannerIn
*** 77,82 ****
--- 79,86 ----
   *    pages        number of pages
   *    tuples        number of tuples
   *
+  * Also, add information about the relation's foreign keys to root->fkey_list.
+  *
   * Also, initialize the attr_needed[] and attr_widths[] arrays.  In most
   * cases these are left as zeroes, but sometimes we need to compute attr
   * widths here, and we may as well cache the results for costsize.c.
*************** get_relation_info(PlannerInfo *root, Oid
*** 403,408 ****
--- 407,415 ----
          rel->fdwroutine = NULL;
      }

+     /* Collect info about relation's foreign keys, if relevant */
+     get_relation_foreign_keys(root, rel, relation);
+
      heap_close(relation, NoLock);

      /*
*************** get_relation_info(PlannerInfo *root, Oid
*** 415,420 ****
--- 422,516 ----
  }

  /*
+  * get_relation_foreign_keys -
+  *      Retrieves foreign key information for a given relation.
+  *
+  * ForeignKeyOptInfos for relevant foreign keys are created and added to
+  * root->fkey_list.  We do this now while we have the relcache entry open.
+  * We could sometimes avoid making useless ForeignKeyOptInfos if we waited
+  * until all RelOptInfos have been built, but the cost of re-opening the
+  * relcache entries would probably exceed any savings.
+  */
+ static void
+ get_relation_foreign_keys(PlannerInfo *root, RelOptInfo *rel,
+                           Relation relation)
+ {
+     List       *rtable = root->parse->rtable;
+     List       *cachedfkeys;
+     ListCell   *lc;
+
+     /*
+      * If it's not a baserel, we don't care about its FKs.  Also, if the query
+      * references only a single relation, we can skip the lookup since no FKs
+      * could satisfy the requirements below.
+      */
+     if (rel->reloptkind != RELOPT_BASEREL ||
+         list_length(rtable) < 2)
+         return;
+
+     /*
+      * Extract data about relation's FKs from the relcache.  Note that this
+      * list belongs to the relcache and might disappear in a cache flush, so
+      * we must not do any further catalog access within this function.
+      */
+     cachedfkeys = RelationGetFKeyList(relation);
+
+     /*
+      * Figure out which FKs are of interest for this query, and create
+      * ForeignKeyOptInfos for them.  We want only FKs that reference some
+      * other RTE of the current query.  In queries containing self-joins,
+      * there might be more than one other RTE for a referenced table, and we
+      * should make a ForeignKeyOptInfo for each occurrence.
+      *
+      * Ideally, we would ignore RTEs that correspond to non-baserels, but it's
+      * too hard to identify those here, so we might end up making some useless
+      * ForeignKeyOptInfos.  If so, match_foreign_keys_to_quals() will remove
+      * them again.
+      */
+     foreach(lc, cachedfkeys)
+     {
+         ForeignKeyCacheInfo *cachedfk = (ForeignKeyCacheInfo *) lfirst(lc);
+         Index        rti;
+         ListCell   *lc2;
+
+         /* conrelid should always be that of the table we're considering */
+         Assert(cachedfk->conrelid == RelationGetRelid(relation));
+
+         /* Scan to find other RTEs matching confrelid */
+         rti = 0;
+         foreach(lc2, rtable)
+         {
+             RangeTblEntry *rte = (RangeTblEntry *) lfirst(lc2);
+             ForeignKeyOptInfo *info;
+
+             rti++;
+             /* Ignore if not the correct table */
+             if (rte->rtekind != RTE_RELATION ||
+                 rte->relid != cachedfk->confrelid)
+                 continue;
+             /* Ignore self-referential FKs; we only care about joins */
+             if (rti == rel->relid)
+                 continue;
+
+             /* OK, let's make an entry */
+             info = makeNode(ForeignKeyOptInfo);
+             info->con_relid = rel->relid;
+             info->ref_relid = rti;
+             info->nkeys = cachedfk->nkeys;
+             memcpy(info->conkey, cachedfk->conkey, sizeof(info->conkey));
+             memcpy(info->confkey, cachedfk->confkey, sizeof(info->confkey));
+             memcpy(info->conpfeqop, cachedfk->conpfeqop, sizeof(info->conpfeqop));
+             /* zero out fields to be filled by match_foreign_keys_to_quals */
+             info->nmatched = 0;
+             memset(info->eclass, 0, sizeof(info->eclass));
+             memset(info->rinfos, 0, sizeof(info->rinfos));
+
+             root->fkey_list = lappend(root->fkey_list, info);
+         }
+     }
+ }
+
+ /*
   * infer_arbiter_indexes -
   *      Determine the unique indexes used to arbitrate speculative insertion.
   *
diff --git a/src/backend/optimizer/util/relnode.c b/src/backend/optimizer/util/relnode.c
index ba185ae..a0a284b 100644
*** a/src/backend/optimizer/util/relnode.c
--- b/src/backend/optimizer/util/relnode.c
*************** get_joinrel_parampathinfo(PlannerInfo *r
*** 1264,1271 ****

      /* Estimate the number of rows returned by the parameterized join */
      rows = get_parameterized_joinrel_size(root, joinrel,
!                                           outer_path->rows,
!                                           inner_path->rows,
                                            sjinfo,
                                            *restrict_clauses);

--- 1264,1271 ----

      /* Estimate the number of rows returned by the parameterized join */
      rows = get_parameterized_joinrel_size(root, joinrel,
!                                           outer_path,
!                                           inner_path,
                                            sjinfo,
                                            *restrict_clauses);

diff --git a/src/backend/utils/cache/relcache.c b/src/backend/utils/cache/relcache.c
index c958758..8d2ad01 100644
*** a/src/backend/utils/cache/relcache.c
--- b/src/backend/utils/cache/relcache.c
*************** RelationBuildDesc(Oid targetRelId, bool
*** 1049,1054 ****
--- 1049,1058 ----
      else
          relation->rd_rsdesc = NULL;

+     /* foreign key data is not loaded till asked for */
+     relation->rd_fkeylist = NIL;
+     relation->rd_fkeyvalid = false;
+
      /*
       * if it's an index, initialize index-related information
       */
*************** RelationDestroyRelation(Relation relatio
*** 2030,2040 ****
          else
              FreeTupleDesc(relation->rd_att);
      }
      list_free(relation->rd_indexlist);
      bms_free(relation->rd_indexattr);
      bms_free(relation->rd_keyattr);
      bms_free(relation->rd_idattr);
-     FreeTriggerDesc(relation->trigdesc);
      if (relation->rd_options)
          pfree(relation->rd_options);
      if (relation->rd_indextuple)
--- 2034,2045 ----
          else
              FreeTupleDesc(relation->rd_att);
      }
+     FreeTriggerDesc(relation->trigdesc);
+     list_free_deep(relation->rd_fkeylist);
      list_free(relation->rd_indexlist);
      bms_free(relation->rd_indexattr);
      bms_free(relation->rd_keyattr);
      bms_free(relation->rd_idattr);
      if (relation->rd_options)
          pfree(relation->rd_options);
      if (relation->rd_indextuple)
*************** CheckConstraintCmp(const void *a, const
*** 3804,3809 ****
--- 3809,3955 ----
  }

  /*
+  * RelationGetFKeyList -- get a list of foreign key info for the relation
+  *
+  * Returns a list of ForeignKeyCacheInfo structs, one per FK constraining
+  * the given relation.  This data is a direct copy of relevant fields from
+  * pg_constraint.  The list items are in no particular order.
+  *
+  * CAUTION: the returned list is part of the relcache's data, and could
+  * vanish in a relcache entry reset.  Callers must inspect or copy it
+  * before doing anything that might trigger a cache flush, such as
+  * system catalog accesses.  copyObject() can be used if desired.
+  * (We define it this way because current callers want to filter and
+  * modify the list entries anyway, so copying would be a waste of time.)
+  */
+ List *
+ RelationGetFKeyList(Relation relation)
+ {
+     List       *result;
+     Relation    conrel;
+     SysScanDesc conscan;
+     ScanKeyData skey;
+     HeapTuple    htup;
+     List       *oldlist;
+     MemoryContext oldcxt;
+
+     /* Quick exit if we already computed the list. */
+     if (relation->rd_fkeyvalid)
+         return relation->rd_fkeylist;
+
+     /* Fast path: if it doesn't have any triggers, it can't have FKs */
+     if (!relation->rd_rel->relhastriggers)
+         return NIL;
+
+     /*
+      * We build the list we intend to return (in the caller's context) while
+      * doing the scan.  After successfully completing the scan, we copy that
+      * list into the relcache entry.  This avoids cache-context memory leakage
+      * if we get some sort of error partway through.
+      */
+     result = NIL;
+
+     /* Prepare to scan pg_constraint for entries having conrelid = this rel. */
+     ScanKeyInit(&skey,
+                 Anum_pg_constraint_conrelid,
+                 BTEqualStrategyNumber, F_OIDEQ,
+                 ObjectIdGetDatum(RelationGetRelid(relation)));
+
+     conrel = heap_open(ConstraintRelationId, AccessShareLock);
+     conscan = systable_beginscan(conrel, ConstraintRelidIndexId, true,
+                                  NULL, 1, &skey);
+
+     while (HeapTupleIsValid(htup = systable_getnext(conscan)))
+     {
+         Form_pg_constraint constraint = (Form_pg_constraint) GETSTRUCT(htup);
+         ForeignKeyCacheInfo *info;
+         Datum        adatum;
+         bool        isnull;
+         ArrayType  *arr;
+         int            nelem;
+
+         /* consider only foreign keys */
+         if (constraint->contype != CONSTRAINT_FOREIGN)
+             continue;
+
+         info = makeNode(ForeignKeyCacheInfo);
+         info->conrelid = constraint->conrelid;
+         info->confrelid = constraint->confrelid;
+
+         /* Extract data from conkey field */
+         adatum = fastgetattr(htup, Anum_pg_constraint_conkey,
+                              conrel->rd_att, &isnull);
+         if (isnull)
+             elog(ERROR, "null conkey for rel %s",
+                  RelationGetRelationName(relation));
+
+         arr = DatumGetArrayTypeP(adatum);        /* ensure not toasted */
+         nelem = ARR_DIMS(arr)[0];
+         if (ARR_NDIM(arr) != 1 ||
+             nelem < 1 ||
+             nelem > INDEX_MAX_KEYS ||
+             ARR_HASNULL(arr) ||
+             ARR_ELEMTYPE(arr) != INT2OID)
+             elog(ERROR, "conkey is not a 1-D smallint array");
+
+         info->nkeys = nelem;
+         memcpy(info->conkey, ARR_DATA_PTR(arr), nelem * sizeof(AttrNumber));
+
+         /* Likewise for confkey */
+         adatum = fastgetattr(htup, Anum_pg_constraint_confkey,
+                              conrel->rd_att, &isnull);
+         if (isnull)
+             elog(ERROR, "null confkey for rel %s",
+                  RelationGetRelationName(relation));
+
+         arr = DatumGetArrayTypeP(adatum);        /* ensure not toasted */
+         nelem = ARR_DIMS(arr)[0];
+         if (ARR_NDIM(arr) != 1 ||
+             nelem != info->nkeys ||
+             ARR_HASNULL(arr) ||
+             ARR_ELEMTYPE(arr) != INT2OID)
+             elog(ERROR, "confkey is not a 1-D smallint array");
+
+         memcpy(info->confkey, ARR_DATA_PTR(arr), nelem * sizeof(AttrNumber));
+
+         /* Likewise for conpfeqop */
+         adatum = fastgetattr(htup, Anum_pg_constraint_conpfeqop,
+                              conrel->rd_att, &isnull);
+         if (isnull)
+             elog(ERROR, "null conpfeqop for rel %s",
+                  RelationGetRelationName(relation));
+
+         arr = DatumGetArrayTypeP(adatum);        /* ensure not toasted */
+         nelem = ARR_DIMS(arr)[0];
+         if (ARR_NDIM(arr) != 1 ||
+             nelem != info->nkeys ||
+             ARR_HASNULL(arr) ||
+             ARR_ELEMTYPE(arr) != OIDOID)
+             elog(ERROR, "conpfeqop is not a 1-D OID array");
+
+         memcpy(info->conpfeqop, ARR_DATA_PTR(arr), nelem * sizeof(Oid));
+
+         /* Add FK's node to the result list */
+         result = lappend(result, info);
+     }
+
+     systable_endscan(conscan);
+     heap_close(conrel, AccessShareLock);
+
+     /* Now save a copy of the completed list in the relcache entry. */
+     oldcxt = MemoryContextSwitchTo(CacheMemoryContext);
+     oldlist = relation->rd_fkeylist;
+     relation->rd_fkeylist = copyObject(result);
+     relation->rd_fkeyvalid = true;
+     MemoryContextSwitchTo(oldcxt);
+
+     /* Don't leak the old list, if there is one */
+     list_free_deep(oldlist);
+
+     return result;
+ }
+
+ /*
   * RelationGetIndexList -- get a list of OIDs of indexes on this relation
   *
   * The index list is created only if someone requests it.  We scan pg_index
*************** load_relcache_init_file(bool shared)
*** 4892,4898 ****
           * format is complex and subject to change).  They must be rebuilt if
           * needed by RelationCacheInitializePhase3.  This is not expected to
           * be a big performance hit since few system catalogs have such. Ditto
!          * for index expressions, predicates, exclusion info, and FDW info.
           */
          rel->rd_rules = NULL;
          rel->rd_rulescxt = NULL;
--- 5038,5045 ----
           * format is complex and subject to change).  They must be rebuilt if
           * needed by RelationCacheInitializePhase3.  This is not expected to
           * be a big performance hit since few system catalogs have such. Ditto
!          * for RLS policy data, index expressions, predicates, exclusion info,
!          * and FDW info.
           */
          rel->rd_rules = NULL;
          rel->rd_rulescxt = NULL;
*************** load_relcache_init_file(bool shared)
*** 4914,4919 ****
--- 5061,5068 ----
          else
              rel->rd_refcnt = 0;
          rel->rd_indexvalid = 0;
+         rel->rd_fkeylist = NIL;
+         rel->rd_fkeyvalid = false;
          rel->rd_indexlist = NIL;
          rel->rd_oidindex = InvalidOid;
          rel->rd_replidindex = InvalidOid;
diff --git a/src/include/nodes/nodes.h b/src/include/nodes/nodes.h
index c4b9c14..8f46091 100644
*** a/src/include/nodes/nodes.h
--- b/src/include/nodes/nodes.h
*************** typedef enum NodeTag
*** 223,228 ****
--- 223,229 ----
      T_PlannerGlobal,
      T_RelOptInfo,
      T_IndexOptInfo,
+     T_ForeignKeyOptInfo,
      T_ParamPathInfo,
      T_Path,
      T_IndexPath,
*************** typedef enum NodeTag
*** 478,484 ****
      T_InlineCodeBlock,            /* in nodes/parsenodes.h */
      T_FdwRoutine,                /* in foreign/fdwapi.h */
      T_IndexAmRoutine,            /* in access/amapi.h */
!     T_TsmRoutine                /* in access/tsmapi.h */
  } NodeTag;

  /*
--- 479,486 ----
      T_InlineCodeBlock,            /* in nodes/parsenodes.h */
      T_FdwRoutine,                /* in foreign/fdwapi.h */
      T_IndexAmRoutine,            /* in access/amapi.h */
!     T_TsmRoutine,                /* in access/tsmapi.h */
!     T_ForeignKeyCacheInfo        /* in utils/rel.h */
  } NodeTag;

  /*
diff --git a/src/include/nodes/relation.h b/src/include/nodes/relation.h
index a4892cb..4d0cb2e 100644
*** a/src/include/nodes/relation.h
--- b/src/include/nodes/relation.h
*************** typedef struct PlannerInfo
*** 251,256 ****
--- 251,258 ----

      List       *placeholder_list;        /* list of PlaceHolderInfos */

+     List       *fkey_list;        /* list of ForeignKeyOptInfos */
+
      List       *query_pathkeys; /* desired pathkeys for query_planner() */

      List       *group_pathkeys; /* groupClause pathkeys, if any */
*************** typedef struct IndexOptInfo
*** 622,627 ****
--- 624,657 ----
      void        (*amcostestimate) ();    /* AM's cost estimator */
  } IndexOptInfo;

+ /*
+  * ForeignKeyOptInfo
+  *        Per-foreign-key information for planning/optimization
+  *
+  * The per-FK-column arrays can be fixed-size because we allow at most
+  * INDEX_MAX_KEYS columns in a foreign key constraint.  Each array has
+  * nkeys valid entries.
+  */
+ typedef struct ForeignKeyOptInfo
+ {
+     NodeTag        type;
+
+     /* Basic data about the foreign key (fetched from catalogs): */
+     Index        con_relid;        /* RT index of the referencing table */
+     Index        ref_relid;        /* RT index of the referenced table */
+     int            nkeys;            /* number of columns in the foreign key */
+     AttrNumber    conkey[INDEX_MAX_KEYS]; /* cols in referencing table */
+     AttrNumber    confkey[INDEX_MAX_KEYS];        /* cols in referenced table */
+     Oid            conpfeqop[INDEX_MAX_KEYS];        /* PK = FK operator OIDs */
+
+     /* Derived info about whether FK's equality conditions match the query: */
+     int            nmatched;        /* number of column conditions matched */
+     /* Pointer to eclass matching each column's condition, if there is one */
+     struct EquivalenceClass *eclass[INDEX_MAX_KEYS];
+     /* List of non-EC RestrictInfos matching each column's condition */
+     List       *rinfos[INDEX_MAX_KEYS];
+ } ForeignKeyOptInfo;
+

  /*
   * EquivalenceClasses
diff --git a/src/include/optimizer/cost.h b/src/include/optimizer/cost.h
index f41f9e9..2a4df2f 100644
*** a/src/include/optimizer/cost.h
--- b/src/include/optimizer/cost.h
*************** extern double get_parameterized_baserel_
*** 167,174 ****
                                 List *param_clauses);
  extern double get_parameterized_joinrel_size(PlannerInfo *root,
                                 RelOptInfo *rel,
!                                double outer_rows,
!                                double inner_rows,
                                 SpecialJoinInfo *sjinfo,
                                 List *restrict_clauses);
  extern void set_joinrel_size_estimates(PlannerInfo *root, RelOptInfo *rel,
--- 167,174 ----
                                 List *param_clauses);
  extern double get_parameterized_joinrel_size(PlannerInfo *root,
                                 RelOptInfo *rel,
!                                Path *outer_path,
!                                Path *inner_path,
                                 SpecialJoinInfo *sjinfo,
                                 List *restrict_clauses);
  extern void set_joinrel_size_estimates(PlannerInfo *root, RelOptInfo *rel,
diff --git a/src/include/optimizer/paths.h b/src/include/optimizer/paths.h
index f3b25e2..a0008ad 100644
*** a/src/include/optimizer/paths.h
--- b/src/include/optimizer/paths.h
*************** extern List *generate_join_implied_equal
*** 139,144 ****
--- 139,147 ----
                                           Relids outer_relids,
                                           RelOptInfo *inner_rel);
  extern bool exprs_known_equal(PlannerInfo *root, Node *item1, Node *item2);
+ extern EquivalenceClass *match_eclasses_to_foreign_key_col(PlannerInfo *root,
+                                   ForeignKeyOptInfo *fkinfo,
+                                   int colno);
  extern void add_child_rel_equivalences(PlannerInfo *root,
                             AppendRelInfo *appinfo,
                             RelOptInfo *parent_rel,
diff --git a/src/include/optimizer/planmain.h b/src/include/optimizer/planmain.h
index a48400b..c529085 100644
*** a/src/include/optimizer/planmain.h
--- b/src/include/optimizer/planmain.h
*************** extern RestrictInfo *build_implied_join_
*** 95,100 ****
--- 95,101 ----
                              Expr *item2,
                              Relids qualscope,
                              Relids nullable_relids);
+ extern void match_foreign_keys_to_quals(PlannerInfo *root);

  /*
   * prototypes for plan/analyzejoins.c
diff --git a/src/include/utils/rel.h b/src/include/utils/rel.h
index fd858fd..ed14442 100644
*** a/src/include/utils/rel.h
--- b/src/include/utils/rel.h
*************** typedef struct RelationData
*** 90,95 ****
--- 90,99 ----
      /* use "struct" here to avoid needing to include rowsecurity.h: */
      struct RowSecurityDesc *rd_rsdesc;    /* row security policies, or NULL */

+     /* data managed by RelationGetFKeyList: */
+     List       *rd_fkeylist;    /* list of ForeignKeyCacheInfo (see below) */
+     bool        rd_fkeyvalid;    /* true if list has been computed */
+
      /* data managed by RelationGetIndexList: */
      List       *rd_indexlist;    /* list of OIDs of indexes on relation */
      Oid            rd_oidindex;    /* OID of unique index on OID, if any */
*************** typedef struct RelationData
*** 170,175 ****
--- 174,207 ----
      struct PgStat_TableStatus *pgstat_info;        /* statistics collection area */
  } RelationData;

+
+ /*
+  * ForeignKeyCacheInfo
+  *        Information the relcache can cache about foreign key constraints
+  *
+  * This is basically just an image of relevant columns from pg_constraint.
+  * We make it a subclass of Node so that copyObject() can be used on a list
+  * of these, but we also ensure it is a "flat" object without substructure,
+  * so that list_free_deep() is sufficient to free such a list.
+  * The per-FK-column arrays can be fixed-size because we allow at most
+  * INDEX_MAX_KEYS columns in a foreign key constraint.
+  *
+  * Currently, we only cache fields of interest to the planner, but the
+  * set of fields could be expanded in future.
+  */
+ typedef struct ForeignKeyCacheInfo
+ {
+     NodeTag        type;
+     Oid            conrelid;        /* relation constrained by the foreign key */
+     Oid            confrelid;        /* relation referenced by the foreign key */
+     int            nkeys;            /* number of columns in the foreign key */
+     /* these arrays each have nkeys valid entries: */
+     AttrNumber    conkey[INDEX_MAX_KEYS]; /* cols in referencing table */
+     AttrNumber    confkey[INDEX_MAX_KEYS];        /* cols in referenced table */
+     Oid            conpfeqop[INDEX_MAX_KEYS];        /* PK = FK operator OIDs */
+ } ForeignKeyCacheInfo;
+
+
  /*
   * StdRdOptions
   *        Standard contents of rd_options for heaps and generic indexes.
diff --git a/src/include/utils/relcache.h b/src/include/utils/relcache.h
index 1b48304..6ea7dd2 100644
*** a/src/include/utils/relcache.h
--- b/src/include/utils/relcache.h
*************** extern void RelationClose(Relation relat
*** 37,42 ****
--- 37,43 ----
  /*
   * Routines to compute/retrieve additional cached information
   */
+ extern List *RelationGetFKeyList(Relation relation);
  extern List *RelationGetIndexList(Relation relation);
  extern Oid    RelationGetOidIndex(Relation relation);
  extern Oid    RelationGetReplicaIndex(Relation relation);

Re: New design for FK-based join selectivity estimation

From
Tomas Vondra
Date:
Hi,

On 06/16/2016 01:00 AM, Tom Lane wrote:
> Tomas Vondra <tomas.vondra@2ndquadrant.com> writes:
>> Attached is a reworked patch, mostly following the new design proposal
>> from this thread.
>
> I've whacked this around quite a bit and am now reasonably happy with it.
> The issue of planning speed hit should be pretty much gone, although I've
> not done testing to prove that.  I've rearranged the actual selectivity
> calculation some too, because tests I did here did not look very good for
> anything but the plain-inner-join case.  It may be that more work is
> needed there, but at least there's reasonable commentary about what we're
> doing and why.

Thanks for getting the patch into a much better shape. I've quickly 
reviewed the patch this morning before leaving to the airport - I do 
plan to do additional review/testing once in the air or perhaps over the 
weekend. So at the moment I only have a few minor comments:

1) Shouldn't we define a new macro in copyfuncs.c, to do the memcpy for 
us? Seems a bit strange we have macros for everything else.

2) I'm wondering whether removing the restrict infos when
    if (fkinfo->eclass[i] == rinfo->parent_ec)

is actually correct. Can't the EC include conditions that do not match 
the FK? I mean something like this:
  CREATE TABLE a (id1 INT PRIMARY KEY, id2 INT);  CREATE TABLE b (id1 INT REFERENCES a (id1), id2 INT);

and then something like
  SELECT * FROM a JOIN b ON (a.id1 = b.id1 AND a.id1 = b.id2)

I've been unable to trigger this issue with your patch, but I do 
remember I've ran into that with my version, which is why I explicitly 
checked the rinfo again before removing it. Maybe that was incorrect, or 
perhaps your patch does something smart. Or maybe it's just masked by 
the fact that we 'push' the EC conditions to the base relations (which 
however means we're stuck with default equality estimate).

3) I think this comment in get_foreign_key_join_selectivity is wrong and 
should instead say 'to FK':
  /* Otherwise, see if rinfo was previously matched to EC */

>
> I have not adopted the plan of ignoring single-column FKs.  While it would
> only take a couple more lines to do so, I think the argument that there
> will be a material planning speed hit no longer has much force.  And I see
> a couple of arguments in favor of allowing this code to trigger on
> single-column FKs.  First, it should work well even when pg_statistic data
> is missing or out of date, which would be an improvement; and second, we
> are likely not going to get much beta testing of the behavior if we
> restrict it to multi-col FKs.  So I think we should ship it like this for
> beta even if we end up adding a filter against single-column FKs for
> release.

OK

regards

-- 
Tomas Vondra                  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: New design for FK-based join selectivity estimation

From
Tom Lane
Date:
Tomas Vondra <tomas.vondra@2ndquadrant.com> writes:
> Thanks for getting the patch into a much better shape. I've quickly 
> reviewed the patch this morning before leaving to the airport - I do 
> plan to do additional review/testing once in the air or perhaps over the 
> weekend. So at the moment I only have a few minor comments:

> 1) Shouldn't we define a new macro in copyfuncs.c, to do the memcpy for 
> us? Seems a bit strange we have macros for everything else.

Perhaps, but it seemed not that compelling since we need bespoke code for
those fields in outfuncs.c etc.  Maybe it would be worth thinking about
macro infrastructure for array-type fields in all of those modules.

> 2) I'm wondering whether removing the restrict infos when
>      if (fkinfo->eclass[i] == rinfo->parent_ec)
> is actually correct. Can't the EC include conditions that do not match 
> the FK?

Doesn't matter.  The point is that it *does* include a condition that
does match the FK, whether it chose to generate exactly that condition
for this join or some related one.

> I mean something like this:
>    CREATE TABLE a (id1 INT PRIMARY KEY, id2 INT);
>    CREATE TABLE b (id1 INT REFERENCES a (id1), id2 INT);
> and then something like
>    SELECT * FROM a JOIN b ON (a.id1 = b.id1 AND a.id1 = b.id2)

Right.  In this case we'll have an EC containing {a.id1, b.id1, b.id2}
which means that equivclass.c will generate a restriction condition
b.id1 = b.id2 to be applied at the scan of b.  At the join level, it
has a choice whether to generate a.id1 = b.id1 or a.id1 = b.id2.
It could generate both, but that would be pointlessly inefficient (and
would likely confuse the selectivity estimators, too).  But even if it
chooses to generate a.id1 = b.id2, we should recognize that the FK is
matched.  What we're effectively doing by dropping that clause in
favor of treating the FK as matched is overridding equivclass.c's
arbitrary choice of which join clause to generate with an equally valid
choice that is easier to estimate for.

> 3) I think this comment in get_foreign_key_join_selectivity is wrong and 
> should instead say 'to FK':
>    /* Otherwise, see if rinfo was previously matched to EC */

Duh, yeah.


I rewrote the comments in this section to clarify a bit:
           /* Drop this clause if it matches any column of the FK */           for (i = 0; i < fkinfo->nkeys; i++)
    {               if (rinfo->parent_ec)               {                   /*                    * EC-derived clauses
canonly match by EC.  It is okay to                    * consider any clause derived from the same EC as
   * matching the FK: even if equivclass.c chose to generate                    * a clause equating some other pair of
Vars,it could                    * have generated one equating the FK's Vars.  So for                    * purposes of
estimation,we can act as though it did so.                    *                    * Note: checking parent_ec is a bit
ofa cheat because                    * there are EC-derived clauses that don't have parent_ec                    * set;
butsuch clauses must compare expressions that                    * aren't just Vars, so they cannot match the FK
anyway.                   */                   if (fkinfo->eclass[i] == rinfo->parent_ec)                   {
           remove_it = true;                       break;                   }               }               else
      {                   /*                    * Otherwise, see if rinfo was previously matched to FK as
    * a "loose" clause.                    */                   if (list_member_ptr(fkinfo->rinfos[i], rinfo))
        {                       remove_it = true;                       break;                   }               }
    }
 

        regards, tom lane



Re: New design for FK-based join selectivity estimation

From
Tomas Vondra
Date:
Hi,

A few more comments, about re-reading the patch more thoroughly. I 
wouldn't say any of those qualify as bugs, but rather as discussion 
about some of the design choices:

1) NULL handling

I'd argue that we should do something about this, although I agree it's 
non-trivial to estimate - at least until we get some sort of correlation 
stats (e.g. my patch already provides most of the pieces, I believe). 
But I'd argue that in the case of multi-column foreign keys we can do 
better even without it - my experience is that in such cases either all 
values are NULL or none of them, and a single NULL value breaks the FK 
of course. So I think max(null_frac) would work.

2) get_foreign_key_join_selectivity vs. incomplete matches

The comment right before checking (removedlist == NIL) says:
 * For a multi-column FK, it's possible that we found matches to some * columns but not all, implying that one of the
aboveeffects applied * to just some of the columns.  For the moment, we go ahead and * remove those clauses and apply
theFK's selectivity anyway.  It * might be better to put back the removed clauses and ignore the FK; * but that amounts
tobetting on independence of the clauses, which * doesn't seem like a good bet in such messy cases.
 

Is this a good idea? I'd probably vote to do what's proposed (and 
rejected) in the second half of the comment, i.e. put back the clauses 
and skip the FK as that pretty much says "improve estimate or keep the 
current one, but don't risk making it worse."

3) ForeignKeyOptInfo->rinfos as a List

Can we actually get a list of matching RestrictInfos for a single 
foreign key? I've been unable to construct such query.


regards

-- 
Tomas Vondra                  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: New design for FK-based join selectivity estimation

From
Tom Lane
Date:
Tomas Vondra <tomas.vondra@2ndquadrant.com> writes:
> A few more comments, about re-reading the patch more thoroughly. I 
> wouldn't say any of those qualify as bugs, but rather as discussion 
> about some of the design choices:

> 1) NULL handling

> I'd argue that we should do something about this, although I agree it's 
> non-trivial to estimate - at least until we get some sort of correlation 
> stats (e.g. my patch already provides most of the pieces, I believe). 

I concur, actually, but I feel that it's out of scope for this particular
patch, which is only trying to replace the functionality that was
committed previously.  If you want to come up with a patch on top of this
that adds some accounting for NULLs, I'd be willing to consider it as
a post-beta2 improvement.

> But I'd argue that in the case of multi-column foreign keys we can do 
> better even without it - my experience is that in such cases either all 
> values are NULL or none of them, and a single NULL value breaks the FK 
> of course. So I think max(null_frac) would work.

Yeah, I was thinking along the same lines: max of the per-column null
fractions is probably an OK estimate.

> The comment right before checking (removedlist == NIL) says:
>   * For a multi-column FK, it's possible that we found matches to some
>   * columns but not all, implying that one of the above effects applied
>   * to just some of the columns.  For the moment, we go ahead and
>   * remove those clauses and apply the FK's selectivity anyway.  It
>   * might be better to put back the removed clauses and ignore the FK;
>   * but that amounts to betting on independence of the clauses, which
>   * doesn't seem like a good bet in such messy cases.

> Is this a good idea? I'd probably vote to do what's proposed (and 
> rejected) in the second half of the comment, i.e. put back the clauses 
> and skip the FK as that pretty much says "improve estimate or keep the 
> current one, but don't risk making it worse."

I would buy that approach when it comes to "loose" quals, but I think
it's not right for EC-derived matches, because of the behavior we
discussed previously that an EC should be considered to have implicitly
generated all the matching quals even though it actually made only one
qual that might not be any of them exactly.

Now on the other side of the coin, if several FKs match the same EC,
we might be overshooting the mark to assume that we can just multiply
their selectivity estimates together.  But I think that's not really
a matter for the matching logic, but rather a question of how we want
to combine the estimates for multiple FKs.

Possibly what we could do here is assume that EC matches succeed,
whether there's a matching qual physically in the list or not, but
require a qual match for each column with non-EC matches.  That
would complicate the logic slightly but not terribly (he says without
having actually tried to code it).

If you have a proposal about adjusting the net selectivity estimate when
multiple FKs match the join, let's hear it.  But again that seems like
something we could address as a post-beta2 refinement.

> 3) ForeignKeyOptInfo->rinfos as a List
> Can we actually get a list of matching RestrictInfos for a single 
> foreign key? I've been unable to construct such query.

I think you'd actually have to write redundant outer join quals,
along the lines of     select ... a left join b on (a.x = b.y and a.x = b.y)
I don't believe we take the trouble to eliminate such duplicates
unless they get absorbed by an EC, which outer-join quals would
not be.  (Haven't tried this, though, as I don't have the patch
installed right now.)


The beta2 deadline is just about upon us; I feel that if we're going
to get this into this release at all, we need to push it today so
that we get a full buildfarm cycle on it before the wrap.

I plan to spend an hour or two adjusting the qual match logic as
discussed above, and re-reading the whole patch another time for
sanity.  If I've not heard objections by the time I'm done,
I will push it.
        regards, tom lane



Re: New design for FK-based join selectivity estimation

From
Tom Lane
Date:
I wrote:
> Tomas Vondra <tomas.vondra@2ndquadrant.com> writes:
>> Is this a good idea? I'd probably vote to do what's proposed (and 
>> rejected) in the second half of the comment, i.e. put back the clauses 
>> and skip the FK as that pretty much says "improve estimate or keep the 
>> current one, but don't risk making it worse."

> I would buy that approach when it comes to "loose" quals, but I think
> it's not right for EC-derived matches, because of the behavior we
> discussed previously that an EC should be considered to have implicitly
> generated all the matching quals even though it actually made only one
> qual that might not be any of them exactly.

After further thought I decided you're right, because one of the main
original goals of ECs was to prevent double-counting the selectivity
of redundant equality quals.  Acting as though the EC had generated
multiple redundant quals is likely to make things worse not better.

I still feel that we're leaving something on the table here, but it
would need to take the form of intelligently combining estimates for
overlapping FKs, not just blindly multiplying them together.  Seems
like a task for later, especially considering that cases of this sort
are likely to be rare.

Pushed with adjustments for that.
        regards, tom lane



Re: New design for FK-based join selectivity estimation

From
Tomas Vondra
Date:
On 06/18/2016 09:27 PM, Tom Lane wrote:
> I wrote:
>> Tomas Vondra <tomas.vondra@2ndquadrant.com> writes:
>>> Is this a good idea? I'd probably vote to do what's proposed (and
>>> rejected) in the second half of the comment, i.e. put back the clauses
>>> and skip the FK as that pretty much says "improve estimate or keep the
>>> current one, but don't risk making it worse."
>
>> I would buy that approach when it comes to "loose" quals, but I think
>> it's not right for EC-derived matches, because of the behavior we
>> discussed previously that an EC should be considered to have implicitly
>> generated all the matching quals even though it actually made only one
>> qual that might not be any of them exactly.
>
> After further thought I decided you're right, because one of the main
> original goals of ECs was to prevent double-counting the selectivity
> of redundant equality quals.  Acting as though the EC had generated
> multiple redundant quals is likely to make things worse not better.
>
> I still feel that we're leaving something on the table here, but it
> would need to take the form of intelligently combining estimates for
> overlapping FKs, not just blindly multiplying them together.  Seems
> like a task for later, especially considering that cases of this sort
> are likely to be rare.
>
> Pushed with adjustments for that.

OK, thanks. The one thing we haven't done is testing the performance, to
see how this fares. So I've repeated the tests I've done on the original
version of the patch here

https://www.postgresql.org/message-id/8344835e-18af-9d40-aed7-bd2261be9162@2ndquadrant.com

Sadly I don't have access to the machine used for the previous tests (on
a vacation and the machine sits under my desk at home), so I had to use
my laptop. That means a fair amount of variability due to power saving
built into the CPU, and also VM variability (using Xen VM). So the
numbers are not directly comparable to the old results, and I believe
the differences between the patched and unpatched version seem to be
quite clear despite the variability.

It's true the results are not as bad as with the originally committed
patch, but there are multiple cases where the planning time gets up to
~2x compared to master.

See the old-results.ods, and also old-scripts.tgz (the old scripts used
the GUC we removed, so I had to tweak it a bit, and you'll have to whack
it a bit to get it working).

Sure, those cases use many foreign keys (generally >=100), but the
conclusion with the old patch was that it matters and we spent a lot of
time to get it within 10% of master. There are also two or three cases
where the planning got quite a bit faster, for some reason.

I've also constructed another script, joining just 2 tables and doing
some funky things (e.g. adding a lot of overlapping foreign keys), and
in these cases the slowdown is even more significant - up to ~13x, and
the stddev also increased. See new-results.ods and new-scripts.tgz.

regards

--
Tomas Vondra                  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Attachment

Re: New design for FK-based join selectivity estimation

From
Tomas Vondra
Date:
Hi

On 06/18/2016 06:52 PM, Tom Lane wrote:
> Tomas Vondra <tomas.vondra@2ndquadrant.com> writes:
>> A few more comments, about re-reading the patch more thoroughly. I
>> wouldn't say any of those qualify as bugs, but rather as discussion
>> about some of the design choices:
>
>> 1) NULL handling
>
>> I'd argue that we should do something about this, although I agree it's
>> non-trivial to estimate - at least until we get some sort of correlation
>> stats (e.g. my patch already provides most of the pieces, I believe).
>
> I concur, actually, but I feel that it's out of scope for this
> particular patch, which is only trying to replace the functionality
> that was committed previously. If you want to come up with a patch on
> top of this that adds some accounting for NULLs, I'd be willing to
> consider it as a post-beta2 improvement.

Sure, fair enough. By post-beta2 you mean for 9.7, or still for 9.6?

>
>> But I'd argue that in the case of multi-column foreign keys we can do
>> better even without it - my experience is that in such cases either all
>> values are NULL or none of them, and a single NULL value breaks the FK
>> of course. So I think max(null_frac) would work.
>
> Yeah, I was thinking along the same lines: max of the per-column null
> fractions is probably an OK estimate.

OK

>
>> 3) ForeignKeyOptInfo->rinfos as a List
>> Can we actually get a list of matching RestrictInfos for a single
>> foreign key? I've been unable to construct such query.
>
> I think you'd actually have to write redundant outer join quals,
> along the lines of
>       select ... a left join b on (a.x = b.y and a.x = b.y)
> I don't believe we take the trouble to eliminate such duplicates
> unless they get absorbed by an EC, which outer-join quals would
> not be.  (Haven't tried this, though, as I don't have the patch
> installed right now.)

OK. Let's look into this post-beta2 then.

>
> The beta2 deadline is just about upon us; I feel that if we're going
> to get this into this release at all, we need to push it today so
> that we get a full buildfarm cycle on it before the wrap.
>
> I plan to spend an hour or two adjusting the qual match logic as
> discussed above, and re-reading the whole patch another time for
> sanity.  If I've not heard objections by the time I'm done,
> I will push it.
>

Thanks!

If I could wish one more thing - could you briefly explain why you 
rewrote the patch the way you did? I'd like to learn from this and while 
I think I kinda understand most of the changes, I'm not really sure I 
got it right.

regards

-- 
Tomas Vondra                  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: New design for FK-based join selectivity estimation

From
Tom Lane
Date:
Tomas Vondra <tomas.vondra@2ndquadrant.com> writes:
> OK, thanks. The one thing we haven't done is testing the performance, to
> see how this fares. So I've repeated the tests I've done on the original
> version of the patch here

Hmm.  I'm not that excited about these results, for a couple of reasons:

* AFAICS, all the numbers are collected from the first execution of a
query within a session, meaning caches aren't populated and everything
has to be loaded from disk (or at least shared buffers).

* I do not credit hundreds of completely redundant FKs between the same
two tables as being representative of plausible real-world cases.

I modified your new script as attached to get rid of the first problem.
Comparing HEAD with HEAD minus commit 100340e2d, in non-assert builds,
I get results like this for the 100-foreign-key case (with repeat
count 1000 for the data collection script):

select code, test, avg(time),stddev(time) from data group by 1,2 order by 1,2;
  code  | test  |        avg         |       stddev
--------+-------+--------------------+---------------------
 head   | t1/t2 |  0.065045045045045 | 0.00312962651081508
 head   | t3/t4 |  0.168561561561562 | 0.00379087132124092
 head   | t5/t6 |  0.127671671671672 | 0.00326275949269809
 head   | t7/t8 |  0.391057057057056 | 0.00590249325300915
 revert | t1/t2 | 0.0613933933933937 |  0.0032082678131875
 revert | t3/t4 | 0.0737507507507501 | 0.00221692725859567
 revert | t5/t6 |  0.123759759759759 | 0.00431225386651805
 revert | t7/t8 |  0.154082082082081 | 0.00405118420422266
(8 rows)

So for the somewhat-credible cases, ie 100 unrelated foreign keys,
I get about 3% - 6% slowdown.  The 100-duplicate-foreign-keys case
does indeed look like about a 2X slowdown, but as I said, I do not
think that has anything to do with interesting usage.

In any case, the situation I was worried about making better was
queries joining many tables, which none of this exercises at all.

            regards, tom lane

DB=$1
COUNT=$2

for i in `seq 1 $COUNT`; do
    echo "EXPLAIN ANALYZE SELECT * FROM t1 JOIN t2 USING (a);"
done | psql $DB | grep 'Planning' | tail -n +2 | awk '{print "t1/t2\t"$3}'

for i in `seq 1 $COUNT`; do
    echo "EXPLAIN ANALYZE SELECT * FROM t3 JOIN t4 USING (a);"
done | psql $DB | grep 'Planning' | tail -n +2 | awk '{print "t3/t4\t"$3}'

for i in `seq 1 $COUNT`; do
    echo "EXPLAIN ANALYZE SELECT * FROM t5 JOIN t6 USING (a,b,c,d);"
done | psql $DB | grep 'Planning' | tail -n +2 | awk '{print "t5/t6\t"$3}'

for i in `seq 1 $COUNT`; do
    echo "EXPLAIN ANALYZE SELECT * FROM t7 JOIN t8 USING (a,b,c,d);"
done | psql $DB | grep 'Planning' | tail -n +2 | awk '{print "t7/t8\t"$3}'

Re: New design for FK-based join selectivity estimation

From
Tom Lane
Date:
Tomas Vondra <tomas.vondra@2ndquadrant.com> writes:
> On 06/18/2016 06:52 PM, Tom Lane wrote:
>> I concur, actually, but I feel that it's out of scope for this
>> particular patch, which is only trying to replace the functionality
>> that was committed previously. If you want to come up with a patch on
>> top of this that adds some accounting for NULLs, I'd be willing to
>> consider it as a post-beta2 improvement.

> Sure, fair enough. By post-beta2 you mean for 9.7, or still for 9.6?

I think it'd be legitimate to address the NULLs question for 9.6, as long
as the patch is not very large.  If it does turn out to be invasive or
otherwise hard to review, waiting for 9.7 might be more prudent.  But
I argued upthread that failing to consider nulls was a bug in the original
patch, so dealing with them could be considered a bug fix.

> If I could wish one more thing - could you briefly explain why you 
> rewrote the patch the way you did? I'd like to learn from this and while 
> I think I kinda understand most of the changes, I'm not really sure I 
> got it right.

I don't at the moment recall everything I changed, but I'm happy to
answer questions that are more specific than that one ...
        regards, tom lane



Re: New design for FK-based join selectivity estimation

From
Andrew Gierth
Date:
>>>>> "Tom" == Tom Lane <tgl@sss.pgh.pa.us> writes:
> Tomas Vondra <tomas.vondra@2ndquadrant.com> writes:>> Attached is a reworked patch, mostly following the new design
proposal>> from this thread.
 
Tom> Comments and testing appreciated.

This blows up (see bug 14219 for testcase) in
match_foreign_keys_to_quals on the find_base_rel call(s) in the
following case:

If the query was produced by rule expansion then the code that populates
fkinfo includes FK references to the OLD and NEW RTEs, but those might not
appear in the jointree (the testcase for the bug is a DELETE rule where
NEW clearly doesn't apply) and hence build_simple_rel was not called
(causing find_base_rel to fail). Not sure what the right fix is.

-- 
Andrew (irc:RhodiumToad)



Re: New design for FK-based join selectivity estimation

From
Tom Lane
Date:
Andrew Gierth <andrew@tao11.riddles.org.uk> writes:
> If the query was produced by rule expansion then the code that populates
> fkinfo includes FK references to the OLD and NEW RTEs, but those might not
> appear in the jointree (the testcase for the bug is a DELETE rule where
> NEW clearly doesn't apply) and hence build_simple_rel was not called
> (causing find_base_rel to fail). Not sure what the right fix is.

Meh.  I had a vaguely uneasy feeling that just scanning the rtable was
too simplistic, but hadn't thought hard about it.

For a really correct fix we could search the jointree to see which rels
are in it, but that would add code and cycles.  A slightly cheating way to
do it is to not use find_base_rel() but look into the simple_rel_array for
ourselves, and do nothing if there's no rel corresponding to the RTE
index.
        regards, tom lane



Re: [HACKERS] New design for FK-based join selectivity estimation

From
Adrien Nayrat
Date:
Hi hackers,

The commit 100340e2dcd05d6505082a8fe343fb2ef2fa5b2a introduce an
estimation error :

create table t3 as select j from generate_series(1,10000)
i,generate_series(1,100) j ;
create table t4 as select j from generate_series(1,100) j ;
create unique index ON t4(j);
alter table t3 add constraint fk foreign key (j) references t4(j);
analyze;

9.5.5
explain analyze select * from t3 where j in (select * from t4 where j<10);
     QUERY PLAN 


--------------------------------------------------------------------------------------------------------------------Hash
SemiJoin  (cost=2.36..18053.61 rows=90000 width=4) (actual 
time=0.217..282.325 rows=90000 loops=1)  Hash Cond: (t3.j = t4.j)  ->  Seq Scan on t3  (cost=0.00..14425.00
rows=1000000width=4) 
(actual time=0.112..116.063 rows=1000000 loops=1)  ->  Hash  (cost=2.25..2.25 rows=9 width=4) (actual time=0.083..0.083
rows=9 loops=1)        Buckets: 1024  Batches: 1  Memory Usage: 9kB        ->  Seq Scan on t4  (cost=0.00..2.25 rows=9
width=4)(actual 
time=0.019..0.074 rows=9 loops=1)              Filter: (j < 10)              Rows Removed by Filter: 91Planning time:
0.674msExecution time: 286.043 ms 

On 9.6 HEAD

explain analyze select * from t3 where j in (select * from t4 where j<10);
    QUERY PLAN 

-------------------------------------------------------------------------------------------------------------------Hash
SemiJoin  (cost=2.36..18053.61 rows=1000000 width=4) (actual 
time=0.089..232.327 rows=90000 loops=1)  Hash Cond: (t3.j = t4.j)  ->  Seq Scan on t3  (cost=0.00..14425.00
rows=1000000width=4) 
(actual time=0.047..97.926 rows=1000000 loops=1)  ->  Hash  (cost=2.25..2.25 rows=9 width=4) (actual time=0.032..0.032
rows=9 loops=1)        Buckets: 1024  Batches: 1  Memory Usage: 9kB        ->  Seq Scan on t4  (cost=0.00..2.25 rows=9
width=4)(actual 
time=0.008..0.030 rows=9 loops=1)              Filter: (j < 10)              Rows Removed by Filter: 91Planning time:
0.247msExecution time: 235.295 ms 
(10 rows)


Estimated row is 10x larger since 100340e2d

Regards,

--
Adrien NAYRAT

http://dalibo.com - http://dalibo.org


Re: [HACKERS] New design for FK-based join selectivity estimation

From
ronan.dunklau@dalibo.com
Date:
On mardi 13 décembre 2016 09:10:47 CET Adrien Nayrat wrote:
> Hi hackers,
>
> The commit 100340e2dcd05d6505082a8fe343fb2ef2fa5b2a introduce an
> estimation error :
[....]
>
> Estimated row is 10x larger since 100340e2d
>
> Regards,

Hello,

I think I understand what the problem is. In get_foreign_key_join_selectiviy,
we remove the restrict info clauses which match a foreign key. This is done so
that the selectivy is not applied twice (once in the function itself, once
when processing the restrictinfos).

The problem is, for semi and anti joins, we assume that we have nohing to do
(costsize.c:4253):
    else if (jointype == JOIN_SEMI || jointype == JOIN_ANTI)    {        /*         * For JOIN_SEMI and JOIN_ANTI, the
selectivityis defined as the         * fraction of LHS rows that have matches.  If the referenced         * table is on
theinner side, that means the selectivity is 1.0         * (modulo nulls, which we're ignoring for now).  We already
    * covered the other case, so no work here.         */    } 

This results in assuming that the whole outerrel will match, no matter the
selectivity of the innerrel.

If I understand it correctly and the above is right, I think we should ignore
SEMI or ANTI joins altogether when considering FKs, and keep the corresponding
restrictinfos for later processing since they are already special-cased later
on.

Regards,

--
Ronan Dunklau






Re: [HACKERS] New design for FK-based join selectivity estimation

From
Tom Lane
Date:
ronan.dunklau@dalibo.com writes:
> On mardi 13 décembre 2016 09:10:47 CET Adrien Nayrat wrote:
>> The commit 100340e2dcd05d6505082a8fe343fb2ef2fa5b2a introduce an
>> estimation error :

> The problem is, for semi and anti joins, we assume that we have nohing to do
> (costsize.c:4253):

>         else if (jointype == JOIN_SEMI || jointype == JOIN_ANTI)
>         {
>             /*
>              * For JOIN_SEMI and JOIN_ANTI, the selectivity is defined as the
>              * fraction of LHS rows that have matches.  If the referenced
>              * table is on the inner side, that means the selectivity is 1.0
>              * (modulo nulls, which we're ignoring for now).  We already
>              * covered the other case, so no work here.
>              */
>         }

> This results in assuming that the whole outerrel will match, no matter the
> selectivity of the innerrel.

Yeah.  In the terms of this example, the FK means that every outer row
would have a match, if the query were
select * from t3 where j in (select * from t4);

But actually it's
select * from t3 where j in (select * from t4 where j<10);

so of course we should not expect a match for every row.

> If I understand it correctly and the above is right, I think we should ignore
> SEMI or ANTI joins altogether when considering FKs, and keep the corresponding
> restrictinfos for later processing since they are already special-cased later
> on.

That seems like an overreaction.  While the old code happens to get this
example exactly right, eqjoinsel_semi is still full of assumptions and
approximations, and it doesn't do very well at all if it lacks MCV lists
for both sides.

I'm inclined to think that what we want to have happen in this case is
to estimate the fraction of outer rows having a match as equal to the
selectivity of the inner query's WHERE clauses, ie the semijoin
selectivity should be sizeof(inner result) divided by sizeof(inner
relation).
        regards, tom lane



Re: [HACKERS] New design for FK-based join selectivity estimation

From
Tom Lane
Date:
I wrote:
> ronan.dunklau@dalibo.com writes:
>> If I understand it correctly and the above is right, I think we should ignore
>> SEMI or ANTI joins altogether when considering FKs, and keep the corresponding
>> restrictinfos for later processing since they are already special-cased later
>> on.

> That seems like an overreaction.  While the old code happens to get this
> example exactly right, eqjoinsel_semi is still full of assumptions and
> approximations, and it doesn't do very well at all if it lacks MCV lists
> for both sides.

> I'm inclined to think that what we want to have happen in this case is
> to estimate the fraction of outer rows having a match as equal to the
> selectivity of the inner query's WHERE clauses, ie the semijoin
> selectivity should be sizeof(inner result) divided by sizeof(inner
> relation).

After further study, I concluded that we can only easily estimate that
when the inner side of the SEMI or ANTI join is just the single referenced
table.  If the inner side is itself a join, it's not easy to determine
what fraction of the referenced table will survive the join clauses.

However, we can still be brighter than to just throw all the FK qual
clauses back into the pool: that would result in multiplying their
selectivity estimates together, which for a multi-column FK results in
exactly the drastic underestimation that 100340e2d intended to avoid.
What seems to make sense here is to take the minimum of the per-clause
selectivities, as we are doing for other outer-join cases.

Hence, I propose the attached patch.  This rearranges the existing code
slightly to avoid duplicating it.

            regards, tom lane

diff --git a/src/backend/optimizer/path/costsize.c b/src/backend/optimizer/path/costsize.c
index e42895d..415edad 100644
*** a/src/backend/optimizer/path/costsize.c
--- b/src/backend/optimizer/path/costsize.c
*************** get_foreign_key_join_selectivity(Planner
*** 4085,4090 ****
--- 4085,4091 ----
      {
          ForeignKeyOptInfo *fkinfo = (ForeignKeyOptInfo *) lfirst(lc);
          bool        ref_is_outer;
+         bool        use_smallest_selectivity = false;
          List       *removedlist;
          ListCell   *cell;
          ListCell   *prev;
*************** get_foreign_key_join_selectivity(Planner
*** 4205,4213 ****
           * be double-counting the null fraction, and (2) it's not very clear
           * how to combine null fractions for multiple referencing columns.
           *
!          * In the first branch of the logic below, null derating is done
!          * implicitly by relying on clause_selectivity(); in the other two
!          * paths, we do nothing for now about correcting for nulls.
           *
           * XXX another point here is that if either side of an FK constraint
           * is an inheritance parent, we estimate as though the constraint
--- 4206,4214 ----
           * be double-counting the null fraction, and (2) it's not very clear
           * how to combine null fractions for multiple referencing columns.
           *
!          * In the use_smallest_selectivity code below, null derating is done
!          * implicitly by relying on clause_selectivity(); in the other cases,
!          * we do nothing for now about correcting for nulls.
           *
           * XXX another point here is that if either side of an FK constraint
           * is an inheritance parent, we estimate as though the constraint
*************** get_foreign_key_join_selectivity(Planner
*** 4230,4257 ****
               * the smallest per-column selectivity, instead.  (This should
               * correspond to the FK column with the most nulls.)
               */
!             Selectivity thisfksel = 1.0;
!
!             foreach(cell, removedlist)
!             {
!                 RestrictInfo *rinfo = (RestrictInfo *) lfirst(cell);
!                 Selectivity csel;
!
!                 csel = clause_selectivity(root, (Node *) rinfo,
!                                           0, jointype, sjinfo);
!                 thisfksel = Min(thisfksel, csel);
!             }
!             fkselec *= thisfksel;
          }
          else if (jointype == JOIN_SEMI || jointype == JOIN_ANTI)
          {
              /*
               * For JOIN_SEMI and JOIN_ANTI, the selectivity is defined as the
!              * fraction of LHS rows that have matches.  If the referenced
!              * table is on the inner side, that means the selectivity is 1.0
!              * (modulo nulls, which we're ignoring for now).  We already
!              * covered the other case, so no work here.
               */
          }
          else
          {
--- 4231,4271 ----
               * the smallest per-column selectivity, instead.  (This should
               * correspond to the FK column with the most nulls.)
               */
!             use_smallest_selectivity = true;
          }
          else if (jointype == JOIN_SEMI || jointype == JOIN_ANTI)
          {
              /*
               * For JOIN_SEMI and JOIN_ANTI, the selectivity is defined as the
!              * fraction of LHS rows that have matches.  The referenced table
!              * is on the inner side (we already handled the other case above),
!              * so the FK implies that every LHS row has a match *in the
!              * referenced table*.  But any restriction or join clauses below
!              * here will reduce the number of matches.
               */
+             if (bms_membership(inner_relids) == BMS_SINGLETON)
+             {
+                 /*
+                  * When the inner side of the semi/anti join is just the
+                  * referenced table, we may take the FK selectivity as equal
+                  * to the selectivity of the table's restriction clauses.
+                  */
+                 RelOptInfo *ref_rel = find_base_rel(root, fkinfo->ref_relid);
+                 double        ref_tuples = Max(ref_rel->tuples, 1.0);
+
+                 fkselec *= ref_rel->rows / ref_tuples;
+             }
+             else
+             {
+                 /*
+                  * When the inner side of the semi/anti join is itself a join,
+                  * it's hard to guess what fraction of the referenced table
+                  * will get through the join.  But we still don't want to
+                  * multiply per-column estimates together.  Take the smallest
+                  * per-column selectivity, instead.
+                  */
+                 use_smallest_selectivity = true;
+             }
          }
          else
          {
*************** get_foreign_key_join_selectivity(Planner
*** 4265,4270 ****
--- 4279,4304 ----

              fkselec *= 1.0 / ref_tuples;
          }
+
+         /*
+          * Common code for cases where we should use the smallest selectivity
+          * that would be computed for any one of the FK's clauses.
+          */
+         if (use_smallest_selectivity)
+         {
+             Selectivity thisfksel = 1.0;
+
+             foreach(cell, removedlist)
+             {
+                 RestrictInfo *rinfo = (RestrictInfo *) lfirst(cell);
+                 Selectivity csel;
+
+                 csel = clause_selectivity(root, (Node *) rinfo,
+                                           0, jointype, sjinfo);
+                 thisfksel = Min(thisfksel, csel);
+             }
+             fkselec *= thisfksel;
+         }
      }

      *restrictlist = worklist;

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers