Re: New design for FK-based join selectivity estimation - Mailing list pgsql-hackers

From Simon Riggs
Subject Re: New design for FK-based join selectivity estimation
Date
Msg-id CANP8+jLsiASHZ=5at_uw7wn7Ujkd6QVJeSf=w1qaoSOSKC3V4g@mail.gmail.com
Whole thread Raw
In response to New design for FK-based join selectivity estimation  (Tom Lane <tgl@sss.pgh.pa.us>)
Responses Re: New design for FK-based join selectivity estimation  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-hackers
On 4 June 2016 at 20:44, Tom Lane <tgl@sss.pgh.pa.us> wrote:
This is a branch of the discussion in
https://www.postgresql.org/message-id/flat/20160429102531.GA13701%40huehner.biz
but I'm starting a new thread as the original title is getting
increasingly off-topic.

I complained in that thread that the FK join selectivity patch had a
very brute-force approach to matching join qual clauses to FK
constraints, requiring a total of seven nested levels of looping to
get anything done, and expensively rediscovering the same facts over
and over.  Here is a sketch of what I think is a better way:

Thanks for your review and design notes here, which look like good improvements. 

Tomas has been discussing that with myself and others, but I just realised that might not be apparent on list, so just to mention there is activity on this and new code will be published very soon.


On the above mentioned thread, Tomas' analysis was this...
> There are probably a few reasonably simple things we could do - e.g. ignore foreign keys
> with just a single column, as the primary goal of the patch is improving estimates with
> multi-column foreign keys. I believe that covers a vast majority of foreign keys in the wild.

I agree with that comment. The relcache code retrieves all FKs, even ones that have a single column. Yet the planner code never uses them unless nKeys>1. That was masked somewhat by my two commits, treating the info as generic and then using only a very specific subset of it.

So a simple change is to make RelationGetFKeyList() only retrieve FKs with nKeys>1. Rename to RelationGetMultiColumnFKeyList(). That greatly reduces the scope for increased planning time.

--
Simon Riggs                http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

pgsql-hackers by date:

Previous
From: Tom Lane
Date:
Subject: Re: proposal: integration bloat tables (indexes) to core
Next
From: Amit Kapila
Date:
Subject: Re: ERROR: ORDER/GROUP BY expression not found in targetlist