Re: pg9.6 segfault using simple query (related to use fk for join estimates) - Mailing list pgsql-hackers

From Tom Lane
Subject Re: pg9.6 segfault using simple query (related to use fk for join estimates)
Date
Msg-id 31780.1462395724@sss.pgh.pa.us
Whole thread Raw
In response to Re: pg9.6 segfault using simple query (related to use fk for join estimates)  (Robert Haas <robertmhaas@gmail.com>)
Responses Re: pg9.6 segfault using simple query (related to use fk for join estimates)  (Tomas Vondra <tomas.vondra@2ndquadrant.com>)
Re: pg9.6 segfault using simple query (related to use fk for join estimates)  (Robert Haas <robertmhaas@gmail.com>)
List pgsql-hackers
Robert Haas <robertmhaas@gmail.com> writes:
> On Wed, May 4, 2016 at 2:54 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> My other design-level complaint is that basing this on foreign keys is
>> fundamentally the wrong thing.  What actually matters is the unique index
>> underlying the FK; that is, if we have "a.x = b.y" and there's a
>> compatible unique index on b.y, we can conclude that no A row will match
>> more than one B row, whether or not an explicit FK relationship has been
>> declared.  So we should drive this off unique indexes instead of FKs,
>> first because we will find more cases, and second because the planner
>> already examines indexes and doesn't need any additional catalog lookups
>> to get the required data.  (IOW, the relcache additions that were made in
>> this patch series should go away too.)

> Without prejudice to anything else in this useful and detailed review,
> I have a question about this.  A unique index proves that no A row
> will match more than one B row, and I agree that deriving that from
> unique indexes is sensible.  However, ISTM that an FK provides
> additional information: we know that, modulo filter conditions on B,
> every A row will match *exactly* one row B row, which can prevent us
> from *underestimating* the size of the join product.  A unique index
> can't do that.

Very good point, but unless I'm missing something, that is not what the
current patch does.  I'm not sure offhand whether that's an important
estimation failure mode currently, or if it is whether it would be
sensible to try to implement that rule entirely separately from the "at
most one" aspect, or if it isn't sensible, whether that's a sufficiently
strong reason to confine the "at most one" logic to working only with FKs
and not with bare unique indexes.

On the whole, that seems like another argument why this needs more time.
        regards, tom lane



pgsql-hackers by date:

Previous
From: Jeff Janes
Date:
Subject: Re: atomic pin/unpin causing errors
Next
From: Stephen Frost
Date:
Subject: Re: pg_dump vs. TRANSFORMs