Home > mailing lists

Re: [HACKERS] 200 = 199 + 1? - Mailing list pgsql-hackers

From	Marko Tiikkaja
Subject	Re: [HACKERS] 200 = 199 + 1?
Date	September 28, 2017 02:17:18
Msg-id	CAL9smLD9x1tccjtrLXvytH-LXzot-pctPKXgsH75vyyj4RLDoQ@mail.gmail.com Whole thread Raw
In response to	Re: [HACKERS] 200 = 199 + 1? (Tom Lane <tgl@sss.pgh.pa.us>)
Responses	Re: [HACKERS] 200 = 199 + 1?
List	pgsql-hackers

Tree view

On Wed, Sep 27, 2017 at 5:45 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

Marko Tiikkaja <marko@joh.to> writes:
> I wonder if the nested loop shouldn't have some kind of a cap on its own
> estimate if it's wildly off of what you'd get by multiplying the child
> nodes' estimates with each other?

Nonstarter I'm afraid. The join relation's size estimate is determined
long before we get to a point where we could multiply the sizes of these
particular child paths to arrive at the conclusion that it should be
something different than what we decided originally.

Ah hah. Thanks for the explanation, that makes sense.

Adjusting the size
of the nestloop result at that point would merely give it an unfair
advantage over other paths for the same join relation. (I think it would
also break some assumptions about paths for the same relation all giving
the same number of rows, unless parameterized.)

With the previous paragraph in mind, I would agree; it's not a very good idea.

Looking at it another way, the main thing that the combination of hashagg
outer path + indexscan inner path knows that eqjoinsel_semi didn't account
for is that there's a unique index on foo.id. But that info is available
to eqjoinsel_semi, in the sense that it's been given a nondefault estimate
that nd1 is equal to the outer relation size. So the mistake that it's
making is to throw up its hands and use an 0.5 selectivity estimate just
because it has no info about the inner relation. I think if we'd pushed
through the nd2/nd1 calculation after setting nd2 = size of inner rel,
we'd end up with an estimate matching the product of these path sizes.
(Caution: inadequate caffeine absorbed yet, this might be all wrong.)

This sounds very reasonable to me.

pgsql-hackers by date:

From: Tom Lane
Date: 28 September 2017, 01:50:05
Subject: Re: [HACKERS] Binary search in fmgr_isbuiltin() is a bottleneck.

From: Pavel Stehule
Date: 28 September 2017, 02:41:52
Subject: Re: [HACKERS] proposal - Default namespaces for XPath expressions(PostgreSQL 11)

Re: [HACKERS] 200 = 199 + 1? - Mailing list pgsql-hackers

Previous

Next