On Wed, Mar 14, 2018 at 12:29 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Thomas Munro <thomas.munro@enterprisedb.com> writes:
>> There is a fundamental and complicated estimation problem lurking here
>> of course and I'm not sure what to think about that yet. Maybe there
>> is a very simple fix for this particular problem:
>
> Ah, I see you thought of the same hack I did.
>
> I think this may actually be a good fix, and here's the reason: this plan
> is in fact being driven entirely off planner default estimates, because
> we don't have any estimation code that knows what to do with
> "wholerowvar *= wholerowvar". I'm suspicious that we could drop the
> preceding ANALYZE as being a waste of cycles, except maybe it's finding
> out the number of rows for us. In any case, LIMIT 1 is only a good idea
> to the extent that the planner knows what it's doing, and this is an
> example where it demonstrably doesn't and won't any time soon.
Hmm. I wonder if the ANALYZE might have been needed to avoid the
nested loop plan at some point in history.
Here's a patch to remove LIMIT 1, which fixes the plan for Jeff's test
scenario and some smaller and larger examples I tried. The query is
already executed with SPI_execute(..., 1) so it'll give up after one
row anyway. The regression test includes a case that causes a row to
be produced here and that's passing ('ERROR: new data for
materialized view "mvtest_mv" contains duplicate rows without any null
columns').
--
Thomas Munro
http://www.enterprisedb.com