Re: Yet another abort-early plan disaster on 9.3 - Mailing list pgsql-performance

From Ryan Johnson
Subject Re: Yet another abort-early plan disaster on 9.3
Date
Msg-id 542D5A24.1080308@cs.utoronto.ca
Whole thread Raw
In response to Re: Yet another abort-early plan disaster on 9.3  (Merlin Moncure <mmoncure@gmail.com>)
List pgsql-performance
On 29/09/2014 9:00 AM, Merlin Moncure wrote:
> On Fri, Sep 26, 2014 at 3:06 AM, Simon Riggs <simon@2ndquadrant.com> wrote:
>> The problem, as I see it, is different. We assume that if there are
>> 100 distinct values and you use LIMIT 1 that you would only need to
>> scan 1% of rows. We assume that the data is arranged in the table in a
>> very homogenous layout. When data is not, and it seldom is, we get
>> problems.
> Hm, good point -- 'data proximity'.  At least in theory, can't this be
> measured and quantified?  For example, given a number of distinct
> values, you could estimate the % of pages read (or maybe non
> sequential seeks relative to the number of pages) you'd need to read
> all instances of a particular value in the average (or perhaps the
> worst) case.   One way of trying to calculate that would be to look at
> proximity of values in sampled pages (and maybe a penalty assigned for
> high update activity relative to table size).  Data proximity would
> then become a cost coefficient to the benefits of LIMIT.
Latecomer to the conversation here, but it seems like this issue (unlike
some) is really easy to recognize at runtime. The optimizer assumed the
scan would access  O(1) pages; if the scan has not returned enough
results after k pages, that would be a really good indication that it's
time to rethink the plan, and probably before too much work has been
done higher in the plan (esp. if there's any kind of buffering between
operators, perhaps intentionally so in special cases like this)

Not sure pgsql has any dynamic reoptimization infrastructure in place,
tho. If not, these sorts of dangerous plans are best left alone IMO.

Ryan



pgsql-performance by date:

Previous
From: Peter Geoghegan
Date:
Subject: Re: Yet another abort-early plan disaster on 9.3
Next
From: Jeff Janes
Date:
Subject: Re: auto vaccum is dying