Re: improper estimates even with high statistic values - Mailing list pgsql-bugs
From | Bruce Momjian |
---|---|
Subject | Re: improper estimates even with high statistic values |
Date | |
Msg-id | 200601262242.k0QMgRB24383@candle.pha.pa.us Whole thread Raw |
In response to | Re: improper estimates even with high statistic values (Robert Treat <xzilla@users.sourceforge.net>) |
Responses |
Re: improper estimates even with high statistic values
|
List | pgsql-bugs |
Has this been researched. Josh posted he thought it was an optimizer bug, but I haven't seen anyone investigate it: http://archives.postgresql.org/pgsql-performance/2006-01/msg00248.php http://archives.postgresql.org/pgsql-performance/2006-01/msg00265.php --------------------------------------------------------------------------- Robert Treat wrote: > Magnus's case seemed like a beast of a different animal to me, given it > was a direct index scan using a wildcard based search on a primary key > column; I'd agree I don't know exactly how it would determine a value > different that 1. > > But in my example, this misestimation comes between columns that are not > primary keys, contain duplicate values (so they are aren't unique), and > involves left joining subqueries. It doesn't seem to follow that it > would always reduce to 1 row quite so easily. > > > Robert Treat > > On Wed, 2006-01-18 at 16:11, Bruce Momjian wrote: > > > > Magnus reported a similar problem with path names. I looked at his > > statistics and found that even at 100 buckets, his LIKE 'f:/.../%" query > > would never span more than one bucket, and because all the path names > > were unique, there were no most common values. > > > > In the case where the LIKE hits only one bucket, and there are no most > > common values, how is the optimzier supposed to estimate the number of > > rows, especially for cases where the values in the buckets are unevenly > > distributed. > > > > --------------------------------------------------------------------------- > > > > Robert Treat wrote: > > > After some extensive discussion on irc, berkus, myself and a few others > > > think we have uncovered a possible bug, or at the least some odd > > > behavior in > 8.1.1. It centers around my recent post to performance > > > http://archives.postgresql.org/pgsql-performance/2006-01/msg00248.php > > > and how I could not seem to get some of the estimates to become > > > reasonable even after bumping up my stats target to 400 which caused > > > every row to be analyzed. If you look at the left join and hash join > > > estimates of the third query you'll note they seem to always get > > > estimated to 1 for no reason that we could come up with. > > > > > > Someone else on irc seemed to have a similar problem to this, so we are > > > wondering if there is some problem here. So the question really is if > > > someone can deduce the behavior from looking at what was provided in the > > > email? If not and you have questions let me know, otherwise I can send a > > > chopped up test database which can reproduce the query issues off list > > > should someone want to walk through the pg code to investigate. TIA > > > > > > > > > Robert Treat > > > -- > > > Build A Brighter Lamp :: Linux Apache {middleware} PostgreSQL > > > > > > > > > ---------------------------(end of broadcast)--------------------------- > > > TIP 2: Don't 'kill -9' the postmaster > > > > > > > -- > > Bruce Momjian | http://candle.pha.pa.us > > pgman@candle.pha.pa.us | (610) 359-1001 > > + If your life is a hard drive, | 13 Roberts Road > > + Christ can be your backup. | Newtown Square, Pennsylvania 19073 > > > > ---------------------------(end of broadcast)--------------------------- > > TIP 3: Have you checked our extensive FAQ? > > > > http://www.postgresql.org/docs/faq > -- > Build A Brighter Lamp :: Linux Apache {middleware} PostgreSQL > -- Bruce Momjian | http://candle.pha.pa.us pgman@candle.pha.pa.us | (610) 359-1001 + If your life is a hard drive, | 13 Roberts Road + Christ can be your backup. | Newtown Square, Pennsylvania 19073
pgsql-bugs by date: