Home > mailing lists

Re: View performance - Mailing list pgsql-performance

From	Tom Lane
Subject	Re: View performance
Date	December 26, 2002 21:43:39
Msg-id	20650.1040946219@sss.pgh.pa.us Whole thread Raw
In response to	View performance (Bruno Wolff III <bruno@wolff.to>)
List	pgsql-performance

Tree view

I wrote:
>> This seems to indicate some estimation problems in cost_hashjoin; the
>> estimated cost for the hashjoin is evidently a lot higher than it should
>> be.

The answer is that estimate_hash_bucketsize() is producing a rather
silly result in this situation, viz. a bucketsize "fraction" that's well
above 1.0.  I've applied the following band-aid patch to CVS tip, which
perhaps you might like to use locally.  But probably the long-range
answer is to rethink what that routine is doing --- its adjustment for
skewed data distributions is perhaps not such a great idea.

            regards, tom lane

*** src/backend/optimizer/path/costsize.c.orig    Fri Dec 13 19:17:55 2002
--- src/backend/optimizer/path/costsize.c    Thu Dec 26 18:34:02 2002
***************
*** 1164,1169 ****
--- 1164,1179 ----
      if (avgfreq > 0.0 && mcvfreq > avgfreq)
          estfract *= mcvfreq / avgfreq;

+     /*
+      * Clamp bucketsize to sane range (the above adjustment could easily
+      * produce an out-of-range result).  We set the lower bound a little
+      * above zero, since zero isn't a very sane result.
+      */
+     if (estfract < 1.0e-6)
+         estfract = 1.0e-6;
+     else if (estfract > 1.0)
+         estfract = 1.0;
+
      ReleaseSysCache(tuple);

      return (Selectivity) estfract;

pgsql-performance by date:

From: Tom Lane
Date: 26 December 2002, 18:45:56
Subject: Re: View performance

From: yutaka_inada@justsystem.co.jp
Date: 27 December 2002, 05:01:32
Subject: executing pgsql on Xeon-dual machine

Re: View performance - Mailing list pgsql-performance

Previous

Next