Re: Fwd: 8.2 Query 10 times slower than 8.1 (view-heavy) - Mailing list pgsql-performance

From Tom Lane
Subject Re: Fwd: 8.2 Query 10 times slower than 8.1 (view-heavy)
Date
Msg-id 4484.1188603565@sss.pgh.pa.us
Whole thread Raw
In response to Re: Fwd: 8.2 Query 10 times slower than 8.1 (view-heavy)  (Tom Lane <tgl@sss.pgh.pa.us>)
Responses Re: Fwd: 8.2 Query 10 times slower than 8.1 (view-heavy)
List pgsql-performance
I wrote:
> Mark Lewis <mark.lewis@mir3.com> writes:
>> We've been holding back from upgrading to 8.2 because this one is a
>> show-stopper for us.

> Well, you could always make your own version with this patch reverted:
> http://archives.postgresql.org/pgsql-committers/2006-11/msg00066.php
> I might end up doing that in the 8.2 branch if a better solution
> seems too large to back-patch.

I thought of a suitably small hack that should cover at least the main
problem without going so far as to revert that patch entirely.  What we
can do is have the IS NULL estimator recognize when the clause is being
applied at an outer join, and not believe the table statistics in that
case.  I've applied the attached patch for this --- are you interested
in trying it out on your queries before 8.2.5 comes out?

            regards, tom lane

Index: src/backend/optimizer/path/clausesel.c
===================================================================
RCS file: /cvsroot/pgsql/src/backend/optimizer/path/clausesel.c,v
retrieving revision 1.82
diff -c -r1.82 clausesel.c
*** src/backend/optimizer/path/clausesel.c    4 Oct 2006 00:29:53 -0000    1.82
--- src/backend/optimizer/path/clausesel.c    31 Aug 2007 23:29:01 -0000
***************
*** 218,224 ****
                  s2 = rqlist->hibound + rqlist->lobound - 1.0;

                  /* Adjust for double-exclusion of NULLs */
!                 s2 += nulltestsel(root, IS_NULL, rqlist->var, varRelid);

                  /*
                   * A zero or slightly negative s2 should be converted into a
--- 218,226 ----
                  s2 = rqlist->hibound + rqlist->lobound - 1.0;

                  /* Adjust for double-exclusion of NULLs */
!                 /* HACK: disable nulltestsel's special outer-join logic */
!                 s2 += nulltestsel(root, IS_NULL, rqlist->var,
!                                   varRelid, JOIN_INNER);

                  /*
                   * A zero or slightly negative s2 should be converted into a
***************
*** 701,707 ****
          s1 = nulltestsel(root,
                           ((NullTest *) clause)->nulltesttype,
                           (Node *) ((NullTest *) clause)->arg,
!                          varRelid);
      }
      else if (IsA(clause, BooleanTest))
      {
--- 703,710 ----
          s1 = nulltestsel(root,
                           ((NullTest *) clause)->nulltesttype,
                           (Node *) ((NullTest *) clause)->arg,
!                          varRelid,
!                          jointype);
      }
      else if (IsA(clause, BooleanTest))
      {
Index: src/backend/utils/adt/selfuncs.c
===================================================================
RCS file: /cvsroot/pgsql/src/backend/utils/adt/selfuncs.c,v
retrieving revision 1.214.2.5
diff -c -r1.214.2.5 selfuncs.c
*** src/backend/utils/adt/selfuncs.c    5 May 2007 17:05:55 -0000    1.214.2.5
--- src/backend/utils/adt/selfuncs.c    31 Aug 2007 23:29:02 -0000
***************
*** 1386,1396 ****
   */
  Selectivity
  nulltestsel(PlannerInfo *root, NullTestType nulltesttype,
!             Node *arg, int varRelid)
  {
      VariableStatData vardata;
      double        selec;

      examine_variable(root, arg, varRelid, &vardata);

      if (HeapTupleIsValid(vardata.statsTuple))
--- 1386,1409 ----
   */
  Selectivity
  nulltestsel(PlannerInfo *root, NullTestType nulltesttype,
!             Node *arg, int varRelid, JoinType jointype)
  {
      VariableStatData vardata;
      double        selec;

+     /*
+      * Special hack: an IS NULL test being applied at an outer join should not
+      * be taken at face value, since it's very likely being used to select the
+      * outer-side rows that don't have a match, and thus its selectivity has
+      * nothing whatever to do with the statistics of the original table
+      * column.  We do not have nearly enough context here to determine its
+      * true selectivity, so for the moment punt and guess at 0.5.  Eventually
+      * the planner should be made to provide enough info about the clause's
+      * context to let us do better.
+      */
+     if (IS_OUTER_JOIN(jointype) && nulltesttype == IS_NULL)
+         return (Selectivity) 0.5;
+
      examine_variable(root, arg, varRelid, &vardata);

      if (HeapTupleIsValid(vardata.statsTuple))
Index: src/include/utils/selfuncs.h
===================================================================
RCS file: /cvsroot/pgsql/src/include/utils/selfuncs.h,v
retrieving revision 1.36
diff -c -r1.36 selfuncs.h
*** src/include/utils/selfuncs.h    4 Oct 2006 00:30:11 -0000    1.36
--- src/include/utils/selfuncs.h    31 Aug 2007 23:29:02 -0000
***************
*** 149,155 ****
  extern Selectivity booltestsel(PlannerInfo *root, BoolTestType booltesttype,
              Node *arg, int varRelid, JoinType jointype);
  extern Selectivity nulltestsel(PlannerInfo *root, NullTestType nulltesttype,
!             Node *arg, int varRelid);
  extern Selectivity scalararraysel(PlannerInfo *root,
                 ScalarArrayOpExpr *clause,
                 bool is_join_clause,
--- 149,155 ----
  extern Selectivity booltestsel(PlannerInfo *root, BoolTestType booltesttype,
              Node *arg, int varRelid, JoinType jointype);
  extern Selectivity nulltestsel(PlannerInfo *root, NullTestType nulltesttype,
!             Node *arg, int varRelid, JoinType jointype);
  extern Selectivity scalararraysel(PlannerInfo *root,
                 ScalarArrayOpExpr *clause,
                 bool is_join_clause,

pgsql-performance by date:

Previous
From: Greg Smith
Date:
Subject: Re: 8.2 Autovacuum BUG ?
Next
From: Tom Lane
Date:
Subject: Re: 8.2 Autovacuum BUG ?