Re: Speed depending of Join Order. - Mailing list pgsql-sql

From Tom Lane
Subject Re: Speed depending of Join Order.
Date
Msg-id 23778.1043267097@sss.pgh.pa.us
Whole thread Raw
In response to Speed depending of Join Order.  (Raúl Gutiérrez Sánchez <raul@laeff.esa.es>)
List pgsql-sql
Raúl Gutiérrez Sánchez <raul@laeff.esa.es> writes:
> Note that the only difference is the order of the join elements. Using
> version 7.2.2, which I have been using untill now, the time expended in
> both of them was the same, using the right indexes. However, using
> version 7.3.1 which I have instaled recently, the results of the explain
> are the following:

It turns out that the problem was inaccuracy in some recently-added code
that tries to account for the possibility that a mergejoin won't scan
all the way to the end.  Your sample data had only one possible value
for the mis_id and mod_mis_id columns; this boundary case exposed the
fact that the code was testing for "x < max" where it should be testing
"x <= max".  Coupled with a lack of sanity-checking, the bogus
calculation affected the estimated costs in an asymmetrical way.  This
is why the choice of a bad plan only occurred in one case out of two.

I've applied the attached patch for 7.3.2.  Thanks for the report!
        regards, tom lane


*** src/backend/optimizer/path/costsize.c.orig    Wed Sep  4 16:31:20 2002
--- src/backend/optimizer/path/costsize.c    Wed Jan 22 15:10:20 2003
***************
*** 645,652 ****         innerscansel = firstclause->left_mergescansel;     } 
!     outer_rows = outer_path->parent->rows * outerscansel;
!     inner_rows = inner_path->parent->rows * innerscansel;      /* cost of source data */ 
--- 645,666 ----         innerscansel = firstclause->left_mergescansel;     } 
!     /* convert selectivity to row count; must scan at least one row */
! 
!     outer_rows = ceil(outer_path->parent->rows * outerscansel);
!     if (outer_rows < 1)
!         outer_rows = 1;
!     inner_rows = ceil(inner_path->parent->rows * innerscansel);
!     if (inner_rows < 1)
!         inner_rows = 1;
! 
!     /*
!      * Readjust scan selectivities to account for above rounding.  This is
!      * normally an insignificant effect, but when there are only a few rows
!      * in the inputs, failing to do this makes for a large percentage error.
!      */
!     outerscansel = outer_rows / outer_path->parent->rows;
!     innerscansel = inner_rows / inner_path->parent->rows;      /* cost of source data */ 
*** src/backend/utils/adt/selfuncs.c.orig    Fri Oct 18 22:56:16 2002
--- src/backend/utils/adt/selfuncs.c    Wed Jan 22 15:12:05 2003
***************
*** 1740,1746 ****                 rsortop,                 ltop,                 gtop,
!                 revltop;     Datum        leftmax,                 rightmax;     double        selec;
--- 1740,1748 ----                 rsortop,                 ltop,                 gtop,
!                 leop,
!                 revgtop,
!                 revleop;     Datum        leftmax,                 rightmax;     double        selec;
***************
*** 1779,1813 ****     /* Look up the "left < right" and "left > right" operators */     op_mergejoin_crossops(opno,
<op,>op, NULL, NULL); 
 
!     /* Look up the "right < left" operator */
!     revltop = get_commutator(gtop);
!     if (!OidIsValid(revltop))
!         return;                    /* shouldn't happen */      /*      * Now, the fraction of the left variable that
willbe scanned is the      * fraction that's <= the right-side maximum value.  But only believe      * non-default
estimates,else stick with our 1.0.      */
 
!     selec = scalarineqsel(root, ltop, false, left,                           rightmax, right->vartype);     if (selec
!=DEFAULT_INEQ_SEL)         *leftscan = selec;      /* And similarly for the right variable. */
 
!     selec = scalarineqsel(root, revltop, false, right,                           leftmax, left->vartype);     if
(selec!= DEFAULT_INEQ_SEL)         *rightscan = selec;      /*      * Only one of the two fractions can really be less
than1.0; believe
 
!      * the smaller estimate and reset the other one to exactly 1.0.      */     if (*leftscan > *rightscan)
*leftscan= 1.0;
 
!     else         *rightscan = 1.0; }  /*
--- 1781,1829 ----     /* Look up the "left < right" and "left > right" operators */     op_mergejoin_crossops(opno,
<op,>op, NULL, NULL); 
 
!     /* Look up the "left <= right" operator */
!     leop = get_negator(gtop);
!     if (!OidIsValid(leop))
!         return;                    /* insufficient info in catalogs */
! 
!     /* Look up the "right > left" operator */
!     revgtop = get_commutator(ltop);
!     if (!OidIsValid(revgtop))
!         return;                    /* insufficient info in catalogs */
! 
!     /* Look up the "right <= left" operator */
!     revleop = get_negator(revgtop);
!     if (!OidIsValid(revleop))
!         return;                    /* insufficient info in catalogs */      /*      * Now, the fraction of the left
variablethat will be scanned is the      * fraction that's <= the right-side maximum value.  But only believe      *
non-defaultestimates, else stick with our 1.0.      */
 
!     selec = scalarineqsel(root, leop, false, left,                           rightmax, right->vartype);     if (selec
!=DEFAULT_INEQ_SEL)         *leftscan = selec;      /* And similarly for the right variable. */
 
!     selec = scalarineqsel(root, revleop, false, right,                           leftmax, left->vartype);     if
(selec!= DEFAULT_INEQ_SEL)         *rightscan = selec;      /*      * Only one of the two fractions can really be less
than1.0; believe
 
!      * the smaller estimate and reset the other one to exactly 1.0.  If we
!      * get exactly equal estimates (as can easily happen with self-joins),
!      * believe neither.      */     if (*leftscan > *rightscan)         *leftscan = 1.0;
!     else if (*leftscan < *rightscan)         *rightscan = 1.0;
+     else
+         *leftscan = *rightscan = 1.0; }  /*


pgsql-sql by date:

Previous
From: Tomasz Myrta
Date:
Subject: Re: To use a VIEW or not to use a View.....
Next
From: Stephan Szabo
Date:
Subject: Re: To use a VIEW or not to use a View.....