Re: Proposal: new GUC paramter - Mailing list pgsql-hackers

From Bruce Momjian
Subject Re: Proposal: new GUC paramter
Date
Msg-id 200110112115.f9BLFxQ13994@candle.pha.pa.us
Whole thread Raw
In response to Re: Proposal: new GUC paramter  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-hackers
> Tatsuo Ishii <t-ishii@sra.co.jp> writes:
> > This is really annoying since:
> > o these code fragments actually controls the optimization efforts for
> >   subqueries and views, not related to GEQO at all. So using GEQO
> >   parameters for this kind of purpose seems abuse for me.
>
> But GEQO_RELS is directly related to the maximum number of FROM-clause
> entries that we want to try to handle by exhaustive search.  So I think
> it's not completely unreasonable to use it for this additional purpose.
>
> Still, if you want to do the work to create another GUC parameter,
> I won't object.

This is a tough call.  The GEQO value is used here to indicate a table
list that is very long and needs GEQO processing, so there is some
relationship.  If we get to a point where the number of tables is too
large, we do have problems.

However, the GEQO setting is set to the point where we want GEQO to take
over from the standard optimizer.  If GEQO was to be improved, this
value would be decreased but the point at which you would want to stop
increasing the target list probably would be the same.

The GEQO/2 is clearly just a ballpark estimate.  I can see the value as
a separate config parameter, but I can also see it as something that may
be confusing to users and <1% of people will want to change it.  In
fact, interestingly, even if GEQO is off, GEQO_THRESHHOLD can be changed
by users wishing to pull more of their subqueries into their target
list.

I started thinking of some more complex comparison we could do, such as
determining if:

    2 * (factorial(rels_in_upper_query) + factorial(rels_in_subquery)) <
    factorial(rels_in_upper_query + factorial(rels_in_subquery)

but this doesn't seem to generate good decisions.

I have applied the following documentation patch to at least document
the current behavior.

--
  Bruce Momjian                        |  http://candle.pha.pa.us
  pgman@candle.pha.pa.us               |  (610) 853-3000
  +  If your life is a hard drive,     |  830 Blythe Avenue
  +  Christ can be your backup.        |  Drexel Hill, Pennsylvania 19026
Index: doc/src/sgml/runtime.sgml
===================================================================
RCS file: /cvsroot/pgsql/doc/src/sgml/runtime.sgml,v
retrieving revision 1.89
diff -c -r1.89 runtime.sgml
*** doc/src/sgml/runtime.sgml    2001/10/09 18:46:00    1.89
--- doc/src/sgml/runtime.sgml    2001/10/11 21:08:59
***************
*** 719,725 ****
          this many FROM items involved.  (Note that a JOIN construct
      counts as only one FROM item.) The default is 11. For simpler
      queries it is usually best to use the
!         deterministic, exhaustive planner.
         </para>
        </listitem>
       </varlistentry>
--- 719,727 ----
          this many FROM items involved.  (Note that a JOIN construct
      counts as only one FROM item.) The default is 11. For simpler
      queries it is usually best to use the
!         deterministic, exhaustive planner.  This parameter also controls
!         how hard the optimizer will try to merge subquery
!         <literal>FROM</literal> clauses into the upper query.
         </para>
        </listitem>
       </varlistentry>

pgsql-hackers by date:

Previous
From: Bruce Momjian
Date:
Subject: Re: FAQ error
Next
From: Tom Lane
Date:
Subject: Re: Glitch in handling of postmaster -o options