Thread: 8.2 Query 10 times slower than 8.1 (view-heavy)

8.2 Query 10 times slower than 8.1 (view-heavy)

From
"Evan Carroll"
Date:
Dearest dragon hunters and mortal wanna-bes,

I recently upgraded a system from Apache2/mod_perl2 to
Lighttpd/fastcgi. The upgrade went about as rough as can be. While in
the midst of a bad day, I decided to make it worse, and upgrade Pg 8.1
to 8.2. Most people I talk to seem to think 8.1 was a lemon release;
not I. It worked excellent for me for the longest time, and I had no
good reason to upgrade it, other than to just have done so. In the
process, A query that took a matter of 2minutes, started taking hours.
I broke that query up into something more atomic and used it as a
sample.

The following material is provided for your assisting-me-pleasure: the
original SQL; the \ds for all pertinent views and tables; the output
of Explain Analyze; and the original query.

The original query both trials was: SELECT * FROM test_view where U_ID = 8;

test_view.sql =   http://rafb.net/p/HhT9g489.html

8.1_explain_analyze =  http://rafb.net/p/uIyY1s44.html
8.2_explain_analzye =  http://rafb.net/p/mxHWi340.html

\d table/views = http://rafb.net/p/EPnyB229.html

Yes, I ran vacuum full after loading both dbs.

Thanks again, ask and I will provide anything else. I'm on freenode,
in #postgresql, and can be found at all times with the nick
EvanCarroll.

--
Evan Carroll
System Lord of the Internets
me@evancarroll.com
832-445-8877

Re: 8.2 Query 10 times slower than 8.1 (view-heavy)

From
"Scott Marlowe"
Date:
On 8/28/07, Evan Carroll <lists@evancarroll.com> wrote:

> the midst of a bad day, I decided to make it worse, and upgrade Pg 8.1
> to 8.2. Most people I talk to seem to think 8.1 was a lemon release;
> not I.

8.0 was the release that had more issues for me, as it was the first
version with all the backend work done to make it capable of running
windows.  for that reason I stayed on 7.4 until 8.1.4 or so was out.

8.2 was a nice incremental upgrade, and I migrated to it around 8.2.3
and have been happy every since.

> Yes, I ran vacuum full after loading both dbs.

Did you run analyze?  It's not built into vacuum.

Re: 8.2 Query 10 times slower than 8.1 (view-heavy)

From
Gregory Stark
Date:
"Evan Carroll" <lists@evancarroll.com> writes:

"Evan Carroll" <lists@evancarroll.com> writes:

> Dearest dragon hunters and mortal wanna-bes,
>
> I recently upgraded a system from Apache2/mod_perl2 to
> Lighttpd/fastcgi. The upgrade went about as rough as can be. While in
> the midst of a bad day, I decided to make it worse, and upgrade Pg 8.1
> to 8.2. Most people I talk to seem to think 8.1 was a lemon release;
> not I. It worked excellent for me for the longest time, and I had no
> good reason to upgrade it, other than to just have done so.

I assume you mean 8.1.9 and 8.2.4?

> The following material is provided for your assisting-me-pleasure: the
> original SQL; the \ds for all pertinent views and tables; the output
> of Explain Analyze; and the original query.

While I do in fact enjoy analyzing query plans I have to say that 75-line
plans push the bounds of my assisting-you-pleasure. Have you experimented with
simplifying this query?

--
  Gregory Stark
  EnterpriseDB          http://www.enterprisedb.com

Re: 8.2 Query 10 times slower than 8.1 (view-heavy)

From
"Joshua D. Drake"
Date:
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Gregory Stark wrote:
> "Evan Carroll" <lists@evancarroll.com> writes:
>
> "Evan Carroll" <lists@evancarroll.com> writes:
>
>> Dearest dragon hunters and mortal wanna-bes,
>>
>> I recently upgraded a system from Apache2/mod_perl2 to
>> Lighttpd/fastcgi. The upgrade went about as rough as can be. While in
>> the midst of a bad day, I decided to make it worse, and upgrade Pg 8.1
>> to 8.2. Most people I talk to seem to think 8.1 was a lemon release;
>> not I. It worked excellent for me for the longest time, and I had no
>> good reason to upgrade it, other than to just have done so.
>
> I assume you mean 8.1.9 and 8.2.4?
>
>> The following material is provided for your assisting-me-pleasure: the
>> original SQL; the \ds for all pertinent views and tables; the output
>> of Explain Analyze; and the original query.
>
> While I do in fact enjoy analyzing query plans I have to say that 75-line
> plans push the bounds of my assisting-you-pleasure. Have you experimented with
> simplifying this query?

Although simplifying the query is probably in order, doesn't it stand to
reason that there may be a problem here. 10x difference (in the worse)
from a lower version to a higher, is likely wrong :)

Joshua D. Drake

>


- --

      === The PostgreSQL Company: Command Prompt, Inc. ===
Sales/Support: +1.503.667.4564   24x7/Emergency: +1.800.492.2240
PostgreSQL solutions since 1997  http://www.commandprompt.com/
            UNIQUE NOT NULL
Donate to the PostgreSQL Project: http://www.postgresql.org/about/donate
PostgreSQL Replication: http://www.commandprompt.com/products/

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.6 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFG1EqYATb/zqfZUUQRAighAJ9g+Py+CRwsW7f5QWuA4uZ5G26a9gCcCXG2
0Le2KBGpdhDZyu4ZT30y8RA=
=MfQw
-----END PGP SIGNATURE-----

Re: 8.2 Query 10 times slower than 8.1 (view-heavy)

From
"Kevin Grittner"
Date:
>>> On Tue, Aug 28, 2007 at 10:22 AM, in message
<428b865e0708280822u522391ecr5366bb5a6647f1aa@mail.gmail.com>, "Evan Carroll"
<lists@evancarroll.com> wrote:
> Yes, I ran vacuum full after loading both dbs.

Have you run VACUUM ANALYZE or ANALYZE?

-Kevin




Fwd: 8.2 Query 10 times slower than 8.1 (view-heavy)

From
"Evan Carroll"
Date:
---------- Forwarded message ----------
From: Evan Carroll <me@evancarroll.com>
Date: Aug 28, 2007 11:23 AM
Subject: Re: [PERFORM] 8.2 Query 10 times slower than 8.1 (view-heavy)
To: Scott Marlowe <scott.marlowe@gmail.com>


On 8/28/07, Scott Marlowe <scott.marlowe@gmail.com> wrote:
> I looked through your query plan, and this is what stood out in the 8.2 plan:
>
>          ->  Nested Loop Left Join  (cost=8830.30..10871.27 rows=1
> width=102) (actual time=2148.444..236018.971 rows=62 loops=1)
>                Join Filter: ((public.contact.pkid =
> public.contact.pkid) AND (public.event.ts_in > public.event.ts_in))
>                Filter: (public.event.pkid IS NULL)
>
> Notice the misestimation is by a factor of 62, and the actual time
> goes from 2149 to 236018 ms.
>
> Again, have you analyzed your tables  / databases?
>
contacts=# \o scott_marlowe_test
contacts=# VACUUM FULL ANALYZE;
contacts=# SELECT * FROM test_view WHERE U_ID = 8;
Cancel request sent
ERROR:  canceling statement due to user request
contacts=# EXPLAIN ANALYZE SELECT * FROM test_view WHERE U_ID = 8;

output found at http://rafb.net/p/EQouMI82.html

--
Evan Carroll
System Lord of the Internets
me@evancarroll.com
832-445-8877


--
Evan Carroll
System Lord of the Internets
me@evancarroll.com
832-445-8877

Re: 8.2 Query 10 times slower than 8.1 (view-heavy)

From
"Evan Carroll"
Date:
From: "Evan Carroll" <me@evancarroll.com>
To: "Kevin Grittner" <Kevin.Grittner@wicourts.gov>,
pgsql-performance@postgresql.org
Date: Tue, 28 Aug 2007 11:21:54 -0500
Subject: Re: [PERFORM] 8.2 Query 10 times slower than 8.1 (view-heavy)
On 8/28/07, Kevin Grittner <Kevin.Grittner@wicourts.gov> wrote:
> >>> On Tue, Aug 28, 2007 at 10:22 AM, in message
> <428b865e0708280822u522391ecr5366bb5a6647f1aa@mail.gmail.com>, "Evan Carroll"
> <lists@evancarroll.com> wrote:
> > Yes, I ran vacuum full after loading both dbs.
>
> Have you run VACUUM ANALYZE or ANALYZE?

VACUUM FULL ANALYZE on both tables, out of habit.
>
> -Kevin

--
Evan Carroll
System Lord of the Internets
me@evancarroll.com
832-445-8877

Re: Fwd: 8.2 Query 10 times slower than 8.1 (view-heavy)

From
Mark Lewis
Date:
It looks like your view is using a left join to look for rows in one
table without matching rows in the other, i.e. a SQL construct similar
in form to the query below:

SELECT ...
FROM A LEFT JOIN B ON (...)
WHERE B.primary_key IS NULL

Unfortunately there has been a planner regression in 8.2 in some cases
with these forms of queries.  This was discussed a few weeks (months?)
ago on this forum.  I haven't looked closely enough to confirm that this
is the problem in your case, but it seems likely.  Is it possible to
refactor the query to avoid using this construct to see if that helps?

We've been holding back from upgrading to 8.2 because this one is a
show-stopper for us.

-- Mark Lewis

On Tue, 2007-08-28 at 11:24 -0500, Evan Carroll wrote:
> ---------- Forwarded message ----------
> From: Evan Carroll <me@evancarroll.com>
> Date: Aug 28, 2007 11:23 AM
> Subject: Re: [PERFORM] 8.2 Query 10 times slower than 8.1 (view-heavy)
> To: Scott Marlowe <scott.marlowe@gmail.com>
>
>
> On 8/28/07, Scott Marlowe <scott.marlowe@gmail.com> wrote:
> > I looked through your query plan, and this is what stood out in the 8.2 plan:
> >
> >          ->  Nested Loop Left Join  (cost=8830.30..10871.27 rows=1
> > width=102) (actual time=2148.444..236018.971 rows=62 loops=1)
> >                Join Filter: ((public.contact.pkid =
> > public.contact.pkid) AND (public.event.ts_in > public.event.ts_in))
> >                Filter: (public.event.pkid IS NULL)
> >
> > Notice the misestimation is by a factor of 62, and the actual time
> > goes from 2149 to 236018 ms.
> >
> > Again, have you analyzed your tables  / databases?
> >
> contacts=# \o scott_marlowe_test
> contacts=# VACUUM FULL ANALYZE;
> contacts=# SELECT * FROM test_view WHERE U_ID = 8;
> Cancel request sent
> ERROR:  canceling statement due to user request
> contacts=# EXPLAIN ANALYZE SELECT * FROM test_view WHERE U_ID = 8;
>
> output found at http://rafb.net/p/EQouMI82.html
>
> --
> Evan Carroll
> System Lord of the Internets
> me@evancarroll.com
> 832-445-8877
>
>

Re: Fwd: 8.2 Query 10 times slower than 8.1 (view-heavy)

From
Tom Lane
Date:
Mark Lewis <mark.lewis@mir3.com> writes:
> Unfortunately there has been a planner regression in 8.2 in some cases
> with these forms of queries.  This was discussed a few weeks (months?)
> ago on this forum.  I haven't looked closely enough to confirm that this
> is the problem in your case, but it seems likely.

Yeah, the EXPLAIN ANALYZE output clearly shows a drastic underestimate
of the number of rows out of a join like this, and a consequent choice
of a nestloop above it that performs terribly.

> We've been holding back from upgrading to 8.2 because this one is a
> show-stopper for us.

Well, you could always make your own version with this patch reverted:
http://archives.postgresql.org/pgsql-committers/2006-11/msg00066.php

I might end up doing that in the 8.2 branch if a better solution
seems too large to back-patch.

            regards, tom lane

Re: 8.2 Query 10 times slower than 8.1 (view-heavy)

From
"Evan Carroll"
Date:
On 8/28/07, Kevin Grittner <Kevin.Grittner@wicourts.gov> wrote:
> >>> On Tue, Aug 28, 2007 at 10:22 AM, in message
> <428b865e0708280822u522391ecr5366bb5a6647f1aa@mail.gmail.com>, "Evan Carroll"
> <lists@evancarroll.com> wrote:
> > Yes, I ran vacuum full after loading both dbs.
>
> Have you run VACUUM ANALYZE or ANALYZE?

VACUUM FULL ANALYZE on both tables, out of habit.
>
> -Kevin
>
>
>
>


--
Evan Carroll
System Lord of the Internets
me@evancarroll.com
832-445-8877

Re: Fwd: 8.2 Query 10 times slower than 8.1 (view-heavy)

From
Tom Lane
Date:
I wrote:
> Mark Lewis <mark.lewis@mir3.com> writes:
>> We've been holding back from upgrading to 8.2 because this one is a
>> show-stopper for us.

> Well, you could always make your own version with this patch reverted:
> http://archives.postgresql.org/pgsql-committers/2006-11/msg00066.php
> I might end up doing that in the 8.2 branch if a better solution
> seems too large to back-patch.

I thought of a suitably small hack that should cover at least the main
problem without going so far as to revert that patch entirely.  What we
can do is have the IS NULL estimator recognize when the clause is being
applied at an outer join, and not believe the table statistics in that
case.  I've applied the attached patch for this --- are you interested
in trying it out on your queries before 8.2.5 comes out?

            regards, tom lane

Index: src/backend/optimizer/path/clausesel.c
===================================================================
RCS file: /cvsroot/pgsql/src/backend/optimizer/path/clausesel.c,v
retrieving revision 1.82
diff -c -r1.82 clausesel.c
*** src/backend/optimizer/path/clausesel.c    4 Oct 2006 00:29:53 -0000    1.82
--- src/backend/optimizer/path/clausesel.c    31 Aug 2007 23:29:01 -0000
***************
*** 218,224 ****
                  s2 = rqlist->hibound + rqlist->lobound - 1.0;

                  /* Adjust for double-exclusion of NULLs */
!                 s2 += nulltestsel(root, IS_NULL, rqlist->var, varRelid);

                  /*
                   * A zero or slightly negative s2 should be converted into a
--- 218,226 ----
                  s2 = rqlist->hibound + rqlist->lobound - 1.0;

                  /* Adjust for double-exclusion of NULLs */
!                 /* HACK: disable nulltestsel's special outer-join logic */
!                 s2 += nulltestsel(root, IS_NULL, rqlist->var,
!                                   varRelid, JOIN_INNER);

                  /*
                   * A zero or slightly negative s2 should be converted into a
***************
*** 701,707 ****
          s1 = nulltestsel(root,
                           ((NullTest *) clause)->nulltesttype,
                           (Node *) ((NullTest *) clause)->arg,
!                          varRelid);
      }
      else if (IsA(clause, BooleanTest))
      {
--- 703,710 ----
          s1 = nulltestsel(root,
                           ((NullTest *) clause)->nulltesttype,
                           (Node *) ((NullTest *) clause)->arg,
!                          varRelid,
!                          jointype);
      }
      else if (IsA(clause, BooleanTest))
      {
Index: src/backend/utils/adt/selfuncs.c
===================================================================
RCS file: /cvsroot/pgsql/src/backend/utils/adt/selfuncs.c,v
retrieving revision 1.214.2.5
diff -c -r1.214.2.5 selfuncs.c
*** src/backend/utils/adt/selfuncs.c    5 May 2007 17:05:55 -0000    1.214.2.5
--- src/backend/utils/adt/selfuncs.c    31 Aug 2007 23:29:02 -0000
***************
*** 1386,1396 ****
   */
  Selectivity
  nulltestsel(PlannerInfo *root, NullTestType nulltesttype,
!             Node *arg, int varRelid)
  {
      VariableStatData vardata;
      double        selec;

      examine_variable(root, arg, varRelid, &vardata);

      if (HeapTupleIsValid(vardata.statsTuple))
--- 1386,1409 ----
   */
  Selectivity
  nulltestsel(PlannerInfo *root, NullTestType nulltesttype,
!             Node *arg, int varRelid, JoinType jointype)
  {
      VariableStatData vardata;
      double        selec;

+     /*
+      * Special hack: an IS NULL test being applied at an outer join should not
+      * be taken at face value, since it's very likely being used to select the
+      * outer-side rows that don't have a match, and thus its selectivity has
+      * nothing whatever to do with the statistics of the original table
+      * column.  We do not have nearly enough context here to determine its
+      * true selectivity, so for the moment punt and guess at 0.5.  Eventually
+      * the planner should be made to provide enough info about the clause's
+      * context to let us do better.
+      */
+     if (IS_OUTER_JOIN(jointype) && nulltesttype == IS_NULL)
+         return (Selectivity) 0.5;
+
      examine_variable(root, arg, varRelid, &vardata);

      if (HeapTupleIsValid(vardata.statsTuple))
Index: src/include/utils/selfuncs.h
===================================================================
RCS file: /cvsroot/pgsql/src/include/utils/selfuncs.h,v
retrieving revision 1.36
diff -c -r1.36 selfuncs.h
*** src/include/utils/selfuncs.h    4 Oct 2006 00:30:11 -0000    1.36
--- src/include/utils/selfuncs.h    31 Aug 2007 23:29:02 -0000
***************
*** 149,155 ****
  extern Selectivity booltestsel(PlannerInfo *root, BoolTestType booltesttype,
              Node *arg, int varRelid, JoinType jointype);
  extern Selectivity nulltestsel(PlannerInfo *root, NullTestType nulltesttype,
!             Node *arg, int varRelid);
  extern Selectivity scalararraysel(PlannerInfo *root,
                 ScalarArrayOpExpr *clause,
                 bool is_join_clause,
--- 149,155 ----
  extern Selectivity booltestsel(PlannerInfo *root, BoolTestType booltesttype,
              Node *arg, int varRelid, JoinType jointype);
  extern Selectivity nulltestsel(PlannerInfo *root, NullTestType nulltesttype,
!             Node *arg, int varRelid, JoinType jointype);
  extern Selectivity scalararraysel(PlannerInfo *root,
                 ScalarArrayOpExpr *clause,
                 bool is_join_clause,

Re: Fwd: 8.2 Query 10 times slower than 8.1 (view-heavy)

From
Mark Lewis
Date:
On Fri, 2007-08-31 at 19:39 -0400, Tom Lane wrote:
> I wrote:
> > Mark Lewis <mark.lewis@mir3.com> writes:
> >> We've been holding back from upgrading to 8.2 because this one is a
> >> show-stopper for us.
>
> > Well, you could always make your own version with this patch reverted:
> > http://archives.postgresql.org/pgsql-committers/2006-11/msg00066.php
> > I might end up doing that in the 8.2 branch if a better solution
> > seems too large to back-patch.
>
> I thought of a suitably small hack that should cover at least the main
> problem without going so far as to revert that patch entirely.  What we
> can do is have the IS NULL estimator recognize when the clause is being
> applied at an outer join, and not believe the table statistics in that
> case.  I've applied the attached patch for this --- are you interested
> in trying it out on your queries before 8.2.5 comes out?

Wish I could, but I'm afraid that I'm not going to be in a position to
try out the patch on the application that exhibits the problem for at
least the next few weeks.

-- Mark