Thread: Poor Performance after Upgrade

Poor Performance after Upgrade

From
Ben Perrault
Date:
Hi,

I recently inherited a very old (PostgreSQL 7.0.3) database, and have
migrated it to 8.2.4 but have run into a performance issue.

Basically, I did a dump and import into the new database, vacuumed and
created fresh indexes and everything is work great except the following
type of query (and similar):

  SELECT tsr.stepId, tsr.testType, tsr.problemReportId, tsr.excpt, tcr.caseId
     FROM   TestCaseRun tcr, TestStepRun tsr
     WHERE  tcr.parentSN = 194813
     AND    (tsr.testType        <> ''
     OR      tsr.problemReportId <> ''
     OR      tsr.excpt           <> '')
     AND    tsr.parentSN =  tcr.recordSN

What used to take 250ms or so on the old database now takes between 55 and
60 Seconds.

On the old database, the query plan looks like this:

Unique  (cost=13074.30..13078.36 rows=32 width=68)
   ->  Sort  (cost=13074.30..13074.30 rows=324 width=68)
         ->  Nested Loop  (cost=0.00..13060.77 rows=324 width=68)
               ->  Index Scan using parentsn_tcr_indx on testcaserun tcr
(cost=0.00..444.83 rows=111 width=16)
               ->  Index Scan using parentsn_tsr_indx on teststeprun tsr
(cost=0.00..113.42 rows=27 width=52)

And on the new database it looks like this:

  Unique  (cost=206559152.10..206559157.14 rows=336 width=137)
    ->  Sort  (cost=206559152.10..206559152.94 rows=336 width=137)
          Sort Key: tsr.stepid, tsr.testtype, tsr.problemreportid,
tsr.excpt, tcr.caseid
          ->  Nested Loop  (cost=100000000.00..106559138.00 rows=336
width=137)
                ->  Index Scan using parentsn_tcr_indx on testcaserun tcr
(cost=0.00..17.00 rows=115 width=11)
                      Index Cond: (parentsn = 186726)
                ->  Index Scan using parentsn_tsr_indx on teststeprun tsr
(cost=0.00..56089.00 rows=75747 width=134)
                      Index Cond: (tsr.parentsn = tcr.recordsn)
                      Filter: ((testtype <> ''::text) OR
((problemreportid)::text <> ''::text) OR (excpt <> ''::text))
(9 rows)

I'm fairly familiar with PostgreSQL, but I have no idea where to start in
trying to trouble shoot this huge performance discrepancy. The hardware
and OS are the same.

And the data size is exactly the same between the two, and the total data
size is about 7.5GB, with the largest table (teststeprun mentioned above)
being about 15 million rows.

Any pointers to where to start troubleshooting this or how to change the
query to work better would be appreciated.

cheers and thanks,
Ben Perrault
Sr. Systems Consultant
Alcatel-Lucent Internetworking

Re: Poor Performance after Upgrade

From
"Steinar H. Gunderson"
Date:
On Mon, Aug 20, 2007 at 10:17:14PM -0700, Ben Perrault wrote:
>          ->  Nested Loop  (cost=100000000.00..106559138.00 rows=336
> width=137)

This sounds very much like you're trying to force the planner. Did you set
enable_nestloop=false or something? Are there any other non-default settings
that could negatively impact planner performance?

/* Steinar */
--
Homepage: http://www.sesse.net/

Re: Poor Performance after Upgrade

From
"vincent"
Date:
> Hi,
>
> I recently inherited a very old (PostgreSQL 7.0.3) database, and have
> migrated it to 8.2.4 but have run into a performance issue.
>

Did you configure the 8.2.4 server to match the memory requirements etc of
the old server? PostgreSQL's default settings are usually not aimed at
optimal performance.



Re: Poor Performance after Upgrade

From
Russell Smith
Date:
Ben Perrault wrote:
> Hi,
>
> I recently inherited a very old (PostgreSQL 7.0.3) database, and have
> migrated it to 8.2.4 but have run into a performance issue.
>
> Basically, I did a dump and import into the new database, vacuumed and
> created fresh indexes and everything is work great except the
> following type of query (and similar):
>
>  SELECT tsr.stepId, tsr.testType, tsr.problemReportId, tsr.excpt,
> tcr.caseId
>     FROM   TestCaseRun tcr, TestStepRun tsr
>     WHERE  tcr.parentSN = 194813
>     AND    (tsr.testType        <> ''
>     OR      tsr.problemReportId <> ''
>     OR      tsr.excpt           <> '')
>     AND    tsr.parentSN =  tcr.recordSN
This query is not "similar" to the plans listed below.  It will not
result in a sort/unique unless tcr or tsr are views.

Can we also see explain analyze instead of just explain, it's much more
helpful to see what's actually going on.  Especially since the row
estimates are quite different in the two plans.

You also mentioned above that you vacuumed, did you analyze with that?
vacuum doesn't do analyze in 8.2.4.  You have to say "vacuum analyze",
or just analyze.
>
> What used to take 250ms or so on the old database now takes between 55
> and 60 Seconds.
>
> On the old database, the query plan looks like this:
>
> Unique  (cost=13074.30..13078.36 rows=32 width=68)
>   ->  Sort  (cost=13074.30..13074.30 rows=324 width=68)
>         ->  Nested Loop  (cost=0.00..13060.77 rows=324 width=68)
>               ->  Index Scan using parentsn_tcr_indx on testcaserun
> tcr (cost=0.00..444.83 rows=111 width=16)
>               ->  Index Scan using parentsn_tsr_indx on teststeprun
> tsr (cost=0.00..113.42 rows=27 width=52)
>
> And on the new database it looks like this:
>
>  Unique  (cost=206559152.10..206559157.14 rows=336 width=137)
>    ->  Sort  (cost=206559152.10..206559152.94 rows=336 width=137)
>          Sort Key: tsr.stepid, tsr.testtype, tsr.problemreportid,
> tsr.excpt, tcr.caseid
>          ->  Nested Loop  (cost=100000000.00..106559138.00 rows=336
> width=137)
>                ->  Index Scan using parentsn_tcr_indx on testcaserun
> tcr (cost=0.00..17.00 rows=115 width=11)
>                      Index Cond: (parentsn = 186726)
>                ->  Index Scan using parentsn_tsr_indx on teststeprun
> tsr (cost=0.00..56089.00 rows=75747 width=134)
>                      Index Cond: (tsr.parentsn = tcr.recordsn)
>                      Filter: ((testtype <> ''::text) OR
> ((problemreportid)::text <> ''::text) OR (excpt <> ''::text))
> (9 rows)
>
> I'm fairly familiar with PostgreSQL, but I have no idea where to start
> in trying to trouble shoot this huge performance discrepancy. The
> hardware and OS are the same.
>
> And the data size is exactly the same between the two, and the total
> data size is about 7.5GB, with the largest table (teststeprun
> mentioned above) being about 15 million rows.
>
> Any pointers to where to start troubleshooting this or how to change
> the query to work better would be appreciated.
Look at row estimates vs reality.  They should be pretty close in the
new version.
Why are the costs so high in the new plan?  100000000 happens to be a
nice number that's used when you attempt to turn off a certain type of plan.
EXPLAIN ANALZE (query) is your friend.

Regards

Russell