Thread: How to debug performance problems

How to debug performance problems

From
Andreas Tille
Date:
Hi,

I'm running a web application using Zope that obtains all data
from a PostgreSQL 7.4 database (Debian Sarge system with package
7.4.7-6sarge4 on an "older" Sparc machine, equipped with 2GB
memory and two processors E250 server).  Once I did some performance
tuning and found out that

      max_connections = 256
      shared_buffers = 131072
      sort_mem = 65536

would help for a certain application (that is now not running any
more on this machine, but I left these parameters in
/etc/postgresql/postgresql.conf untouched.

My web application was running fine for years without any problem
and the performance was satisfying.  Some months ago I added a
table containing 4500000 data rows (all other used tables are
smaller by order of magnitudes) so nothing very large and this
table is not directly accessed in the web application (just some
genereated caching tables updated once a day.  Some functions
and small tables were added as well, but there was a stable
core over several years.

Since about two weeks the application became *drastically* slower
and I urgently have to bring back the old performance.  As I said
I'm talking about functions accessing tables that did not increased
over several years and should behave more or less the same.

I wonder whether adding tables and functions could have an influence
on other untouched parts and how to find out what makes the things
slow that worked for years reliable and satisfying.  My first try
was to switch back to the default settings of the current Debian
package maintainers /etc/postgresql/postgresql.conf leaving the
parameters above untouched but this did not changed anything.

I'm quite clueless even how to explain the problem correctly and
I'm hoping you will at least find information enouth to ask me
"the right questions" to find out the information you need to
track down the performance problems.

Kind regards and thanks for any help

            Andreas.

Re: How to debug performance problems

From
"Craig A. James"
Date:
Andreas Tille wrote:
> My web application was running fine for years without any problem
> and the performance was satisfying.  Some months ago I added a
> table containing 4500000 data rows ...
>
> Since about two weeks the application became *drastically* slower
> and I urgently have to bring back the old performance.  As I said
> I'm talking about functions accessing tables that did not increased
> over several years and should behave more or less the same.

Don't assume that the big table you added is the source of the problem.  It might be, but more likely it's something
elseentirely.  You indicated that the problem didn't coincide with creating the large table. 

There are a number of recurring themes on this discussion group:

  * A long-running transaction keeps vacuum from working.

  * A table grows just enough to pass a threshold in the
    planner and a drastically different plan is generated.

  * An index has become bloated and/or corrupted, and you
    need to run the REINDEX command.

And several other common problems.

The first thing is to find out which query is taking a lot of time.  I'm no expert, but there have been several
explanationson this forum recently how to find your top time-consuming queries.  Once you find them, then EXPLAIN
ANALYZEshould get you started  

Craig

Re: How to debug performance problems

From
Jeff Davis
Date:
On Mon, 2007-02-19 at 11:50 +0100, Andreas Tille wrote:
> Hi,
>
> I'm running a web application using Zope that obtains all data
> from a PostgreSQL 7.4 database (Debian Sarge system with package
> 7.4.7-6sarge4 on an "older" Sparc machine, equipped with 2GB

Upgrade to 8.2.3 if possible, or at least to 7.4.16.

This is a basic question, but do you VACUUM ANALYZE regularly? 7.4 is
before autovacuum was integrated in the core. If you don't do this you
could have a lot of wasted space in your tables causing unneeded I/O,
and the planner might be making bad plans.

> memory and two processors E250 server).  Once I did some performance
> tuning and found out that
>
>       max_connections = 256
>       shared_buffers = 131072
>       sort_mem = 65536
>

You're allocating 50% of the physical memory to shared buffers. That's
not necessarily too much, but that's on the high side of the normal
range.

Does the total size of all of your tables and indexes add up to enough
to exhaust your physical memory? Check to see if you have any
exceptionally large tables or indexes. You can do that easily with
pg_relation_size('a_table_or_index') and pg_total_relation_size
('a_table').

> Since about two weeks the application became *drastically* slower
> and I urgently have to bring back the old performance.  As I said
> I'm talking about functions accessing tables that did not increased
> over several years and should behave more or less the same.
>
> I wonder whether adding tables and functions could have an influence
> on other untouched parts and how to find out what makes the things
> slow that worked for years reliable and satisfying.  My first try

You need to provide queries, and also define "slower". Set
log_min_duration_statement to some positive value (I often start with
1000) to try to catch the slow statements in the logs. Once you have
found the slow statements, do an EXPLAIN and an EXPLAIN ANALYZE on those
statements. That will tell you exactly what you need to know.

Regards,
    Jeff Davis



Re: How to debug performance problems

From
Scott Marlowe
Date:
On Mon, 2007-02-19 at 12:18, Jeff Davis wrote:
> On Mon, 2007-02-19 at 11:50 +0100, Andreas Tille wrote:
> > Hi,
> >
> > I'm running a web application using Zope that obtains all data
> > from a PostgreSQL 7.4 database (Debian Sarge system with package
> > 7.4.7-6sarge4 on an "older" Sparc machine, equipped with 2GB
>
> Upgrade to 8.2.3 if possible, or at least to 7.4.16.

What Jeff said ++

>
> This is a basic question, but do you VACUUM ANALYZE regularly? 7.4 is
> before autovacuum was integrated in the core. If you don't do this you
> could have a lot of wasted space in your tables causing unneeded I/O,
> and the planner might be making bad plans.

Look into vacuum full followed by reindex to fix the bloat.  Then
schedule regular vacuums (regular, not full).

> > memory and two processors E250 server).  Once I did some performance
> > tuning and found out that
> >
> >       max_connections = 256
> >       shared_buffers = 131072
> >       sort_mem = 65536
> >
>
> You're allocating 50% of the physical memory to shared buffers. That's
> not necessarily too much, but that's on the high side of the normal
> range.

For 7.4 that's far too much.  Very few installations running 7.4 will be
faster as you go past 10000 to 20000 buffers.  131072 is probably
slowing the machine down instead of speeding it up, as the buffer cache
algo in 7.4 was not that good with large amounts of memory.


Re: How to debug performance problems

From
Ray Stell
Date:
I'd like to have a toolbox prepared for when performance goes south.
I'm clueless.  Would someone mind providing some detail about how to
measure these four items Craig listed:

1. The first thing is to find out which query is taking a lot of time.

2. A long-running transaction keeps vacuum from working.

3. A table grows just enough to pass a threshold in the
   planner and a drastically different plan is generated.

4. An index has become bloated and/or corrupted, and you
   need to run the REINDEX command.

Thx.





On Wed, Aug 30, 2006 at 11:45:06AM -0700, Jeff Frost wrote:
> On Wed, 30 Aug 2006, Joe McClintock wrote:
>
> >I ran a vacuum, analyze and reindex on the database with no change in
> >performance, query time was still 37+ sec, a little worse. On our test
> >system I found that a db_dump from production and then restore brought the
> >database back to full performance. So in desperation I shut down the
> >production application, backed up the production database, rename the
> >production db, create a new empty production db and restored the
> >production backup to the empty db. After a successful db restore and
> >restart of the web application, everything was then up and running like a
> >top.
>
> Joe,
>
> I would guess that since the dump/restore yielded good performance once
> again, a VACUUM FULL would have also fixed the problem.  How are your FSM
> settings in the conf file?  Can you run VACUUM VERBOSE and send us the last
> 10 or so lines of output?
>
> A good article on FSM settings can be found here:
>
> http://www.pervasive-postgres.com/instantkb13/article.aspx?id=10087&cNode=5K1C3W
>
> You probably should consider setting up autovacuum and definitely should
> upgrade to at least 8.0.8 if not 8.1.4 when you get the chance.
>
> When you loaded the new data did you delete or update old data or was it
> just a straight insert?
>
> --
> Jeff Frost, Owner     <jeff@frostconsultingllc.com>
> Frost Consulting, LLC     http://www.frostconsultingllc.com/
> Phone: 650-780-7908    FAX: 650-649-1954
>
> ---------------------------(end of broadcast)---------------------------
> TIP 9: In versions below 8.0, the planner will ignore your desire to
>       choose an index scan if your joining column's datatypes do not
>       match







--

On Mon, Feb 19, 2007 at 10:02:46AM -0800, Craig A. James wrote:
> Andreas Tille wrote:
> >My web application was running fine for years without any problem
> >and the performance was satisfying.  Some months ago I added a
> >table containing 4500000 data rows ...
> >
> >Since about two weeks the application became *drastically* slower
> >and I urgently have to bring back the old performance.  As I said
> >I'm talking about functions accessing tables that did not increased
> >over several years and should behave more or less the same.
>
> Don't assume that the big table you added is the source of the problem.  It
> might be, but more likely it's something else entirely.  You indicated that
> the problem didn't coincide with creating the large table.
>
> There are a number of recurring themes on this discussion group:
>
>  * A long-running transaction keeps vacuum from working.
>
>  * A table grows just enough to pass a threshold in the
>    planner and a drastically different plan is generated.
>
>  * An index has become bloated and/or corrupted, and you
>    need to run the REINDEX command.
>
> And several other common problems.
>
> The first thing is to find out which query is taking a lot of time.  I'm no
> expert, but there have been several explanations on this forum recently how
> to find your top time-consuming queries.  Once you find them, then EXPLAIN
> ANALYZE should get you started
> Craig
>
> ---------------------------(end of broadcast)---------------------------
> TIP 5: don't forget to increase your free space map settings

--
You have no chance to survive make your time.

Re: How to debug performance problems

From
"Craig A. James"
Date:
Ray,

> I'd like to have a toolbox prepared for when performance goes south.
> I'm clueless.  Would someone mind providing some detail about how to
> measure these four items Craig listed:

I hope I didn't give the impression that these were the only thing to look at ... those four items just popped into my
head,because they've come up repeatedly in this forum.  There are surely more things that could be suspect; perhaps
otherscould add to your list. 

You can find the answers to each of the four topics I mentioned by looking through the archives of this list.  It's a
lotof work.  It would be really nice if there was some full-time employee somewhere whose job was to monitor this group
andpull out common themes that were put into a nice, tidy manual.  But this is open-source development, and there is no
suchperson, so you have to dig in and find it yourself. 

Craig

Re: How to debug performance problems

From
Ray Stell
Date:
On Wed, Feb 21, 2007 at 08:09:49AM -0800, Craig A. James wrote:
> I hope I didn't give the impression that these were the only thing to look
> at ... those four items just popped into my head, because they've come up
> repeatedly in this forum.  There are surely more things that could be
> suspect; perhaps others could add to your list.

I'm only clueless about the details of pg, not db perf concepts.  Really,
a mechanism to determine where the system is spending the response
time is key.  As you pointed out, the added table may not be the issue.
In fact, if you can't measure where the db time is being spent
you will be lucky to fix a performance issue, since you don't really
know what resources need to be addressed.


> so you have to dig in and find it yourself.

this afternoon, maybe.

Re: How to debug performance problems

From
Mark Stosberg
Date:
Ray Stell wrote:
> I'd like to have a toolbox prepared for when performance goes south.
> I'm clueless.  Would someone mind providing some detail about how to
> measure these four items Craig listed:
>
> 1. The first thing is to find out which query is taking a lot of time.
>
> 2. A long-running transaction keeps vacuum from working.
>
> 3. A table grows just enough to pass a threshold in the
>    planner and a drastically different plan is generated.

I just ran into a variation of this:

3.5 A table grows so large so that VACUUMING it takes extremely long,
interfering with the general performance of the system.

In our case, we think the table had about 36 million rows when it hit
that threshold.

I'm now working on a better solution for that table.

  Mark