Thread: Postgresql performance degrading... how to diagnose the root cause
Hi, I have a postgresql database (8.4) running in production whose performance is degrading. There is no single query that underperforms, all queries do. Another interesting point is that a generic performance test (https://launchpad.net/tpc-b) gives mediocre peformance when run on the database, BUT the same test on a newly created database, on the same pg cluster, on the same tablespace, does perform good. So the problem seems to be limited to this database, even on newly created tables... What should I check to find the culprit of this degrading performance ? Franck
Attachment
On 03/29/2013 15:20, Franck Routier wrote: > Hi, > Hello, > I have a postgresql database (8.4) running in production whose > performance is degrading. > There is no single query that underperforms, all queries do. > Another interesting point is that a generic performance test > (https://launchpad.net/tpc-b) gives mediocre peformance when run on > the database, BUT the same test on a newly created database, on the > same pg cluster, on the same tablespace, does perform good. > > So the problem seems to be limited to this database, even on newly > created tables... > > What should I check to find the culprit of this degrading performance ? > Difficult to answer with so few details, but I would start by logging slow queries, and run an explain analyze on them (or use auto_explain). Check if you're CPU bound or I/O bound (top, iostats, vmstat, systat, gstat..), check your configuration (shared_buffers, effective_cache_size, work_mem, checkpoint_segments, cpu_tuple_cost, ...) > Franck > -- No trees were killed in the creation of this message. However, many electrons were terribly inconvenienced.
Re: Postgresql performance degrading... how to diagnose the root cause
Franck Routier <franck.routier 'at' axege.com> writes: > Hi, > > I have a postgresql database (8.4) running in production whose > performance is degrading. > There is no single query that underperforms, all queries do. > Another interesting point is that a generic performance test > (https://launchpad.net/tpc-b) gives mediocre peformance when run on > the database, BUT the same test on a newly created database, on the > same pg cluster, on the same tablespace, does perform good. > > So the problem seems to be limited to this database, even on newly > created tables... > > What should I check to find the culprit of this degrading performance ? I don't know that tcp-b does but it looks like bloat, provided your comparison with the newly created database is using the same amount of data in database. You may want to use this loose bloat estimate: http://wiki.postgresql.org/wiki/Show_database_bloat and then use any preferred unbloat mechanism (vacuum full, cluster, possibly also reindex), and in the long term better configure some parameters (activate autovacuum if not already the case, lower autovacuum_vacuum_cost_delay and raise autovacuum_vacuum_cost_limit, raise max_fsm_* on your 8.4 or upgrade to 9.x). -- Guillaume Cottenceau
Hi, > I don't know that tcp-b does tpcb.jar is a java implementation of the http://www.tpc.org/tpcb/ benchmark. It is not particularly representative of my workload, but gives a synthetic, db-agnostic, view of the system performance. We use it to have quick view to compare differents servers (different OS, different RDBMS, etc...). That said, the test wil create tables, load them with data, and perform some transactions on them. The point that makes me wonder what happens, is that the test run on my main database is slow, while the same test run on a database on its own is quick. This is the same postgresql cluster (same postgresql.conf), same tablespace (so same disks), same hardware obviously. Regarding the server activity, it seems quite flat : iostat shows that disks are not working much (less than 5%), top shows only one active core, and load average is well under 1... > > http://wiki.postgresql.org/wiki/Show_database_bloat How do I interpret the output of this query ? Is 1.1 bloat level on a table alarming, or quite ok ? Franck
Attachment
Hi,
I have a postgresql database (8.4) running in production whose performance is degrading.
There is no single query that underperforms, all queries do.
Another interesting point is that a generic performance test (https://launchpad.net/tpc-b) gives mediocre peformance when run on the database, BUT the same test on a newly created database, on the same pg cluster, on the same tablespace, does perform good.
Re: Postgresql performance degrading... how to diagnose the root cause
Franck Routier <franck.routier 'at' axege.com> writes: >> http://wiki.postgresql.org/wiki/Show_database_bloat > How do I interpret the output of this query ? Is 1.1 bloat level on a > table alarming, or quite ok ? I am not very used to this, but I'd start by comparing the top result in your established DB against the top result in your fresh DB. What does it say? The wiki page says it is a loose estimate, however, unusually larger tbloat and/or wastedbytes might be an indication. Of course, if you can afford it, a good old VACUUM FULL ANALYZE VERBOSE would tell you how many pages were reclaimed while rewriting the table. Otherwise, VACUUM VERBOSE on both the established DB and a backup/restore on a fresh DB also provide a helpful comparison of how many pages are used for suspected tables. -- Guillaume Cottenceau
> > I don't know that tcp-b does
>
> tpcb.jar is a java implementation of the http://www.tpc.org/tpcb/
> benchmark. It is not particularly representative of my workload, but
> gives a synthetic, db-agnostic, view of the system performance.
> We use it to have quick view to compare differents servers (different
> OS, different RDBMS, etc...).
For information, pgbench is a sort of limited TPC-B benchmark.
> That said, the test wil create tables, load them with data, and perform
> some transactions on them.
> The point that makes me wonder what happens, is that the test run on my
> main database is slow, while the same test run on a database on its own
> is quick.
Do you mean when you run it against already existing data vs its own TPC-B DB?
> This is the same postgresql cluster (same postgresql.conf), same
> tablespace (so same disks), same hardware obviously.
>
> Regarding the server activity, it seems quite flat : iostat shows that
> disks are not working much (less than 5%), top shows only one active
> core, and load average is well under 1...
>
> > http://wiki.postgresql.org/wiki/Show_database_bloat
>
> How do I interpret the output of this query ? Is 1.1 bloat level on a
> table alarming, or quite ok ?
quite ok. The threshold for maintenance task is around 20%.
I wonder about your system catalogs (pg_type, pg_attribute, ...)
You can use low level tool provided by PostgreSQL to help figure what's going wrong.
pg_buffercache, pg_stattuple come first to explore your cached data and the block content.
Or some weird database configuration ? (parameters in PostgreSQL can be set per DB, per role, etc...)
--
Cédric Villemain +33 (0)6 20 30 22 52
http://2ndQuadrant.fr/
PostgreSQL: Support 24x7 - Développement, Expertise et Formation
Attachment
Hi,tpcb.jar is a java implementation of the http://www.tpc.org/tpcb/ benchmark. It is not particularly representative of my workload, but gives a synthetic, db-agnostic, view of the system performance.I don't know that tcp-b does
We use it to have quick view to compare differents servers (different OS, different RDBMS, etc...).
That said, the test wil create tables, load them with data, and perform some transactions on them.
The point that makes me wonder what happens, is that the test run on my main database is slow, while the same test run on a database on its own is quick.
This is the same postgresql cluster (same postgresql.conf), same tablespace (so same disks), same hardware obviously.
Regarding the server activity, it seems quite flat : iostat shows that disks are not working much (less than 5%),
top shows only one active core, and load average is well under 1...
Le 29/03/2013 15:20, Franck Routier a écrit : > Hi, > > I have a postgresql database (8.4) running in production whose > performance is degrading. > There is no single query that underperforms, all queries do. > Another interesting point is that a generic performance test > (https://launchpad.net/tpc-b) gives mediocre peformance when run on > the database, BUT the same test on a newly created database, on the > same pg cluster, on the same tablespace, does perform good. > > So the problem seems to be limited to this database, even on newly > created tables... > > What should I check to find the culprit of this degrading performance ? > > Franck > Just for the record, the problem turned out to be a too high default_statistics_target (4000) that was causing the planner to take up to seconds just to evaluate the better plan (probably in eqjoinsel() ). See thread titled "What happens between end of explain analyze and end of query execution ?" on this list for details. Thanks again to those who responded. Franck