Thread: Performance/Maintenance test result collection

Performance/Maintenance test result collection

From
Chris Mckenzie
Date:

Hi.

I'm trying to plan for a performance test session where a large database is subject to regular hits from my application while both regular and full database maintenance is being performed. The idea is to gain a better idea on the impact maintenance will have on regular usage, and when to reasonably schedule both regular and full maintenance.

Is the verbose option for the VACUUM command and physical disk space usage enough? What's a good rule of thumb for verifying that the space supposedly recovered from a FULL vacuum is real? I can turn on verbose for a FULL vacuum and watch the "Total free space (including removable row versions) is 7032 bytes." details, but can it be reasonably correlated against disk linux system tools? (like du) Or only as a guidance that some maintenance is being performed?

Any other stat collection points I should be watching?

Here's an example lazy vacuum verbose output from an empty schema table: (not that you guys haven't seen this stuff enough)

VACUUM VERBOSE app.queuedemails;
INFO:  vacuuming "app.queuedemails"
INFO:  index "queuedemails1" now contains 0 row versions in 1 pages
DETAIL:  0 index pages have been deleted, 0 are currently reusable.
CPU 0.00s/0.00u sec elapsed 0.00 sec.

INFO:  index "queuedemails2" now contains 0 row versions in 1 pages
DETAIL:  0 index pages have been deleted, 0 are currently reusable.
CPU 0.00s/0.00u sec elapsed 0.00 sec.

INFO:  "queuedemails": found 0 removable, 0 nonremovable row versions in 0 pages
DETAIL:  0 dead row versions cannot be removed yet.
There were 0 unused item pointers.
0 pages are entirely empty.
CPU 0.00s/0.00u sec elapsed 0.00 sec.

INFO:  vacuuming "pg_toast.pg_toast_17595"
INFO:  index "pg_toast_17595_index" now contains 0 row versions in 1 pages
DETAIL:  0 index pages have been deleted, 0 are currently reusable.
CPU 0.00s/0.00u sec elapsed 0.00 sec.

INFO:  "pg_toast_17595": found 0 removable, 0 nonremovable row versions in 0 pages
DETAIL:  0 dead row versions cannot be removed yet.
There were 0 unused item pointers.
0 pages are entirely empty.
CPU 0.00s/0.00u sec elapsed 0.00 sec.

THANKS!!!

- Chris
 

Re: Performance/Maintenance test result collection

From
"Jim C. Nasby"
Date:
On Wed, May 17, 2006 at 01:50:22PM -0400, Chris Mckenzie wrote:
> Hi.
>
> I'm trying to plan for a performance test session where a large database is
> subject to regular hits from my application while both regular and full
> database maintenance is being performed. The idea is to gain a better idea
> on the impact maintenance will have on regular usage, and when to reasonably
> schedule both regular and full maintenance.

What do you mean by "regular and full maintenance"? Do you mean VACUUM
FULL?

If you're vacuuming appropriately you shouldn't have any need to ever
VACUUM FULL...
--
Jim C. Nasby, Sr. Engineering Consultant      jnasby@pervasive.com
Pervasive Software      http://pervasive.com    work: 512-231-6117
vcard: http://jim.nasby.net/pervasive.vcf       cell: 512-569-9461

Re: Performance/Maintenance test result collection

From
Chris Mckenzie
Date:

Yes, regular versus full vacuum. Thanks for the comment but I was hoping to come to that conclusion on my own by observing the affects of the different vacuums.

My original question was guidance on collecting data for confirmation on the impact that maintenance of a large database (as a result of my applications regular usage over a period of time) has.

I can du the various tables and compare their size before/after against the verbose output of a VACUUM FULL. I can use sar during all of this to monitor cpu and i/o activity. I can turn on transaction logging once I get a better idea of maintenance impact on my hardware so identify the biggest transactions that might statement timeout if a VACUUM was running at the same time.

Any suggestions or comments related to collection of this type of data would be helpful. I've already read the Postges 7.4 (yes, I'm stuck on 7.4) manual, I was hoping for this mail-list' wisdom to supply me with some tips that can only be learnt through painful experience. :-)

Thanks.

- Chris

-----Original Message-----
From: Jim C. Nasby [mailto:jnasby@pervasive.com]
Sent: Wednesday, May 17, 2006 3:25 PM
To: Chris Mckenzie
Cc: pgsql-performance@postgresql.org
Subject: Re: [PERFORM] Performance/Maintenance test result collection

On Wed, May 17, 2006 at 01:50:22PM -0400, Chris Mckenzie wrote:
> Hi.
>
> I'm trying to plan for a performance test session where a large
> database is subject to regular hits from my application while both
> regular and full database maintenance is being performed. The idea is
> to gain a better idea on the impact maintenance will have on regular
> usage, and when to reasonably schedule both regular and full
> maintenance.

What do you mean by "regular and full maintenance"? Do you mean VACUUM FULL?

If you're vacuuming appropriately you shouldn't have any need to ever VACUUM FULL...
--
Jim C. Nasby, Sr. Engineering Consultant      jnasby@pervasive.com
Pervasive Software      http://pervasive.com    work: 512-231-6117
vcard: http://jim.nasby.net/pervasive.vcf       cell: 512-569-9461

Re: Performance/Maintenance test result collection

From
"Jim C. Nasby"
Date:
On Thu, May 18, 2006 at 11:20:17AM -0400, Chris Mckenzie wrote:
> Yes, regular versus full vacuum. Thanks for the comment but I was hoping to
> come to that conclusion on my own by observing the affects of the different
> vacuums.
>
> My original question was guidance on collecting data for confirmation on the
> impact that maintenance of a large database (as a result of my applications
> regular usage over a period of time) has.
>
> I can du the various tables and compare their size before/after against the
> verbose output of a VACUUM FULL. I can use sar during all of this to monitor
> cpu and i/o activity. I can turn on transaction logging once I get a better
> idea of maintenance impact on my hardware so identify the biggest
> transactions that might statement timeout if a VACUUM was running at the
> same time.

Well, vacuum full re-writes the table completely from scratch. Lazy
vacuum reads the entire table (just like full), but only writes out
pages that have dead space on them.

But if you're wondering about the impact that will have on your
application, you can stop wondering, because vacuum full will
essentially shut your application down because it locks out use of the
table while it's being vacuumed.

> Any suggestions or comments related to collection of this type of data would
> be helpful. I've already read the Postges 7.4 (yes, I'm stuck on 7.4)
> manual, I was hoping for this mail-list' wisdom to supply me with some tips
> that can only be learnt through painful experience. :-)

If you're stuck on 7.4, at least make sure you're using the most recent
version. Otherwise you're exposing yourself to a number of data loss
bugs.

As for vacuuming, it depends a lot on what your application is doing. If
you have frequent-enough slow periods (like at night), you can probably
schedule a database-wide vacuum during that time, possibly supplimented
by vacuums on critical tables during the day. If you have something
closer to a 24x7 load then pg_autovacuum is probably your best bet,
along with vacuum_cost_delay (if that's available in 7.4; it's been so
long I don't remember).

There's a few articles in our knowledge base
(http://www.pervasivepostgres.com/kb/index.asp) that might be worth
reading as well (search for 'vacuum'). In particular, "Is PostgreSQL
remembering what I vacuumed" has some critical information about
managing the free space map.
--
Jim C. Nasby, Sr. Engineering Consultant      jnasby@pervasive.com
Pervasive Software      http://pervasive.com    work: 512-231-6117
vcard: http://jim.nasby.net/pervasive.vcf       cell: 512-569-9461