Thread: Profiling

Profiling

From
"Josh Berkus"
Date:
Folks,

I had a request from one of the SF-PUG members that I found
interesting.  She suggested that we post the specs of some of the
PostgreSQL servers that we administrate, their statistics, and some
comments on how they perform.  I'll start it off with an example:

Small Cheap Workgroup Server
AMD Athalon 700mhz
256mb SDRAM
Dual 20gb 7200rpm IDE Drives (1 IBM, 1 Quantum)
   with OS, Apache, XLog on 1st drive,
   Postgres Database, Swap on 2nd drive
Running SuSE Linux 7.3
  Apache 1.3.x
  PHP 4.0.x
  PostgreSQL 7.1.3
3-8 concurrent users on intranet application
with large transactions but low transaction frequency
(est. 20-300 reads and 5-80 writes per hour)
on small database (< 20,000 records combined in main tables)

Performance assessment:  Adequate, reasonably fast
on selects except aggregates, commits taking 5-20 seconds
during medium activity.  Same system with a Celeron 500
previously demonstrated horrible performance (often > 45 seconds
on selects) on complex queries, such as one view with
custom aggregates.


-Josh Berkus


______AGLIO DATABASE SOLUTIONS___________________________
                                       Josh Berkus
  Complete information technology      josh@agliodbs.com
   and data management solutions       (415) 565-7293
  for law firms, small businesses        fax 621-2533
    and non-profit organizations.      San Francisco

Re: Profiling

From
"Josh Berkus"
Date:
Folks,

> I had a request from one of the SF-PUG members that I found
> interesting.  She suggested that we post the specs of some of the
> PostgreSQL servers that we administrate, their statistics, and some
> comments on how they perform.  I'll start it off with an example:
>
> Small Cheap Workgroup Server
> AMD Athalon 700mhz
> 256mb SDRAM
> Dual 20gb 7200rpm IDE Drives (1 IBM, 1 Quantum)
>    with OS, Apache, XLog on 1st drive,
>    Postgres Database, Swap on 2nd drive
> Running SuSE Linux 7.3
>   Apache 1.3.x
>   PHP 4.0.x
>   PostgreSQL 7.1.3
> 3-8 concurrent users on intranet application
> with large transactions but low transaction frequency
> (est. 20-300 reads and 5-80 writes per hour)
> on small database (< 20,000 records combined in main tables)
>
> Performance assessment:  Adequate, reasonably fast
> on selects except aggregates, commits taking 5-20 seconds
> during medium activity.  Same system with a Celeron 500
> previously demonstrated horrible performance (often > 45 seconds
> on selects) on complex queries, such as one view with
> custom aggregates.

Oh, and I forgot:

shared_buffers 4096
sort_mem 2048
wal_files 8
wal_sync_method = fdatasync

-Josh

______AGLIO DATABASE SOLUTIONS___________________________
                                       Josh Berkus
  Complete information technology      josh@agliodbs.com
   and data management solutions       (415) 565-7293
  for law firms, small businesses        fax 621-2533
    and non-profit organizations.      San Francisco

Re: Profiling

From
Justin Clift
Date:
Josh Berkus wrote:
> Folks,
>
>
>>I had a request from one of the SF-PUG members that I found
>>interesting.  She suggested that we post the specs of some of the
>>PostgreSQL servers that we administrate, their statistics, and some
>>comments on how they perform.  I'll start it off with an example:
>>
>>Small Cheap Workgroup Server
>>AMD Athalon 700mhz
>>256mb SDRAM
>>Dual 20gb 7200rpm IDE Drives (1 IBM, 1 Quantum)
>>   with OS, Apache, XLog on 1st drive,
>>   Postgres Database, Swap on 2nd drive
>>Running SuSE Linux 7.3
>>  Apache 1.3.x
>>  PHP 4.0.x
>>  PostgreSQL 7.1.3
>>3-8 concurrent users on intranet application
>>with large transactions but low transaction frequency
>>(est. 20-300 reads and 5-80 writes per hour)
>>on small database (< 20,000 records combined in main tables)
>>
>>Performance assessment:  Adequate, reasonably fast
>>on selects except aggregates, commits taking 5-20 seconds
>>during medium activity.  Same system with a Celeron 500
>>previously demonstrated horrible performance (often > 45 seconds
>>on selects) on complex queries, such as one view with
>>custom aggregates.
>
>
> Oh, and I forgot:
>
> shared_buffers 4096
> sort_mem 2048
> wal_files 8
> wal_sync_method = fdatasync

Hi Josh,

Want to CVS checkout the latest OSDB source code (http://www.sf.net/projects/osdb), generate say a 100MB database and
do 
a multiuser test of 20 or so users on it?

:-)

Regards and best wishes,

Justin Clift


> -Josh
>
> ______AGLIO DATABASE SOLUTIONS___________________________
>                                        Josh Berkus
>   Complete information technology      josh@agliodbs.com
>    and data management solutions       (415) 565-7293
>   for law firms, small businesses        fax 621-2533
>     and non-profit organizations.      San Francisco
>
> ---------------------------(end of broadcast)---------------------------
> TIP 5: Have you checked our extensive FAQ?
>
> http://www.postgresql.org/users-lounge/docs/faq.html


--
"My grandfather once told me that there are two kinds of people: those
who work and those who take the credit. He told me to try to be in the
first group; there was less competition there."
- Indira Gandhi


Re: Profiling

From
Shridhar Daithankar
Date:
On Tuesday 17 December 2002 09:33 am, you wrote:
> Folks,
>
> I had a request from one of the SF-PUG members that I found
> interesting.  She suggested that we post the specs of some of the
> PostgreSQL servers that we administrate, their statistics, and some
> comments on how they perform.  I'll start it off with an example:

OK my take.

P-II-450MHz/256MB/20GB IDE. Mandrake8.2, postgresql 7.2.x

PGBench, with 10M records/10,000 transactions/10 users.

Throughput 25tps.

Rest of the things were default. I am not too sure of details as this was more
than 4 months back and that machine is windows now.

Same machine/Another benchmark

Banking application simulation.

Shared buffers 14000
Number of records: 100 in one table, continously updated+log table continously
inserted
Throughput 200tps.

 HTH

 Shridhar


Re: Profiling

From
Cory 'G' Watson
Date:
On Tuesday, December 17, 2002, at 12:57 AM, Shridhar Daithankar wrote:

> On Tuesday 17 December 2002 09:33 am, you wrote:
>> Folks,
>>
>> I had a request from one of the SF-PUG members that I found
>> interesting.  She suggested that we post the specs of some of the
>> PostgreSQL servers that we administrate, their statistics, and some
>> comments on how they perform.  I'll start it off with an example:
>

My take:

Dual PIII-1.12Ghz, 3Gb, 5 x 36 RAID 5'ed with a spare, RedHat 7.2, Pg
7.3

pgbench, default settings, 252tps inc. connex, 409tps excluding connex

Day to day, runs a monitoring/historical analysis tool of my design
with gathers metrics from around 30 hosts (they report every 10
minutes, by their clock).  Has 3,689,652 rows as of right now in the
'metrics' table, which is indexed by timestamp.

My 'main' query is in the form of:

SELECT timestamp, data FROM metrics WHERE resgroupid=? and hostid=? AND
timestamp BETWEEN ? AND ? ORDER BY timestamp

Index is on timestamp.

This query generally takes about half a second for 24 hours worth of
data.  I just ran a 240 hour query on a test database with about 20,000
rows and the result too 2998ms.

Things slowed to a crawl about 2 weeks ago, so I upgraded to 7.3 and
saw a huge improvement.  I believe part of this might have been due to
the recreation of the database, similar to a CLUSTER.  My performance
is not degrading from a time perspective, but CPU usage is steadily
degrading.  User time is steadily increasing over the last 240 hours,
from 5% to 15%.  Attached is output of my monitoring program (well, the
new improved Java version) showing the CPU performance over the last
240 hours.

shared_buffers = 98304
sort_mem = 1600
fsync = false

Everything else is default, recommendations welcome. ;)



Cory 'G' Watson

Attachment

Re: Profiling

From
Cory 'G' Watson
Date:
On Tuesday, December 17, 2002, at 12:57 AM, Shridhar Daithankar wrote:

> On Tuesday 17 December 2002 09:33 am, you wrote:
>> Folks,
>>
>> I had a request from one of the SF-PUG members that I found
>> interesting.  She suggested that we post the specs of some of the
>> PostgreSQL servers that we administrate, their statistics, and some
>> comments on how they perform.  I'll start it off with an example:
>

My take:

Dual PIII-1.12Ghz, 3Gb, 5 x 36 RAID 5'ed with a spare, RedHat 7.2, Pg
7.3

pgbench, default settings, 252tps inc. connex, 409tps excluding connex

Day to day, runs a monitoring/historical analysis tool of my design
with gathers metrics from around 30 hosts (they report every 10
minutes, by their clock).  Has 3,689,652 rows as of right now in the
'metrics' table, which is indexed by timestamp.

My 'main' query is in the form of:

SELECT timestamp, data FROM metrics WHERE resgroupid=? and hostid=? AND
timestamp BETWEEN ? AND ? ORDER BY timestamp

Index is on timestamp.

This query generally takes about half a second for 24 hours worth of
data.  I just ran a 240 hour query on a test database with about 20,000
rows and the result too 2998ms.

Things slowed to a crawl about 2 weeks ago, so I upgraded to 7.3 and
saw a huge improvement.  I believe part of this might have been due to
the recreation of the database, similar to a CLUSTER.  My performance
is not degrading from a time perspective, but CPU usage is steadily
degrading.  User time is steadily increasing over the last 240 hours,
from 5% to 15%.  Attached is output of my monitoring program (well, the
new improved Java version) showing the CPU performance over the last
240 hours.

shared_buffers = 98304
sort_mem = 1600
fsync = false

Everything else is default, recommendations welcome. ;)



Cory 'G' Watson

Attachment

Re: Profiling

From
Shridhar Daithankar
Date:
On Tuesday 17 December 2002 06:59 pm, you wrote:
> shared_buffers = 98304
> sort_mem = 1600
> fsync = false
>
> Everything else is default, recommendations welcome. ;)

What is the vacuum frequency?

 Shridhar


Re: Profiling

From
Cory 'G' Watson
Date:
On Tuesday, December 17, 2002, at 07:35 AM, Shridhar Daithankar wrote:

> On Tuesday 17 December 2002 06:59 pm, you wrote:
>> shared_buffers = 98304
>> sort_mem = 1600
>> fsync = false
>>
>> Everything else is default, recommendations welcome. ;)
>
> What is the vacuum frequency?

Every morning.  This db is almost exclusively INSERT and SELECT.  Well,
I take that back, a single table gets UPDATEs rather frequently.
Otherwise, INSERT only.

Cory 'G' Watson


Re: Profiling

From
Shridhar Daithankar
Date:
On Tuesday 17 December 2002 07:13 pm, you wrote:
> On Tuesday, December 17, 2002, at 07:35 AM, Shridhar Daithankar wrote:
> > What is the vacuum frequency?
>
> Every morning.  This db is almost exclusively INSERT and SELECT.  Well,
> I take that back, a single table gets UPDATEs rather frequently.
> Otherwise, INSERT only.

i recommend a vacuum analyze per 1000/2000 records for the table that gets
updated. It should boost the performance like anything..

 Shridhar

Re: Profiling

From
Cory 'G' Watson
Date:
On Tuesday, December 17, 2002, at 07:49 AM, Shridhar Daithankar wrote:

> On Tuesday 17 December 2002 07:13 pm, you wrote:
>> On Tuesday, December 17, 2002, at 07:35 AM, Shridhar Daithankar wrote:
>>> What is the vacuum frequency?
>>
>> Every morning.  This db is almost exclusively INSERT and SELECT.
>> Well,
>> I take that back, a single table gets UPDATEs rather frequently.
>> Otherwise, INSERT only.
>
> i recommend a vacuum analyze per 1000/2000 records for the table that
> gets
> updated. It should boost the performance like anything..

By my math, I'll need to vacuum once every hour or so.  Cron, here I
come.

vacuumdb --table cached_metrics loggerithim

I assume I do not need a --analyze, since that table has no indexes.
Should I vacuum the entire DB?

Any other settings I should look at?  Note that I'm not necessarily
having any problems at present, but one can always tune.  This DB is
used with a web app (mod_perl/DBI) at the moment, but is moving to a
Java Swing client, which will give me much more data about performance.

Cory 'G' Watson


Re: Profiling

From
"Shridhar Daithankar"
Date:
On 17 Dec 2002 at 8:20, Cory 'G' Watson wrote:
> By my math, I'll need to vacuum once every hour or so.  Cron, here I
> come.
>
> vacuumdb --table cached_metrics loggerithim

http://gborg.postgresql.org/project/pgavd/projdisplay.php

Yeah, yeah.. I wrote that..and use CVS as usual. No release as yet..


> I assume I do not need a --analyze, since that table has no indexes.
> Should I vacuum the entire DB?

You need analyse to keep vacuum non-locking I assume. And there is no need to
vacuum entire DB.

HTH

Bye
 Shridhar

--
paycheck:    The weekly $5.27 that remains after deductions for federal
withholding, state withholding, city withholding, FICA,    medical/dental, long-
term disability, unemployment insurance,    Christmas Club, and payroll savings
plan contributions.


Re: Profiling

From
Tom Lane
Date:
"Cory 'G' Watson" <gphat@cafes.net> writes:
> I assume I do not need a --analyze, since that table has no indexes.

Whether you need analyze or not has nothing to do with whether there
are indexes.  You probably don't need it once an hour, but maybe once
a day would be good.

> Should I vacuum the entire DB?

Overkill; just get the heavily-updated table(s).  A DB-wide vacuum must
be done occasionally, but again once-a-day would be plenty.

> Any other settings I should look at?

Free space map (fsm) settings must be adequate to keep track of the free
space in your tables.

However, all of this relates only to keeping performance good on the
table with lots of updates.  If you are seeing progressive degradation
on a table that only gets INSERTs, then there's something else going on.
AFAIR you didn't show us an EXPLAIN ANALYZE for the principal query?

            regards, tom lane

Re: Profiling

From
"scott.marlowe"
Date:
OK, my turn.

We have two main servers that are identical, one is the online server, the
other is the hot spare.   Their specs:

Dual PIII-750 1.5 Gig ram and dual 18 Gig 10krpm UW SCSI drives.
OS on one drive, postgresql on the other.

Interesting postgresql.conf entries:

max_connections = 128
shared_buffers = 32768
max_fsm_relations = 10000
sort_mem = 2048
vacuum_mem = 8192
cpu_tuple_cost = 0.01
cpu_index_tuple_cost = 0.0001
cpu_operator_cost = 0.05

pgbench -c 4 -t 200 delivers about 240 tps.

Performance is outstanding.  This machine runs apache, OpenLDAP, Real
Server 1, as well as Postgresql.  All non-database data is stored on a
NAS, so the local drives are only used for swap and postgresql.  Average
load is about 40 to 200 reads per minute, with only a handful of writes
per minute (1 to 5 max).  Most data is loaded from nightly runs out of
the mainframe and a few other systems for things like company phonebook
and ldap.

My test servers:

Server A: Dual PPro 200 with 256 Meg RAM and 6x4Gig 10kRPM UW SCSI drives
(3 quantum, 3 seagate) and 2x80Gig 7200 RPM IDE drives.

Data is generally stored on the pair of 80 gig drives, because the 4 gig
scsis just aren't big enough.  The 80 gig ides are setup as two 40 gig
mirrors (i.e. they're split in half) with the other half used to store
backups and such.

shared_buffers = 5000

pgbench -c 4 -t 200 yields about 80 tps.

Performance is actually quite good, and this is a box we bought in 1997.

Server B: (My workstation) Celeron 1.1GHz, with 512 MEg RAM and a 40 gig
IDE @7200 RPM, and a 17 Gig IDE @5400 RPM.

shared_buffers = 4096

pgbench -c 4 -t 200 yields about 75 tps.  Yes, my dual PPro 200 outruns
this box.  But then again, my workstation has KDE up and running with
Mozilla, xmms mp3 player going, and a couple other programs running as
well.

All of these boxes are / were heavily tested before deployment, and we
have never had a problem with postgresql on any of them.