Thread: Optimize

Optimize

From

Torsten Schulz

Date:

24 November 2003, 16:48:52

Yes, I know: very difficult question, but I don't know what to do now.

Our Server:
Dual-CPU with 1.2 GHz
1.5 GB RAM

Our Problem: We are a Community. Between 19 and 21 o clock we have >350
User in the Community. But then, the Database are very slow. And we have
per CPU ~20-30% idle-time.

Has anyone an idea what's the best configuration for thta server?

Many Greetings
T. Schulz (with very bad english, i know)

Re: Optimize

From

Gaetano Mendola

Date:

24 November 2003, 17:20:46

Torsten Schulz wrote:

> Yes, I know: very difficult question, but I don't know what to do now.
>
> Our Server:
> Dual-CPU with 1.2 GHz
> 1.5 GB RAM
>
> Our Problem: We are a Community. Between 19 and 21 o clock we have >350
> User in the Community. But then, the Database are very slow. And we have
> per CPU ~20-30% idle-time.

May we know the postgres version that you are running and
see the query that run slow ?
Is also usefull take a look at your postgresql configuration.
You can see doing   select * from pg_stat_activity the
queries that are currently running on your server, and
do a explain analize on it to see which one is the
bottleneck. If you are running the 7.4 you can see on
the log the total ammount for each query.

Let us know.

Regards
Gaetano Mendola

Re: Optimize

From

Torsten Schulz

Date:

24 November 2003, 17:33:38

Gaetano Mendola wrote:

> Torsten Schulz wrote:
>
>> Yes, I know: very difficult question, but I don't know what to do now.
>>
>> Our Server:
>> Dual-CPU with 1.2 GHz
>> 1.5 GB RAM
>>
>> Our Problem: We are a Community. Between 19 and 21 o clock we have
>> >350 User in the Community. But then, the Database are very slow. And
>> we have per CPU ~20-30% idle-time.
>
>
> May we know the postgres version that you are running and
> see the query that run slow ?

Postgres: 7.3.2
Query: All queries

Configuration:
max_connections = 1000  # Must be, if lower then 500 we become
connection-errors
shared_buffers = 5000        # 2*max_connections, min 16
max_fsm_relations = 1000    # min 10, fsm is free space map
max_fsm_pages = 2000000      # min 1000, fsm is free space map
max_locks_per_transaction = 64 # min 10
wal_buffers = 2000            # min 4

sort_mem = 32768             # min 32
vacuum_mem = 32768           # min 1024

fsync = false

enable_seqscan = true
enable_indexscan = true
enable_tidscan = true
enable_sort = true
enable_nestloop = true
enable_mergejoin = true
enable_hashjoin = true

effective_cache_size = 96000  # default in 8k pages


That are all uncommented lines. I've found the values in internet and
had tested it. But in performance are no difference between old
configuration an this.

> Is also usefull take a look at your postgresql configuration.
> You can see doing   select * from pg_stat_activity the
> queries that are currently running on your server, and
> do a explain analize on it to see which one is the
> bottleneck. If you are running the 7.4 you can see on
> the log the total ammount for each query.
>
I'll show tomorrow for this, today it is too late, the performance is
now perfect. It's only slow on this 2 hours with so many users on server.

Oh, and i can't update to 7.4. The Chat don't run with libraries of 7.4

Re: Optimize

From

Dennis Bjorklund

Date:

24 November 2003, 17:46:21

On Mon, 24 Nov 2003, Torsten Schulz wrote:

> sort_mem = 32768             # min 32

32 meg per sort can be a lot in total if you have many clients sorting
things. I assume you have checked so that the computer is not pushed into
swapping when you have the peak with lots of users. A swapping computer is
never fast.

Using some swap space is not bad, but a lot of page in and page out to the
swap is not good.

--
/Dennis

Re: Optimize

From

Gaetano Mendola

Date:

24 November 2003, 17:56:51

Torsten Schulz wrote:

> Gaetano Mendola wrote:
>
>> Torsten Schulz wrote:
>>
>>> Yes, I know: very difficult question, but I don't know what to do now.
>>>
>>> Our Server:
>>> Dual-CPU with 1.2 GHz
>>> 1.5 GB RAM
>>>
>>> Our Problem: We are a Community. Between 19 and 21 o clock we have
>>> >350 User in the Community. But then, the Database are very slow. And
>>> we have per CPU ~20-30% idle-time.
>>
>>
>>
>> May we know the postgres version that you are running and
>> see the query that run slow ?
>
>
> Postgres: 7.3.2
> Query: All queries
>
> Configuration:
> max_connections = 1000  # Must be, if lower then 500 we become
> connection-errors
> shared_buffers = 5000        # 2*max_connections, min 16
> max_fsm_relations = 1000    # min 10, fsm is free space map
> max_fsm_pages = 2000000      # min 1000, fsm is free space map
> max_locks_per_transaction = 64 # min 10
> wal_buffers = 2000            # min 4
>
> sort_mem = 32768             # min 32
> vacuum_mem = 32768           # min 1024
>
> fsync = false
>
> enable_seqscan = true
> enable_indexscan = true
> enable_tidscan = true
> enable_sort = true
> enable_nestloop = true
> enable_mergejoin = true
> enable_hashjoin = true
>
> effective_cache_size = 96000  # default in 8k pages

With 500 connection at the sime time 32MB for sort_mem can be too much.
What say "iostat 1"  and "vmstat 1" ?

Try also to reduce this costs:

random_page_cost = 2.5
cpu_tuple_cost = 0.005
cpu_index_tuple_cost = 0.0005


BTW take a query and show us the result of explain analyze.


Regards
Gaetano Mendola

Re: Optimize

From

Neil Conway

Date:

24 November 2003, 22:25:22

Torsten Schulz <comic@be-a-part.de> writes:
> Our Server:
> Dual-CPU with 1.2 GHz
> 1.5 GB RAM

What kind of I/O subsystem is in this machine? This is an x86 machine,
right?

> Has anyone an idea what's the best configuration for thta server?

It is difficult to say until you provide some information on the
system's state during periods of heavy traffic.

BTW, in addition to the machine's hardware configuration, have you
looked at tuning the queries running on PostgreSQL? What about the OS
kernel?

-Neil

design question: general db performance

From

shane hill

Date:

25 November 2003, 14:28:24

Hi folks,

Disclaimer:  I am relatively new to RDBMSs, so please do not laugh at me
too loudly,  you can laugh, just not too loudly and please do not point. :)

I am working on an Automated Installer Testing System for Adobe Systems
and I am doing a DB redesign of the current postgres db:

1. We are testing a matrix of over 900 Acrobat installer configurations
and we are tracking every file and registry entry that is affected by an
installation.

2. a single file or registry entry that is affected by any test is
stored in the db as a record.

3. a typical record is about 12 columns of string data. the data is all
information about a file (mac or windows) or windows registry entry [
file or regkey name, file size,  modification date, checksum,
permissions, owner, group, and in the case of a mac, we are getting all
the hfs atts as well].

4. A typical test produces anywhere from 2000 - 5000 records.


Our db is getting to be a respectable size (about 10GB right now) and is
growing slower and slower. I have been charged with making it faster and
with a smaller footprint while retaining all of the current
functionality.  here is one of my ideas.  Please tell me if I am crazy:

The strings that we are storing (mentioned in 3 above) are extremely
repetitive. for example, there are a limited number of permissions for
the files in the acrobat installer and we are storing this information
over and over again in the tables. The same goes for filenames, registry
key names and almost all of the data we are storing.  So it seems to me
that to create a smaller and faster database we could assign an integer
to each string and just store the integer representation of the string
rather than the string itself.  Then we would just store the strings in
a separate table one time and do join queries against the tables that
are holding the strings and the main data tables.  for example,

a table that would hold unique permissions strings would look like

table:  perms_strs

string             |  id
---------------------
'drwxr-xr-x'   |   1
'-rw-------'    |   2
'drwxrwxr-x'  |   3
'-rw-r--r--'     |   4

then in my data I would just store 1,2,3 or 4 instead of the whole
permissions string.

it seems to me that we would save lots of space and over time not see
the same performance degradation.

anyways,  please tell me if this makes sense and make any other
suggestions that you can think of.  I am just now starting this analysis
so I cannot give specifics as to where we are seeing poor performance
just yet.  just tell me if my concepts are correct.

thanks for your time and for suffering this email.

chao,

-Shane

Re: design question: general db performance

From

Jord Tanner

Date:

25 November 2003, 14:59:03

[small chuckle]

By George, I think he's got it!

You are on the right track. Have a look at this link on database
normalization for more info:

http://databases.about.com/library/weekly/aa080501a.htm



On Tue, 2003-11-25 at 10:42, shane hill wrote:
> Hi folks,
>
> Disclaimer:  I am relatively new to RDBMSs, so please do not laugh at me
> too loudly,  you can laugh, just not too loudly and please do not point. :)
>

[snip]

--
Jord Tanner <jord@indygecko.com>

Re: design question: general db performance

From

Josh Berkus

Date:

25 November 2003, 15:23:54

Shane,

> Disclaimer:  I am relatively new to RDBMSs, so please do not laugh at me
> too loudly,  you can laugh, just not too loudly and please do not point. :)

Hey, we all started somewhere.  Nobody was born knowing databases.  Except
maybe Neil Conway.

> I am working on an Automated Installer Testing System for Adobe Systems
> and I am doing a DB redesign of the current postgres db:

Cool!    We're going to want to talk to you about a case study later, if you
can get your boss to authorize it ....

> Our db is getting to be a respectable size (about 10GB right now) and is
> growing slower and slower.

Slower and slower?   Hmmm ... what's your VACUUM. ANALYZE & REINDEX schedule?
What PG version?  What are your postgresql.conf settings?   Progressive
performance loss may indicate a problem with one or more of these things ...

> then in my data I would just store 1,2,3 or 4 instead of the whole
> permissions string.
>
> it seems to me that we would save lots of space and over time not see
> the same performance degradation.

Yes, this is a good idea.   Abstracting other repetitive data is good too.
Also keep in mind that the permissions themselves can be represented as octal
numbers instead of strings, which takes less space.

--
-Josh Berkus
 Aglio Database Solutions
 San Francisco

Re: design question: general db performance

From

Jeff

Date:

25 November 2003, 15:24:07

On Tue, 25 Nov 2003 10:42:47 -0800
shane hill <shill@adobe.com> wrote:

> Our db is getting to be a respectable size (about 10GB right now) and
> is growing slower and slower. I have been charged with making it
> faster and with a smaller footprint while retaining all of the current
> functionality.  here is one of my ideas.  Please tell me if I am
> crazy:
>

What exactly is it getting slower doing?

Have you run through the usual gamut of things to check - shared
buffers, vacuum analyzig, etc. etc.

What ver of PG?

What OS?

Can you post any schema/queries?

Normalizing can help. But I don't think it is going to be a magical
bullet that will make the DB instantly fast.  It will reduce the size of
it though.

--
Jeff Trout <jeff@jefftrout.com>
http://www.jefftrout.com/
http://www.stuarthamm.net/

Re: design question: general db performance

From

Richard Huxton

Date:

25 November 2003, 15:28:40

On Tuesday 25 November 2003 18:42, shane hill wrote:
>
> Our db is getting to be a respectable size (about 10GB right now) and is
> growing slower and slower. I have been charged with making it faster and
> with a smaller footprint while retaining all of the current
> functionality.  here is one of my ideas.  Please tell me if I am crazy:

Your idea of using an integer makes sense - that's how it is stored on unix
anyway.

Are you familiar with VACUUM/VACUUM FULL/REINDEX and when you should use them?
If not, that's a good place to start. Try a VACUUM FULL on frequently updated
tables and see if that reduces your disk size.

You'll probably want to check the performance notes too:
http://www.varlena.com/varlena/GeneralBits/Tidbits/index.php

--
  Richard Huxton
  Archonet Ltd

Re: Optimize

From

Torsten Schulz

Date:

25 November 2003, 18:08:03

Chester Kustarz wrote:

> On Mon, 24 Nov 2003, Torsten Schulz wrote:
>
>
>> shared_buffers = 5000        # 2*max_connections, min 16
>>
>
>
> that looks pretty small. that would only be 40MBytes (8k/page *
> 5000pages).
>
> http://www.varlena.com/GeneralBits/Tidbits/perf.html
>
>
>
Ok, thats it. I've set it to 51200, now it seems to be very fast.

Thank you!


-------- Original Message --------
Subject:     Re: [PERFORM] Optimize
Date:     Tue, 25 Nov 2003 23:04:06 +0100
From:     Torsten Schulz <comic@be-a-part.de>
To:     Chester Kustarz <chester@arbor.net>
References:     <Pine.BSO.4.44.0311241718030.19584-100000@detroit.arbor.net>



Chester Kustarz wrote:

>On Mon, 24 Nov 2003, Torsten Schulz wrote:
>
>
>>shared_buffers = 5000        # 2*max_connections, min 16
>>
>>
>
>that looks pretty small. that would only be 40MBytes (8k/page * 5000pages).
>
>http://www.varlena.com/GeneralBits/Tidbits/perf.html
>
>
>
Ok, thats it. I've set it to 51200, now it seems to be very fast.

Thank you!

Re: Optimize

From

Shridhar Daithankar

Date:

26 November 2003, 02:07:14

Torsten Schulz wrote:

> Chester Kustarz wrote:

>> On Mon, 24 Nov 2003, Torsten Schulz wrote:
>>> shared_buffers = 5000        # 2*max_connections, min 16
>> that looks pretty small. that would only be 40MBytes (8k/page *
>> 5000pages).
>> http://www.varlena.com/GeneralBits/Tidbits/perf.html
> Ok, thats it. I've set it to 51200, now it seems to be very fast.

Whoa..That is too much. You acn get still better performance at something low
like 10,000 or even 5000.

Bumping up shared buffers stops being useful after a point and later it actually
degrades the performance..

  Shridhar