Thread: Optimization

Optimization

From
"Justin Long"
Date:
Greetings,
 
I am trying to understand the various factors used by Postgres to optimize. I presently have a dual-866 Dell server with 1GB of memory. I've done the following:
 
set /proc/sys/kernel/shmmax to 512000000
shared_buffers = 32000
sort_mem = 32000
max_connections=64
fsync=false
Can someone tell me what effective_cache_size should be set to? what kind of formula to use for this? (I got the other figures from phpbuilder.com, and modified for 512k memory).
 
The databases I'm using have about 200,000+ news headlines with full-text indexes (which range upwards of a few million records). They are updated about every 5 to 10 minutes, which means I also have to run a vacuum about once every 2 to 3 hours at least. As I get more updates obviously the efficiency goes down. I'm trying to make the most of this system but don't fully understand PG's optimization stuff.
 
Thanks in advance,
Justin Long
 

Re: Optimization

From
"Nick Fankhauser"
Date:
Justin-

It sounds like you're on a system similar to ours, so I'll pass along the
changes that I made, which seem to have increased performance, and most
importantly, haven't hurt anything. The main difference in our environment
is that we are less Update/Insert intensive than you are- in our
application, 90% of our information (court cases) is static (Closed) and 10%
are frequently being updated (Pending/Active). This means I only vacuum once
a week. I haven't had chance to churn out objective tests yet, but my
subjective judgment is that this set of params works well:

Set SHMMAX and SHMALL in the kernel to 134217728 (128MB)
Set shared_buffers to 8192 (64MB)
Set sort_mem to 16384 (16MB)
Set effective_cache_size to 65536 (1/2 GB)

The Hardware is a dual-processor Athlon 1.2 Ghz box with 1 GB of RAM and the
DB on SCSI RAID drives.

The database size is about 8GB, with the largest table 2.5 GB, and the two
most commonly queried tables at 1 GB each.

The OS is Debian Linux kernel 2.4.x (recompiled custom kernel for dual
processor support)

The PostgreSQL version is 7.3.2

My reasoning was to increase shared_buffers based on anecdotal
recommendations I've seen on this list to 64MB and boost the OS SHMMAX to
twice that value to allow adequate room for other shared memory needs, thus
reserving 128MB off the top. Of the remaining memory, 256MB goes to 16MB
sort space times a guesstimate of 16 simultaneous sorts at any given time.
If I leave about 128 MB for headroom, then 1/2 GB should be left available
for the effective cache size.

I've never been tempted to turn fsync off. That seems like a risky move.

Regards,
   -Nick
---------------------------------------------------------------------
Nick Fankhauser
nickf@doxpop.com Phone 1.765.965.7363 Fax 1.765.962.9788
doxpop - Court records at your fingertips - http://www.doxpop.com/


Re: Optimization

From
"Nick Fankhauser - Doxpop"
Date:
Justin-

It sounds like you're on a system similar to ours, so I'll pass along the
changes that I made, which seem to have increased performance, and most
importantly, haven't hurt anything. The main difference in our environment
is that we are less Update/Insert intensive than you are- in our
application, 90% of our information (court cases) is static (Closed) and 10%
are frequently being updated (Pending/Active). This means I only vacuum once
a week. I haven't had chance to churn out objective tests yet, but my
subjective judgment is that this set of params works well:

Set SHMMAX and SHMALL in the kernel to 134217728 (128MB)
Set shared_buffers to 8192 (64MB)
Set sort_mem to 16384 (16MB)
Set effective_cache_size to 65536 (1/2 GB)


The Hardware is a dual-processor Athlon 1.2 Ghz box with 1 GB of RAM and the
DB on SCSI RAID drives.

The database size is about 8GB, with the largest table 2.5 GB, and the two
most commonly queried tables at 1 GB each.

The OS is Debian Linux kernel 2.4.x (recompiled custom kernel for dual
processor support)
The PostgreSQL version is 7.3.2

My reasoning was to increase shared_buffers based on anecdotal
recommendations I've seen on this list to 64MB and boost the OS SHMMAX to
twice that value to allow adequate room for other shared memory needs, thus
reserving 128MB. Of the remaining memory, 256MB goes to 16MB sort space
times
a guesstimate of 16 simultaneous sorts at any given time. If I leave about
128 MB for headroom, then 1/2 GB should be left available for the effective
cache size.

I've never been tempted to turn fsync off. That seems like a risky move.

Regards,
     -Nick

---------------------------------------------------------------------
Nick Fankhauser

    nickf@doxpop.com  Phone 1.765.965.7363  Fax 1.765.962.9788
doxpop - Court records at your fingertips - http://www.doxpop.com/


Re: Optimization

From
"Nick Fankhauser"
Date:

>Can someone tell me what effective_cache_size should be set to?

You may be able to intuit this from my last post, but if I understand
correctly, what you should be doing is estimating how much memory is likely
to be "left over" for the OS to do disk caching with after all of the basic
needs of the OS, PostgreSQL & any other applications are taken care of. You
then tell postgresql what to expect in terms of caching resources by putting
this number into effective_cache_size, and this allows the query planner
come up with a strategy that is optimized for the expected cache size.

So the "formula" would be: Figure out how much memory is normally in use
allowing adequate margins, subtract this from your total RAM, and make the
remainder your effective_cache size.


-Nick


Re: Optimization

From
Josh Berkus
Date:
Justin,

> I am trying to understand the various factors used by Postgres to optimize.
I presently have a dual-866 Dell server with 1GB of memory. I've done the
following:

Please set the performance articles at:
http://www.varlena.com/varlena/GeneralBits/Tidbits/index.php

--
-Josh Berkus
 Aglio Database Solutions
 San Francisco


Re: Optimization

From
Josh Berkus
Date:
Justin,

> I am trying to understand the various factors used by Postgres to optimize.
I presently have a dual-866 Dell server with 1GB of memory. I've done the
following:

see: http://www.varlena.com/varlena/GeneralBits/Tidbits/index.php
which has articles on .conf files.
(feel free to link these articles at PHPbuilder.com and elsewhere!)

> The databases I'm using have about 200,000+ news headlines with full-text
indexes (which range upwards of a few million records). They are updated
about every 5 to 10 minutes, which means I also have to run a vacuum about
once every 2 to 3 hours at least. As I get more updates obviously the
efficiency goes down. I'm trying to make the most of this system but don't
fully understand PG's optimization stuff.

Unless you're running PostgreSQL 7.1 or earlier, you should be VACUUMing every
10-15 minutes, not every 2-3 hours.   Regular VACUUM does not lock your
database.  You will also want to increase your FSM_relations so that VACUUM
is more effective/efficient; again, see the articles.

--
-Josh Berkus
 Aglio Database Solutions
 San Francisco


Re: Optimization

From
"Shridhar Daithankar"
Date:
On 28 Jul 2003 at 12:27, Josh Berkus wrote:
> Unless you're running PostgreSQL 7.1 or earlier, you should be VACUUMing every
> 10-15 minutes, not every 2-3 hours.   Regular VACUUM does not lock your
> database.  You will also want to increase your FSM_relations so that VACUUM
> is more effective/efficient; again, see the articles.

There is an auto-vacuum daemon in contrib and if I understand it correctly, it
is not getting much of a field testing. How about you guys installing it and
trying it?

Bye
 Shridhar

--
O'Reilly's Law of the Kitchen:    Cleanliness is next to impossible


Re: Optimization

From
Peter Childs
Date:
On Tue, 29 Jul 2003, Shridhar Daithankar wrote:

> On 28 Jul 2003 at 12:27, Josh Berkus wrote:
> > Unless you're running PostgreSQL 7.1 or earlier, you should be VACUUMing every
> > 10-15 minutes, not every 2-3 hours.   Regular VACUUM does not lock your
> > database.  You will also want to increase your FSM_relations so that VACUUM
> > is more effective/efficient; again, see the articles.
>
> There is an auto-vacuum daemon in contrib and if I understand it correctly, it
> is not getting much of a field testing. How about you guys installing it and
> trying it?

    If there is such a daemon, what is it called? As I can't see it.
Is it part of gborg?

Peter Childs


Re: Optimization

From
"Shridhar Daithankar"
Date:
On 29 Jul 2003 at 8:14, Peter Childs wrote:

> On Tue, 29 Jul 2003, Shridhar Daithankar wrote:
>
> > On 28 Jul 2003 at 12:27, Josh Berkus wrote:
> > > Unless you're running PostgreSQL 7.1 or earlier, you should be VACUUMing every
> > > 10-15 minutes, not every 2-3 hours.   Regular VACUUM does not lock your
> > > database.  You will also want to increase your FSM_relations so that VACUUM
> > > is more effective/efficient; again, see the articles.
> >
> > There is an auto-vacuum daemon in contrib and if I understand it correctly, it
> > is not getting much of a field testing. How about you guys installing it and
> > trying it?
>
>     If there is such a daemon, what is it called? As I can't see it.
> Is it part of gborg?

It is in sources. See contrib module in postgresql CVS, 7.4 beta if you prefer
to wait till announement.

It is called as pgavd..

Bye
 Shridhar

--
squatcho, n.:    The button at the top of a baseball cap.        -- "Sniglets", Rich
Hall & Friends