Thread: Clarification on some settings

Clarification on some settings

From

Doug Y

Date:

11 May 2004, 19:02:12

Hello,
   I've been having some performance issues with a DB I use. I'm trying to
come up with some performance recommendations to send to the "adminstrator".

Hardware:
CPU0: Pentium III (Coppermine) 1000MHz (256k cache)
CPU1: Pentium III (Coppermine) 1000MHz (256k cache)
Memory: 3863468 kB (4 GB)
OS: Red Hat Linux release 7.2 (Enigma)
Kernel: 2.4.9-31smp
I/O I believe is a 3-disk raid 5.

/proc/sys/kernel/shmmax and /proc/sys/kernel/shmall were set to 2G

Postgres version: 7.3.4

I know its a bit dated, and upgrades are planned, but several months out.
Load average seems to hover between 1.0 and 5.0-ish during peak hours. CPU
seems to be the limiting factor but I'm not positive (cpu utilization seems
to be 40-50%). We have 2 of those set up as the back end to 3 web-servers
each... supposedly load-balanced, but one of the 2 dbs consistently has
higher load. We have a home-grown replication system that keeps them in
sync with each other... peer to peer (master/master).

The DB schema is, well to put it nicely... not exactly normalized. No
constraints to speak of except for the requisite not-nulls on the primary
keys (many of which are compound). Keys are mostly varchar(256) fields.

Ok for what I'm uncertain of...
shared_buffers:
According to http://www.varlena.com/varlena/GeneralBits/Tidbits/perf.html
Its more of a staging area and more isn't necessarily better. That psql
relies on the OS to cache data for later use.
But according to
http://www.ca.postgresql.org/docs/momjian/hw_performance/node3.html its
where psql caches previous data for queries because the OS cache is slower,
and should be as big as possible without causing swap.
Those seem to be conflicting statements. In our case, the "administrator"
kept increasing this until performance seemed to increase, which means its
now 250000 (x 8k is 2G).
Is this just a staging area for data waiting to move to the OS cache, or is
this really the area that psql caches its data?

effective_cache_size:
Again, according to the Varlena guide this tells psql how much system
memory is available for it to do its work in.
until recently, this was set at the default value of 1000. It was just
recently increased to 180000 (1.5G)
according to
http://www.varlena.com/varlena/GeneralBits/Tidbits/annotated_conf_e.html it
should be about 25% of memory?

Finally sort_mem:
Was until recently left at the default of 1000. Is now 16000.

Increasing the effective cache and sort mem didn't seem to make much of a
difference. I'm guessing the eff cache was probably raised a bit too much,
and shared_buffers is way to high.

What can I do to help determine what the proper settings should be and/or
look at other possible choke points. What should I look for in iostat,
mpstat, or vmstat as red flags that cpu, memory, or i/o bound?

DB maintenance wise, I don't believe they were running vacuum full until I
told them a few months ago that regular vacuum analyze no longer cleans out
dead tuples. Now normal vac is run daily, vac full weekly (supposedly). How
can I tell from the output of vacuum if the vac fulls aren't being done, or
not done often enough? Or from the system tables, what can I read?

Is there anywhere else I can look for possible clues? I have access to the
DB super-user, but not the system root/user.

Thank you for your time. Please let me know any help or suggestions you may
have. Unfortunately upgrading postgres, OS, kernel, or re-writing schema is
most likely not an option.

Re: Clarification on some settings

From

Shridhar Daithankar

Date:

12 May 2004, 10:44:16

Doug Y wrote:

> Hello,
>   I've been having some performance issues with a DB I use. I'm trying
> to come up with some performance recommendations to send to the
> "adminstrator".
>
> Hardware:
> CPU0: Pentium III (Coppermine) 1000MHz (256k cache)
> CPU1: Pentium III (Coppermine) 1000MHz (256k cache)
> Memory: 3863468 kB (4 GB)
> OS: Red Hat Linux release 7.2 (Enigma)
> Kernel: 2.4.9-31smp
> I/O I believe is a 3-disk raid 5.
>
> /proc/sys/kernel/shmmax and /proc/sys/kernel/shmall were set to 2G
>
> Postgres version: 7.3.4
  > The DB schema is, well to put it nicely... not exactly normalized. No
> constraints to speak of except for the requisite not-nulls on the
> primary keys (many of which are compound). Keys are mostly varchar(256)
> fields.
>
> Ok for what I'm uncertain of...
> shared_buffers:
> According to http://www.varlena.com/varlena/GeneralBits/Tidbits/perf.html
> Its more of a staging area and more isn't necessarily better. That psql
> relies on the OS to cache data for later use.
> But according to
> http://www.ca.postgresql.org/docs/momjian/hw_performance/node3.html its
> where psql caches previous data for queries because the OS cache is
> slower, and should be as big as possible without causing swap.
> Those seem to be conflicting statements. In our case, the
> "administrator" kept increasing this until performance seemed to
> increase, which means its now 250000 (x 8k is 2G).
> Is this just a staging area for data waiting to move to the OS cache, or
> is this really the area that psql caches its data?

It is the area where postgresql works. It updates data in this area and pushes
it to OS cache for disk writes later.

By experience, larger does not mean better for this parameter. For multi-Gig RAM
machines, the best(on an average for wide variety of load) value found to be
around 10000-15000. May be even lower.

It is a well known fact that raising this parameter unnecessarily decreases the
performance. You indicate that best performance occurred at 250000. This is very
very large compared to other people's experience.
>
> effective_cache_size:
> Again, according to the Varlena guide this tells psql how much system
> memory is available for it to do its work in.
> until recently, this was set at the default value of 1000. It was just
> recently increased to 180000 (1.5G)
> according to
> http://www.varlena.com/varlena/GeneralBits/Tidbits/annotated_conf_e.html
> it should be about 25% of memory?

No rule of thumb. It is amount of memory OS will dedicate to psotgresql data
buffers. Depending uponn what else you run on machine, it could be
straight-forward or noodly value to calculate. For a 4GB machine, 1.5GB is quite
good but coupled with 2G of shared buffers it could push the machines to swap
storm. And swapping shared buffers is a big performance hit.

>
> Finally sort_mem:
> Was until recently left at the default of 1000. Is now 16000.

Sort memory is per sort not per query or per connection. So depending upon how
many concurrent connections you entertain, it could take quite a chuck of RAM.
>
> Increasing the effective cache and sort mem didn't seem to make much of
> a difference. I'm guessing the eff cache was probably raised a bit too
> much, and shared_buffers is way to high.

I agree. For shared buffers start with 5000 and increase in batches on 1000. Or
set it to a high value and check with ipcs for maximum shared memory usage. If
share memory usage peaks at 100MB, you don't need more than say 120MB of buffers.

>
> What can I do to help determine what the proper settings should be
> and/or look at other possible choke points. What should I look for in
> iostat, mpstat, or vmstat as red flags that cpu, memory, or i/o bound?

Yes. vmstat is usually a lot of help to locate the bottelneck.

> DB maintenance wise, I don't believe they were running vacuum full until
> I told them a few months ago that regular vacuum analyze no longer
> cleans out dead tuples. Now normal vac is run daily, vac full weekly
> (supposedly). How can I tell from the output of vacuum if the vac fulls
> aren't being done, or not done often enough? Or from the system tables,
> what can I read?

In 7.4 you can do vacuum full verbose and it will tell you the stats at the end.
For 7.3.x, its not there.

I suggest you vacuum full database once.(For large database, dumping restoring
might work faster. Dump/restore and vacuum full both lock the database
exclusively i.e. downtime. So I guess faster the better for you. But there is no
tool/guideline to determine which way to go.)

> Is there anywhere else I can look for possible clues? I have access to
> the DB super-user, but not the system root/user.

Other than hardware tuning, find out slow/frequent queries. Use explain analyze
to determine why they are so slow. Forgetting to typecast a where clause and
using sequential scan could cost you lot more than mistuned postgresql
configuration.

> Thank you for your time. Please let me know any help or suggestions you
> may have. Unfortunately upgrading postgres, OS, kernel, or re-writing
> schema is most likely not an option.

I hope you can change your queries.

HTH

  Shridhar

Re: Clarification on some settings

From

Neil Conway

Date:

13 May 2004, 10:02:39

On Wed, 2004-05-12 at 05:02, Shridhar Daithankar wrote:
> I agree. For shared buffers start with 5000 and increase in batches on 1000. Or
> set it to a high value and check with ipcs for maximum shared memory usage. If
> share memory usage peaks at 100MB, you don't need more than say 120MB of buffers.

If your DB touches more than 100MB worth of buffers over time, shared
memory consumption won't peak at 100MB. PG shared buffers are only
"recycled" when there are no unused buffers available, so this isn't a
really valid way to determine the right shared_buffers setting.

-Neil

Re: Clarification on some settings

From

Mark Kirkwood

Date:

13 May 2004, 10:57:17

Note that effective_cache_size is merely a hint to that planner to say
"I have this much os buffer cache to use" - it is not actually allocated.

It is shared_buffers that will hurt you if it is too high (10000 - 25000
is the usual sweet spot).

best wishes

Mark

Shridhar Daithankar wrote:

>
>>
>> Increasing the effective cache and sort mem didn't seem to make much
>> of a difference. I'm guessing the eff cache was probably raised a bit
>> too much, and shared_buffers is way to high.
>
>
> I agree. For shared buffers start with 5000 and increase in batches on
> 1000. Or set it to a high value and check with ipcs for maximum shared
> memory usage. If share memory usage peaks at 100MB, you don't need
> more than say 120MB of buffers.
>
>
>

Re: Clarification on some settings

From

Doug Y

Date:

13 May 2004, 16:42:52

(Sorry if this ends up being a duplicate post, I sent a reply yesterday,
but it doesn't appear to have gone through... I think I typo'd the address
but never got a bounce.)

Hi,
   Thanks for your initial help. I have some more questions below.

At 05:02 AM 5/12/2004, Shridhar Daithankar wrote:
>Doug Y wrote:
>
>>Hello,
>>   I've been having some performance issues with a DB I use. I'm trying
>> to come up with some performance recommendations to send to the "adminstrator".
>>
>>Ok for what I'm uncertain of...
>>shared_buffers:
>>According to http://www.varlena.com/varlena/GeneralBits/Tidbits/perf.html
>>Its more of a staging area and more isn't necessarily better. That psql
>>relies on the OS to cache data for later use.
>>But according to
>>http://www.ca.postgresql.org/docs/momjian/hw_performance/node3.html its
>>where psql caches previous data for queries because the OS cache is
>>slower, and should be as big as possible without causing swap.
>>Those seem to be conflicting statements. In our case, the "administrator"
>>kept increasing this until performance seemed to increase, which means
>>its now 250000 (x 8k is 2G).
>>Is this just a staging area for data waiting to move to the OS cache, or
>>is this really the area that psql caches its data?
>
>It is the area where postgresql works. It updates data in this area and
>pushes it to OS cache for disk writes later.
>
>By experience, larger does not mean better for this parameter. For
>multi-Gig RAM machines, the best(on an average for wide variety of load)
>value found to be around 10000-15000. May be even lower.
>
>It is a well known fact that raising this parameter unnecessarily
>decreases the performance. You indicate that best performance occurred at
>250000. This is very very large compared to other people's experience.

Ok. I think I understand a bit better now.

>>effective_cache_size:
>>Again, according to the Varlena guide this tells psql how much system
>>memory is available for it to do its work in.
>>until recently, this was set at the default value of 1000. It was just
>>recently increased to 180000 (1.5G)
>>according to
>>http://www.varlena.com/varlena/GeneralBits/Tidbits/annotated_conf_e.html
>>it should be about 25% of memory?
>
>No rule of thumb. It is amount of memory OS will dedicate to psotgresql
>data buffers. Depending uponn what else you run on machine, it could be
>straight-forward or noodly value to calculate. For a 4GB machine, 1.5GB is
>quite good but coupled with 2G of shared buffers it could push the
>machines to swap storm. And swapping shared buffers is a big performance hit.

We don't seem to be swapping much:

# top

   2:21pm  up 236 days, 19:12,  1 user,  load average: 1.45, 1.09, 1.00
53 processes: 51 sleeping, 2 running, 0 zombie, 0 stopped
CPU0 states: 30.3% user,  9.1% system,  0.0% nice, 60.0% idle
CPU1 states: 32.0% user,  9.3% system,  0.0% nice, 58.1% idle
Mem:  3863468K av, 3845844K used,   17624K free, 2035472K shrd,  198340K buff
Swap: 1052248K av,    1092K used, 1051156K free                 1465112K cached

looks like at some point it did swap a little, but from running vmstat, I
can't seem to catch it actively swapping.

>>Finally sort_mem:
>>Was until recently left at the default of 1000. Is now 16000.
>
>Sort memory is per sort not per query or per connection. So depending upon
>how many concurrent connections you entertain, it could take quite a chuck
>of RAM.

Right I understand that. How does one calculate the size of a sort? Rows *
width from an explain?

>>Increasing the effective cache and sort mem didn't seem to make much of a
>>difference. I'm guessing the eff cache was probably raised a bit too
>>much, and shared_buffers is way to high.
>
>I agree. For shared buffers start with 5000 and increase in batches on
>1000. Or set it to a high value and check with ipcs for maximum shared
>memory usage. If share memory usage peaks at 100MB, you don't need more
>than say 120MB of buffers.

My results from ipcs seems confusing... says its using the full 2G of
shared cache:

# ipcs

------ Shared Memory Segments --------
key        shmid      owner      perms      bytes      nattch     status
0x0052e2c1 6389760    postgres  600        2088370176 4

------ Semaphore Arrays --------
key        semid      owner      perms      nsems      status
0x0052e2c1 424378368  postgres  600        17
0x0052e2c2 424411137  postgres  600        17
0x0052e2c3 424443906  postgres  600        17
0x0052e2c4 424476675  postgres  600        17
0x0052e2c5 424509444  postgres  600        17
0x0052e2c6 424542213  postgres  600        17
0x0052e2c7 424574982  postgres  600        17
0x0052e2c8 424607751  postgres  600        17
0x0052e2c9 424640520  postgres  600        17
0x0052e2ca 424673289  postgres  600        17
0x0052e2cb 424706058  postgres  600        17
0x0052e2cc 424738827  postgres  600        17
0x0052e2cd 424771596  postgres  600        17
0x0052e2ce 424804365  postgres  600        17
0x0052e2cf 424837134  postgres  600        17
0x0052e2d0 424869903  postgres  600        17
0x0052e2d1 424902672  postgres  600        17
0x00018d45 505544721  root      777        1

------ Message Queues --------
key        msqid      owner      perms      used-bytes   messages


>>What can I do to help determine what the proper settings should be and/or
>>look at other possible choke points. What should I look for in iostat,
>>mpstat, or vmstat as red flags that cpu, memory, or i/o bound?
>
>Yes. vmstat is usually a lot of help to locate the bottelneck.

What would I be looking for here?

# vmstat 2 10
    procs                      memory    swap          io     system
  cpu
  r  b  w   swpd   free   buff  cache  si  so    bi    bo   in    cs  us
sy  id
  0  0  0   1092  14780 198120 1467164   0   0     0    0   0     0   0  0   0
  0  0  0   1092  19488 198120
1467204   0   0     0    0  240   564  11   5  84
  0  0  0   1092  19520 198120
1467300   0   0     0   210  443  1094  29   8  63
  0  0  0   1092  15832 198120
1467356   0   0     4   110  368  1455  27   5  68
  3  0  0   1092  10956 198120
1467464   0   0     4   336  417  1679  33  10  57
  1  0  0   1092  17840 198124
1465980   0   0  200   334  581  1914  63  14  23
  1  0  0   1092  16556 198124
1466012   0   0     0   226  397  1069  30   4  66
  0  0  0   1092  19096 198124
1466028   0   0     0   160  230   314  12   2  86
  2  0  1   1092  16100 198128
1466748   0   0    28  1484  711  1578  23  12  65
  0  0  0   1092  20140 198128
1466780   0   0     0   414  291   746  15   8  77

I'm guessing what I should look at is the io: bi & bo ? when I run some
particularly large queries I see bo activity so I'm speculating that that
means its reading pages from disk, correct?

>>DB maintenance wise, I don't believe they were running vacuum full until
>>I told them a few months ago that regular vacuum analyze no longer cleans
>>out dead tuples. Now normal vac is run daily, vac full weekly
>>(supposedly). How can I tell from the output of vacuum if the vac fulls
>>aren't being done, or not done often enough? Or from the system tables,
>>what can I read?
>
>In 7.4 you can do vacuum full verbose and it will tell you the stats at
>the end. For 7.3.x, its not there.
>
>I suggest you vacuum full database once.(For large database, dumping
>restoring might work faster. Dump/restore and vacuum full both lock the
>database exclusively i.e. downtime. So I guess faster the better for you.
>But there is no tool/guideline to determine which way to go.)

Ok they had not done a full vacuum in a long time. I them run vacuumdb
--full --analyze --verbose and dump it into a file. What should I look for
to see if it was useful?

for example:
INFO:  Pages 118200: Changed 74, reaped 117525, Empty 0, New 0; Tup 575298:
Vac 11006, Keep/VTL 0/0, UnUsed 2454159, MinLen 68, MaxLen 1911; Re-using:
Free/Avai
l. Space 774122944/774122944; EndEmpty/Avail. Pages 0/118200.
         CPU 9.41s/1.33u sec elapsed 97.35 sec.

Is there any documentation on what those numbers represent?

Also do we need to use REINDEX on the indexes, or does vacuum full take
case of that?


>>Is there anywhere else I can look for possible clues? I have access to
>>the DB super-user, but not the system root/user.
>
>Other than hardware tuning, find out slow/frequent queries. Use explain
>analyze to determine why they are so slow. Forgetting to typecast a where
>clause and using sequential scan could cost you lot more than mistuned
>postgresql configuration.

Right. One example I can think of is one particular query takes about 120
seconds to run (explain analyze), but if I set enable_seqscan to off, it
takes about 10 seconds.

>>Thank you for your time. Please let me know any help or suggestions you
>>may have. Unfortunately upgrading postgres, OS, kernel, or re-writing
>>schema is most likely not an option.
>
>I hope you can change your queries.

For the most part we're not having too much trouble, just some newer
queries were building for some new features is what we're seeing trouble with.


>HTH
>
>  Shridhar

Re: Clarification on some settings

From

Greg Copeland

Date:

13 May 2004, 17:28:35

On Thu, 2004-05-13 at 14:42, Doug Y wrote:

> We don't seem to be swapping much:
>

Linux aggressively swaps.  If you have any process in memory which is
sleeping a lot, Linux may actively attempt to page it out.  This is true
even when you are not low on memory.  Just because you see some swap
space being used, does not mean that your actively running processes are
causing your system to swap.

I didn't catch what kernel version you are running, so I'm tossing this
out there.  Depending on the kernel (I believe 2.6+, but there may be
something like it in older kernels) that you are running, you can
attempt to tune this buy setting a value of 0-100 in
/proc/sys/vm/swappiness.  The higher the number, the more aggressive the
kernel will attempt to swap.  Some misc. kernel patches attempt to
dynamically tune this parameter.

For a dedicated DB server, a higher number will probably be better.
This is because it should result in the most cache being available to
the system.  This, of course means, you may have to wait an tad bit long
when you ssh into the system, assuming sshd was swapped out.  I think
you get the idea.

> Swap: 1052248K av,    1092K used, 1051156K free                 1465112K cached
>
> looks like at some point it did swap a little, but from running vmstat, I
> can't seem to catch it actively swapping.
>

Chances are, you have some dormant process which is partially or
completely paged out.

For an interesting read on Linux and swapping, you can find out more
here:  http://kerneltrap.org/node/view/3080.

Cheers!

--
Greg Copeland, Owner
greg@copelandconsulting.net
Copeland Computer Consulting
940.206.8004