Thread: shared_buffers/effective_cache_size on 96GB server

shared_buffers/effective_cache_size on 96GB server

From
Strahinja Kustudić
Date:
Hi everyone,

I have a Postgresql 9.1 dedicated server with 16 cores, 96GB RAM and RAID10 15K SCSI drives which is runing Centos 6.2 x64. This server is mainly used for inserting/updating large amounts of data via copy/insert/update commands, and seldom for running select queries.

Here are the relevant configuration parameters I changed:

shared_buffers = 10GB
effective_cache_size = 90GB
work_mem = 32MB
maintenance_work_mem = 512MB
checkpoint_segments = 64
checkpoint_completion_target = 0.8

My biggest concern are shared_buffers and effective_cache_size, should I increase shared_buffers and decrease effective_cache_size? I read that values above 10GB for shared_buffers give lower performance, than smaller amounts?

free is currently reporting (during the loading of data):

$ free -m
             total       used       free     shared    buffers     cached
Mem:         96730      96418        311          0         71      93120
-/+ buffers/cache:       3227      93502
Swap:        21000         51      20949

So it did a little swapping, but only minor, still I should probably decrease shared_buffers so there is no swapping at all.

Thanks in advance,
Strahinja

Re: shared_buffers/effective_cache_size on 96GB server

From
Strahinja Kustudić
Date:
Hm, I just notices that shared_buffers + effective_cache_size = 100 > 96GB, which can't be right. effective_cache_size should probably be 80GB.

Strahinja Kustudić
| System Engineer | Nordeus



On Wed, Oct 10, 2012 at 9:12 AM, Strahinja Kustudić <strahinjak@nordeus.com> wrote:
Hi everyone,

I have a Postgresql 9.1 dedicated server with 16 cores, 96GB RAM and RAID10 15K SCSI drives which is runing Centos 6.2 x64. This server is mainly used for inserting/updating large amounts of data via copy/insert/update commands, and seldom for running select queries.

Here are the relevant configuration parameters I changed:

shared_buffers = 10GB
effective_cache_size = 90GB
work_mem = 32MB
maintenance_work_mem = 512MB
checkpoint_segments = 64
checkpoint_completion_target = 0.8

My biggest concern are shared_buffers and effective_cache_size, should I increase shared_buffers and decrease effective_cache_size? I read that values above 10GB for shared_buffers give lower performance, than smaller amounts?

free is currently reporting (during the loading of data):

$ free -m
             total       used       free     shared    buffers     cached
Mem:         96730      96418        311          0         71      93120
-/+ buffers/cache:       3227      93502
Swap:        21000         51      20949

So it did a little swapping, but only minor, still I should probably decrease shared_buffers so there is no swapping at all.

Thanks in advance,
Strahinja

Re: shared_buffers/effective_cache_size on 96GB server

From
Tomas Vondra
Date:
On 10.10.2012 09:12, Strahinja Kustudić wrote:
> Hi everyone,
>
> I have a Postgresql 9.1 dedicated server with 16 cores, 96GB RAM and
> RAID10 15K SCSI drives which is runing Centos 6.2 x64. This server is
> mainly used for inserting/updating large amounts of data via
> copy/insert/update commands, and seldom for running select queries.
>
> Here are the relevant configuration parameters I changed:
>
> shared_buffers = 10GB
> effective_cache_size = 90GB
> work_mem = 32MB
> maintenance_work_mem = 512MB
> checkpoint_segments = 64
> checkpoint_completion_target = 0.8
>
> My biggest concern are shared_buffers and effective_cache_size, should I
> increase shared_buffers and decrease effective_cache_size? I read that
> values above 10GB for shared_buffers give lower performance, than
> smaller amounts?
>
> free is currently reporting (during the loading of data):
>
> $ free -m
>              total       used       free     shared    buffers     cached
> Mem:         96730      96418        311          0         71      93120
> -/+ buffers/cache:       3227      93502
> Swap:        21000         51      20949
>
> So it did a little swapping, but only minor, still I should probably
> decrease shared_buffers so there is no swapping at all.

That's hardly caused by shared buffers. The main point is that
effective_cache_size is just a hint to the optimizer how much cache
(shared buffers + page cache) to expect. So it's unlikely PostgreSQL is
going to allocate 100GB of RAM or something.

What have you set to the main /proc/sys/vm/ parameters? Mainly these three:

/proc/sys/vm/swappiness
/proc/sys/vm/overcommit_memory
/proc/sys/vm/overcommit_ratio


Tomas


Re: shared_buffers/effective_cache_size on 96GB server

From
"Albe Laurenz"
Date:
Strahinja Kustudic wrote:
>> I have a Postgresql 9.1 dedicated server with 16 cores, 96GB RAM and RAID10 15K SCSI drives
>> which is runing Centos 6.2 x64. This server is mainly used for inserting/updating large amounts of
>> data via copy/insert/update commands, and seldom for running select queries.
>> 
>> Here are the relevant configuration parameters I changed:
>> 
>> shared_buffers = 10GB
>> effective_cache_size = 90GB
>> work_mem = 32MB
>> maintenance_work_mem = 512MB
>> checkpoint_segments = 64
>> checkpoint_completion_target = 0.8
>> 
>> My biggest concern are shared_buffers and effective_cache_size, should I increase shared_buffers
>> and decrease effective_cache_size? I read that values above 10GB for shared_buffers give lower
>> performance, than smaller amounts?
>> 
>> free is currently reporting (during the loading of data):
>> 
>> $ free -m
>>              total       used       free     shared    buffers     cached
>> Mem:         96730      96418        311          0         71      93120
>> -/+ buffers/cache:       3227      93502
>> Swap:        21000         51      20949
>> 
>> So it did a little swapping, but only minor, still I should probably decrease shared_buffers so
>> there is no swapping at all.

> Hm, I just notices that shared_buffers + effective_cache_size = 100 > 96GB, which can't be right.
> effective_cache_size should probably be 80GB.

I think you misunderstood effective_cache_size.
It does not influence memory usage, but query planning.
It gives the planner an idea of how much memory there is for caching
data, including the filesystem cache.

So a good value for effective_cache_size would be
total memory minus what the OS and others need minus what private
memory the PostgreSQL backends need.
The latter can be estimated as work_mem times max_connections.

To avoid swapping, consider setting vm.swappiness to 0 in
/etc/sysctl.conf.

10GB of shared_buffers is quite a lot.
If you can run realistic performance tests, start with a lower value
and increase until you cannot see a notable improvement.

Yours,
Laurenz Albe

Re: shared_buffers/effective_cache_size on 96GB server

From
Julien Cigar
Date:
On 10/10/2012 09:12, Strahinja Kustudić wrote:
> Hi everyone,

Hello,

>
> I have a Postgresql 9.1 dedicated server with 16 cores, 96GB RAM and
> RAID10 15K SCSI drives which is runing Centos 6.2 x64. This server is
> mainly used for inserting/updating large amounts of data via
> copy/insert/update commands, and seldom for running select queries.
>
> Here are the relevant configuration parameters I changed:
>
> shared_buffers = 10GB

Generally going over 4GB for shared_buffers doesn't help.. some of the
overhead of bgwriter and checkpoints is more or less linear in the size
of shared_buffers ..

> effective_cache_size = 90GB

effective_cache_size should be ~75% of the RAM (if it's a dedicated server)

> work_mem = 32MB

with 96GB of RAM I would raise default work_mem to something like 128MB

> maintenance_work_mem = 512MB

again, with 96GB of ram you can raise maintenance_work_mem to something
like 4GB

> checkpoint_segments = 64
> checkpoint_completion_target = 0.8
>
> My biggest concern are shared_buffers and effective_cache_size, should
> I increase shared_buffers and decrease effective_cache_size? I read
> that values above 10GB for shared_buffers give lower performance, than
> smaller amounts?
>
> free is currently reporting (during the loading of data):
>
> $ free -m
> total used free shared buffers cached
> Mem: 96730 96418 311 0 71 93120
> -/+ buffers/cache: 3227 93502
> Swap: 21000 51 20949
>
> So it did a little swapping, but only minor, still I should probably
> decrease shared_buffers so there is no swapping at all.
>
> Thanks in advance,
> Strahinja

Julien


--
No trees were killed in the creation of this message.
However, many electrons were terribly inconvenienced.


Attachment

Re: shared_buffers/effective_cache_size on 96GB server

From
Strahinja Kustudić
Date:
Thanks for very fast replies everyone :)

@Laurenz I know that effective cache size is only used for the query planner, what I was saying is that if I tell it that it can have 90GB cached items, that is not trues, since the OS and Postgres process itself can take more than 6GB, which would mean 90GB is not the correct value, but if effective_cache size should be shared_buffers+page cache as Tomas said, than 90GB, won't be a problem.


@Tomas here are the values:

# cat /proc/sys/vm/swappiness
60
# cat  /proc/sys/vm/overcommit_memory
0
# cat  /proc/sys/vm/overcommit_ratio
50

I will turn of swappiness, I was meaning to do that, but I don't know much about the overcommit settings, I will read what they do.


@Julien thanks for the suggestions, I will tweak them like you suggested.

Strahinja Kustudić
| System Engineer | Nordeus



On Wed, Oct 10, 2012 at 10:11 AM, Julien Cigar <jcigar@ulb.ac.be> wrote:
On 10/10/2012 09:12, Strahinja Kustudić wrote:
Hi everyone,

Hello,



I have a Postgresql 9.1 dedicated server with 16 cores, 96GB RAM and RAID10 15K SCSI drives which is runing Centos 6.2 x64. This server is mainly used for inserting/updating large amounts of data via copy/insert/update commands, and seldom for running select queries.

Here are the relevant configuration parameters I changed:

shared_buffers = 10GB

Generally going over 4GB for shared_buffers doesn't help.. some of the overhead of bgwriter and checkpoints is more or less linear in the size of shared_buffers ..

effective_cache_size = 90GB

effective_cache_size should be ~75% of the RAM (if it's a dedicated server)

work_mem = 32MB

with 96GB of RAM I would raise default work_mem to something like 128MB

maintenance_work_mem = 512MB

again, with 96GB of ram you can raise maintenance_work_mem to something like 4GB


checkpoint_segments = 64
checkpoint_completion_target = 0.8

My biggest concern are shared_buffers and effective_cache_size, should I increase shared_buffers and decrease effective_cache_size? I read that values above 10GB for shared_buffers give lower performance, than smaller amounts?

free is currently reporting (during the loading of data):

$ free -m
total used free shared buffers cached
Mem: 96730 96418 311 0 71 93120
-/+ buffers/cache: 3227 93502
Swap: 21000 51 20949

So it did a little swapping, but only minor, still I should probably decrease shared_buffers so there is no swapping at all.

Thanks in advance,
Strahinja

Julien


--
No trees were killed in the creation of this message.
However, many electrons were terribly inconvenienced.



--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: shared_buffers/effective_cache_size on 96GB server

From
Julien Cigar
Date:
On 10/10/2012 10:30, Strahinja Kustudić wrote:
Thanks for very fast replies everyone :)

@Laurenz I know that effective cache size is only used for the query planner, what I was saying is that if I tell it that it can have 90GB cached items, that is not trues, since the OS and Postgres process itself can take more than 6GB, which would mean 90GB is not the correct value, but if effective_cache size should be shared_buffers+page cache as Tomas said, than 90GB, won't be a problem.


@Tomas here are the values:

# cat /proc/sys/vm/swappiness
60
# cat  /proc/sys/vm/overcommit_memory
0
# cat  /proc/sys/vm/overcommit_ratio
50

I will turn of swappiness, I was meaning to do that, but I don't know much about the overcommit settings, I will read what they do.


@Julien thanks for the suggestions, I will tweak them like you suggested.


also with 15k SCSI you can reduce random_page_cost to 3.5 (instead of 4.0)
I also recommend to raise cpu_tuple_cost to 0.05 (instead of 0.01), set vm.swappiness to 0, vm.overcommit_memory to 2, and finally raise the read-ahead (something like 8192)

Strahinja Kustudić | System Engineer | Nordeus



On Wed, Oct 10, 2012 at 10:11 AM, Julien Cigar <jcigar@ulb.ac.be> wrote:
On 10/10/2012 09:12, Strahinja Kustudić wrote:
Hi everyone,

Hello,



I have a Postgresql 9.1 dedicated server with 16 cores, 96GB RAM and RAID10 15K SCSI drives which is runing Centos 6.2 x64. This server is mainly used for inserting/updating large amounts of data via copy/insert/update commands, and seldom for running select queries.

Here are the relevant configuration parameters I changed:

shared_buffers = 10GB

Generally going over 4GB for shared_buffers doesn't help.. some of the overhead of bgwriter and checkpoints is more or less linear in the size of shared_buffers ..

effective_cache_size = 90GB

effective_cache_size should be ~75% of the RAM (if it's a dedicated server)

work_mem = 32MB

with 96GB of RAM I would raise default work_mem to something like 128MB

maintenance_work_mem = 512MB

again, with 96GB of ram you can raise maintenance_work_mem to something like 4GB


checkpoint_segments = 64
checkpoint_completion_target = 0.8

My biggest concern are shared_buffers and effective_cache_size, should I increase shared_buffers and decrease effective_cache_size? I read that values above 10GB for shared_buffers give lower performance, than smaller amounts?

free is currently reporting (during the loading of data):

$ free -m
total used free shared buffers cached
Mem: 96730 96418 311 0 71 93120
-/+ buffers/cache: 3227 93502
Swap: 21000 51 20949

So it did a little swapping, but only minor, still I should probably decrease shared_buffers so there is no swapping at all.

Thanks in advance,
Strahinja

Julien


--
No trees were killed in the creation of this message.
However, many electrons were terribly inconvenienced.



--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance




-- 
No trees were killed in the creation of this message.
However, many electrons were terribly inconvenienced.
Attachment

Re: shared_buffers/effective_cache_size on 96GB server

From
Strahinja Kustudić
Date:
Thanks for your help everyone.

I set:
shared_buffers = 4GB
effective_cache_size = 72GB
work_mem = 128MB
maintenance_work_mem = 4GB
checkpoint_segments = 64
checkpoint_completion_target = 0.9
random_page_cost = 3.5
cpu_tuple_cost = 0.05

Where can I get the values for random_page_cost and for cpu_tuple_cost where they depend on hardware? I know that for SSDs random_page_cost should be 1.0, but I have no idea what value this should be for different types of drives.

I also set:
vm.swappiness = 0
vm.overcommit_memory = 2
vm.overcommit_ratio = 50

But I don't understand why do I need to set overcommit_memory, since I only have postgres running, nothing else would allocate memory anyway?

I will set readahead later, first I want to see how is this working.

Strahinja Kustudić | System Engineer | Nordeus



On Wed, Oct 10, 2012 at 10:52 AM, Julien Cigar <jcigar@ulb.ac.be> wrote:
On 10/10/2012 10:30, Strahinja Kustudić wrote:
Thanks for very fast replies everyone :)

@Laurenz I know that effective cache size is only used for the query planner, what I was saying is that if I tell it that it can have 90GB cached items, that is not trues, since the OS and Postgres process itself can take more than 6GB, which would mean 90GB is not the correct value, but if effective_cache size should be shared_buffers+page cache as Tomas said, than 90GB, won't be a problem.


@Tomas here are the values:

# cat /proc/sys/vm/swappiness
60
# cat  /proc/sys/vm/overcommit_memory
0
# cat  /proc/sys/vm/overcommit_ratio
50

I will turn of swappiness, I was meaning to do that, but I don't know much about the overcommit settings, I will read what they do.


@Julien thanks for the suggestions, I will tweak them like you suggested.


also with 15k SCSI you can reduce random_page_cost to 3.5 (instead of 4.0)
I also recommend to raise cpu_tuple_cost to 0.05 (instead of 0.01), set vm.swappiness to 0, vm.overcommit_memory to 2, and finally raise the read-ahead (something like 8192)


Strahinja Kustudić | System Engineer | Nordeus



On Wed, Oct 10, 2012 at 10:11 AM, Julien Cigar <jcigar@ulb.ac.be> wrote:
On 10/10/2012 09:12, Strahinja Kustudić wrote:
Hi everyone,

Hello,



I have a Postgresql 9.1 dedicated server with 16 cores, 96GB RAM and RAID10 15K SCSI drives which is runing Centos 6.2 x64. This server is mainly used for inserting/updating large amounts of data via copy/insert/update commands, and seldom for running select queries.

Here are the relevant configuration parameters I changed:

shared_buffers = 10GB

Generally going over 4GB for shared_buffers doesn't help.. some of the overhead of bgwriter and checkpoints is more or less linear in the size of shared_buffers ..

effective_cache_size = 90GB

effective_cache_size should be ~75% of the RAM (if it's a dedicated server)

work_mem = 32MB

with 96GB of RAM I would raise default work_mem to something like 128MB

maintenance_work_mem = 512MB

again, with 96GB of ram you can raise maintenance_work_mem to something like 4GB


checkpoint_segments = 64
checkpoint_completion_target = 0.8

My biggest concern are shared_buffers and effective_cache_size, should I increase shared_buffers and decrease effective_cache_size? I read that values above 10GB for shared_buffers give lower performance, than smaller amounts?

free is currently reporting (during the loading of data):

$ free -m
total used free shared buffers cached
Mem: 96730 96418 311 0 71 93120
-/+ buffers/cache: 3227 93502
Swap: 21000 51 20949

So it did a little swapping, but only minor, still I should probably decrease shared_buffers so there is no swapping at all.

Thanks in advance,
Strahinja

Julien


--
No trees were killed in the creation of this message.
However, many electrons were terribly inconvenienced.



--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance




-- 
No trees were killed in the creation of this message.
However, many electrons were terribly inconvenienced.

Re: shared_buffers/effective_cache_size on 96GB server

From
Shaun Thomas
Date:
On 10/10/2012 02:12 AM, Strahinja Kustudić wrote:

>           total    used   free  shared   buffers     cached
> Mem:      96730   96418    311       0        71      93120

Wow, look at all that RAM. Something nobody has mentioned yet, you'll
want to set some additional kernel parameters for this, to avoid getting
occasional IO storms caused by dirty memory flushes.

vm.dirty_background_ratio = 1
vm.dirty_ratio = 5

Again, these would go in sysctl.conf, or /etc/sysctl.d/10-dbserver.conf
or something. If you have a newer kernel, look into
vm.dirty_background_bytes, and vm.dirty_bytes.

The why of this is brought up occasionally here, but it comes down to
your vast amount of memory. The defaults for even late Linux kernels is
5% for dirty_background_ratio, and 10% for dirty_ratio. So if you
multiply it out, the kernel will allow about 4.8GB of dirty memory
before attempting to flush it to disk. If that number reaches 9.6, the
system goes synchronous, and no other disk writes can take place until
*all 9.6GB* is flushed. Even with a fast disk subsystem, that's a pretty
big gulp.

The idea here is to keep it writing in the background by setting a low
limit, so it never reaches a critical mass that causes it to snowball
into the more dangerous upper limit. If you have a newer kernel, the
ability to set "bytes" is a much more granular knob that can be used to
match RAID buffer sizes. You'll probably want to experiment with this a
bit before committing to a setting.

> So it did a little swapping, but only minor, still I should probably
> decrease shared_buffers so there is no swapping at all.

Don't worry about that amount of swapping. As others have said here, you
can reduce that to 0, and even then, the OS will still swap something
occasionally. It's really just a hint to the kernel how much swapping
you want to go on, and it's free to ignore it in cases where it knows
some data won't be accessed after initialization or something, so it
swaps it out anyway.


--
Shaun Thomas
OptionsHouse | 141 W. Jackson Blvd. | Suite 500 | Chicago IL, 60604
312-444-8534
sthomas@optionshouse.com

______________________________________________

See http://www.peak6.com/email_disclaimer/ for terms and conditions related to this email


Re: shared_buffers/effective_cache_size on 96GB server

From
Strahinja Kustudić
Date:
Shaun,

running these commands:

#sysctl vm.dirty_ratio
vm.dirty_ratio = 40
# sysctl vm.dirty_background_ratio
vm.dirty_background_ratio = 10

shows that these values are even higher by default. When you said RAID buffer size, you meant the controllers cache memory size?

Regards,
Strahinja


On Wed, Oct 10, 2012 at 3:09 PM, Shaun Thomas <sthomas@optionshouse.com> wrote:
On 10/10/2012 02:12 AM, Strahinja Kustudić wrote:

          total    used   free  shared   buffers     cached
Mem:      96730   96418    311       0        71      93120

Wow, look at all that RAM. Something nobody has mentioned yet, you'll want to set some additional kernel parameters for this, to avoid getting occasional IO storms caused by dirty memory flushes.

vm.dirty_background_ratio = 1
vm.dirty_ratio = 5

Again, these would go in sysctl.conf, or /etc/sysctl.d/10-dbserver.conf or something. If you have a newer kernel, look into vm.dirty_background_bytes, and vm.dirty_bytes.

The why of this is brought up occasionally here, but it comes down to your vast amount of memory. The defaults for even late Linux kernels is 5% for dirty_background_ratio, and 10% for dirty_ratio. So if you multiply it out, the kernel will allow about 4.8GB of dirty memory before attempting to flush it to disk. If that number reaches 9.6, the system goes synchronous, and no other disk writes can take place until *all 9.6GB* is flushed. Even with a fast disk subsystem, that's a pretty big gulp.

The idea here is to keep it writing in the background by setting a low limit, so it never reaches a critical mass that causes it to snowball into the more dangerous upper limit. If you have a newer kernel, the ability to set "bytes" is a much more granular knob that can be used to match RAID buffer sizes. You'll probably want to experiment with this a bit before committing to a setting.


So it did a little swapping, but only minor, still I should probably
decrease shared_buffers so there is no swapping at all.

Don't worry about that amount of swapping. As others have said here, you can reduce that to 0, and even then, the OS will still swap something occasionally. It's really just a hint to the kernel how much swapping you want to go on, and it's free to ignore it in cases where it knows some data won't be accessed after initialization or something, so it swaps it out anyway.


--
Shaun Thomas
OptionsHouse | 141 W. Jackson Blvd. | Suite 500 | Chicago IL, 60604
312-444-8534
sthomas@optionshouse.com

______________________________________________

See http://www.peak6.com/email_disclaimer/ for terms and conditions related to this email

Re: shared_buffers/effective_cache_size on 96GB server

From
Shaun Thomas
Date:
On 10/10/2012 09:35 AM, Strahinja Kustudić wrote:

> #sysctl vm.dirty_ratio
> vm.dirty_ratio = 40
> # sysctl vm.dirty_background_ratio
> vm.dirty_background_ratio = 10

Ouuuuch. That looks a lot like an old RHEL or CentOS system. Change
those ASAP. Currently your system won't start writing dirty buffers
until it hits 9.6GB. :(

> shows that these values are even higher by default. When you said
> RAID buffer size, you meant the controllers cache memory size?

Yeah, that. :)

--
Shaun Thomas
OptionsHouse | 141 W. Jackson Blvd. | Suite 500 | Chicago IL, 60604
312-444-8534
sthomas@optionshouse.com

______________________________________________

See http://www.peak6.com/email_disclaimer/ for terms and conditions related to this email


Re: shared_buffers/effective_cache_size on 96GB server

From
Strahinja Kustudić
Date:
I will change those, but I don't think this is that big of an issue if most of the IO is done by Postgres, since Postgres has it's own mechanism to tell the OS to sync the data to disk. For example when it's writing a wal file, or when it's writing a check point, those do not get cached.

Regards,
Strahinja


On Wed, Oct 10, 2012 at 4:38 PM, Shaun Thomas <sthomas@optionshouse.com> wrote:
On 10/10/2012 09:35 AM, Strahinja Kustudić wrote:

#sysctl vm.dirty_ratio
vm.dirty_ratio = 40
# sysctl vm.dirty_background_ratio
vm.dirty_background_ratio = 10

Ouuuuch. That looks a lot like an old RHEL or CentOS system. Change those ASAP. Currently your system won't start writing dirty buffers until it hits 9.6GB. :(


shows that these values are even higher by default. When you said
RAID buffer size, you meant the controllers cache memory size?

Yeah, that. :)


--
Shaun Thomas
OptionsHouse | 141 W. Jackson Blvd. | Suite 500 | Chicago IL, 60604
312-444-8534
sthomas@optionshouse.com

______________________________________________

See http://www.peak6.com/email_disclaimer/ for terms and conditions related to this email

Re: shared_buffers/effective_cache_size on 96GB server

From
Shaun Thomas
Date:
On 10/10/2012 09:49 AM, Strahinja Kustudić wrote:

> I will change those, but I don't think this is that big of an issue if
> most of the IO is done by Postgres, since Postgres has it's own
> mechanism to tell the OS to sync the data to disk. For example when it's
> writing a wal file, or when it's writing a check point, those do not get
> cached.

You'd be surprised. Greg Smith did a bunch of work a couple years back
that supported these changes. Most DBAs with heavily utilized systems
could even see this in action by turning on checkpoint logging, and
there's an occasional period where the sync time lags into the minutes
due to a synchronous IO switch.

--
Shaun Thomas
OptionsHouse | 141 W. Jackson Blvd. | Suite 500 | Chicago IL, 60604
312-444-8534
sthomas@optionshouse.com

______________________________________________

See http://www.peak6.com/email_disclaimer/ for terms and conditions related to this email


Re: shared_buffers/effective_cache_size on 96GB server

From
Bruce Momjian
Date:
On Wed, Oct 10, 2012 at 09:12:20AM +0200, Strahinja Kustudić wrote:
> Hi everyone,
>
> I have a Postgresql 9.1 dedicated server with 16 cores, 96GB RAM and RAID10 15K
> SCSI drives which is runing Centos 6.2 x64. This server is mainly used for
> inserting/updating large amounts of data via copy/insert/update commands, and
> seldom for running select queries.
>
> Here are the relevant configuration parameters I changed:
>
> shared_buffers = 10GB
> effective_cache_size = 90GB
> work_mem = 32MB
> maintenance_work_mem = 512MB
> checkpoint_segments = 64
> checkpoint_completion_target = 0.8
>
> My biggest concern are shared_buffers and effective_cache_size, should I
> increase shared_buffers and decrease effective_cache_size? I read that values
> above 10GB for shared_buffers give lower performance, than smaller amounts?
>
> free is currently reporting (during the loading of data):
>
> $ free -m
>              total       used       free     shared    buffers     cached
> Mem:         96730      96418        311          0         71      93120
> -/+ buffers/cache:       3227      93502
> Swap:        21000         51      20949
>
> So it did a little swapping, but only minor, still I should probably decrease
> shared_buffers so there is no swapping at all.

You might want to read my blog entry about swap space:

    http://momjian.us/main/blogs/pgblog/2012.html#July_25_2012

It is probably swapping unused memory _out_ to make more use of RAM for
cache.

--
  Bruce Momjian  <bruce@momjian.us>        http://momjian.us
  EnterpriseDB                             http://enterprisedb.com

  + It's impossible for everything to be true. +


Re: shared_buffers/effective_cache_size on 96GB server

From
Bruce Momjian
Date:
On Wed, Oct 10, 2012 at 10:11:30AM +0200, Julien Cigar wrote:
> On 10/10/2012 09:12, Strahinja Kustudić wrote:
> >Hi everyone,
>
> Hello,
>
> >
> >I have a Postgresql 9.1 dedicated server with 16 cores, 96GB RAM
> >and RAID10 15K SCSI drives which is runing Centos 6.2 x64. This
> >server is mainly used for inserting/updating large amounts of data
> >via copy/insert/update commands, and seldom for running select
> >queries.
> >
> >Here are the relevant configuration parameters I changed:
> >
> >shared_buffers = 10GB
>
> Generally going over 4GB for shared_buffers doesn't help.. some of
> the overhead of bgwriter and checkpoints is more or less linear in
> the size of shared_buffers ..
>
> >effective_cache_size = 90GB
>
> effective_cache_size should be ~75% of the RAM (if it's a dedicated server)

Why guess?  Use 'free' to tell you the kernel cache size:

    http://momjian.us/main/blogs/pgblog/2012.html#May_4_2012

--
  Bruce Momjian  <bruce@momjian.us>        http://momjian.us
  EnterpriseDB                             http://enterprisedb.com

  + It's impossible for everything to be true. +


Re: shared_buffers/effective_cache_size on 96GB server

From
Claudio Freire
Date:
On Wed, Oct 10, 2012 at 1:10 PM, Bruce Momjian <bruce@momjian.us> wrote:
>> >shared_buffers = 10GB
>>
>> Generally going over 4GB for shared_buffers doesn't help.. some of
>> the overhead of bgwriter and checkpoints is more or less linear in
>> the size of shared_buffers ..
>>
>> >effective_cache_size = 90GB
>>
>> effective_cache_size should be ~75% of the RAM (if it's a dedicated server)
>
> Why guess?  Use 'free' to tell you the kernel cache size:
>
>         http://momjian.us/main/blogs/pgblog/2012.html#May_4_2012

Why does nobody every mention that concurrent access has to be taken
into account?

Ie: if I expect concurrent access to 10 really big indices, I'll set
effective_cache_size = free ram / 10


Re: shared_buffers/effective_cache_size on 96GB server

From
Bruce Momjian
Date:
On Wed, Oct 10, 2012 at 02:05:20PM -0300, Claudio Freire wrote:
> On Wed, Oct 10, 2012 at 1:10 PM, Bruce Momjian <bruce@momjian.us> wrote:
> >> >shared_buffers = 10GB
> >>
> >> Generally going over 4GB for shared_buffers doesn't help.. some of
> >> the overhead of bgwriter and checkpoints is more or less linear in
> >> the size of shared_buffers ..
> >>
> >> >effective_cache_size = 90GB
> >>
> >> effective_cache_size should be ~75% of the RAM (if it's a dedicated server)
> >
> > Why guess?  Use 'free' to tell you the kernel cache size:
> >
> >         http://momjian.us/main/blogs/pgblog/2012.html#May_4_2012
>
> Why does nobody every mention that concurrent access has to be taken
> into account?
>
> Ie: if I expect concurrent access to 10 really big indices, I'll set
> effective_cache_size = free ram / 10

It is true that the estimate assumes a single session is using all the
cache, but I think that is based on the assumion is that there is a
major overlap between the cache needs of multiple sessions.

--
  Bruce Momjian  <bruce@momjian.us>        http://momjian.us
  EnterpriseDB                             http://enterprisedb.com

  + It's impossible for everything to be true. +


Re: shared_buffers/effective_cache_size on 96GB server

From
Shaun Thomas
Date:
On 10/10/2012 12:05 PM, Claudio Freire wrote:

> Why does nobody every mention that concurrent access has to be taken
> into account?

That's actually a good point. But if you have one giant database, the
overlap of which tables are being accessed by various sessions is going
to be immense.

There probably should be a point about this in the docs, though. There
are more and more shared-hosting setups or places that spread their data
horizontally across separate databases for various clients, and in those
cases, parallel usage does not imply overlap.

--
Shaun Thomas
OptionsHouse | 141 W. Jackson Blvd. | Suite 500 | Chicago IL, 60604
312-444-8534
sthomas@optionshouse.com

______________________________________________

See http://www.peak6.com/email_disclaimer/ for terms and conditions related to this email


Re: shared_buffers/effective_cache_size on 96GB server

From
Claudio Freire
Date:
On Wed, Oct 10, 2012 at 3:18 PM, Shaun Thomas <sthomas@optionshouse.com> wrote:
>> Why does nobody every mention that concurrent access has to be taken
>> into account?
>
>
> That's actually a good point. But if you have one giant database, the
> overlap of which tables are being accessed by various sessions is going to
> be immense.

That's why I said "several huge indices". If regularly accessed
indices are separate, and big, it means they don't overlap nor do they
fit in any cache.


Re: shared_buffers/effective_cache_size on 96GB server

From
Jeff Janes
Date:
On Wed, Oct 10, 2012 at 12:12 AM, Strahinja Kustudić
<strahinjak@nordeus.com> wrote:
> Hi everyone,
>
> I have a Postgresql 9.1 dedicated server with 16 cores, 96GB RAM and RAID10
> 15K SCSI drives which is runing Centos 6.2 x64.

How many drives in the RAID?

> This server is mainly used
> for inserting/updating large amounts of data via copy/insert/update
> commands, and seldom for running select queries.

Are there a lot of indexes?

>
> Here are the relevant configuration parameters I changed:
>
> shared_buffers = 10GB
> effective_cache_size = 90GB
> work_mem = 32MB
> maintenance_work_mem = 512MB
> checkpoint_segments = 64
> checkpoint_completion_target = 0.8
>
> My biggest concern are shared_buffers and effective_cache_size, should I
> increase shared_buffers and decrease effective_cache_size?

Are you experiencing performance problems?  If so, what are they?

> I read that
> values above 10GB for shared_buffers give lower performance, than smaller
> amounts?

There are reports that large shared_buffers can lead to latency
spikes.  I don't know how sensitive your work load is to latency,
though.  Nor how much those reports apply to 9.1.

>
> free is currently reporting (during the loading of data):
>
> $ free -m
>              total       used       free     shared    buffers     cached
> Mem:         96730      96418        311          0         71      93120
> -/+ buffers/cache:       3227      93502
> Swap:        21000         51      20949
>
> So it did a little swapping, but only minor,

The kernel has, over the entire time the server has been up, found 51
MB of process memory to swap.  That doesn't really mean anything.  Do
you see active swapping going on, like with vmstat?


Cheers,

Jeff


Re: shared_buffers/effective_cache_size on 96GB server

From
Strahinja Kustudić
Date:
@Bruce Thanks for your articles, after reading them all I don't think disabling swap is a good idea now. Also you said to see the effective_cache_size I should check it with free. My question is should I use the value that free is showing as cached, or a little lower one, since not everything in the cache is because of Postgres.

@Claudio So you are basically saying that if I have set effective_cache_size to 10GB and I have 10 concurrent processes which are using 10 different indices which are for example 2GB, it would be better to set the effective_cache size to 1GB? Since if I leave it at 10GB each running process query planner will think the whole index is in cache and that won't be true? Did I get that right?

@Jeff I have 4 drives in RADI10. The database has around 80GB of indices. I'm not experiencing any slow downs, I would just like to increase the performance of update/insert, since it needs to insert a lot of data and to make the select queries faster since they are done on a lot of big tables. I am experiencing a lot of performance problems when autovacuum kicks in for a few big tables, since it slows downs things a lot. I didn't notice any swapping and I know those 51MB which were swapped were just staying there, so swap isn't an issue at all.

Strahinja Kustudić
| System Engineer | Nordeus



On Wed, Oct 10, 2012 at 9:30 PM, Jeff Janes <jeff.janes@gmail.com> wrote:
On Wed, Oct 10, 2012 at 12:12 AM, Strahinja Kustudić
<strahinjak@nordeus.com> wrote:
> Hi everyone,
>
> I have a Postgresql 9.1 dedicated server with 16 cores, 96GB RAM and RAID10
> 15K SCSI drives which is runing Centos 6.2 x64.

How many drives in the RAID?

> This server is mainly used
> for inserting/updating large amounts of data via copy/insert/update
> commands, and seldom for running select queries.

Are there a lot of indexes?

>
> Here are the relevant configuration parameters I changed:
>
> shared_buffers = 10GB
> effective_cache_size = 90GB
> work_mem = 32MB
> maintenance_work_mem = 512MB
> checkpoint_segments = 64
> checkpoint_completion_target = 0.8
>
> My biggest concern are shared_buffers and effective_cache_size, should I
> increase shared_buffers and decrease effective_cache_size?

Are you experiencing performance problems?  If so, what are they?

> I read that
> values above 10GB for shared_buffers give lower performance, than smaller
> amounts?

There are reports that large shared_buffers can lead to latency
spikes.  I don't know how sensitive your work load is to latency,
though.  Nor how much those reports apply to 9.1.

>
> free is currently reporting (during the loading of data):
>
> $ free -m
>              total       used       free     shared    buffers     cached
> Mem:         96730      96418        311          0         71      93120
> -/+ buffers/cache:       3227      93502
> Swap:        21000         51      20949
>
> So it did a little swapping, but only minor,

The kernel has, over the entire time the server has been up, found 51
MB of process memory to swap.  That doesn't really mean anything.  Do
you see active swapping going on, like with vmstat?


Cheers,

Jeff

Re: shared_buffers/effective_cache_size on 96GB server

From
Claudio Freire
Date:
On Wed, Oct 10, 2012 at 5:12 PM, Strahinja Kustudić
<strahinjak@nordeus.com> wrote:
> @Claudio So you are basically saying that if I have set effective_cache_size
> to 10GB and I have 10 concurrent processes which are using 10 different
> indices which are for example 2GB, it would be better to set the
> effective_cache size to 1GB? Since if I leave it at 10GB each running
> process query planner will think the whole index is in cache and that won't
> be true? Did I get that right?

Yep. You might get away with setting 2GB, if you're willing to bet
there won't be 100% concurrency. But the safest setting would be 1G.


Re: shared_buffers/effective_cache_size on 96GB server

From
Bruce Momjian
Date:
On Wed, Oct 10, 2012 at 10:12:51PM +0200, Strahinja Kustudić wrote:
> @Bruce Thanks for your articles, after reading them all I don't think disabling
> swap is a good idea now. Also you said to see the effective_cache_size I should
> check it with free. My question is should I use the value that free is showing
> as cached, or a little lower one, since not everything in the cache is because
> of Postgres.

Well, you are right that some of that might not be Postgres, so yeah,
you can lower it somewhat.

> @Claudio So you are basically saying that if I have set effective_cache_size to
> 10GB and I have 10 concurrent processes which are using 10 different indices
> which are for example 2GB, it would be better to set the effective_cache size
> to 1GB? Since if I leave it at 10GB each running process query planner will
> think the whole index is in cache and that won't be true? Did I get that right?

Well, the real question is whether, while traversing the index, if some
of the pages are going to be removed from the cache by other process
cache usage.  effective_cache_size is not figuring the cache will remain
between queries.

--
  Bruce Momjian  <bruce@momjian.us>        http://momjian.us
  EnterpriseDB                             http://enterprisedb.com

  + It's impossible for everything to be true. +


Re: shared_buffers/effective_cache_size on 96GB server

From
Ondrej Ivanič
Date:
Hi,

On 10 October 2012 19:11, Julien Cigar <jcigar@ulb.ac.be> wrote:
>> shared_buffers = 10GB
>
>
> Generally going over 4GB for shared_buffers doesn't help.. some of the
> overhead of bgwriter and checkpoints is more or less linear in the size of
> shared_buffers ..

Nothing is black or white; It's all shades of Grey :) It depends on
workload. In my case external consultants recommended 8GB and I was
able to increase it up to 10GB. This was mostly read-only workload.
From my experience large buffer cache acts as handbrake for
write-heavy workloads.

--
Ondrej Ivanic
(ondrej.ivanic@gmail.com)
(http://www.linkedin.com/in/ondrejivanic)


Re: shared_buffers/effective_cache_size on 96GB server

From
Claudio Freire
Date:
On Wed, Oct 10, 2012 at 7:06 PM, Ondrej Ivanič <ondrej.ivanic@gmail.com> wrote:
>> Generally going over 4GB for shared_buffers doesn't help.. some of the
>> overhead of bgwriter and checkpoints is more or less linear in the size of
>> shared_buffers ..
>
> Nothing is black or white; It's all shades of Grey :) It depends on
> workload. In my case external consultants recommended 8GB and I was
> able to increase it up to 10GB. This was mostly read-only workload.
> From my experience large buffer cache acts as handbrake for
> write-heavy workloads.

Which makes me ask...

...why can't checkpoint_timeout be set above 1h? Mostly for the
checkpoint target thing.

I know, you'd need an unholy amount of WAL and recovery time, but
modern systems I think can handle that (especially if you don't care
much about recovery time).

I usually set checkpoint_timeout to approach the time between periodic
mass updates, and it works rather nice. Except when those updates are
spaced more than 1h, my hands are tied.


Re: shared_buffers/effective_cache_size on 96GB server

From
Jeff Janes
Date:
On Wed, Oct 10, 2012 at 2:03 PM, Bruce Momjian <bruce@momjian.us> wrote:
> On Wed, Oct 10, 2012 at 10:12:51PM +0200, Strahinja Kustudić wrote:

>> @Claudio So you are basically saying that if I have set effective_cache_size to
>> 10GB and I have 10 concurrent processes which are using 10 different indices
>> which are for example 2GB, it would be better to set the effective_cache size
>> to 1GB? Since if I leave it at 10GB each running process query planner will
>> think the whole index is in cache and that won't be true? Did I get that right?
>
> Well, the real question is whether, while traversing the index, if some
> of the pages are going to be removed from the cache by other process
> cache usage.  effective_cache_size is not figuring the cache will remain
> between queries.

Does anyone see effective_cache_size make a difference anyway?  If so,
in what circumstances?

In my hands, queries for which effective_cache_size might come into
play (for deciding between seq scan and index scan) are instead
planned as bitmap scans.

Cheers,

Jeff


Re: shared_buffers/effective_cache_size on 96GB server

From
Claudio Freire
Date:
On Wed, Oct 10, 2012 at 7:33 PM, Jeff Janes <jeff.janes@gmail.com> wrote:
>> Well, the real question is whether, while traversing the index, if some
>> of the pages are going to be removed from the cache by other process
>> cache usage.  effective_cache_size is not figuring the cache will remain
>> between queries.
>
> Does anyone see effective_cache_size make a difference anyway?  If so,
> in what circumstances?

In my case, if I set it too high, I get impossibly suboptimal plans
when an index scan over millions of rows hits the disk way too often
way too randomly. The difference is minutes for a seqscan vs hours for
the index scan. In fact, I prefer setting it too low than too high.


Re: shared_buffers/effective_cache_size on 96GB server

From
Scott Marlowe
Date:
On Wed, Oct 10, 2012 at 4:37 PM, Claudio Freire <klaussfreire@gmail.com> wrote:
> On Wed, Oct 10, 2012 at 7:33 PM, Jeff Janes <jeff.janes@gmail.com> wrote:
>>> Well, the real question is whether, while traversing the index, if some
>>> of the pages are going to be removed from the cache by other process
>>> cache usage.  effective_cache_size is not figuring the cache will remain
>>> between queries.
>>
>> Does anyone see effective_cache_size make a difference anyway?  If so,
>> in what circumstances?
>
> In my case, if I set it too high, I get impossibly suboptimal plans
> when an index scan over millions of rows hits the disk way too often
> way too randomly. The difference is minutes for a seqscan vs hours for
> the index scan. In fact, I prefer setting it too low than too high.

There's a corollary for very fast disk subsystems.  If you've got say
40 15krpm disks in a RAID-10 you can get sequential read speeds into
the gigabytes per second, so that sequential page access costs MUCH
lower than random page access, to the point that if seq page access is
rated a 1, random page access should be much higher, sometimes on the
order of 100 or so.  I.e. sequential accesses are almost always
preferred, especially if you're getting more than a tiny portion of
the table at one time.

As for the arguments for / against having a swap space, no one has
mentioned the one I've run into on many older kernels, and that is
BUGs.  I have had to turn off swap on very large mem machines with
2.6.xx series kernels in the past.  These machines all had properly
set vm.* settings for dirty buffers and percent etc.  Didn't matter,
as after 2 to 4 weeks of hard working uptimes, I'd get an alert on the
db server for high load, log in, and see kswapd working its butt off
doing... NOTHING.  Load would be in the 50 to 150 range. iostat showed
NOTHING in terms of si/so/bi/bo and so on.  kswapd wasn't in a D
(iowait) state, but rather R, pegging a CPU core at 100% while
running, and apparently blocking a lot of other processes that wanted
to access memory, all of which were S(leeping) or R(unning).  Two
seconds after a sudo swapoff -a completed and my machine went back to
a load of 2 to 5 as was normal for it.  Honestly if you're running out
of memory on a machine with 256G and needing swap, you've got other
very real memory usage issues you've been ignoring to get to that
point.

Are all those bugs fixed in the 3.0.latest kernels?  Not sure, but I
haven't had this issue on any big memory servers lately and they've
all had swap turned on.


Re: shared_buffers/effective_cache_size on 96GB server

From
Josh Berkus
Date:
Jeff,

> Does anyone see effective_cache_size make a difference anyway?  If so,
> in what circumstances?

E_C_S, together with random_page_cost, the table and index sizes, the
row estimates and the cpu_* costs, form an equation which estimates the
cost of doing various kinds of scans, particularly index scan vs. table
scan.  If you have an extremely small database (< shared_buffers) or a
very large database ( > 50X RAM ), the setting for E_C_S probably
doesn't matter, but in the fairly common case where some tables and
indexes fit in RAM and some don't, it matters.

> In my hands, queries for which effective_cache_size might come into
> play (for deciding between seq scan and index scan) are instead
> planned as bitmap scans.

You have a very unusual workload, or a very small database.


--
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com


Re: shared_buffers/effective_cache_size on 96GB server

From
Jeff Janes
Date:
On Thu, Oct 11, 2012 at 11:17 AM, Josh Berkus <josh@agliodbs.com> wrote:
>
>> Does anyone see effective_cache_size make a difference anyway?  If so,
>> in what circumstances?
>
> E_C_S, together with random_page_cost, the table and index sizes, the
> row estimates and the cpu_* costs, form an equation which estimates the
> cost of doing various kinds of scans, particularly index scan vs. table
> scan.

E_C_S only comes into play when the same table pages are (predicted to
be) visited repeatedly during the index scan, but this is the same
situation in which a bitmap scan is generally preferred anyway.  In
fact the two seem to be conceptually very similar (either avoid
actually visiting the block repeatedly, or avoid the IO cost of
visiting the block repeatedly), and I'm not sure why bitmap scans
comes out on top--there doesn't seem to be a CPU cost estimate of
visiting a block which is assumed to already be in memory, nor is
bitmap scan given credit for the use of effective_io_concurrency.

But I found a simple case (over in "Unused index influencing
sequential scan plan") which is very sensitive to E_C_S.  When the
index scan is being done to avoid a costly sort or aggregation, then
it can't be usefully replaced with a bitmap scan since it won't
produce index-order sorted output.

>> In my hands, queries for which effective_cache_size might come into
>> play (for deciding between seq scan and index scan) are instead
>> planned as bitmap scans.
>
> You have a very unusual workload, or a very small database.

I think all real workloads are unusual, otherwise benchmarking would
be easy...but since complex queries are intractable to figure out what
the planner is thinking, I'm biased to using simple ones when trying
to figure out general principles.  I can make the database look as big
or small as I want (relative to RAM), by feeding effective_cache_size
false information.

Anyway, it seems like the consequences of overestimating E_C_S (by
underestimating the number of processes that might expect to benefit
from it concurrently) are worse than the consequences of
underestimating it--assuming you have the types of queries for which
it makes much of a difference.

Cheers,

Jeff


Re: shared_buffers/effective_cache_size on 96GB server

From
Jeff Janes
Date:
On Wed, Oct 10, 2012 at 1:12 PM, Strahinja Kustudić
<strahinjak@nordeus.com> wrote:

> @Claudio So you are basically saying that if I have set effective_cache_size
> to 10GB and I have 10 concurrent processes which are using 10 different
> indices which are for example 2GB,

It is the size of the table, not the index, which is primarily of
concern.  However, that mostly factors into how postgres uses
effective_cache_size, not how you set it.

> it would be better to set the
> effective_cache size to 1GB?

If 10GB were the correct setting for a system with only one process
trying to run that type of query at a time, then 1 GB would be the
correct setting for 10 concurrent processing running that type of
query concurrently.

But, I think there is little reason to think that 10GB actually would
be the correct setting for the first case, so little reason to think
1GB is the correct setting in the 2nd case.

Since you have 96GB of RAM, I would think that 10GB is an appropriate
setting *already taking concurrency into account*, and would be too
low if you were not expecting any concurrency.

In any case, the setting of effective_cache size shouldn't affect
simple inserts or copies at all, since those operations don't use
large index range scans.


> Since if I leave it at 10GB each running
> process query planner will think the whole index is in cache and that won't
> be true? Did I get that right?

It isn't mostly about how much of the index is in cache, but rather
how much of the table is in cache.

>
> @Jeff I have 4 drives in RADI10. The database has around 80GB of indices.

That seems like a pretty small disk set for a server of this size.

Do you know what percentage of that 80GB of indices gets dirtied
during any given round of batch loading/updating?  I think that that
could easily be your bottleneck, how fast you can write out dirtied
index pages, which are likely being written randomly rather than
sequentially.


> I'm not experiencing any slow downs, I would just like to increase the
> performance of update/insert, since it needs to insert a lot of data and to
> make the select queries faster since they are done on a lot of big tables.

I think these two things are in tension.  The faster the inserts and
updates run, the more resources they will take away from the selects
during those periods.  If you are doing batch copies, then as long as
one batch has finished before the next one needs to start, isn't that
fast enough?  Maybe the goal should be to throttle the inserts so that
the selects see a more steady competition for IO.

> I
> am experiencing a lot of performance problems when autovacuum kicks in for a
> few big tables, since it slows downs things a lot.

You can tune the autovacuum to make them slower.  But it sounds like
maybe you should have put more money into spindles and less into CPU
cores.  (I think that is a very common situation to be in).

Cheers,

Jeff


Re: shared_buffers/effective_cache_size on 96GB server

From
Jeff Janes
Date:
On Wed, Oct 10, 2012 at 10:36 PM, Scott Marlowe <scott.marlowe@gmail.com> wrote:
> On Wed, Oct 10, 2012 at 4:37 PM, Claudio Freire <klaussfreire@gmail.com> wrote:
>>
>> In my case, if I set it too high, I get impossibly suboptimal plans
>> when an index scan over millions of rows hits the disk way too often
>> way too randomly. The difference is minutes for a seqscan vs hours for
>> the index scan. In fact, I prefer setting it too low than too high.
>
> There's a corollary for very fast disk subsystems.  If you've got say
> 40 15krpm disks in a RAID-10 you can get sequential read speeds into
> the gigabytes per second, so that sequential page access costs MUCH
> lower than random page access, to the point that if seq page access is
> rated a 1, random page access should be much higher, sometimes on the
> order of 100 or so.

On the other hand, if you have 40 very busy connections, then if they
are all doing sequential scans on different tables they will interfere
with each other and will have to divide up the RAID throughput, while
if they are doing random fetches they will get along nicely on that
RAID.  So you have to know how much concurrency of the relevant type
you expect to see.

The default page cost settings already assume that random fetches are
far more likely to be cache hits than sequential fetches are.  If that
is not true, then the default random page cost is way too low,
regardless of the number of spindles or the concurrency.

Cheers,

Jeff


Re: shared_buffers/effective_cache_size on 96GB server

From
Claudio Freire
Date:
On Thu, Oct 18, 2012 at 4:23 PM, Jeff Janes <jeff.janes@gmail.com> wrote:
>> @Claudio So you are basically saying that if I have set effective_cache_size
>> to 10GB and I have 10 concurrent processes which are using 10 different
>> indices which are for example 2GB,
>
> It is the size of the table, not the index, which is primarily of
> concern.  However, that mostly factors into how postgres uses
> effective_cache_size, not how you set it.

You're right, I just noticed that a few minutes ago (talk about telepathy).


Re: shared_buffers/effective_cache_size on 96GB server

From
Scott Marlowe
Date:
On Thu, Oct 18, 2012 at 1:50 PM, Jeff Janes <jeff.janes@gmail.com> wrote:
> On Wed, Oct 10, 2012 at 10:36 PM, Scott Marlowe <scott.marlowe@gmail.com> wrote:
>> On Wed, Oct 10, 2012 at 4:37 PM, Claudio Freire <klaussfreire@gmail.com> wrote:
>>>
>>> In my case, if I set it too high, I get impossibly suboptimal plans
>>> when an index scan over millions of rows hits the disk way too often
>>> way too randomly. The difference is minutes for a seqscan vs hours for
>>> the index scan. In fact, I prefer setting it too low than too high.
>>
>> There's a corollary for very fast disk subsystems.  If you've got say
>> 40 15krpm disks in a RAID-10 you can get sequential read speeds into
>> the gigabytes per second, so that sequential page access costs MUCH
>> lower than random page access, to the point that if seq page access is
>> rated a 1, random page access should be much higher, sometimes on the
>> order of 100 or so.
>
> On the other hand, if you have 40 very busy connections, then if they
> are all doing sequential scans on different tables they will interfere
> with each other and will have to divide up the RAID throughput, while
> if they are doing random fetches they will get along nicely on that
> RAID.  So you have to know how much concurrency of the relevant type
> you expect to see.

My experience is that both read ahead and caching will make it more
resilient than that, to the point that it can take several times the
number of read clients than the number of spindles before things get
as slow as random access. While it's a performance knee to be aware
of, often by the time you have enough clients for it to matter you've
already maxxed out either memory bw or all the CPU cores.  But again,
this is with dozens to hundreds of disk drives.  Not four or eight
etc.

And it's very access pattern dependent.  On the last machine I had
where we cranked up random versus sequential costs, it was a reporting
server with 5 or 6TB of data that regularly got trundled through
regularly for aggregation. Daily slices of data, in partitions were
1GB to 10GB.  For this machine the access patterns were almost never
random.  Just a handful of queries running in a random access pattern
could interfere with a 5 minute long sequential reporting query and
make it suddenly take hours.  It had 16x7200rpm 2TB drives and would
regularly hand three or four queries at a time, all running plenty
fast.  But crank up one thread of pgbench and it would slow to a
crawl.