Thread: memory allocation ; postgresql-8.0

memory allocation ; postgresql-8.0

From

"Kavan, Dan (IMS)"

Date:

06 May 2005, 15:28:59

      Hi,

I am reserving 2.1 GB of memory for postgres to use with SuSE linux 9.0
(SLES) in sysctl.conf.
I have configured postgres to use 1.9 GB of memory so it won't go over
the 2.1 cap (postgresql.conf).
I have a total of 4GB of memory.  Somewhere I read to keep half of the
memory for other things.  This server is dedicated to postgresql.  Does
anyone agree with just using 2GB?  Does anyone know where I might have
read that?

Thanks,
Dj

Re: memory allocation ; postgresql-8.0

From

Scott Marlowe

Date:

06 May 2005, 15:38:14

On Fri, 2005-05-06 at 13:28, Kavan, Dan (IMS) wrote:
>       Hi,
>
> I am reserving 2.1 GB of memory for postgres to use with SuSE linux 9.0
> (SLES) in sysctl.conf.
> I have configured postgres to use 1.9 GB of memory so it won't go over
> the 2.1 cap (postgresql.conf).
> I have a total of 4GB of memory.  Somewhere I read to keep half of the
> memory for other things.  This server is dedicated to postgresql.  Does
> anyone agree with just using 2GB?  Does anyone know where I might have
> read that?

Unless you routinely actually handle data sets that are 1.9 gigabytes in
size, it's probably not a great idea.

PostgreSQL doesn't really cache data in the classical sense.  All the
data stored in its internal buffers is tossed away when the last backend
referencing said material stops referencing it.

Plus, there's a fair bit of overhead to managing such a large data set,
and, until 8.0, the algorithms for doing so with efficiency weren't a
part of PostgreSQL.

While some future version might incorporate genuine caching that could
utilize all that memory, for now, one is still better off giving
postgresql about 250 megabytes max and letting the kernel use the rest
for caching.

Re: memory allocation ; postgresql-8.0

From

"Kavan, Dan (IMS)"

Date:

12 May 2005, 12:10:20

Hi Scott,

Thanks again for all your tips.

If I knock the buffer size down to 65,536 (still higher than what you
are recommending)  then my shmmax becomes:
256,000 + 550,292,685 (65536*8396.8) + 1,454,100 = 552,002,785

That will leave me with 3.5 GB of free memory for the system & work
memory to use.
Will those free system resources ever get used with a 10 million record,
10 GB database?

If I go with 65,536 as my buffer size, Would having the SHMMAX set to 1
GB on my sysctl.conf system parameters allow me to run two seperate
instances of postgresql on 2 seperate ports?

~DjK

-----Original Message-----
From: Scott Marlowe [mailto:smarlowe@g2switchworks.com]
Sent: Friday, May 06, 2005 3:31 PM
To: Kavan, Dan (IMS)
Subject: RE: [ADMIN] memory allocation ; postgresql-8.0

On Fri, 2005-05-06 at 14:12, Kavan, Dan (IMS) wrote:
> <Scott>Unless you routinely actually handle data sets that are 1.9
> gigabytes in size, it's probably not a great idea. <Dj>I knew I
> wouldn't be that easy. ;) We have a database going on here(in a couple

> weeks) that will handle 10
> million records.   I'm not sure how to measure if that means I will or
> won't have 1.9 GB in datasets.  Right now, most of my memory shows up
> as free (top) and I have 4GB of swap, virtual memory, that hasn't been

> tapped, yet.  I put a copy of what I have in sysctl.conf and
> postgresql.conf below ( the memory section )

Well, the only way you'll really know is when you have 10,000,000
records and start working with them.  You might want to create a test
data set and see how it runs on your setup with varying amounts of
buffers allocated to postgresql. I'd bet that in most circumstances,
you'll find 10,000 buffers, at most 20,000 buffers working best for you.

> <Scott>PostgreSQL doesn't really cache data in the classical sense.
> All the data stored in its internal buffers is tossed away when the
> last backend referencing said material stops referencing it.
> <Dj>That's the reason why I leave the remaining 2GB of memory to the
> system.

But, keep in mind, that reduces the amount of space the kernel now has
to play with, as well as the memory left for sorts (sort_mem in <8.0,
working_mem in 8.0).  It's all about tradeoffs.  1.9gig for pgsql is
usually a bit much, but not always.

> <Scott>Plus, there's a fair bit of overhead to managing such a large
> data set, and, until 8.0, the algorithms for doing so with efficiency
> weren't a part of PostgreSQL. <Dj>Besides full vacuuming and backups
> what kind of maintenance is involved in managing a  database with 10
> million records?

That's not the ovrehead I'm talking about. I mean the CPU overhead from
maintaining that large list of buffers and searching them each time you
access a buffer.  Again, 8.0 is better at it than <8.0

> <Scott>While some future version might incorporate genuine caching
> that could utilize all that memory, for now, one is still better off
> giving postgresql about 250 megabytes max and letting the kernel use
> the rest for caching. <Dj>Does that logic scale to a 10 GB database
> with 10 million records and 2 other small databases all with 10-20
> simultaneous users doing queries?

It does for 7.4 and before, which really didn't handle large internal
buffers all that efficiently.  8.0 is an unknown to me in that area.

>
> <Dj>
> #my startup script includes
> sysctl -w vm.overcommit_memory=2

Keep in mind that according to recent postings I've seen in the kernel
mailing list and the pgsql mailing lists, the vm.overcommit settings are
sometimes ignored, and the oom killer still stalks the night and kills
processes that are memory hogs.  IF postgresql is set to use 1.9 gig
shared memory, it's likely to be the first victim should the oom killer
come out to play.

> shared_buffers = 230000        # min 16, at least
max_connections*2,
> 8KB each was 65536
> work_mem = 1024        # min 64, size in KB

I'd increase our work mem to 16384 or so.  Depends on how many parallel
clients you're likely to have.   You want to allow sorts to happen in
memory when possible, but you don't want to starve the other processes
for memory or you'll get a swap storm.

> max_fsm_pages = 30000        # min max_fsm_relations*16, 6 bytes each
> max_fsm_relations = 1500    # min 100, ~50 bytes each

You'll likely want these larger too.  With a hard working large
database, 100,000 / 10000 are not unusual settings for the fsm settings.

Re: memory allocation ; postgresql-8.0

From

Scott Marlowe

Date:

12 May 2005, 12:21:05

On Thu, 2005-05-12 at 10:10, Kavan, Dan (IMS) wrote:
> Hi Scott,
>
> Thanks again for all your tips.
>
> If I knock the buffer size down to 65,536 (still higher than what you
> are recommending)  then my shmmax becomes:
> 256,000 + 550,292,685 (65536*8396.8) + 1,454,100 = 552,002,785
>
> That will leave me with 3.5 GB of free memory for the system & work
> memory to use.
> Will those free system resources ever get used with a 10 million record,
> 10 GB database?

Certainly.  As you access the data the kernel will cache all the data
sent through it.  Once the machine's been up and processing for a while
you should see a top output that shows "free" memory at a few megs (8 to
30 meg is typical) and all the rest of the memory being used as kernel
cache.

> If I go with 65,536 as my buffer size, Would having the SHMMAX set to 1
> GB on my sysctl.conf system parameters allow me to run two seperate
> instances of postgresql on 2 seperate ports?

Yes, but you may want to set it just a tad higher for things like fsm
and whatnot.

Definitely benchmark both the 64k setting of shared_buffers and lower
settings, looking for a knee with your data set.  It may well be that
the best performance happens at a lower number, and doesn't really
increase as you bump up the shared_buffers.  Be sure to test things as
realistically as possible, i.e. the right amount of parallel users and
all that.

Re: memory allocation ; postgresql-8.0

From

"Kavan, Dan (IMS)"

Date:

12 May 2005, 14:46:37

Do psql calls/procedures access resources reserved from the
kernel.shmmax?
How about the tar or copy sysadmin commands?  I would guess they don't
use kernel.shmmax resources.  Finally, work memory alos does not access
resources reserved from kernel.shmmax, correct?  Thanks for clearing
things up.

-----Original Message-----
From: Scott Marlowe [mailto:smarlowe@g2switchworks.com]
Sent: Thursday, May 12, 2005 11:21 AM
To: Kavan, Dan (IMS)
Cc: postgres
Subject: RE: [ADMIN] memory allocation ; postgresql-8.0

On Thu, 2005-05-12 at 10:10, Kavan, Dan (IMS) wrote:
> Hi Scott,
>
> Thanks again for all your tips.
>
> If I knock the buffer size down to 65,536 (still higher than what you
> are recommending)  then my shmmax becomes: 256,000 + 550,292,685
> (65536*8396.8) + 1,454,100 = 552,002,785
>
> That will leave me with 3.5 GB of free memory for the system & work
> memory to use. Will those free system resources ever get used with a
> 10 million record, 10 GB database?

Certainly.  As you access the data the kernel will cache all the data
sent through it.  Once the machine's been up and processing for a while
you should see a top output that shows "free" memory at a few megs (8 to
30 meg is typical) and all the rest of the memory being used as kernel
cache.

> If I go with 65,536 as my buffer size, Would having the SHMMAX set to
> 1 GB on my sysctl.conf system parameters allow me to run two seperate
> instances of postgresql on 2 seperate ports?

Yes, but you may want to set it just a tad higher for things like fsm
and whatnot.

Definitely benchmark both the 64k setting of shared_buffers and lower
settings, looking for a knee with your data set.  It may well be that
the best performance happens at a lower number, and doesn't really
increase as you bump up the shared_buffers.  Be sure to test things as
realistically as possible, i.e. the right amount of parallel users and
all that.

Re: memory allocation ; postgresql-8.0

From

Scott Marlowe

Date:

12 May 2005, 16:07:35

On Thu, 2005-05-12 at 12:46, Kavan, Dan (IMS) wrote:
> Do psql calls/procedures access resources reserved from the
> kernel.shmmax?

Only in the sense that it can execute a query, which in the backend
could therefore use shared memory.  psql, itself, doesn't use shared
memory.

> How about the tar or copy sysadmin commands?  I would guess they don't
> use kernel.shmmax resources.

Correct.  Actually, very few types of programs use shared memory.

>   Finally, work memory alos does not access
> resources reserved from kernel.shmmax, correct?

Correct.  The only parts of postgresql.conf that affect shared memory
usage are max connections, shared buffers, and the fsm.