Thread: Potential memory usage issue

Potential memory usage issue

From
David Brain
Date:
Hi,

I recently migrated one of our large (multi-hundred GB) dbs from an
Intel 32bit platform (Dell 1650 - running 8.1.3) to a 64bit platform
(Dell 1950 - running 8.1.5).  However I am not seeing the performance
gains I would expect - I am suspecting that some of this is due to
differences I am seeing in reported memory usage.

On the 1650 - a 'typical' postmaster process looks like this in top:

5267 postgres  16   0  439m 427m 386m S  3.0 21.1   3:31.73 postmaster

On the 1940 - a 'typical' postmaster process looks like:

10304 postgres  16   0 41896  13m  11m D    4  0.3   0:11.73 postmaster

I currently have both systems running in parallel so the workloads will
be approximately equal.  The configurations of the two systems in terms
of postgresql.conf is pretty much identical between the two systems, I
did make some changes to logging, but nothing to buffers/shared memory
config.

I have never seen a postmaster process on the new system consume
anywhere near as much RAM as the old system - I am wondering if there is
something up with the shared memory config/usage that is causing my
performance issues.  Any thoughts as to where I should go from here?

Thanks,

David.


--
David Brain - bandwidth.com
dbrain@bandwidth.com

Re: Potential memory usage issue

From
Bill Moran
Date:
In response to David Brain <dbrain@bandwidth.com>:
>
> I recently migrated one of our large (multi-hundred GB) dbs from an
> Intel 32bit platform (Dell 1650 - running 8.1.3) to a 64bit platform
> (Dell 1950 - running 8.1.5).  However I am not seeing the performance
> gains I would expect

What were you expecting?  It's possible that your expectations are
unreasonable.

In our testing, we found that 64bit on the same hardware as 32bit only
gave us a 5% gain, in the best case.  In many cases the gain was near
0, and in some there was a small performance loss.  These findings seemed
to jive with what others have been reporting.

> - I am suspecting that some of this is due to
> differences I am seeing in reported memory usage.
>
> On the 1650 - a 'typical' postmaster process looks like this in top:
>
> 5267 postgres  16   0  439m 427m 386m S  3.0 21.1   3:31.73 postmaster
>
> On the 1940 - a 'typical' postmaster process looks like:
>
> 10304 postgres  16   0 41896  13m  11m D    4  0.3   0:11.73 postmaster
>
> I currently have both systems running in parallel so the workloads will
> be approximately equal.  The configurations of the two systems in terms
> of postgresql.conf is pretty much identical between the two systems, I
> did make some changes to logging, but nothing to buffers/shared memory
> config.
>
> I have never seen a postmaster process on the new system consume
> anywhere near as much RAM as the old system - I am wondering if there is
> something up with the shared memory config/usage that is causing my
> performance issues.  Any thoughts as to where I should go from here?

Provide more information, for one thing.  I'm assuming from the top output
that this is some version of Linux, but more details on that are liable
to elicit more helpful feedback.

We run everything on FreeBSD here, but I haven't seen any difference in
the way PostgreSQL uses memory on ia32 FreeBSD vs. amd64 FreeBSD.  Without
more details on your setup, my only suggestion would be to double-verify
that your postgresql.conf settings are correct on the 64 bit system.

--
Bill Moran
Collaborative Fusion Inc.

wmoran@collaborativefusion.com
Phone: 412-422-3463x4023

****************************************************************
IMPORTANT: This message contains confidential information and is
intended only for the individual named. If the reader of this
message is not an intended recipient (or the individual
responsible for the delivery of this message to an intended
recipient), please be advised that any re-use, dissemination,
distribution or copying of this message is prohibited. Please
notify the sender immediately by e-mail if you have received
this e-mail by mistake and delete this e-mail from your system.
E-mail transmission cannot be guaranteed to be secure or
error-free as information could be intercepted, corrupted, lost,
destroyed, arrive late or incomplete, or contain viruses. The
sender therefore does not accept liability for any errors or
omissions in the contents of this message, which arise as a
result of e-mail transmission.
****************************************************************

Re: Potential memory usage issue

From
David Brain
Date:
Hi,

Thanks for the response.
Bill Moran wrote:
> In response to David Brain <dbrain@bandwidth.com>:
>> I recently migrated one of our large (multi-hundred GB) dbs from an
>> Intel 32bit platform (Dell 1650 - running 8.1.3) to a 64bit platform
>> (Dell 1950 - running 8.1.5).  However I am not seeing the performance
>> gains I would expect
>
> What were you expecting?  It's possible that your expectations are
> unreasonable.
>

Possibly - but there is a fair step up hardware performance wise from a
1650 (Dual 1.4 Ghz PIII with U160 SCSI) to a 1950 (Dual, Dual Core 2.3
Ghz Xeons with SAS) - so I wasn't necessarily expecting much from the
32->64 transition (except maybe the option to go > 4GB easily - although
currently we only have 4GB in the box), but was from the hardware
standpoint.

I am curious as to why 'top' gives such different output on the two
systems - the datasets are large and so I know I benefit from having
high shared_buffers and effective_cache_size settings.

> Provide more information, for one thing.  I'm assuming from the top output
> that this is some version of Linux, but more details on that are liable
> to elicit more helpful feedback.
>
Yes the OS is Linux - on the 1650 version 2.6.14, on the 1950 version 2.6.18

Thanks,

David.



--
David Brain - bandwidth.com
dbrain@bandwidth.com
919.297.1078

Re: Potential memory usage issue

From
Bill Moran
Date:
In response to David Brain <dbrain@bandwidth.com>:
>
> Thanks for the response.
> Bill Moran wrote:
> > In response to David Brain <dbrain@bandwidth.com>:
> >> I recently migrated one of our large (multi-hundred GB) dbs from an
> >> Intel 32bit platform (Dell 1650 - running 8.1.3) to a 64bit platform
> >> (Dell 1950 - running 8.1.5).  However I am not seeing the performance
> >> gains I would expect
> >
> > What were you expecting?  It's possible that your expectations are
> > unreasonable.
>
> Possibly - but there is a fair step up hardware performance wise from a
> 1650 (Dual 1.4 Ghz PIII with U160 SCSI) to a 1950 (Dual, Dual Core 2.3
> Ghz Xeons with SAS) - so I wasn't necessarily expecting much from the
> 32->64 transition (except maybe the option to go > 4GB easily - although
> currently we only have 4GB in the box), but was from the hardware
> standpoint.

Ahh ... I didn't get that from your original message.

> I am curious as to why 'top' gives such different output on the two
> systems - the datasets are large and so I know I benefit from having
> high shared_buffers and effective_cache_size settings.

Have you done any actual queries on the new system?  PG won't use the
shm until it needs it -- and that doesn't occur until it gets a request
for data via a query.

Install the pg_bufferstats contrib module and take a look at how shared
memory is being use.  I like to use MRTG to graph shared buffer usage
over time, but you can just do a SELECT count(*) WHERE NOT NULL to see
how many buffers are actually in use.

> > Provide more information, for one thing.  I'm assuming from the top output
> > that this is some version of Linux, but more details on that are liable
> > to elicit more helpful feedback.
> >
> Yes the OS is Linux - on the 1650 version 2.6.14, on the 1950 version 2.6.18

--
Bill Moran
Collaborative Fusion Inc.

wmoran@collaborativefusion.com
Phone: 412-422-3463x4023

****************************************************************
IMPORTANT: This message contains confidential information and is
intended only for the individual named. If the reader of this
message is not an intended recipient (or the individual
responsible for the delivery of this message to an intended
recipient), please be advised that any re-use, dissemination,
distribution or copying of this message is prohibited. Please
notify the sender immediately by e-mail if you have received
this e-mail by mistake and delete this e-mail from your system.
E-mail transmission cannot be guaranteed to be secure or
error-free as information could be intercepted, corrupted, lost,
destroyed, arrive late or incomplete, or contain viruses. The
sender therefore does not accept liability for any errors or
omissions in the contents of this message, which arise as a
result of e-mail transmission.
****************************************************************

Re: Potential memory usage issue

From
Tom Lane
Date:
Bill Moran <wmoran@collaborativefusion.com> writes:
> In response to David Brain <dbrain@bandwidth.com>:
>> I am curious as to why 'top' gives such different output on the two
>> systems - the datasets are large and so I know I benefit from having
>> high shared_buffers and effective_cache_size settings.

> Have you done any actual queries on the new system?  PG won't use the
> shm until it needs it -- and that doesn't occur until it gets a request
> for data via a query.

More accurately, top won't consider shared mem to be part of the process
address space until it's actually touched by that process.

            regards, tom lane

Re: Potential memory usage issue

From
David Brain
Date:
Bill Moran wrote:

>
> Install the pg_bufferstats contrib module and take a look at how shared
> memory is being use.  I like to use MRTG to graph shared buffer usage
> over time, but you can just do a SELECT count(*) WHERE NOT NULL to see
> how many buffers are actually in use.
>

Can you explain what you'd use as a diagnostic on this - I just
installed the module - but I'm not entirely clear as to what the output
is actually showing me and/or what would be considered good or bad.

Thanks,

David.
--
David Brain - bandwidth.com
dbrain@bandwidth.com

Re: Potential memory usage issue

From
Bill Moran
Date:
In response to David Brain <dbrain@bandwidth.com>:

> Bill Moran wrote:
>
> >
> > Install the pg_bufferstats contrib module and take a look at how shared
> > memory is being use.  I like to use MRTG to graph shared buffer usage
> > over time, but you can just do a SELECT count(*) WHERE NOT NULL to see
> > how many buffers are actually in use.
> >
>
> Can you explain what you'd use as a diagnostic on this - I just
> installed the module - but I'm not entirely clear as to what the output
> is actually showing me and/or what would be considered good or bad.

Well, there are different things you can do with it.  See the README, which
I found pretty comprehensive.

What I was referring to was the ability to track how many shared_buffers
were actually in use, which can easily be seen at a cluster-wide view
with two queries:
select count(*) from pg_buffercache;
select count(*) from pg_buffercache where reldatabase is not null;

The first gives you the total number of buffers available (you could get
this from your postgresql.conf as well, but with automated collection and
graphing via mrtg, doing it this way guarantees that we'll always know
what the _real_ value is)  The second gives you the number of buffers
that are actually holding data.

If #2 is smaller than #1, that indicates that the entire working set of
your database is able to fit in shared memory.  This might not be your
entire database, as some tables might never be queried from (i.e. log
tables that are only queried when stuff goes wrong ...)  This means
that Postgres is usually able to execute queries without going to the
disk for data, which usually equates to fast queries.  If it's
consistently _much_ lower, it may indicate that your shared_buffers
value is too high, and the system may benefit from re-balancing memory
usage.

If #2 is equal to #1, it probably means that your working set is larger
than the available shared buffers, this _may_ mean that your queries are
using the disk a lot, and that you _may_ benefit from increasing
shared_buffers, adding more RAM, sacrificing a 15000 RPM SCSI drive to
the gods of performance, etc ...

Another great thing to track is read activity.  I do this via the
pg_stat_database table:
select sum(blks_hit) from pg_stat_database;
select sum(blks_read) from pg_stat_database;

(Note that you need block-level stats collecting enabled to make these
usable)

If the second one is increasing particularly fast, that's a strong
indication that more shared_memory might improve performance.  If
neither of them are increasing, that indicates that nobody's really
doing much with the database ;)

I strongly recommend that you graph these values using mrtg or cacti
or one of the many other programs designed to do that.  It makes life
nice when someone says, "hey, the DB system was really slow yesterday
while you where busy in meetings, can you speed it up."

--
Bill Moran
Collaborative Fusion Inc.

Re: Potential memory usage issue [resolved]

From
David Brain
Date:
Thanks Bill for the explanation - that really helped me out considerably.

What this showed me was that there were only 1024 buffers configured.
I'm not quite clear as to how this happened as the postgresql.conf files
on both systems have the shared_buffers set to ~50000.  However it looks
as though the system start script was passing in -B 1024 to postmaster
which was overriding the postgresql.conf settings.

The really odd thing is that that the db start script is also the same
on both systems, so there some other difference there that I need to
track down.  However removing the -B 1024 allowed the settings to revert
to the file specified values.

So now I'm back to using ~50k buffers again and things are running a
little more swiftly, and according to pg_buffercache I'm using 49151 of
them (-:

Thanks again to those who helped me track this down.

David.



--
David Brain - bandwidth.com
dbrain@bandwidth.com