Thread: Is this way of testing a bad idea?

Is this way of testing a bad idea?

From
"Fredrik Israelsson"
Date:
I am evaluating PostgreSQL as a candiate to cooperate with a java
application.

Performance test set up:
Only one table in the database schema.
The tables contains a bytea column plus some other columns.
The PostgreSQL server runs on Linux.

Test execution:
The java application connects throught TCP/IP (jdbc) and performs 50000
inserts.

Result:
Monitoring the processes using top reveals that the total amount of
memory used slowly increases during the test. When reaching insert
number 40000, or somewhere around that, memory is exhausted, and the the
systems begins to swap. Each of the postmaster processes seem to use a
constant amount of memory, but the total memory usage increases all the
same.

Questions:
Is this way of testing the performance a bad idea? Actual database usage
will be a mixture of inserts and queries. Maybe the test should behave
like that instead, but I wanted to keep things simple.
Why is the memory usage slowly increasing during the whole test?
Is there a way of keeping PostgreSQL from exhausting memory during the
test? I have looked for some fitting parameters to used, but I am
probably to much of a novice to understand which to choose.

Thanks in advance,
Fredrik Israelsson

Re: Is this way of testing a bad idea?

From
Tom Lane
Date:
"Fredrik Israelsson" <fredrik.israelsson@eu.biotage.com> writes:
> Monitoring the processes using top reveals that the total amount of
> memory used slowly increases during the test. When reaching insert
> number 40000, or somewhere around that, memory is exhausted, and the the
> systems begins to swap. Each of the postmaster processes seem to use a
> constant amount of memory, but the total memory usage increases all the
> same.

That statement is basically nonsense.   If there is a memory leak then
you should be able to pin it on some specific process.

What's your test case exactly, and what's your basis for asserting that
the system starts to swap?  We've seen people fooled by the fact that
some versions of ps report a process's total memory size as including
whatever pages of Postgres' shared memory area the process has actually
chanced to touch.  So as a backend randomly happens to use different
shared buffers its reported memory size grows ... but there's no actual
leak, and no reason why the system would start to swap.  (Unless maybe
you've set an unreasonably high shared_buffers setting?)

Another theory is that you're watching free memory go to zero because
the kernel is filling free memory with copies of disk pages.  This is
not a leak either.  Zero free memory is the normal, expected state of
a Unix system that's been up for any length of time.

            regards, tom lane

Re: Is this way of testing a bad idea?

From
Mark Lewis
Date:
> Monitoring the processes using top reveals that the total amount of
> memory used slowly increases during the test. When reaching insert
> number 40000, or somewhere around that, memory is exhausted, and the the
> systems begins to swap. Each of the postmaster processes seem to use a
> constant amount of memory, but the total memory usage increases all the
> same.

So . . . . what's using the memory?  It doesn't sound like PG is using
it, so is it your Java app?

If it's the Java app, then it could be that your code isn't remembering
to do things like close statements, or perhaps the max heap size is set
too large for your hardware.  With early RHEL3 kernels there was also a
quirky interaction with Sun's JVM where the system swaps itself to death
even when less than half the physical memory is in use.

If its neither PG nor Java, then perhaps you're misinterpreting the
results of top.  Remember that the "free" memory on a properly running
Unix box that's been running for a while should hover just a bit above
zero due to normal caching; read up on the 'free' command to see the
actual memory utilization.

-- Mark

Re: Is this way of testing a bad idea?

From
"Bucky Jordan"
Date:
Also, as Tom stated, defining your test cases is a good idea before you
start benchmarking. Our application has a load data phase, then a
query/active use phase. So, we benchmark both (data loads, and then
transactions) since they're quite different workloads, and there's
different ways to optimize for each.

For bulk loads, I would look into either batching several inserts into
one transaction or the copy command. Do some testing here to figure out
what works best for your hardware/setup (for example, we usually batch
several thousand inserts together for a pretty dramatic increase in
performance). There's usually a sweet spot in there depending on how
your WAL is configured and other concurrent activity.

Also, when testing bulk loads, be careful to setup a realistic test. If
your application requires foreign keys and indexes, these can
significantly slow down bulk inserts. There's several optimizations-
check the mailing lists and the manual.

And lastly, when you're loading tons of data, as previously pointed out,
the normal state of the system is to be heavily utilized (in fact, I
would think this is ideal since you know you're making full use of your
hardware).

HTH,

Bucky

-----Original Message-----
From: pgsql-performance-owner@postgresql.org
[mailto:pgsql-performance-owner@postgresql.org] On Behalf Of Mark Lewis
Sent: Thursday, August 24, 2006 9:40 AM
To: Fredrik Israelsson
Cc: pgsql-performance@postgresql.org
Subject: Re: [PERFORM] Is this way of testing a bad idea?

> Monitoring the processes using top reveals that the total amount of
> memory used slowly increases during the test. When reaching insert
> number 40000, or somewhere around that, memory is exhausted, and the
the
> systems begins to swap. Each of the postmaster processes seem to use a
> constant amount of memory, but the total memory usage increases all
the
> same.

So . . . . what's using the memory?  It doesn't sound like PG is using
it, so is it your Java app?

If it's the Java app, then it could be that your code isn't remembering
to do things like close statements, or perhaps the max heap size is set
too large for your hardware.  With early RHEL3 kernels there was also a
quirky interaction with Sun's JVM where the system swaps itself to death
even when less than half the physical memory is in use.

If its neither PG nor Java, then perhaps you're misinterpreting the
results of top.  Remember that the "free" memory on a properly running
Unix box that's been running for a while should hover just a bit above
zero due to normal caching; read up on the 'free' command to see the
actual memory utilization.

-- Mark

---------------------------(end of broadcast)---------------------------
TIP 5: don't forget to increase your free space map settings