Thread: Is this way of testing a bad idea?
I am evaluating PostgreSQL as a candiate to cooperate with a java application. Performance test set up: Only one table in the database schema. The tables contains a bytea column plus some other columns. The PostgreSQL server runs on Linux. Test execution: The java application connects throught TCP/IP (jdbc) and performs 50000 inserts. Result: Monitoring the processes using top reveals that the total amount of memory used slowly increases during the test. When reaching insert number 40000, or somewhere around that, memory is exhausted, and the the systems begins to swap. Each of the postmaster processes seem to use a constant amount of memory, but the total memory usage increases all the same. Questions: Is this way of testing the performance a bad idea? Actual database usage will be a mixture of inserts and queries. Maybe the test should behave like that instead, but I wanted to keep things simple. Why is the memory usage slowly increasing during the whole test? Is there a way of keeping PostgreSQL from exhausting memory during the test? I have looked for some fitting parameters to used, but I am probably to much of a novice to understand which to choose. Thanks in advance, Fredrik Israelsson
"Fredrik Israelsson" <fredrik.israelsson@eu.biotage.com> writes: > Monitoring the processes using top reveals that the total amount of > memory used slowly increases during the test. When reaching insert > number 40000, or somewhere around that, memory is exhausted, and the the > systems begins to swap. Each of the postmaster processes seem to use a > constant amount of memory, but the total memory usage increases all the > same. That statement is basically nonsense. If there is a memory leak then you should be able to pin it on some specific process. What's your test case exactly, and what's your basis for asserting that the system starts to swap? We've seen people fooled by the fact that some versions of ps report a process's total memory size as including whatever pages of Postgres' shared memory area the process has actually chanced to touch. So as a backend randomly happens to use different shared buffers its reported memory size grows ... but there's no actual leak, and no reason why the system would start to swap. (Unless maybe you've set an unreasonably high shared_buffers setting?) Another theory is that you're watching free memory go to zero because the kernel is filling free memory with copies of disk pages. This is not a leak either. Zero free memory is the normal, expected state of a Unix system that's been up for any length of time. regards, tom lane
> Monitoring the processes using top reveals that the total amount of > memory used slowly increases during the test. When reaching insert > number 40000, or somewhere around that, memory is exhausted, and the the > systems begins to swap. Each of the postmaster processes seem to use a > constant amount of memory, but the total memory usage increases all the > same. So . . . . what's using the memory? It doesn't sound like PG is using it, so is it your Java app? If it's the Java app, then it could be that your code isn't remembering to do things like close statements, or perhaps the max heap size is set too large for your hardware. With early RHEL3 kernels there was also a quirky interaction with Sun's JVM where the system swaps itself to death even when less than half the physical memory is in use. If its neither PG nor Java, then perhaps you're misinterpreting the results of top. Remember that the "free" memory on a properly running Unix box that's been running for a while should hover just a bit above zero due to normal caching; read up on the 'free' command to see the actual memory utilization. -- Mark
Also, as Tom stated, defining your test cases is a good idea before you start benchmarking. Our application has a load data phase, then a query/active use phase. So, we benchmark both (data loads, and then transactions) since they're quite different workloads, and there's different ways to optimize for each. For bulk loads, I would look into either batching several inserts into one transaction or the copy command. Do some testing here to figure out what works best for your hardware/setup (for example, we usually batch several thousand inserts together for a pretty dramatic increase in performance). There's usually a sweet spot in there depending on how your WAL is configured and other concurrent activity. Also, when testing bulk loads, be careful to setup a realistic test. If your application requires foreign keys and indexes, these can significantly slow down bulk inserts. There's several optimizations- check the mailing lists and the manual. And lastly, when you're loading tons of data, as previously pointed out, the normal state of the system is to be heavily utilized (in fact, I would think this is ideal since you know you're making full use of your hardware). HTH, Bucky -----Original Message----- From: pgsql-performance-owner@postgresql.org [mailto:pgsql-performance-owner@postgresql.org] On Behalf Of Mark Lewis Sent: Thursday, August 24, 2006 9:40 AM To: Fredrik Israelsson Cc: pgsql-performance@postgresql.org Subject: Re: [PERFORM] Is this way of testing a bad idea? > Monitoring the processes using top reveals that the total amount of > memory used slowly increases during the test. When reaching insert > number 40000, or somewhere around that, memory is exhausted, and the the > systems begins to swap. Each of the postmaster processes seem to use a > constant amount of memory, but the total memory usage increases all the > same. So . . . . what's using the memory? It doesn't sound like PG is using it, so is it your Java app? If it's the Java app, then it could be that your code isn't remembering to do things like close statements, or perhaps the max heap size is set too large for your hardware. With early RHEL3 kernels there was also a quirky interaction with Sun's JVM where the system swaps itself to death even when less than half the physical memory is in use. If its neither PG nor Java, then perhaps you're misinterpreting the results of top. Remember that the "free" memory on a properly running Unix box that's been running for a while should hover just a bit above zero due to normal caching; read up on the 'free' command to see the actual memory utilization. -- Mark ---------------------------(end of broadcast)--------------------------- TIP 5: don't forget to increase your free space map settings