Thread: "Mysterious" issues with newly installed 8.3
Ok, I know that such an open and vague question like this one is... well, open and vague... But still. The short story: Just finished an 8.3.4 installation on a new machine, to replace an existing one; the new machine is superior (i.e., higher performance) in virtually every way --- twice as much memory, faster processor, faster drives, etc. I made an exact copy of the existing database on the new machine, and the exact same queries run on both reveal that the old machine beats the new one by a factor of close to 2 !!!! (i.e., the same queries run close to twice as fast on the old machine!!!) To make things worse: the old machine is in operation, under normal workload (and right now the system may be around peak time), and the new machine is there sitting doing nothing; just one user logged in using psql to run the queries --- *no-one and nothing* is connecting to the new server. So... What's going on??? The details: CPU: New: Opteron DC 1218HE (1MB cache per core) @2.6GHz Old: Athlon64 X2 (512K cache per core) @2.2GHz RAM: New: 4GB Old: 2GB HD: Doesn't matter the capacity, but I have every reason to believe the new one is faster --- hdparm reports 105MB/sec transfer rate; the measurement for the old server is meaningless, since it is in operation (i.e., there is actual database activity), so it measures between 50MB/sec and 70MB/sec. Given its age, I would estimate 70 to 80 MB/sec OS: New: CentOS 5.2 (gcc 4.1.2) Old: FC6 (gcc 4.1.2) PG: New: 8.3.4 installed from source Old: 8.2.4 installed from source Presumably relevant configuration parameters --- shared_buffers was set to 250MB on the old one; I set it to 500MB on the new one (kinda makes sense, no? 1/8 of the physical memory in both cases). I set max_fsm_pages a little bit higher on the new one (409600 instead of 307200 on the old one). The rest is pretty much identical (except for the autovacuum --- I left the defaults in the new one) The old machine is vacuum-analyzed once a day (around 4AM); on the new one, I ran a vacuumdb -z -f after populating it. Some interesting outputs: explain analyze select count(*) from users; New: Aggregate (cost=8507.11..8507.12 rows=1 width=0) (actual time=867.582..867.584 rows=1 loops=1) -> Seq Scan on users (cost=0.00..7964.49 rows=217049 width=0) (actual time=0.016..450.560 rows=217049 loops=1) Total runtime: 867.744 ms Old: Aggregate (cost=17171.22..17171.22 rows=1 width=0) (actual time=559.475..559.476 rows=1 loops=1) -> Seq Scan on users (cost=0.00..16628.57 rows=217057 width=0) (actual time=0.009..303.026 rows=217107 loops=1) Total runtime: 559.536 ms Running the same command again several times practically does not change anything. explain analyze select count(*) from users where username like 'A%'; New: Aggregate (cost=6361.28..6361.29 rows=1 width=0) (actual time=87.528..87.530 rows=1 loops=1) -> Bitmap Heap Scan on users (cost=351.63..6325.33 rows=14376 width=0) (actual time=6.444..53.426 rows=17739 loops=1) Filter: ((username)::text ~~ 'a%'::text) -> Bitmap Index Scan on c_username_unique (cost=0.00..348.04 rows=14376 width=0) (actual time=5.383..5.383 rows=17739 loops=1) Index Cond: (((username)::text >= 'a'::text) AND ((username)::text < 'b'::text)) Total runtime: 87.638 ms Old: Aggregate (cost=13188.91..13188.92 rows=1 width=0) (actual time=61.743..61.745 rows=1 loops=1) -> Bitmap Heap Scan on users (cost=392.07..13157.75 rows=12466 width=0) (actual time=7.433..40.847 rows=17747 loops=1) Filter: ((username)::text ~~ 'a%'::text) -> Bitmap Index Scan on c_username_unique (cost=0.00..388.96 rows=12466 width=0) (actual time=5.652..5.652 rows=17861 loops=1) Index Cond: (((username)::text >= 'a'::character varying) AND ((username)::text < 'b'::character varying)) Total runtime: 61.824 ms Any ideas?
On Thu, Oct 9, 2008 at 4:51 PM, Carlos Moreno <morenopg@mochima.com> wrote:
First, use iostat or another tool to view the disk usage on the new machine during these queries and validate that it is not using the disk at all. This is most likely the case.
Then, to be sure, set its config parameters to be equal to the old one, and turn off auto-vacuum. This will also most likely have no effect.
Once this is confirmed, we can be pretty sure that the issue is restricted to:
CPU / RAM / Motherboard on the hardware side. There may still be some software effects in the OS or drivers, or PostgreSQL to account for, but lets drill into the hardware and try and eliminate that first.
Sure, the processor should be faster, but Athlon64s / Opterons are very sensitive to the RAM used and its performance and tuning.
So, you should find some basic CPU benchmarks and RAM benchmarks -- you'll want to measure latency as well as bandwidth.
Athlon64 and Opteron both typically have two memory busses per processor, and it is possible to populate the memory banks in such a way that the system has half the bandwidth.
In any event, you'll first want to identify if simple benchmark software is able to prove a disparity between the systems independant of postgres. This may be a bit difficult to do on the live system however.
But it is my suspicion that Postgres performance is often more dependant on the memory subsystem performance than the CPU Mhz (as are most databases) and poor components, configuration, or tuning on that side would show up in queries like the examples here.
Ok, I know that such an open and vague question like this one
is... well, open and vague... But still.
The short story:
Just finished an 8.3.4 installation on a new machine, to replace
an existing one; the new machine is superior (i.e., higher
performance) in virtually every way --- twice as much memory,
faster processor, faster drives, etc.
I made an exact copy of the existing database on the new
machine, and the exact same queries run on both reveal that
the old machine beats the new one by a factor of close to 2 !!!!
(i.e., the same queries run close to twice as fast on the old
machine!!!)
To make things worse: the old machine is in operation, under
normal workload (and right now the system may be around
peak time), and the new machine is there sitting doing nothing;
just one user logged in using psql to run the queries --- *no-one
and nothing* is connecting to the new server.
So... What's going on???
The details:
CPU:
New: Opteron DC 1218HE (1MB cache per core) @2.6GHz
Old: Athlon64 X2 (512K cache per core) @2.2GHz
RAM:
New: 4GB
Old: 2GB
HD:
Doesn't matter the capacity, but I have every reason to believe
the new one is faster --- hdparm reports 105MB/sec transfer
rate; the measurement for the old server is meaningless, since
it is in operation (i.e., there is actual database activity), so it
measures between 50MB/sec and 70MB/sec. Given its age, I
would estimate 70 to 80 MB/sec
OS:
New: CentOS 5.2 (gcc 4.1.2)
Old: FC6 (gcc 4.1.2)
PG:
New: 8.3.4 installed from source
Old: 8.2.4 installed from source
Presumably relevant configuration parameters --- shared_buffers
was set to 250MB on the old one; I set it to 500MB on the new
one (kinda makes sense, no? 1/8 of the physical memory in both
cases).
I set max_fsm_pages a little bit higher on the new one (409600
instead of 307200 on the old one). The rest is pretty much
identical (except for the autovacuum --- I left the defaults in the
new one)
The old machine is vacuum-analyzed once a day (around 4AM);
on the new one, I ran a vacuumdb -z -f after populating it.
Some interesting outputs:
explain analyze select count(*) from users;
New:
Aggregate (cost=8507.11..8507.12 rows=1 width=0) (actual
time=867.582..867.584 rows=1 loops=1)
-> Seq Scan on users (cost=0.00..7964.49 rows=217049 width=0)
(actual time=0.016..450.560 rows=217049 loops=1)
Total runtime: 867.744 ms
Old:
Aggregate (cost=17171.22..17171.22 rows=1 width=0) (actual
time=559.475..559.476 rows=1 loops=1)
-> Seq Scan on users (cost=0.00..16628.57 rows=217057 width=0)
(actual time=0.009..303.026 rows=217107 loops=1)
Total runtime: 559.536 ms
Running the same command again several times practically
does not change anything.
explain analyze select count(*) from users where username like 'A%';
New:
Aggregate (cost=6361.28..6361.29 rows=1 width=0) (actual
time=87.528..87.530 rows=1 loops=1)
-> Bitmap Heap Scan on users (cost=351.63..6325.33 rows=14376
width=0) (actual time=6.444..53.426 rows=17739 loops=1)
Filter: ((username)::text ~~ 'a%'::text)
-> Bitmap Index Scan on c_username_unique (cost=0.00..348.04
rows=14376 width=0) (actual time=5.383..5.383 rows=17739 loops=1)
Index Cond: (((username)::text >= 'a'::text) AND
((username)::text < 'b'::text))
Total runtime: 87.638 ms
Old:
Aggregate (cost=13188.91..13188.92 rows=1 width=0) (actual
time=61.743..61.745 rows=1 loops=1)
-> Bitmap Heap Scan on users (cost=392.07..13157.75 rows=12466
width=0) (actual time=7.433..40.847 rows=17747 loops=1)
Filter: ((username)::text ~~ 'a%'::text)
-> Bitmap Index Scan on c_username_unique (cost=0.00..388.96
rows=12466 width=0) (actual time=5.652..5.652 rows=17861 loops=1)
Index Cond: (((username)::text >= 'a'::character varying)
AND ((username)::text < 'b'::character varying))
Total runtime: 61.824 ms
Any ideas?
--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance
First, use iostat or another tool to view the disk usage on the new machine during these queries and validate that it is not using the disk at all. This is most likely the case.
Then, to be sure, set its config parameters to be equal to the old one, and turn off auto-vacuum. This will also most likely have no effect.
Once this is confirmed, we can be pretty sure that the issue is restricted to:
CPU / RAM / Motherboard on the hardware side. There may still be some software effects in the OS or drivers, or PostgreSQL to account for, but lets drill into the hardware and try and eliminate that first.
Sure, the processor should be faster, but Athlon64s / Opterons are very sensitive to the RAM used and its performance and tuning.
So, you should find some basic CPU benchmarks and RAM benchmarks -- you'll want to measure latency as well as bandwidth.
Athlon64 and Opteron both typically have two memory busses per processor, and it is possible to populate the memory banks in such a way that the system has half the bandwidth.
In any event, you'll first want to identify if simple benchmark software is able to prove a disparity between the systems independant of postgres. This may be a bit difficult to do on the live system however.
But it is my suspicion that Postgres performance is often more dependant on the memory subsystem performance than the CPU Mhz (as are most databases) and poor components, configuration, or tuning on that side would show up in queries like the examples here.
Scott Carey wrote: > On Thu, Oct 9, 2008 at 4:51 PM, Carlos Moreno <morenopg@mochima.com > <mailto:morenopg@mochima.com>> wrote: > > > Ok, I know that such an open and vague question like this one > is... well, open and vague... But still. > > The short story: > > Just finished an 8.3.4 installation on a new machine, to replace > an existing one; the new machine is superior (i.e., higher > performance) in virtually every way --- twice as much memory, > faster processor, faster drives, etc. > > I made an exact copy of the existing database on the new > machine, and the exact same queries run on both reveal that > the old machine beats the new one by a factor of close to 2 !!!! > (i.e., the same queries run close to twice as fast on the old > machine!!!) > > To make things worse: the old machine is in operation, under > normal workload (and right now the system may be around > peak time), and the new machine is there sitting doing nothing; > just one user logged in using psql to run the queries --- *no-one > and nothing* is connecting to the new server. > > So... What's going on??? Did you do an ANALYZE on the new database after you cloned it? I was suprised by this too, that after doing a pg_dump/pg_restore,the performance sucked. But it was simply because the new database had no statistics yet. Craig
The first thing I'd try is installing 8.2 on the new server to see if the problem is the server or postgresql. Set up the new server and new pgsql install the same and see how it runs.
On Thu, 9 Oct 2008, Craig James wrote: > Did you do an ANALYZE on the new database after you cloned it? He ran "vacuumdb -f -z", the -z does an analyze. Also, he's getting nearly identical explain plans out of the two systems, which suggests the stats are similar enough in the two cases. -- * Greg Smith gsmith@gregsmith.com http://www.gregsmith.com Baltimore, MD
On Thu, 9 Oct 2008, Scott Carey wrote: > Sure, the processor should be faster, but Athlon64s / Opterons are very > sensitive to the RAM used and its performance and tuning. So, you should > find some basic CPU benchmarks and RAM benchmarks -- you'll want to > measure latency as well as bandwidth. Athlon64 and Opteron both > typically have two memory busses per processor, and it is possible to > populate the memory banks in such a way that the system has half the > bandwidth. This is really something to watch out for. One quick thing first though: what frequency does the CPU on the new server show when you look at /proc/cpuinfo? If you see "cpu MHz: 1000.00" you probably are throttling the CPU down hard with power management which was cause the slowness you describe. In that case I'd suggest editing /etc/sysconfig/cpuspeed and changing "GOVERNER=performance". Back to memory. What I do with any new, untrusted system is boot with a memtest86+ CD is take a look at the memory speed information it shows, with the most important number being the uncached RAM speed. If you're not running in dual-channel mode and at the maximum frequency the RAM supports, that can run seriously slow things down. You probably can't take down the production server for comparison. I can tell you that on my little Athlon X2@2.4GHz server, I've seen the memtest86+ reported raw memory speed run anywhere from 2093MB/s (with crummy DDR2 667 that doesn't match the CPU bus frequency very well) to 3367MB/s (using good DDR2 800). You should see even better from your Opteron system. Another really handy way to gauge memory speed on Linux, if there are similar kernels installed on each system like your case, is to use "hdparm -T". That cached read figure is highly correlated with overall memory performance. The nice part about that is you can probably get a useful comparison result from the old server if you run that a bunch of times even with other activity (just take the highest number you ever see), whereas memtest86+ requires some downtime. Those numbers are lower than the I'd expect around 2500MB/s out of your new server here (that's what I got when I just tested an Opteron 2220 system @2.8GHz using the RHEL5 hdparm -T). -- * Greg Smith gsmith@gregsmith.com http://www.gregsmith.com Baltimore, MD
Thanks Greg and others for your replies, > This is really something to watch out for. One quick thing first > though: what frequency does the CPU on the new server show when you > look at /proc/cpuinfo? If you see "cpu MHz: 1000.00" It was like that in the initial setup --- I just disabled the cpuspeed service (service cpuspeed stop; chkconfig cpuspeed off ), and now it shows the full 2600MHz at all times (the installation of PG was done after this change) > Another really handy way to gauge memory speed on Linux, if there are > similar kernels installed on each system like your case, is to use > "hdparm -T". Great tip! I was familiar with the -T switch, but was not clear on the notion that the figure tells you that much about the overall memory performance! Anyway, I checked on both, and the new system is slightly superior (around 2200 for the new, around 1900 for the old one) --- a bit below the figure you mention you'd expect (2500 --- though that was for a 2.8GHz Opteron, presumably with faster FSB and faster memory??) I guess my logical next step is what was suggested by Scott --- install 8.2.4 and repeat the same tests with this one; that should give me interesting information. Anyway, if I find something interesting or puzzling, I would post again with the results of those tests. Thanks again for the valuable advice and comments! Carlos --
On Mon, Oct 13, 2008 at 8:55 AM, Carlos Moreno <morenopg@mochima.com> wrote: > I guess my logical next step is what was suggested by Scott --- install > 8.2.4 and repeat the same tests with this one; that should give me > interesting information. I'd suggest updating to the latest 8.2.x update as well. Not for performance tuning reasons but to make sure you're data's not at risk etc... I think there's a good year or more of updates missing from the 8.2.4 branch.
Scott Marlowe wrote: > On Mon, Oct 13, 2008 at 8:55 AM, Carlos Moreno <morenopg@mochima.com> wrote: > >> I guess my logical next step is what was suggested by Scott --- install >> 8.2.4 and repeat the same tests with this one; that should give me >> interesting information. >> > > I'd suggest updating to the latest 8.2.x update as well. Not for > performance tuning reasons but to make sure you're data's not at risk > etc... I think there's a good year or more of updates missing from the > 8.2.4 branch. > Of course --- but do keep in mind that the reason for this was to do a meanigful comparison; SQLs being run and clicked on an 8.2.4 system vs. the same SQLs being run on a different hardware with the exact same software. If for some reason I conclude that the 8.2 seems to offer better performance with the given hardware, then of course I would go with the latest 8.2.x ... If you're referring to the existing installation, well, yeah, I've been meaning to upgrade it, but I guess now that we are going with a hardware upgrade as well, then the software upgrade will be a side-effect of the maneuver. Thanks, Carlos --
On Mon, 13 Oct 2008, Carlos Moreno wrote: >> Another really handy way to gauge memory speed on Linux, if there are >> similar kernels installed on each system like your case, is to use >> "hdparm -T". > > Great tip! I was familiar with the -T switch, but was not clear on the > notion that the figure tells you that much about the overall memory > performance! I wouldn't go so far as to say it tells you *much* about it, but it does give a fairly useful comparison figure if the kernels are basically the same and can help spot gross errors. I used it a bunch when I was tinkering with DDR speeds and such earlier this year, it correlated fairly well with other memory bandwidth measurements within the same processor family (so far my tests suggestion results are much higher per clock on Intel CPUs). Certainly of no use for comparison if one system has a 32-bit kernel and the other 64, and results will vary depending on general kernel configuration (I get very different results from seemingly similar RedHat and Ubuntu kernels on the same system for example). It sounds like your CPU and memory setup are all fine, which leaves your mystery open. Please let us know if you find anything interesting out, normally an 8.3 upgrade would run faster so your situation is a bit curious. The only other benchmark I'd suggest just as a sanity check is bonnie++, that's a bit more thorough than what hdparm -t reports. -- * Greg Smith gsmith@gregsmith.com http://www.gregsmith.com Baltimore, MD