Thread: Fun little performance IMPROVEMENT...
I was doing a little testing to see how machine load affected the performance of different types of queries, index range scans, hash joins, full scans, a mix, etc. In order to do this, I isolated different performance hits, spinning only CPU, loading the disk to create high I/O wait states, and using most of the physical memory. This was on a 4 CPU Xen virtual machine running 8.1.22 on CENTOS. Here is the fun part. When running 8 threads spinning calculating square roots (using the stress package), the full scan returned consistently 60% faster than the machine with no load. It was returning 44,000 out of 5,000,000 rows. Here is the explain analyze. I am hoping that this triggers something (I can run more tests as needed) that can help us make it always better. Idling: QUERY PLAN ---------------------------------------------------------------------------------------------------------------------------- Seq Scan on schedule_details (cost=0.00..219437.90 rows=81386 width=187) (actual time=0.053..2915.966 rows=44320 loops=1) Filter: (schedule_type = '5X'::bpchar) Total runtime: 2986.764 ms Loaded: QUERY PLAN ---------------------------------------------------------------------------------------------------------------------------- Seq Scan on schedule_details (cost=0.00..219437.90 rows=81386 width=187) (actual time=0.034..1698.068 rows=44320 loops=1) Filter: (schedule_type = '5X'::bpchar) Total runtime: 1733.084 ms
On 1/21/2011 12:12 PM, grant@amadensor.com wrote: > I was doing a little testing to see how machine load affected the > performance of different types of queries, index range scans, hash joins, > full scans, a mix, etc. > > In order to do this, I isolated different performance hits, spinning only > CPU, loading the disk to create high I/O wait states, and using most of > the physical memory. This was on a 4 CPU Xen virtual machine running > 8.1.22 on CENTOS. > > > Here is the fun part. When running 8 threads spinning calculating square > roots (using the stress package), the full scan returned consistently 60% > faster than the machine with no load. It was returning 44,000 out of > 5,000,000 rows. Here is the explain analyze. I am hoping that this > triggers something (I can run more tests as needed) that can help us make > it always better. > > Idling: > QUERY PLAN > ---------------------------------------------------------------------------------------------------------------------------- > Seq Scan on schedule_details (cost=0.00..219437.90 rows=81386 width=187) > (actual time=0.053..2915.966 rows=44320 loops=1) > Filter: (schedule_type = '5X'::bpchar) > Total runtime: 2986.764 ms > > Loaded: > QUERY PLAN > ---------------------------------------------------------------------------------------------------------------------------- > Seq Scan on schedule_details (cost=0.00..219437.90 rows=81386 width=187) > (actual time=0.034..1698.068 rows=44320 loops=1) > Filter: (schedule_type = '5X'::bpchar) > Total runtime: 1733.084 ms > Odd. Did'ja by chance run the select more than once... maybe three or four times, and always get the same (or close) results? Is the stress package running niced? -Andy
grant@amadensor.com writes: > Here is the fun part. When running 8 threads spinning calculating square > roots (using the stress package), the full scan returned consistently 60% > faster than the machine with no load. Possibly the synchronized-seqscans logic kicking in, resulting in this guy not having to do all his own I/Os. It would be difficult to make any trustworthy conclusions about performance in such cases from a view of only one process's results --- you'd need to look at the aggregate behavior to understand what's happening. regards, tom lane
grant@amadensor.com wrote: > This was on a 4 CPU Xen virtual machine running > 8.1.22 on CENTOS. > You're not going to get anyone to spend a minute trying to figure what's happening on virtual hardware with an ancient version of PostgreSQL. If this was an actual full test case against PostgreSQL 8.4 or later on a physical machine, it might be possible to draw some conclusions about it that impact current PostgreSQL development. Note where 8.1 is on http://wiki.postgresql.org/wiki/PostgreSQL_Release_Support_Policy for example. -- Greg Smith 2ndQuadrant US greg@2ndQuadrant.com Baltimore, MD PostgreSQL Training, Services, and 24x7 Support www.2ndQuadrant.us "PostgreSQL 9.0 High Performance": http://www.2ndQuadrant.com/books
> > Odd. Did'ja by chance run the select more than once... maybe three or > four times, and always get the same (or close) results? > > Is the stress package running niced? > The stress package is not running niced. I ran it initially 5 times each. It was very consistent. Initially, I just ran everything to files. Later when I looked over it, I was confused, so tried it again, several times on each, with very little deviation, and the process with the CPU stressed always being faster. The only deviation, which is understandable, was that the first run of anything after memory stress (using 7G of the available 8G). was slow as it swapped back in, so I did a swapoff/swapon to clear up swap, and still got the same results.
> grant@amadensor.com writes: >> Here is the fun part. When running 8 threads spinning calculating >> square >> roots (using the stress package), the full scan returned consistently >> 60% >> faster than the machine with no load. > > Possibly the synchronized-seqscans logic kicking in, resulting in this > guy not having to do all his own I/Os. It would be difficult to make > any trustworthy conclusions about performance in such cases from a view > of only one process's results --- you'd need to look at the aggregate > behavior to understand what's happening. > > regards, tom lane > My though was that either: 1) It was preventing some other I/O or memory intensive process from happening, opening the resources up. 2) It was keeping the machine busy from the hypervisor's point of view, preventing it from waiting for a slot on the host machine. 3) The square roots happen quickly, resulting in more yields, and therefore more time slices for my process than if the system was in its idle loop. Any way you look at it, it is fun and interesting that a load can make something unrelated happen more quickly. I will continue to try to find out why it is the case.
On 1/21/11 12:23 PM, "grant@amadensor.com" <grant@amadensor.com> wrote: >> grant@amadensor.com writes: >>> Here is the fun part. When running 8 threads spinning calculating >>> square >>> roots (using the stress package), the full scan returned consistently >>> 60% >>> faster than the machine with no load. >> >> Possibly the synchronized-seqscans logic kicking in, resulting in this >> guy not having to do all his own I/Os. It would be difficult to make >> any trustworthy conclusions about performance in such cases from a view >> of only one process's results --- you'd need to look at the aggregate >> behavior to understand what's happening. >> >> regards, tom lane >> >My though was that either: > >1) It was preventing some other I/O or memory intensive process from >happening, opening the resources up. >2) It was keeping the machine busy from the hypervisor's point of view, >preventing it from waiting for a slot on the host machine. My guess is its something hypervisor related. If this happened on direct hardware I'd be more surprised. Hypervisors have all sorts of stuff going on, like throttling the number of CPU cycles a vm gets. In your idle case, your VM might effectively occupy 1Ghz of a CPU, but 2Ghz in the loaded case. >3) The square roots happen quickly, resulting in more yields, and >therefore more time slices for my process than if the system was in its >idle loop. > >Any way you look at it, it is fun and interesting that a load can make >something unrelated happen more quickly. I will continue to try to find >out why it is the case. > > > >-- >Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) >To make changes to your subscription: >http://www.postgresql.org/mailpref/pgsql-performance
> > Odd. Did'ja by chance run the select more than once... maybe three or > four times, and always get the same (or close) results? > > Is the stress package running niced? > > -Andy > I got a little crazy, and upgraded the DB to 8.4.5. It still reacts the same. I am hoping someone has an idea of a metric I can run to see why it is different.
> > My guess is its something hypervisor related. If this happened on direct > hardware I'd be more surprised. Hypervisors have all sorts of stuff going > on, like throttling the number of CPU cycles a vm gets. In your idle > case, your VM might effectively occupy 1Ghz of a CPU, but 2Ghz in the > loaded case. > I will be building a new machine this weekend on bare hardware. It won't be very big on specs, but this is only 5 million rows, so it should be fine. I will try it there.
On 21/01/2011 19:12, grant@amadensor.com wrote: > I was doing a little testing to see how machine load affected the > performance of different types of queries, index range scans, hash joins, > full scans, a mix, etc. > > In order to do this, I isolated different performance hits, spinning only > CPU, loading the disk to create high I/O wait states, and using most of > the physical memory. This was on a 4 CPU Xen virtual machine running > 8.1.22 on CENTOS. > > > Here is the fun part. When running 8 threads spinning calculating square > roots (using the stress package), the full scan returned consistently 60% > faster than the machine with no load. It was returning 44,000 out of > 5,000,000 rows. Here is the explain analyze. I am hoping that this > triggers something (I can run more tests as needed) that can help us make > it always better. Looks like a virtualization artifact. Here's a list of some such noticed artifacts: http://wiki.freebsd.org/WhyNotBenchmarkUnderVMWare > > Idling: > QUERY PLAN > ---------------------------------------------------------------------------------------------------------------------------- > Seq Scan on schedule_details (cost=0.00..219437.90 rows=81386 width=187) > (actual time=0.053..2915.966 rows=44320 loops=1) > Filter: (schedule_type = '5X'::bpchar) > Total runtime: 2986.764 ms > > Loaded: > QUERY PLAN > ---------------------------------------------------------------------------------------------------------------------------- > Seq Scan on schedule_details (cost=0.00..219437.90 rows=81386 width=187) > (actual time=0.034..1698.068 rows=44320 loops=1) > Filter: (schedule_type = '5X'::bpchar) > Total runtime: 1733.084 ms In this case it looks like the IO generated by the VM is causing the Hypervisor to frequently "sleep" the machine while waiting for the IO, but if the machine is also generating CPU load, it is not put to sleep as often.