Thread: Having I/O problems in simple virtualized environment

Having I/O problems in simple virtualized environment

From
Ron Arts
Date:
Hi list,

I am running PostgreSQL 8.1 (CentOS 5.7) on a VM on a single XCP (Xenserver) host.
This is a HP server with 8GB, Dual Quad Core, and 2 SATA in RAID-1.

The problem is: it's running very slow compared to running it on bare metal, and
the VM is starving for I/O bandwidht, so other processes (slow to a crawl.
This does not happen on bare metal.

I had to replace the server with a bare-metal one, I could not troubleshoot in production.
Also it was hard to emulte the workload for that VM in a test environment, so I
concentrated on PostgreSQLand why it apparently generated so much I/O.

Before I start I should confess having only spotty experience with Xen and PostgreSQL
performance testing.

I setup a test Xen server created a CentOS5.7 VM with out-of-the-box PostgreSQL and ran:
pgbench -i  pgbench ; time pgbench -t 100000 pgbench
This ran for 3:28. Then I replaced the SATA HD with an SSD disk, and reran the test.
It ran for 2:46. This seemed strange as I expected the run to finish much faster.

I reran the first test on the SATA, and looked at CPU and I/O use. The CPU was not used
too much in both the VM (30%) and in dom0 (10%). The I/O use was not much as well,
around 8MB/sec in the VM. (Couldn't use iotop in dom0, because of missing kernel support
in XCP 1.1).

It reran the second test on SSD, and experienced almost the same CPU, and I/O load.

(I now probably need to run the same test on bare metal, but didn't get to that yet,
all this already ruined my weekend.)

Now I came this far, can anybody give me some pointers? Why doesn't pgbench saturate
either the CPU or the I/O? Why does using SSD only change the performance this much?

Thanks,
Ron





Re: Having I/O problems in simple virtualized environment

From
Claudio Freire
Date:
On Sun, Jan 29, 2012 at 7:48 PM, Ron Arts <ron.arts@gmail.com> wrote:
> Hi list,
>
> I am running PostgreSQL 8.1 (CentOS 5.7) on a VM on a single XCP (Xenserver) host.
> This is a HP server with 8GB, Dual Quad Core, and 2 SATA in RAID-1.
>
> The problem is: it's running very slow compared to running it on bare metal, and
> the VM is starving for I/O bandwidht, so other processes (slow to a crawl.
> This does not happen on bare metal.

My experience with xen and postgres, which we use for testing upgrades
before doing them on production servers, never in production per-se,
is that I/O is very costly on CPU cycles because of the necessary talk
between domU and dom0.

It's is worthwhile to pin at least one core for exclusive use of the
dom0, or at least only let low-load VMs use that core. That frees up
cycles on the dom0, which is the one handling all I/O.

You'll still have lousy I/O. But it will suck a little less.

Re: Having I/O problems in simple virtualized environment

From
Jose Ildefonso Camargo Tolosa
Date:
On Sun, Jan 29, 2012 at 6:18 PM, Ron Arts <ron.arts@gmail.com> wrote:
> Hi list,
>
> I am running PostgreSQL 8.1 (CentOS 5.7) on a VM on a single XCP (Xenserver) host.
> This is a HP server with 8GB, Dual Quad Core, and 2 SATA in RAID-1.
>
> The problem is: it's running very slow compared to running it on bare metal, and
> the VM is starving for I/O bandwidht, so other processes (slow to a crawl.
> This does not happen on bare metal.
>
> I had to replace the server with a bare-metal one, I could not troubleshoot in production.
> Also it was hard to emulte the workload for that VM in a test environment, so I
> concentrated on PostgreSQLand why it apparently generated so much I/O.
>
> Before I start I should confess having only spotty experience with Xen and PostgreSQL
> performance testing.
>
> I setup a test Xen server created a CentOS5.7 VM with out-of-the-box PostgreSQL and ran:
> pgbench -i  pgbench ; time pgbench -t 100000 pgbench
> This ran for 3:28. Then I replaced the SATA HD with an SSD disk, and reran the test.
> It ran for 2:46. This seemed strange as I expected the run to finish much faster.
>
> I reran the first test on the SATA, and looked at CPU and I/O use. The CPU was not used
> too much in both the VM (30%) and in dom0 (10%). The I/O use was not much as well,
> around 8MB/sec in the VM. (Couldn't use iotop in dom0, because of missing kernel support
> in XCP 1.1).
>
> It reran the second test on SSD, and experienced almost the same CPU, and I/O load.
>
> (I now probably need to run the same test on bare metal, but didn't get to that yet,
> all this already ruined my weekend.)
>
> Now I came this far, can anybody give me some pointers? Why doesn't pgbench saturate
> either the CPU or the I/O? Why does using SSD only change the performance this much?

Ok, one point: Which IO scheduler are you using?  (on dom0 and on the VM).

Re: Having I/O problems in simple virtualized environment

From
Ron Arts
Date:
Op 30-01-12 02:52, Jose Ildefonso Camargo Tolosa schreef:
> On Sun, Jan 29, 2012 at 6:18 PM, Ron Arts <ron.arts@gmail.com> wrote:
>> Hi list,
>>
>> I am running PostgreSQL 8.1 (CentOS 5.7) on a VM on a single XCP (Xenserver) host.
>> This is a HP server with 8GB, Dual Quad Core, and 2 SATA in RAID-1.
>>
>> The problem is: it's running very slow compared to running it on bare metal, and
>> the VM is starving for I/O bandwidht, so other processes (slow to a crawl.
>> This does not happen on bare metal.
>>
>> I had to replace the server with a bare-metal one, I could not troubleshoot in production.
>> Also it was hard to emulte the workload for that VM in a test environment, so I
>> concentrated on PostgreSQLand why it apparently generated so much I/O.
>>
>> Before I start I should confess having only spotty experience with Xen and PostgreSQL
>> performance testing.
>>
>> I setup a test Xen server created a CentOS5.7 VM with out-of-the-box PostgreSQL and ran:
>> pgbench -i  pgbench ; time pgbench -t 100000 pgbench
>> This ran for 3:28. Then I replaced the SATA HD with an SSD disk, and reran the test.
>> It ran for 2:46. This seemed strange as I expected the run to finish much faster.
>>
>> I reran the first test on the SATA, and looked at CPU and I/O use. The CPU was not used
>> too much in both the VM (30%) and in dom0 (10%). The I/O use was not much as well,
>> around 8MB/sec in the VM. (Couldn't use iotop in dom0, because of missing kernel support
>> in XCP 1.1).
>>
>> It reran the second test on SSD, and experienced almost the same CPU, and I/O load.
>>
>> (I now probably need to run the same test on bare metal, but didn't get to that yet,
>> all this already ruined my weekend.)
>>
>> Now I came this far, can anybody give me some pointers? Why doesn't pgbench saturate
>> either the CPU or the I/O? Why does using SSD only change the performance this much?
>
> Ok, one point: Which IO scheduler are you using?  (on dom0 and on the VM).

Ok, first dom0:

For the SSD (hda):
# cat /sys/block/sda/queue/scheduler
[noop] anticipatory deadline cfq

For the SATA:
# cat /sys/block/sdb/queue/scheduler
noop anticipatory deadline [cfq]

Then in the VM:

# cat /sys/block/xvda/queue/scheduler
[noop] anticipatory deadline cfq

Ron

Re: Having I/O problems in simple virtualized environment

From
Jose Ildefonso Camargo Tolosa
Date:
On Mon, Jan 30, 2012 at 3:11 AM, Ron Arts <ron.arts@gmail.com> wrote:
> Op 30-01-12 02:52, Jose Ildefonso Camargo Tolosa schreef:
>> On Sun, Jan 29, 2012 at 6:18 PM, Ron Arts <ron.arts@gmail.com> wrote:
>>> Hi list,
>>>
>>> I am running PostgreSQL 8.1 (CentOS 5.7) on a VM on a single XCP (Xenserver) host.
>>> This is a HP server with 8GB, Dual Quad Core, and 2 SATA in RAID-1.
>>>
>>> The problem is: it's running very slow compared to running it on bare metal, and
>>> the VM is starving for I/O bandwidht, so other processes (slow to a crawl.
>>> This does not happen on bare metal.
>>>
>>> I had to replace the server with a bare-metal one, I could not troubleshoot in production.
>>> Also it was hard to emulte the workload for that VM in a test environment, so I
>>> concentrated on PostgreSQLand why it apparently generated so much I/O.
>>>
>>> Before I start I should confess having only spotty experience with Xen and PostgreSQL
>>> performance testing.
>>>
>>> I setup a test Xen server created a CentOS5.7 VM with out-of-the-box PostgreSQL and ran:
>>> pgbench -i  pgbench ; time pgbench -t 100000 pgbench
>>> This ran for 3:28. Then I replaced the SATA HD with an SSD disk, and reran the test.
>>> It ran for 2:46. This seemed strange as I expected the run to finish much faster.
>>>
>>> I reran the first test on the SATA, and looked at CPU and I/O use. The CPU was not used
>>> too much in both the VM (30%) and in dom0 (10%). The I/O use was not much as well,
>>> around 8MB/sec in the VM. (Couldn't use iotop in dom0, because of missing kernel support
>>> in XCP 1.1).
>>>
>>> It reran the second test on SSD, and experienced almost the same CPU, and I/O load.
>>>
>>> (I now probably need to run the same test on bare metal, but didn't get to that yet,
>>> all this already ruined my weekend.)
>>>
>>> Now I came this far, can anybody give me some pointers? Why doesn't pgbench saturate
>>> either the CPU or the I/O? Why does using SSD only change the performance this much?
>>
>> Ok, one point: Which IO scheduler are you using?  (on dom0 and on the VM).
>
> Ok, first dom0:
>
> For the SSD (hda):
> # cat /sys/block/sda/queue/scheduler
> [noop] anticipatory deadline cfq

Use deadline.

>
> For the SATA:
> # cat /sys/block/sdb/queue/scheduler
> noop anticipatory deadline [cfq]

Use deadline too (this is specially true if sdb is a raid array).

>
> Then in the VM:
>
> # cat /sys/block/xvda/queue/scheduler
> [noop] anticipatory deadline cfq

Should be ok for the VM.