Re: Postgresql in a Virtual Machine - Mailing list pgsql-performance

From Xenofon Papadopoulos
Subject Re: Postgresql in a Virtual Machine
Date
Msg-id CANL7jAQwPwKeH1zJAO24225-w+o_961=TzjrRNwx+WqP7VQqZA@mail.gmail.com
Whole thread Raw
In response to Re: Postgresql in a Virtual Machine  (Merlin Moncure <mmoncure@gmail.com>)
Responses Re: Postgresql in a Virtual Machine
List pgsql-performance
We have been running several Postgres databases on VMs for the last 9 months. The largest one currently has a few hundreds of millions of rows (~1.5T of data, ~100G of frequently queried data ) and performs at ~1000 tps. Most of our transactions are part of a 2PC, which effectively results to high I/O as asynchronous commit is disabled. 

Main benefits so far:

- ESXi HA makes high availability completely transparent and reduces the number of failover servers (we're running N+1 clusters)

- Our projects' load can often miss our expectations, and it changes over the time. Scaling up/down has helped us cope.

- Live relocation of databases helps with hardware upgrades and spreading of load.

Main issues:

- We are not overprovisioning at all (using virtualization exclusively for the management benefits), so we don't know its impact to performance.

- I/O has often been a bottleneck. We are not certain whether this is due to the impact of virtualization or due to mistakes in our sizing and  configuration. So far we have been coping by spreading the load across more spindles and by increasing the memory.





On Tue, Nov 26, 2013 at 1:26 AM, Merlin Moncure <mmoncure@gmail.com> wrote:
On Mon, Nov 25, 2013 at 4:57 PM, David Lang <david@lang.hm> wrote:
> On Mon, 25 Nov 2013, Merlin Moncure wrote:
>
>> On Mon, Nov 25, 2013 at 2:01 PM, Lee Nguyen <leemobile@gmail.com> wrote:
>>>
>>> Hi,
>>>
>>> Having attended a few PGCons, I've always heard the remark from a few
>>> presenters and attendees that Postgres shouldn't be run inside a VM. That
>>> bare metal is the only way to go.
>>>
>>> Here at work we were entertaining the idea of running our Postgres
>>> database
>>> on our VM farm alongside our application vm's.  We are planning to run a
>>> few
>>> Postgres synchronous replication nodes.
>>>
>>> Why shouldn't we run Postgres in a VM?  What are the downsides? Does
>>> anyone
>>> have any metrics or benchmarks with the latest Postgres?
>>
>>
>> Unfortunately (and it really pains me to say this) we live in an
>> increasingly virtualized world and we just have to go ahead and deal
>> with it.  I work at a mid cap company and we have a zero tolerance
>> policy in terms of applications targeting hardware: in short, you
>> can't.  VMs have downsides: you get less performance per buck and have
>> another thing to fail but the administration advantages are compelling
>> especially for large environments.  Furthermore, for any size company
>> it makes less sense to run your own data center with each passing day;
>> the cloud providers are really bringing up their game. This is
>> economic specialization at work.
>
>
> being pedantic, you can get almost all the management benefits on bare
> metal, and you can rent bare metal from hosting providors, cloud VMs are not
> the only option. 'Cloud' makes sense if you have a very predictably spiky
> load and you can add/remove machines to meet that load, but if you end up
> needing to have the machines running a significant percentage of the time,
> dedicated boxes are cheaper (as well as faster)

Well, that depends on how you define 'most'.  The thing is for me is
that for machines around the office (just like with people) about 10%
of them do 90% of the work.  Being able to slide them around based on
that (sometime changing) need is a tremendous time and cost saver.
For application and infrastructure development dealing with hardware
is just a distraction.   I'd rather click on some interface and say,
'this application needs 25k iops guaranteed' and then make a cost
driven decision on software optimization.  It's hard to let go after
decades of hardware innovation (the SSD revolution was the final shoe
to drop) but for me the time has finally come.  As recently as a year
ago I was arguing databases needed to be run against metal.

merlin


--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

pgsql-performance by date:

Previous
From: Merlin Moncure
Date:
Subject: Re: Postgresql in a Virtual Machine
Next
From: David Lang
Date:
Subject: Re: Postgresql in a Virtual Machine