Re: large database - Mailing list pgsql-general

From Jan Kesten
Subject Re: large database
Date
Msg-id 50C6EA6F.1050903@dafuer.de
Whole thread Raw
In response to large database  ("Mihai Popa" <mihai@lattica.com>)
Responses Re: large database  (Johannes Lochmann <johannes.lochmann@gmail.com>)
List pgsql-general
Hi Mihai.

> We are now at the point where the csv files are all created and amount
> to some 300 GB of data.

> I would like to get some advice on the best deployment option.

First - and maybe best - advice: Do some testing on your own and plan
some time for this.

> First, the project has been started using MySQL. Is it worth switching
> to Postgres and if so, which version should I use?

When switching to PostgreSQL I would recommend to use the latest stable
version. But your project is already running in MySQL - are there issues
you expect to solve with switching to another database system? If not:
why switching?

> Second, where should I deploy it? The cloud or a dedicated box?

Given 1TB of storage, the x-large instance and 10000 provisioned IOPS
would mean about 2000USD for a 100% utilized instance on amazon. This is
not really ultra-cheap ;-) For two months running you can get a
dedicated server with eight drives, buy to extra SSDs and have full
control on a Dell server. But things get much cheaper if real IOPS are
not at such high rate.

Also when using a cloud infrastructure and need your data on local
system keep network latency in mind.

We have several huge PostgreSQL databases running and have used
OpenIndina with ZFS and SSDs for data storage for quite a while now and
works perfect.

There are some sildes from Sun/Oracle about ZFS, ZIL, SSD and PostgreSQL
performance (I can look if I find them if needed).

> Alternatively I looked at a Dell server with 32 GB of RAM and some
> really good hard drives. But such a box does not come cheap and I don't
> want to keep the pieces if it doesn't cut it

Just a hint: Do not simply look at Dells prices - phone them and get a
quote. I was surprised (but do not buy SSDs there).

Think about how you data is structured and how it is queried after it
was imported into the database to see where your bottlenecks are.

Cheers,
Jan


pgsql-general by date:

Previous
From: Ondrej Ivanič
Date:
Subject: Re: large database
Next
From: Johannes Lochmann
Date:
Subject: Re: large database