Re: large database - Mailing list pgsql-general

From Jeff Janes
Subject Re: large database
Date
Msg-id CAMkU=1xn7wqdbu05m3Y_u6rsoWrXxCzYFYNek5soKAA592fkqg@mail.gmail.com
Whole thread Raw
In response to large database  ("Mihai Popa" <mihai@lattica.com>)
Responses Re: large database
List pgsql-general
On Mon, Dec 10, 2012 at 12:26 PM, Mihai Popa <mihai@lattica.com> wrote:
> Hi,
>
> I've recently inherited a project that involves importing a large set of
> Access mdb files into a Postgres or MySQL database.
> The process is to export the mdb's to comma separated files than import
> those into the final database.
> We are now at the point where the csv files are all created and amount
> to some 300 GB of data.

Compressed or uncompressed?

> I would like to get some advice on the best deployment option.
>
> First, the project has been started using MySQL. Is it worth switching
> to Postgres and if so, which version should I use?

Why did you originally choose MySQL?  What has changed that causes you
to rethink that decision?  Does your team have experience with MySQL
but not with PostgreSQL?

I like PostgreSQL, of course, but if I already had an
already-functioning app on MySQL I'd be reluctant to change it.

If I were going to do so, though, I'd use 9.2.  No reason to develop
against something other than the latest stable version.


> Second, where should I deploy it? The cloud or a dedicated box?
>
> Amazon seems like the sensible choice; you can scale it up and down as
> needed and backup is handled automatically.
> I was thinking of an x-large RDS instance with 10000 IOPS and 1 TB of
> storage. Would this do, or will I end up with a larger/ more expensive
> instance?

My understanding is that RDS does not support Postgres, so if you go
that route the decision is already made for you.  Or am I wrong here?

1TB of storage sounds desperately small for loading 300GB of csv files.

IOPS would mostly depend on how you are using the system, not how large it is.

> Alternatively I looked at a Dell server with 32 GB of RAM and some
> really good hard drives. But such a box does not come cheap and I don't
> want to keep the pieces if it doesn't cut it

xlarge RDS with 1TB of storage and 10000 iops doesn't come cheap, either.

Cheers,

Jeff


pgsql-general by date:

Previous
From: Merlin Moncure
Date:
Subject: Re: SSDs - SandForce or not?
Next
From: "David Johnston"
Date:
Subject: Re: Postgresql PL parallel processing inside Postgresql function....