Re: large database - Mailing list pgsql-general

From Mihai Popa
Subject Re: large database
Date
Msg-id 50C7510C.3030605@lattica.com
Whole thread Raw
In response to Re: large database  (Bill Moran <wmoran@potentialtech.com>)
Responses Re: large database  (David Boreham <david_list@boreham.org>)
List pgsql-general
On 12/11/2012 07:27 AM, Bill Moran wrote:
> On Mon, 10 Dec 2012 15:26:02 -0500 (EST) "Mihai Popa" <mihai@lattica.com> wrote:
>
>> Hi,
>>
>> I've recently inherited a project that involves importing a large set of
>> Access mdb files into a Postgres or MySQL database.
>> The process is to export the mdb's to comma separated files than import
>> those into the final database.
>> We are now at the point where the csv files are all created and amount
>> to some 300 GB of data.
>>
>> I would like to get some advice on the best deployment option.
>>
>> First, the project has been started using MySQL. Is it worth switching
>> to Postgres and if so, which version should I use?
> I've been managing a few large databases this year, on both PostgreSQL and
> MySQL.
>
> Don't put your data in MySQL.  Ever.  If you feel like you need to use
> something like MySQL, just go straight to a system that was designed with
> no constraints right off the bat, like Mongo or something.

I've never worked with MySQL before; I did work with Postgres a lot over
the last few years, but never
with such large databases, so I cannot really choose one over the other;
hence my posting:)
> and the fact that if you use anything other than INT AUTO_INCREMENT for
> your primary key you're liable to hit on awful inefficiencies.

Unfortunately, I don't know much yet about the usage pattern; all I know
is that the data is mostly
read only, there will be a few updates every year, but they will
probably happen as batch jobs over night.
And meanwhile it appears there is a lot more of it: 800 GB rather than
300 as initially thought.
There aren't a lot of tables so each will have a large number of rows.

I guess Chris was right, I have to better understand the usage pattern
and do some testing of my own.
I was just hoping my hunch about Amazon being the better alternative
would be confirmed, but this does not
seem to be the case; most of you recommend purchasing a box.

I want to thank everyone for the input, really appreciate it!

regards,
mihai


pgsql-general by date:

Previous
From: Merlin Moncure
Date:
Subject: Re: Problem with aborting entire transactions on error
Next
From: Misa Simic
Date:
Subject: Postgresql PL parallel processing inside Postgresql function....