Home > mailing lists

Re: huge price database question.. - Mailing list pgsql-general

From	David Kerr
Subject	Re: huge price database question..
Date	March 20, 2012 23:03:32
Msg-id	4F6936E8.3040507@mr-paradox.net Whole thread Raw
In response to	Re: huge price database question.. (Jim Green <student.northwestern@gmail.com>)
Responses	Re: huge price database question..
List	pgsql-general

Tree view

On 03/20/2012 06:50 PM, Jim Green wrote:
> On 20 March 2012 21:40, David Kerr<dmk@mr-paradox.net>  wrote:
>> On 03/20/2012 04:27 PM, Jim Green wrote:
>>
>> Greetings list!
>> I am pretty new to postgresql from mysql and did a fairly extensive
>> search of the list and came up with a few good ones but didn't find
>> the exact same situation as I have now. so I am venturing asking here.
>>
>> I have daily minute stock price data from 2005 on and each day with
>> columns timestamp, open,high,low,close,volume and a few more. each
>> day's data is about 1.2million rows. I want import all the data to
>> postgresql and analyze using R with the help of Rpostgresql.
>>
>> right now I am having about 7000 tables for individual stock and I use
>> perl to do inserts, it's very slow. I would like to use copy or other
>> bulk loading tool to load the daily raw gz data. but I need the split
>> the file to per stock files first before I do bulk loading. I consider
>> this a bit messy.
>>
>> I would seek advise on the following idea:
>> store everything in a big table, partition by month(this gives a
>> reasonable number of partitions) and do bulk loading on the daily
>> file. my queries would consist mostly select on a particular symbol on
>> a particular day.
>>
>> Also in the future, I will import daily data to the db every day.
>>
>> my hardware is 16G Ram, 4x5400rpm raid10 with enough space.
>>
>> Thanks!
>>
>> Jim.
>>
>>
>> Seems like you'd want to do this?
>> http://search.cpan.org/~turnstep/DBD-Pg-2.19.2/Pg.pm#pg_putcopydata
>> COPY support
>>
>> DBD::Pg allows for quick (bulk) reading and storing of data by using the
>> COPY command. The basic process is to use $dbh->do to issue a COPY command,
>> and then to either add rows using "pg_putcopydata", or to read them by using
>> "pg_getcopydata".
> Thanks! would you comment on the table setup as well?
>
> Jim.
>

\copy on 1.2million rows should only take a minute or two, you could
make that table "unlogged"
as well to speed it up more.  If you could truncate / drop / create /
load / then index the table each
time then you'll get the best throughput.

Dave

pgsql-general by date:

From: Jim Green
Date: 20 March 2012, 23:03:12
Subject: Re: huge price database question..

From: Jim Green
Date: 20 March 2012, 23:09:03
Subject: Re: huge price database question..

Re: huge price database question.. - Mailing list pgsql-general

Previous

Next