Home > mailing lists

Re: Streaming large data into postgres [WORM like applications] - Mailing list pgsql-general

From	Dhaval Shah
Subject	Re: Streaming large data into postgres [WORM like applications]
Date	May 12, 2007 21:49:38
Msg-id	565237760705121749r4b331fa5v81cf235f3a371d0@mail.gmail.com Whole thread Raw
In response to	Re: Streaming large data into postgres [WORM like applications] (Lincoln Yeoh <lyeoh@pop.jaring.my>)
Responses	Re: Streaming large data into postgres [WORM like applications] Re: Streaming large data into postgres [WORM like applications]
List	pgsql-general

Tree view

Consolidating my responses in one email.

1. The total data that is expected is some 1 - 1.5 Tb a day. 75% of
the data comes in a period of 10 hours. Rest 25% comes in the 14
hours. Of course there are ways to smooth the load patterns, however
the current scenario is as explained.

2 I do expect that the customer rolls in something like a NAS/SAN with
Tb of disk space. The idea is to retain the data for a duration and
offload it to tape.

That leads to the question, can the data be compressed? Since the data
is very similar, any compression would result in some 6x-10x
compression. Is there a way to identify which partitions are in which
data files and compress them until they are actually read?

Regards
Dhaval

On 5/12/07, Lincoln Yeoh <lyeoh@pop.jaring.my> wrote:
> At 04:43 AM 5/12/2007, Dhaval Shah wrote:
>
> >1. Large amount of streamed rows. In the order of @50-100k rows per
> >second. I was thinking that the rows can be stored into a file and the
> >file then copied into a temp table using copy and then appending those
> >rows to the master table. And then dropping and recreating the index
> >very lazily [during the first query hit or something like that]
>
> Is it one process inserting or can it be many processes?
>
> Is it just a short (relatively) high burst or is that rate sustained
> for a long time? If it's sustained I don't see the point of doing so
> many copies.
>
> How many bytes per row? If the rate is sustained and the rows are big
> then you are going to need LOTs of disks (e.g. a large RAID10).
>
> When do you need to do the reads, and how up to date do they need to be?
>
> Regards,
> Link.
>
>
>
>

--
Dhaval Shah

pgsql-general by date:

From: "Jim C. Nasby"
Date: 12 May 2007, 19:16:33
Subject: Re: [ADMIN] increasing of the shared memory does not solve the problem of "OUT of shared memory"

From: Kevin Hunter
Date: 13 May 2007, 00:46:19
Subject: Re: Streaming large data into postgres [WORM like applications]

Re: Streaming large data into postgres [WORM like applications] - Mailing list pgsql-general

Previous

Next