Re: break table into portions for writing to separate files - Mailing list pgsql-general

From Seb
Subject Re: break table into portions for writing to separate files
Date
Msg-id 87d2fxpc41.fsf@net82.ceos.umanitoba.ca
Whole thread Raw
In response to break table into portions for writing to separate files  (Seb <spluque@gmail.com>)
Responses Re: break table into portions for writing to separate files  (Francisco Olarte <folarte@peoplecall.com>)
List pgsql-general
On Thu, 1 May 2014 20:20:23 +0200,
Francisco Olarte <folarte@peoplecall.com> wrote:

[...]

> As you mention looping and a shell, I suppose you are in something
> unix like, with pipes et al. You can pipe COPY ( either with the pipe
> options for copy, or piping a psql command, or whichever thing you
> like ) through a script which spits ecah data chunk into its
> corresponding file. If your data is somehow clustered into the table (
> by chance or by design ) you don't even need a to sort the data, just
> use an open file pool, I did that once with call files, chunked them
> into day sized files and it worked like a charm ( and if you need the
> files sorted, you can then use sort on each of them, which normally is
> quite fast ).

> For your description of data, with a pipe, you could read a line,
> extract a key for the record ( the timestamp rounded down to 20
> minutes would be a good one ), get and open output file for append (
> using a small caching layer ) , write it.

> Depending on how many files you expect, how many RAM you have and how
> many files your OS allows you to open, other solutions exists. And if
> you do not have enough ram / openfiles / clustering for any of them
> there are multitude of tricks ( if, say, you have 3 years worth, no
> correlation, and can only open/buffer about 1000 files you could split
> from the db into day sized chunks and then split each of them into 20
> minutes ones.

Thanks, I'm glad to hear you've used this approach successfully.  It
seems as though the best solution is to do a single SELECT to get the
data out of the server (it is a view with a very complex query plan
joining several other similar views), and then pipe the output through
say awk to break down into chunks for writing the files, as you
describe.

Cheers,

--
Seb

pgsql-general by date:

Previous
From: Stephan Fabel
Date:
Subject: Ubuntu Packages / Config Files
Next
From: Seb
Date:
Subject: Re: break table into portions for writing to separate files