On Tue, 26 Feb 2008, Simon Riggs wrote:
> Splitting up the dump is the enabler for splitting up the load.
While the pg_dump split train seems to be leaving the station, I feel
compelled to point out that focus does nothing to help people who are
bulk-loading data that came from somewhere else. If my data is already in
PostgreSQL, and I'm doing a dump/load, I can usually split the data easily
enough with existing tools to handle that right now via COPY (SELECT...)
TO. Some tools within pg_dump would be nice, but I don't need them that
much. It's gigantic files that came from some other DB I don't even have
access to that I struggle with loading efficiently.
The work Dimitri is doing is wandering in that direction and that may be
enough. I note that something that addresses loading big files regardless
of source could also work on PostgreSQL dumps, while a pg_dump focused
effort helps nothing but that specific workflow. I wonder if doing too
much work on the pg_dump path is the best use of someone's time when the
more general case will need to be addressed one day anyway.
--
* Greg Smith gsmith@gregsmith.com http://www.gregsmith.com Baltimore, MD