Re: pg_dump additional options for performance - Mailing list pgsql-hackers

From Greg Smith
Subject Re: pg_dump additional options for performance
Date
Msg-id Pine.GSO.4.64.0802261209430.204@westnet.com
Whole thread Raw
In response to Re: pg_dump additional options for performance  (Simon Riggs <simon@2ndquadrant.com>)
Responses Re: pg_dump additional options for performance  (Tom Lane <tgl@sss.pgh.pa.us>)
Re: pg_dump additional options for performance  (Simon Riggs <simon@2ndquadrant.com>)
List pgsql-hackers
On Tue, 26 Feb 2008, Simon Riggs wrote:

> Splitting up the dump is the enabler for splitting up the load.

While the pg_dump split train seems to be leaving the station, I feel 
compelled to point out that focus does nothing to help people who are 
bulk-loading data that came from somewhere else.  If my data is already in 
PostgreSQL, and I'm doing a dump/load, I can usually split the data easily 
enough with existing tools to handle that right now via COPY (SELECT...) 
TO.  Some tools within pg_dump would be nice, but I don't need them that 
much.  It's gigantic files that came from some other DB I don't even have 
access to that I struggle with loading efficiently.

The work Dimitri is doing is wandering in that direction and that may be 
enough.  I note that something that addresses loading big files regardless 
of source could also work on PostgreSQL dumps, while a pg_dump focused 
effort helps nothing but that specific workflow.  I wonder if doing too 
much work on the pg_dump path is the best use of someone's time when the 
more general case will need to be addressed one day anyway.

--
* Greg Smith gsmith@gregsmith.com http://www.gregsmith.com Baltimore, MD


pgsql-hackers by date:

Previous
From: Simon Riggs
Date:
Subject: Re: pg_dump additional options for performance
Next
From: Tom Lane
Date:
Subject: Re: Including PL/PgSQL by default