Re: customizing pg_dump together with copy.c's DoCopy function - Mailing list pgsql-general

From lynnsettle@yahoo.com
Subject Re: customizing pg_dump together with copy.c's DoCopy function
Date
Msg-id 1153358235.999737.107920@75g2000cwc.googlegroups.com
Whole thread Raw
In response to Re: customizing pg_dump together with copy.c's DoCopy function  ("Brian Mathis" <brian.mathis@gmail.com>)
List pgsql-general
Brian,

Those are very interesting ideas. Thanks. I've been playing around with
pg_dump. Modifying it to selectively dump/restore tables and columns is
pretty easy. But as you say, changing the data content within the data
buffers to reflect varying column values, changed column types, and new
columns seems tricky. I wonder if anyone knows of any example code
around somewhere...
-Lynn

"Brian Mathis" wrote:
> pg_dump by default dumps to STDOUT, which you should use in a pipeline to
> perform any modifications.  To me this seems pretty tricky, but should be
> doable.  Modifying pg_dump really strikes me as the wrong way to go about
> it.  Pipelines operate in memory, and should be very fast, depending on how
> you write the filtering program.  You would need to dump the data without
> compression, then compress it coming out the other end (maybe split it up
> too).  Something like this:
>     pg_dump | myfilter | gzip | split --bytes=2000M - mydump.
>
> Also, you can't expect to have speed if you have no disk space.
> Reading/writing to the same disk will kill you.  If you could set up some
> temp space over NFS on the local network, that should gain you some speed.
>
>


pgsql-general by date:

Previous
From: "Rhys Stewart"
Date:
Subject: Re: function is quick for one row but super slow on more than 5..
Next
From: Jerry LeVan
Date:
Subject: Ann: PyPgExplorer-0.8