Brian,
Those are very interesting ideas. Thanks. I've been playing around with
pg_dump. Modifying it to selectively dump/restore tables and columns is
pretty easy. But as you say, changing the data content within the data
buffers to reflect varying column values, changed column types, and new
columns seems tricky. I wonder if anyone knows of any example code
around somewhere...
-Lynn
"Brian Mathis" wrote:
> pg_dump by default dumps to STDOUT, which you should use in a pipeline to
> perform any modifications. To me this seems pretty tricky, but should be
> doable. Modifying pg_dump really strikes me as the wrong way to go about
> it. Pipelines operate in memory, and should be very fast, depending on how
> you write the filtering program. You would need to dump the data without
> compression, then compress it coming out the other end (maybe split it up
> too). Something like this:
> pg_dump | myfilter | gzip | split --bytes=2000M - mydump.
>
> Also, you can't expect to have speed if you have no disk space.
> Reading/writing to the same disk will kill you. If you could set up some
> temp space over NFS on the local network, that should gain you some speed.
>
>