Re: pg_dump --split patch - Mailing list pgsql-hackers

From Joel Jacobson
Subject Re: pg_dump --split patch
Date
Msg-id AANLkTikRn63rn2LiVJGeFXKZ9myWMiPJj5jjZP7A2=Lg@mail.gmail.com
Whole thread Raw
In response to Re: pg_dump --split patch  (Andrew Dunstan <andrew@dunslane.net>)
Responses Re: pg_dump --split patch  (Aidan Van Dyk <aidan@highrise.ca>)
List pgsql-hackers
2010/12/29 Andrew Dunstan <andrew@dunslane.net>
try:

 diff -F '^CREATE' ...

cheers

andrew

Embarrasing, I'm sure I've done `man diff` before, must have missed that one, wish I'd known about that feature before, would have saved me many hours! :-) Thanks for the tip!

There are some other real-life use-cases where I think splitting would be nice and save a lot of time:

a) if you don't have a perfect 1:1 relationship between all the SPs in your database and your source code repository (handled by your favorite version control system), i.e. if you suspect some SPs in the database might differ compared to the source code files in your repo. In this scenario, it might be simpler to "start over" and continue developing on a repo built from a pg_dump --split export. You would lose all history, but it might still be worth it if the "compare everything in database against source code files in repo"-project would take a lot of man hours.

b) quick branching - perhaps you are a consultant at a company where they don't even have the SPs stored in separate files, they might have been magically installed by some consultant before you without any trace. :-) To get up to speed solving the problem you've been assigned, which in this example involves a lot of SP coding and modifications of existing functions, it would save a lot of time if you had all functions in separate files before you started coding, then you would use git or any other nice version control system to track your changes and figure out what you've done once you get everything to work.

c) automatically saving daily snapshots of your production database schema to your version control system. While the best version control system (git) does not track individual files, many of the ancient ones still very popular ones like svn do so. If every function in the production database schema would be saved automatically to the VCS, you would be guaranteed to have a tack of all deployed changes affecting each function, which is probably a lot fewer changes compared to the entire history for each function, assuming developers commit things while developing and not only when deploying.

d) while pg_dump offers some options to limit the output content, such as -s for "schema only" and -t/-T to limit which tables to dump, it lacks options to export "functions only" or "these functions only". It would require quite a lot of such options to provide the same flexibility as a split dump, highly reducing the need for such options as you could then compose your own restore script based on the dump.

Of course, not all of these scenarios are relevant for everybody.

--
Best regards,

Joel Jacobson
Glue Finance

pgsql-hackers by date:

Previous
From: Robert Haas
Date:
Subject: Re: and it's not a bunny rabbit, either
Next
From: Mark Kirkwood
Date:
Subject: Re: "writable CTEs"