Thread: Dump a database excluding one table DATA?
Hello.
Is there any way (or hack) to dump the whole database, but to exclude the DATA from a table within this dump? (DDL of the table should not be excluded: after restoring the data the excluded table should look "empty".)
I see -T switch of pg_dump, but seems -T excludes the data AND the DDL of a table (and possibly all objects which depend on this table?), so after restoration the database structure becomes broken sometimes.
On Thursday, August 18, 2011 12:49:45 pm Dmitry Koterov wrote: > Hello. > > Is there any way (or hack) to dump the whole database, but to exclude the > DATA from a table within this dump? (DDL of the table should not be > excluded: after restoring the data the excluded table should look "empty".) > > I see -T switch of pg_dump, but seems -T excludes the data AND the DDL of a > table (and possibly all objects which depend on this table?), so after > restoration the database structure becomes broken sometimes. One way I know you can do it, is exclude the data from restoring. This requires you use the pg_dump custom format. For full details see here: http://www.postgresql.org/docs/9.0/interactive/app-pgrestore.html Short version use pg_restore -l to generate a listing from the dump file. Comment out the line that copys the data into that table. Use pg_restore ... -L to have pg_restore those items not commented out. Another way is do it using the -T switch for the 'complete' db dump. Then do a separate dump using -s (schema only) and -t some_table and then restore it on its own. -- Adrian Klaver adrian.klaver@gmail.com
1. I need to shorten pg_dump results (for backup purposes), so pg_restore is too late for that...
2. If I use "pg_dump -s" separately, the data may not load (or load to slow) after that, because all indices/foreign keys are already there. Is there a way to split "pg_dump -s" into 2 parts: the first part dumps everything excluding indices, checks and foreign keys, and the second part - only them? Not sure it is possible at all, because I think pg_dump may dump data not between these two blocks of DDLs...
On Fri, Aug 19, 2011 at 12:04 AM, Adrian Klaver <adrian.klaver@gmail.com> wrote:
One way I know you can do it, is exclude the data from restoring. This requiresOn Thursday, August 18, 2011 12:49:45 pm Dmitry Koterov wrote:
> Hello.
>
> Is there any way (or hack) to dump the whole database, but to exclude the
> DATA from a table within this dump? (DDL of the table should not be
> excluded: after restoring the data the excluded table should look "empty".)
>
> I see -T switch of pg_dump, but seems -T excludes the data AND the DDL of a
> table (and possibly all objects which depend on this table?), so after
> restoration the database structure becomes broken sometimes.
you use the pg_dump custom format. For full details see here:
http://www.postgresql.org/docs/9.0/interactive/app-pgrestore.html
Short version use pg_restore -l to generate a listing from the dump file.
Comment out the line that copys the data into that table.
Use pg_restore ... -L to have pg_restore those items not commented out.
Another way is do it using the -T switch for the 'complete' db dump. Then do a
separate dump using -s (schema only) and -t some_table and then restore it on
its own.
--
Adrian Klaver
adrian.klaver@gmail.com
--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general
On Thursday, August 18, 2011 1:23:25 pm Dmitry Koterov wrote: > 1. I need to shorten pg_dump results (for backup purposes), so pg_restore > is too late for that.. > > 2. If I use "pg_dump -s" separately, the data may not load (or load to > slow) after that, because all indices/foreign keys are already there. Is > there a way to split "pg_dump -s" into 2 parts: the first part dumps > everything excluding indices, checks and foreign keys, and the second part > - only them? Not sure it is possible at all, because I think pg_dump may > dump data not between these two blocks of DDLs... > I am not sure I follow. Are you saying you eventually restore the data for that table as a separate step? If so, from the previous link, this might help: " --disable-triggers This option is only relevant when performing a data-only restore. It instructs pg_restore to execute commands to temporarily disable triggers on the target tables while the data is reloaded. Use this if you have referential integrity checks or other triggers on the tables that you do not want to invoke during data reload. Presently, the commands emitted for --disable-triggers must be done as superuser. So, you should also specify a superuser name with -S, or preferably run pg_restore as a PostgreSQL superuser. " -- Adrian Klaver adrian.klaver@gmail.com
Thanks, "pg_dump --data-only --disable-triggers" is the king.
(Unfortunately it is not supported by pg_dumpall, but it is entirely another story. :-)
On Fri, Aug 19, 2011 at 12:36 AM, Adrian Klaver <adrian.klaver@gmail.com> wrote:
On Thursday, August 18, 2011 1:23:25 pm Dmitry Koterov wrote:I am not sure I follow. Are you saying you eventually restore the data for that
> 1. I need to shorten pg_dump results (for backup purposes), so pg_restore
> is too late for that..
>
> 2. If I use "pg_dump -s" separately, the data may not load (or load to
> slow) after that, because all indices/foreign keys are already there. Is
> there a way to split "pg_dump -s" into 2 parts: the first part dumps
> everything excluding indices, checks and foreign keys, and the second part
> - only them? Not sure it is possible at all, because I think pg_dump may
> dump data not between these two blocks of DDLs...
>
table as a separate step? If so, from the previous link, this might help:
"
--disable-triggers
This option is only relevant when performing a data-only restore. It
instructs pg_restore to execute commands to temporarily disable triggers on the
target tables while the data is reloaded. Use this if you have referential
integrity checks or other triggers on the tables that you do not want to invoke
during data reload.
Presently, the commands emitted for --disable-triggers must be done as
superuser. So, you should also specify a superuser name with -S, or preferably
run pg_restore as a PostgreSQL superuser.
"
--Adrian Klaver
adrian.klaver@gmail.com
--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general
Mmm, --disable-triggers is not surely enough - we also have RULEs and (much worse) INDEXes.
If we create all indices and then restore all data, it is MUCH SLOWER than restore the data first and then - create all indices.
So I think that there is no work-around really...
I propose to include an option to pg_dump to skip several tables data restoration. :-)
On Fri, Aug 19, 2011 at 12:44 AM, Dmitry Koterov <dmitry@koterov.ru> wrote:
Thanks, "pg_dump --data-only --disable-triggers" is the king.(Unfortunately it is not supported by pg_dumpall, but it is entirely another story. :-)On Fri, Aug 19, 2011 at 12:36 AM, Adrian Klaver <adrian.klaver@gmail.com> wrote:On Thursday, August 18, 2011 1:23:25 pm Dmitry Koterov wrote:I am not sure I follow. Are you saying you eventually restore the data for that
> 1. I need to shorten pg_dump results (for backup purposes), so pg_restore
> is too late for that..
>
> 2. If I use "pg_dump -s" separately, the data may not load (or load to
> slow) after that, because all indices/foreign keys are already there. Is
> there a way to split "pg_dump -s" into 2 parts: the first part dumps
> everything excluding indices, checks and foreign keys, and the second part
> - only them? Not sure it is possible at all, because I think pg_dump may
> dump data not between these two blocks of DDLs...
>
table as a separate step? If so, from the previous link, this might help:
"
--disable-triggers
This option is only relevant when performing a data-only restore. It
instructs pg_restore to execute commands to temporarily disable triggers on the
target tables while the data is reloaded. Use this if you have referential
integrity checks or other triggers on the tables that you do not want to invoke
during data reload.
Presently, the commands emitted for --disable-triggers must be done as
superuser. So, you should also specify a superuser name with -S, or preferably
run pg_restore as a PostgreSQL superuser.
"
--Adrian Klaver
adrian.klaver@gmail.com
--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general
On Thursday, August 18, 2011 3:25:59 pm Dmitry Koterov wrote: > Mmm, --disable-triggers is not surely enough - we also have RULEs and (much > worse) INDEXes. > > If we create all indices and then restore all data, it is MUCH SLOWER than > restore the data first and then - create all indices. > So I think that there is no work-around really... Yes there is. Do the schema only dump of the table in question, comment out the Indexes and rules you don't want. Load the data. Go back to the schema dump comment out the table and uncomment the Indexes/rules,etc . Or Write a script that does the same thing automatically. > > I propose to include an option to pg_dump to skip several tables data > restoration. :-) > I thought you where looking to exclude only one table. As you found out the dependency issues for one table can be a problem. Doing more than one makes the process even more brittle. -- Adrian Klaver adrian.klaver@gmail.com
Dmitry Koterov <dmitry@koterov.ru> writes: > Is there any way (or hack) to dump the whole database, but to exclude the > DATA from a table within this dump? (DDL of the table should not be > excluded: after restoring the data the excluded table should look "empty".) The pg_staging tool allows you to do that quite easily, once you've done the initial setup (involved, but nothing fancy). https://github.com/dimitri/pg_staging http://tapoueh.org/pgsql/pgstaging.html If you want to exclude the table from the dumps though (data loss issues are looking at you), what I would advice is create a dedicated schema and skipping the whole schema at dump time, using the following option: --exclude-schema=schema Do not dump any schemas matching the schema pattern. The pattern is interpreted according to the same rules as for -n. -N can be given more than once to exclude schemas matching any of several patterns. Regards, -- Dimitri Fontaine http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support