Thread: Dump a database excluding one table DATA?

Dump a database excluding one table DATA?

From
Dmitry Koterov
Date:
Hello.

Is there any way (or hack) to dump the whole database, but to exclude the DATA from a table within this dump? (DDL of the table should not be excluded: after restoring the data the excluded table should look "empty".)

I see -T switch of pg_dump, but seems -T excludes the data AND the DDL of a table (and possibly all objects which depend on this table?), so after restoration the database structure becomes broken sometimes.

Re: Dump a database excluding one table DATA?

From
Adrian Klaver
Date:
On Thursday, August 18, 2011 12:49:45 pm Dmitry Koterov wrote:
> Hello.
>
> Is there any way (or hack) to dump the whole database, but to exclude the
> DATA from a table within this dump? (DDL of the table should not be
> excluded: after restoring the data the excluded table should look "empty".)
>
> I see -T switch of pg_dump, but seems -T excludes the data AND the DDL of a
> table (and possibly all objects which depend on this table?), so after
> restoration the database structure becomes broken sometimes.

One way I know you can do it, is exclude the data from restoring. This requires
you use the pg_dump custom format. For full details see here:

http://www.postgresql.org/docs/9.0/interactive/app-pgrestore.html

Short version use pg_restore -l to generate a listing from the dump file.
Comment out the line that copys the data into that table.
Use pg_restore ... -L to have pg_restore those items not commented out.

Another way is do it using the -T switch for the 'complete' db dump. Then do a
separate dump using -s (schema only) and -t some_table and then restore it on
its own.

--
Adrian Klaver
adrian.klaver@gmail.com

Re: Dump a database excluding one table DATA?

From
Dmitry Koterov
Date:
1. I need to shorten pg_dump results (for backup purposes), so pg_restore is too late for that...

2. If I use "pg_dump -s" separately, the data may not load (or load to slow) after that, because all indices/foreign keys are already there. Is there a way to split "pg_dump -s" into 2 parts: the first part dumps everything excluding indices, checks and foreign keys, and the second part - only them? Not sure it is possible at all, because I think pg_dump may dump data not between these two blocks of DDLs...



On Fri, Aug 19, 2011 at 12:04 AM, Adrian Klaver <adrian.klaver@gmail.com> wrote:
On Thursday, August 18, 2011 12:49:45 pm Dmitry Koterov wrote:
> Hello.
>
> Is there any way (or hack) to dump the whole database, but to exclude the
> DATA from a table within this dump? (DDL of the table should not be
> excluded: after restoring the data the excluded table should look "empty".)
>
> I see -T switch of pg_dump, but seems -T excludes the data AND the DDL of a
> table (and possibly all objects which depend on this table?), so after
> restoration the database structure becomes broken sometimes.

One way I know you can do it, is exclude the data from restoring. This requires
you use the pg_dump custom format. For full details see here:

http://www.postgresql.org/docs/9.0/interactive/app-pgrestore.html

Short version use pg_restore -l to generate a listing from the dump file.
Comment out the line that copys the data into that table.
Use pg_restore ... -L to have pg_restore those items not commented out.

Another way is do it using the -T switch for the 'complete' db dump. Then do a
separate dump using -s (schema only) and -t some_table and then restore it on
its own.

--
Adrian Klaver
adrian.klaver@gmail.com

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: Dump a database excluding one table DATA?

From
Adrian Klaver
Date:
On Thursday, August 18, 2011 1:23:25 pm Dmitry Koterov wrote:
> 1. I need to shorten pg_dump results (for backup purposes), so pg_restore
> is too late for that..

>
> 2. If I use "pg_dump -s" separately, the data may not load (or load to
> slow) after that, because all indices/foreign keys are already there. Is
> there a way to split "pg_dump -s" into 2 parts: the first part dumps
> everything excluding indices, checks and foreign keys, and the second part
> - only them? Not sure it is possible at all, because I think pg_dump may
> dump data not between these two blocks of DDLs...
>

I am not sure I follow. Are you saying you eventually restore the data for that
table as a separate step? If so, from the previous link, this might help:

"
--disable-triggers

    This option is only relevant when performing a data-only restore. It
instructs pg_restore to execute commands to temporarily disable triggers on the
target tables while the data is reloaded. Use this if you have referential
integrity checks or other triggers on the tables that you do not want to invoke
during data reload.

    Presently, the commands emitted for --disable-triggers must be done as
superuser. So, you should also specify a superuser name with -S, or preferably
run pg_restore as a PostgreSQL superuser.
"



--
Adrian Klaver
adrian.klaver@gmail.com

Re: Dump a database excluding one table DATA?

From
Dmitry Koterov
Date:
Thanks, "pg_dump --data-only --disable-triggers" is the king.

(Unfortunately it is not supported by pg_dumpall, but it is entirely another story. :-)


On Fri, Aug 19, 2011 at 12:36 AM, Adrian Klaver <adrian.klaver@gmail.com> wrote:
On Thursday, August 18, 2011 1:23:25 pm Dmitry Koterov wrote:
> 1. I need to shorten pg_dump results (for backup purposes), so pg_restore
> is too late for that..

>
> 2. If I use "pg_dump -s" separately, the data may not load (or load to
> slow) after that, because all indices/foreign keys are already there. Is
> there a way to split "pg_dump -s" into 2 parts: the first part dumps
> everything excluding indices, checks and foreign keys, and the second part
> - only them? Not sure it is possible at all, because I think pg_dump may
> dump data not between these two blocks of DDLs...
>

I am not sure I follow. Are you saying you eventually restore the data for that
table as a separate step? If so, from the previous link, this might help:

"
--disable-triggers

   This option is only relevant when performing a data-only restore. It
instructs pg_restore to execute commands to temporarily disable triggers on the
target tables while the data is reloaded. Use this if you have referential
integrity checks or other triggers on the tables that you do not want to invoke
during data reload.

   Presently, the commands emitted for --disable-triggers must be done as
superuser. So, you should also specify a superuser name with -S, or preferably
run pg_restore as a PostgreSQL superuser.
"



--
Adrian Klaver
adrian.klaver@gmail.com

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: Dump a database excluding one table DATA?

From
Dmitry Koterov
Date:
Mmm, --disable-triggers is not surely enough - we also have RULEs and (much worse) INDEXes.

If we create all indices and then restore all data, it is MUCH SLOWER than restore the data first and then - create all indices.
So I think that there is no work-around really...

I propose to include an option to pg_dump to skip several tables data restoration. :-)



On Fri, Aug 19, 2011 at 12:44 AM, Dmitry Koterov <dmitry@koterov.ru> wrote:
Thanks, "pg_dump --data-only --disable-triggers" is the king.

(Unfortunately it is not supported by pg_dumpall, but it is entirely another story. :-)


On Fri, Aug 19, 2011 at 12:36 AM, Adrian Klaver <adrian.klaver@gmail.com> wrote:
On Thursday, August 18, 2011 1:23:25 pm Dmitry Koterov wrote:
> 1. I need to shorten pg_dump results (for backup purposes), so pg_restore
> is too late for that..

>
> 2. If I use "pg_dump -s" separately, the data may not load (or load to
> slow) after that, because all indices/foreign keys are already there. Is
> there a way to split "pg_dump -s" into 2 parts: the first part dumps
> everything excluding indices, checks and foreign keys, and the second part
> - only them? Not sure it is possible at all, because I think pg_dump may
> dump data not between these two blocks of DDLs...
>

I am not sure I follow. Are you saying you eventually restore the data for that
table as a separate step? If so, from the previous link, this might help:

"
--disable-triggers

   This option is only relevant when performing a data-only restore. It
instructs pg_restore to execute commands to temporarily disable triggers on the
target tables while the data is reloaded. Use this if you have referential
integrity checks or other triggers on the tables that you do not want to invoke
during data reload.

   Presently, the commands emitted for --disable-triggers must be done as
superuser. So, you should also specify a superuser name with -S, or preferably
run pg_restore as a PostgreSQL superuser.
"



--
Adrian Klaver
adrian.klaver@gmail.com

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: Dump a database excluding one table DATA?

From
Adrian Klaver
Date:
On Thursday, August 18, 2011 3:25:59 pm Dmitry Koterov wrote:
> Mmm, --disable-triggers is not surely enough - we also have RULEs and (much
> worse) INDEXes.
>
> If we create all indices and then restore all data, it is MUCH SLOWER than
> restore the data first and then - create all indices.
> So I think that there is no work-around really...

Yes there is. Do the schema only dump of the table in question, comment out the
Indexes and rules you don't want. Load the data. Go back to the schema dump
comment out the table and uncomment the Indexes/rules,etc .
Or
Write a script that does the same thing automatically.


>
> I propose to include an option to pg_dump to skip several tables data
> restoration. :-)
>

I thought you where looking to exclude only one table. As you found out the
dependency issues for one table can be a problem. Doing more than one makes the
process even more brittle.

--
Adrian Klaver
adrian.klaver@gmail.com

Re: Dump a database excluding one table DATA?

From
Dimitri Fontaine
Date:
Dmitry Koterov <dmitry@koterov.ru> writes:
> Is there any way (or hack) to dump the whole database, but to exclude the
> DATA from a table within this dump? (DDL of the table should not be
> excluded: after restoring the data the excluded table should look "empty".)

The pg_staging tool allows you to do that quite easily, once you've done
the initial setup (involved, but nothing fancy).

  https://github.com/dimitri/pg_staging
  http://tapoueh.org/pgsql/pgstaging.html

If you want to exclude the table from the dumps though (data loss issues
are looking at you), what I would advice is create a dedicated schema
and skipping the whole schema at dump time, using the following option:

    --exclude-schema=schema
        Do   not  dump   any   schemas   matching  the   schema
        pattern. The  pattern is  interpreted according  to the
        same rules as  for -n.  -N can be given  more than once
        to exclude schemas matching any of several patterns.

Regards,
--
Dimitri Fontaine
http://2ndQuadrant.fr     PostgreSQL : Expertise, Formation et Support