Thread: pg_dump insert transactions

pg_dump insert transactions

From
Kris Jurka
Date:
This patch makes pg_dump wrap insert dumps (-d, -D) in a transaction to
speed the restore process.  It also adds an option to disable these
transactions for doing things like copying only newly inserted values from
one database to another (assuming old ones will fail on an unique
constraint).

Kris Jurka

Attachment

Re: pg_dump insert transactions

From
Tom Lane
Date:
Kris Jurka <books@ejurka.com> writes:
> This patch makes pg_dump wrap insert dumps (-d, -D) in a transaction to
> speed the restore process.

What's the point of this, compared to Simon's recent patch adding an
option to wrap the whole output in one big transaction?  An intermediate
level doesn't seem very interesting ... and lord knows pg_dump has an
unreasonable number of options already.

> It also adds an option to disable these
> transactions for doing things like copying only newly inserted values from
> one database to another (assuming old ones will fail on an unique
> constraint).

I would think that would be the main use-case for still using -d/-D at
all (instead of COPY), so this seems of dubious usefulness, and
certainly not what should be the default.

            regards, tom lane

Re: pg_dump insert transactions

From
Bruce Momjian
Date:
Tom Lane wrote:
> Kris Jurka <books@ejurka.com> writes:
> > This patch makes pg_dump wrap insert dumps (-d, -D) in a transaction to
> > speed the restore process.
>
> What's the point of this, compared to Simon's recent patch adding an
> option to wrap the whole output in one big transaction?  An intermediate
> level doesn't seem very interesting ... and lord knows pg_dump has an
> unreasonable number of options already.
>
> > It also adds an option to disable these
> > transactions for doing things like copying only newly inserted values from
> > one database to another (assuming old ones will fail on an unique
> > constraint).
>
> I would think that would be the main use-case for still using -d/-D at
> all (instead of COPY), so this seems of dubious usefulness, and
> certainly not what should be the default.

I have added a patch to document that pg_dump -d/-D throw an error on
invalid data, but allow other INSERTs to continue.

Also, we do have this TODO item:

        o %Have pg_dump use multi-statement transactions for INSERT dumps

It always bothered me that INSERT did not have the same behavior as
non-INSERT, but if we document the difference, and people find it
useful, we should just keep it.  TODO item removed.

--
  Bruce Momjian   http://candle.pha.pa.us
  EnterpriseDB    http://www.enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +
Index: doc/src/sgml/ref/pg_dump.sgml
===================================================================
RCS file: /cvsroot/pgsql/doc/src/sgml/ref/pg_dump.sgml,v
retrieving revision 1.82
diff -c -c -r1.82 pg_dump.sgml
*** doc/src/sgml/ref/pg_dump.sgml    17 Mar 2006 16:02:47 -0000    1.82
--- doc/src/sgml/ref/pg_dump.sgml    15 Apr 2006 17:59:33 -0000
***************
*** 173,178 ****
--- 173,181 ----
          non-<productname>PostgreSQL</productname> databases.  Note that
          the restore may fail altogether if you have rearranged column order.
          The <option>-D</option> option is safer, though even slower.
+         Also, while this option generates errors for invalid data,
+         it allows other <command>INSERT</command>s to continue loading
+         data into the table.
         </para>
        </listitem>
       </varlistentry>
***************
*** 190,195 ****
--- 193,201 ----
          ...</literal>).  This will make restoration very slow; it is mainly
          useful for making dumps that can be loaded into
          non-<productname>PostgreSQL</productname> databases.
+         Also, while this option generates errors for invalid data,
+         it allows other <command>INSERT</command>s to continue loading
+         data into the table.
         </para>
        </listitem>
       </varlistentry>