Re: [GENERAL] pg_dump behaves differently for different archive formats - Mailing list pgsql-hackers

From Albe Laurenz
Subject Re: [GENERAL] pg_dump behaves differently for different archive formats
Date
Msg-id A737B7A37273E048B164557ADEF4A58B17D206CF@ntex2010i.host.magwien.gv.at
Whole thread Raw
Responses Re: Re: [GENERAL] pg_dump behaves differently for different archive formats  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-hackers
Tom Lane wrote on Dec 16, 2013:
> Albe Laurenz <laurenz.albe@wien.gv.at> writes:
>> Restoring a "plain format" dump and a "custom format" dump of
>> the same database can lead to different results:

>> pg_dump organizes the SQL statements it creates in "TOC entries".
>> If a custom format dump is restored with pg_restore, all
>> SQL statements in a TOC entry will be executed as a single command
>> and thus in a single transaction.

> Yeah, this is a bug I think.  pg_dump was designed around the idea
> that the output would be executed as a simple script, and in a
> number of places there's an expectation that one SQL statement
> can fail without affecting following ones.  So if pg_restore can't
> provide that behavior it's not good.
> 
> On the other hand, I'm not sure how much enthusiasm there'd be for
> complex or fragile changes to fix this.  A lot of people invariably
> run restores in single-transaction mode and don't really care about
> fault-tolerant restores.  Also, it's easy enough to dodge the problem
> if you must: just pipe the output into psql rather than
> direct-to-database.
> 
> So to me the question is can we fix this without doing something like
> duplicating psql's lexer?  If we have to parse out the statements
> contained in each text blob, it's probably going to be too painful.
> Some cautionary history about this sort of thing can be read at
> http://www.postgresql.org/message-id/flat/18006.1325700782@sss.pgh.pa.us

I thought that changing the dump format for this would be too
much trouble, so I came up with the attached.

It assumes that custom- or tar-format archives are written by pg_dump
and cannot contain arbitrary SQL statements, which allows me to get away
with very simple parsing.

If this is not shot down immediately on account of fragility, I'd
add it to the next commitfest page.

The problem has been a pain point for my co-workers in the past;
using single-transaction mode doesn't work for us, since we have custom objects
in our template database that cause expected errors when a dump is restored.

Yours,
Laurenz Albe

Attachment

pgsql-hackers by date:

Previous
From: Fujii Masao
Date:
Subject: Re: Audit of logout
Next
From: Tom Lane
Date:
Subject: Re: Re: [GENERAL] pg_dump behaves differently for different archive formats