Thread: BUG #7792: pg_dump does not treat -c flag correctly when using tar format

BUG #7792: pg_dump does not treat -c flag correctly when using tar format

From
emesika@redhat.com
Date:
The following bug has been logged on the website:

Bug reference:      7792
Logged by:          Eli Mesika
Email address:      emesika@redhat.com
PostgreSQL version: 9.1.7
Operating system:   Fedora 16
Description:        =


steps to reproduce (aasuming database name is : test)

1) pg_dump -F t -U postgres -f test.tar test
2) tar xvf test.tar  to any directory
3) vi restore.sql
* restore.sql includes DROP statements for each object even tough -c flag
was not given

repeat the above using plain-text format
1) pg_dump -F p -U postgres -f test.sql test
2) vi test.sql

This time test.sql does not include DROP staements for each object

* pg_dump should not produce DROP statements for each object if -c flag was
not given to the command
emesika@redhat.com writes:
> 1) pg_dump -F t -U postgres -f test.tar test
> 2) tar xvf test.tar  to any directory
> 3) vi restore.sql
> * restore.sql includes DROP statements for each object even tough -c flag
> was not given

I believe this is intentional - at least, pg_backup_tar.c goes out of
its way to make it happen.  (The forcible setting of ropt->dropSchema
in _CloseArchive is the cause, and it's hard to see why that would be
there unless the author intended this effect.)  Perhaps we should remove
that, but it would be an incompatible change.  Arguing for or against
it really requires a model of what people would be doing with the
restore.sql script.  I'm not entirely convinced that it should be
considered equivalent to what you'd get from a plain dump run.

            regards, tom lane

Re: BUG #7792: pg_dump does not treat -c flag correctly when using tar format

From
Magnus Hagander
Date:
On Mon, Jan 7, 2013 at 1:32 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> emesika@redhat.com writes:
>> 1) pg_dump -F t -U postgres -f test.tar test
>> 2) tar xvf test.tar  to any directory
>> 3) vi restore.sql
>> * restore.sql includes DROP statements for each object even tough -c flag
>> was not given
>
> I believe this is intentional - at least, pg_backup_tar.c goes out of
> its way to make it happen.  (The forcible setting of ropt->dropSchema
> in _CloseArchive is the cause, and it's hard to see why that would be
> there unless the author intended this effect.)  Perhaps we should remove
> that, but it would be an incompatible change.  Arguing for or against
> it really requires a model of what people would be doing with the
> restore.sql script.  I'm not entirely convinced that it should be
> considered equivalent to what you'd get from a plain dump run.

tar archives are more like custom archives ,arne't they? The idea
being that since you use pg_restore to restore them, you should use -c
on the pg_restore commandline, and not on the pg_dump one? And if we
didn't include it in the dump, that wouldn't even be possible.

I definitely think it's doing the right thing, but perhaps it's
something that needs to be clarified in the documentation if people
find it confusing.

--
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/
Magnus Hagander <magnus@hagander.net> writes:
> On Mon, Jan 7, 2013 at 1:32 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> emesika@redhat.com writes:
>>> * restore.sql includes DROP statements for each object even tough -c flag
>>> was not given

>> I believe this is intentional - at least, pg_backup_tar.c goes out of
>> its way to make it happen.  (The forcible setting of ropt->dropSchema
>> in _CloseArchive is the cause, and it's hard to see why that would be
>> there unless the author intended this effect.)  Perhaps we should remove
>> that, but it would be an incompatible change.  Arguing for or against
>> it really requires a model of what people would be doing with the
>> restore.sql script.  I'm not entirely convinced that it should be
>> considered equivalent to what you'd get from a plain dump run.

> tar archives are more like custom archives ,arne't they? The idea
> being that since you use pg_restore to restore them, you should use -c
> on the pg_restore commandline, and not on the pg_dump one? And if we
> didn't include it in the dump, that wouldn't even be possible.

No, pg_restore depends only on what is in the TOC data structure.
The restore.sql script is just an auxiliary file that's there if you
extract the contents of the tar file --- pg_restore doesn't use it.

On reviewing the commit history, I find that pg_backup_tar.c was created
in commit c3e18804ff14f690a6d8c31b452476d0f8fcec28, and in that version
the code looked like

        ropt = NewRestoreOptions();
        ropt->dropSchema = 1;
        ropt->compression = 0;

        RestoreArchive((Archive*)AH, ropt);

where the recursive call to RestoreArchive is what scans the TOC and
prints the contents of restore.sql.  I am not sure that in 2000 the
function even had the opportunity to pass down the outer restore options
to the recursive call, but it certainly didn't try.  Somewhere along the
line we added a link to RestoreOptions to the ArchiveHandle data
structure, and since last year (commit
4317e0246c645f60c39e6572644cff1cb03b4c65) the code looks like

         ropt = NewRestoreOptions();
+        memcpy(ropt, AH->ropt, sizeof(RestoreOptions));
         ropt->dropSchema = 1;
         ropt->compression = 0;
         ropt->superuser = NULL;
         ropt->suppressDumpWarnings = true;

+        savRopt = AH->ropt;
+        AH->ropt = ropt;
+
         savVerbose = AH->public.verbose;
         AH->public.verbose = 0;

-        RestoreArchive((Archive *) AH, ropt);
+        RestoreArchive((Archive *) AH);

+        AH->ropt = savRopt;
         AH->public.verbose = savVerbose;

So at this point passing down options is the norm and forcibly setting
dropSchema is the exception.  (The forced clearing of superuser seems a
bit debatable now too.)  Maybe we should change it, but it would
definitely be a user-visible behavioral change.

            regards, tom lane

Re: BUG #7792: pg_dump does not treat -c flag correctly when using tar format

From
Magnus Hagander
Date:
On Mon, Jan 7, 2013 at 4:41 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Magnus Hagander <magnus@hagander.net> writes:
>> On Mon, Jan 7, 2013 at 1:32 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>>> emesika@redhat.com writes:
>>>> * restore.sql includes DROP statements for each object even tough -c flag
>>>> was not given
>
>>> I believe this is intentional - at least, pg_backup_tar.c goes out of
>>> its way to make it happen.  (The forcible setting of ropt->dropSchema
>>> in _CloseArchive is the cause, and it's hard to see why that would be
>>> there unless the author intended this effect.)  Perhaps we should remove
>>> that, but it would be an incompatible change.  Arguing for or against
>>> it really requires a model of what people would be doing with the
>>> restore.sql script.  I'm not entirely convinced that it should be
>>> considered equivalent to what you'd get from a plain dump run.
>
>> tar archives are more like custom archives ,arne't they? The idea
>> being that since you use pg_restore to restore them, you should use -c
>> on the pg_restore commandline, and not on the pg_dump one? And if we
>> didn't include it in the dump, that wouldn't even be possible.
>
> No, pg_restore depends only on what is in the TOC data structure.
> The restore.sql script is just an auxiliary file that's there if you
> extract the contents of the tar file --- pg_restore doesn't use it.

Ah, so it's never used at all? Interesting.

Shows how often I use the tar format :D AFAIK it has no actual
advantage over custom format, but I guess including the restore.sql
file might be considered such an advantage.



--
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/
----- Original Message -----
> From: "Tom Lane" <tgl@sss.pgh.pa.us>
> To: emesika@redhat.com
> Cc: pgsql-bugs@postgresql.org
> Sent: Monday, January 7, 2013 2:32:06 AM
> Subject: Re: [BUGS] BUG #7792: pg_dump does not treat -c flag correctly when using tar format
>
> emesika@redhat.com writes:
> > 1) pg_dump -F t -U postgres -f test.tar test
> > 2) tar xvf test.tar  to any directory
> > 3) vi restore.sql
> > * restore.sql includes DROP statements for each object even tough
> > -c flag
> > was not given
>
> I believe this is intentional - at least, pg_backup_tar.c goes out of
> its way to make it happen.  (The forcible setting of ropt->dropSchema
> in _CloseArchive is the cause, and it's hard to see why that would be
> there unless the author intended this effect.)  Perhaps we should
> remove
> that, but it would be an incompatible change.  Arguing for or against
> it really requires a model of what people would be doing with the
> restore.sql script.  I'm not entirely convinced that it should be
> considered equivalent to what you'd get from a plain dump run.

Tom, thanks for the quick respond.
I find that very confusing:
Currently, in order to use the pg_restore (tar format), you will have to create first the database with the same schema
youare going to restore while the most convenient usage is to create a new empty database and run the restore.  
Also, that contradict with the pg_dump -C flag that adds database creation statement and it is clear that DROP
statementswill fail on a clean database. 

Regards
Eli



>
>             regards, tom lane
>