Thread: [PATCH] Fix pg_dump --no-tablespaces for the custom format

[PATCH] Fix pg_dump --no-tablespaces for the custom format

From
Christopher Baines
Date:
Hey,

So I'm new to poking around in the PostgreSQL code, so this is a bit of
a shot in the dark. I'm having some problems with pg_dump, and a
database with tablespaces. A couple of the tables are not in the default
tablespace, and I want to ignore this for the dump.

Looking at the pg_dump --help, there seems to be a perfect option for
this:

  --no-tablespaces             do not dump tablespace assignments

This seems to work fine when using the plain text format, but I'm using
the custom format, and that seems to break the effect of
--no-tablespaces.

Looking at the code, I think I've managed to determine a place where
this behaviour can be changed, and so I've attached a draft patch [1].

Is this an actual problem, and if so, am I anywhere near the right place
in the code in terms of addressing it?

Thanks,

Chris

1:
From 124c10e3eb9f530a42bf294c95ee14b723ca2878 Mon Sep 17 00:00:00 2001
From: Christopher Baines <mail@cbaines.net>
Date: Fri, 15 May 2020 20:57:26 +0100
Subject: [PATCH] Fix pg_dump --no-tablespaces for the custom format

---
 src/bin/pg_dump/pg_dump.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/src/bin/pg_dump/pg_dump.c b/src/bin/pg_dump/pg_dump.c
index f33c2463a7..ea35984a94 100644
--- a/src/bin/pg_dump/pg_dump.c
+++ b/src/bin/pg_dump/pg_dump.c
@@ -16412,7 +16412,8 @@ dumpTableSchema(Archive *fout, TableInfo *tbinfo)
                      ARCHIVE_OPTS(.tag = tbinfo->dobj.name,
                                   .namespace = tbinfo->dobj.namespace->dobj.name,
                                   .tablespace = (tbinfo->relkind == RELKIND_VIEW) ?
-                                  NULL : tbinfo->reltablespace,
+                                  NULL : (dopt->outputNoTablespaces ?
+                                          NULL : tbinfo->reltablespace),
                                   .tableam = tableam,
                                   .owner = tbinfo->rolname,
                                   .description = reltypename,
--
2.26.2


Attachment

Re: [PATCH] Fix pg_dump --no-tablespaces for the custom format

From
Tom Lane
Date:
Christopher Baines <mail@cbaines.net> writes:
> So I'm new to poking around in the PostgreSQL code, so this is a bit of
> a shot in the dark. I'm having some problems with pg_dump, and a
> database with tablespaces. A couple of the tables are not in the default
> tablespace, and I want to ignore this for the dump.

I think you've misunderstood how the pieces fit together.  A lot of
the detail-filtering switches, including --no-tablespaces, work on
the output side of the "archive" format.  While you can't really tell
the difference in pg_dump text mode, the implication for custom-format
output is that the info is always there in the archive file, and you
give the switch to pg_restore if you don't want to see the info.
This is more flexible since you aren't compelled to make the decision
up-front, and it doesn't really cost anything to include such info in
the archive.  (Obviously, table-filtering switches don't work that
way, since with those there can be a really large cost in file size
to include unwanted data.)

So from my perspective, things are working fine and this patch would
break it.

If you actually want to suppress this info from getting into the
archive file, you'd have to give a very compelling reason for
breaking this behavior for other people.

            regards, tom lane



Re: [PATCH] Fix pg_dump --no-tablespaces for the custom format

From
Christopher Baines
Date:
Tom Lane <tgl@sss.pgh.pa.us> writes:

> Christopher Baines <mail@cbaines.net> writes:
>> So I'm new to poking around in the PostgreSQL code, so this is a bit of
>> a shot in the dark. I'm having some problems with pg_dump, and a
>> database with tablespaces. A couple of the tables are not in the default
>> tablespace, and I want to ignore this for the dump.
>
> I think you've misunderstood how the pieces fit together.  A lot of
> the detail-filtering switches, including --no-tablespaces, work on
> the output side of the "archive" format.  While you can't really tell
> the difference in pg_dump text mode, the implication for custom-format
> output is that the info is always there in the archive file, and you
> give the switch to pg_restore if you don't want to see the info.
> This is more flexible since you aren't compelled to make the decision
> up-front, and it doesn't really cost anything to include such info in
> the archive.  (Obviously, table-filtering switches don't work that
> way, since with those there can be a really large cost in file size
> to include unwanted data.)
>
> So from my perspective, things are working fine and this patch would
> break it.
>
> If you actually want to suppress this info from getting into the
> archive file, you'd have to give a very compelling reason for
> breaking this behavior for other people.

Thanks for getting back to me Tom :)

I don't really follow how having it do something more along the lines of
how it's documented would both be less flexible and break existing uses
of pg_dump. You're not prevented from including tablespace assignments,
just don't pass --no-tablespaces, and your now able to not include them
for the archive formats, just like the plain format, which in my view
increases the flexibility of the tool, since something new is possible.

I realise that for people who are passing --no-tablespaces, without
realising it does nothing combined with the archive formats, that
actually not including tablespace assignments will change the behaviour
for them, but as above, I'd see this as a positive change, as it makes
the tooling more powerful (and simpler to understand as well).

I see now that while the --help output doesn't capture the nuances:

  --no-tablespaces             do not dump tablespace assignments

The documentation does:

  --no-tablespaces

    Do not output commands to select tablespaces. With this option, all
    objects will be created in whichever tablespace is the default
    during restore.

    This option is only meaningful for the plain-text format. For the
    archive formats, you can specify the option when you call
    pg_restore.


Thanks again,

Chris

Attachment

Re: [PATCH] Fix pg_dump --no-tablespaces for the custom format

From
Euler Taveira
Date:
On Sat, 16 May 2020 at 04:20, Christopher Baines <mail@cbaines.net> wrote:

Tom Lane <tgl@sss.pgh.pa.us> writes:

> Christopher Baines <mail@cbaines.net> writes:
>> So I'm new to poking around in the PostgreSQL code, so this is a bit of
>> a shot in the dark. I'm having some problems with pg_dump, and a
>> database with tablespaces. A couple of the tables are not in the default
>> tablespace, and I want to ignore this for the dump.
>
> I think you've misunderstood how the pieces fit together.  A lot of
> the detail-filtering switches, including --no-tablespaces, work on
> the output side of the "archive" format.  While you can't really tell
> the difference in pg_dump text mode, the implication for custom-format
> output is that the info is always there in the archive file, and you
> give the switch to pg_restore if you don't want to see the info.
> This is more flexible since you aren't compelled to make the decision
> up-front, and it doesn't really cost anything to include such info in
> the archive.  (Obviously, table-filtering switches don't work that
> way, since with those there can be a really large cost in file size
> to include unwanted data.)
>
I've also had to explain a dozen times how the archive format works. Archive
format is kind of intermediary format because you can produce a plain format
using it.

[Testing some pg_dump --no-option ...]

The following objects are not included if a --no-option is used:

* grant / revoke
* comment
* publication
* subscription
* security labels

but some are included even if --no-option is used:

* owner
* tablespace

I'm wondering why there is such a distinction. We have some options:

(a) leave it as is and document that those 2 options has no effect in pg_dump
and possibly add a warning to report if someone uses it with an archive format;
(b) exclude owner and tablespace from archive (it breaks compatibility but do
exactly what users expect).

I do not even consider a possibility to include all objects even if a
--no-option is used because you will have a bunch of complaints / reports.


--
Euler Taveira                 http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: [PATCH] Fix pg_dump --no-tablespaces for the custom format

From
Tom Lane
Date:
Euler Taveira <euler.taveira@2ndquadrant.com> writes:
> I'm wondering why there is such a distinction. We have some options:

> (a) leave it as is and document that those 2 options has no effect in
> pg_dump
> and possibly add a warning to report if someone uses it with an archive
> format;
> (b) exclude owner and tablespace from archive (it breaks compatibility but
> do
> exactly what users expect).

I think throwing a warning saying "this option does nothing" might be
a reasonable change.

I think (b) would be a break in the archive format, with unclear
consequences, so I'm not in favor of that.

            regards, tom lane



Re: [PATCH] Fix pg_dump --no-tablespaces for the custom format

From
Euler Taveira
Date:
On Sat, 16 May 2020 at 16:31, Tom Lane <tgl@sss.pgh.pa.us> wrote:

I think (b) would be a break in the archive format, with unclear
consequences, so I'm not in favor of that.

I came to the same conclusion while inspecting the source code.


--
Euler Taveira                 http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services