Thread: [PATCH] Fix pg_dump --no-tablespaces for the custom format
Hey, So I'm new to poking around in the PostgreSQL code, so this is a bit of a shot in the dark. I'm having some problems with pg_dump, and a database with tablespaces. A couple of the tables are not in the default tablespace, and I want to ignore this for the dump. Looking at the pg_dump --help, there seems to be a perfect option for this: --no-tablespaces do not dump tablespace assignments This seems to work fine when using the plain text format, but I'm using the custom format, and that seems to break the effect of --no-tablespaces. Looking at the code, I think I've managed to determine a place where this behaviour can be changed, and so I've attached a draft patch [1]. Is this an actual problem, and if so, am I anywhere near the right place in the code in terms of addressing it? Thanks, Chris 1: From 124c10e3eb9f530a42bf294c95ee14b723ca2878 Mon Sep 17 00:00:00 2001 From: Christopher Baines <mail@cbaines.net> Date: Fri, 15 May 2020 20:57:26 +0100 Subject: [PATCH] Fix pg_dump --no-tablespaces for the custom format --- src/bin/pg_dump/pg_dump.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/src/bin/pg_dump/pg_dump.c b/src/bin/pg_dump/pg_dump.c index f33c2463a7..ea35984a94 100644 --- a/src/bin/pg_dump/pg_dump.c +++ b/src/bin/pg_dump/pg_dump.c @@ -16412,7 +16412,8 @@ dumpTableSchema(Archive *fout, TableInfo *tbinfo) ARCHIVE_OPTS(.tag = tbinfo->dobj.name, .namespace = tbinfo->dobj.namespace->dobj.name, .tablespace = (tbinfo->relkind == RELKIND_VIEW) ? - NULL : tbinfo->reltablespace, + NULL : (dopt->outputNoTablespaces ? + NULL : tbinfo->reltablespace), .tableam = tableam, .owner = tbinfo->rolname, .description = reltypename, -- 2.26.2
Attachment
Christopher Baines <mail@cbaines.net> writes: > So I'm new to poking around in the PostgreSQL code, so this is a bit of > a shot in the dark. I'm having some problems with pg_dump, and a > database with tablespaces. A couple of the tables are not in the default > tablespace, and I want to ignore this for the dump. I think you've misunderstood how the pieces fit together. A lot of the detail-filtering switches, including --no-tablespaces, work on the output side of the "archive" format. While you can't really tell the difference in pg_dump text mode, the implication for custom-format output is that the info is always there in the archive file, and you give the switch to pg_restore if you don't want to see the info. This is more flexible since you aren't compelled to make the decision up-front, and it doesn't really cost anything to include such info in the archive. (Obviously, table-filtering switches don't work that way, since with those there can be a really large cost in file size to include unwanted data.) So from my perspective, things are working fine and this patch would break it. If you actually want to suppress this info from getting into the archive file, you'd have to give a very compelling reason for breaking this behavior for other people. regards, tom lane
Tom Lane <tgl@sss.pgh.pa.us> writes: > Christopher Baines <mail@cbaines.net> writes: >> So I'm new to poking around in the PostgreSQL code, so this is a bit of >> a shot in the dark. I'm having some problems with pg_dump, and a >> database with tablespaces. A couple of the tables are not in the default >> tablespace, and I want to ignore this for the dump. > > I think you've misunderstood how the pieces fit together. A lot of > the detail-filtering switches, including --no-tablespaces, work on > the output side of the "archive" format. While you can't really tell > the difference in pg_dump text mode, the implication for custom-format > output is that the info is always there in the archive file, and you > give the switch to pg_restore if you don't want to see the info. > This is more flexible since you aren't compelled to make the decision > up-front, and it doesn't really cost anything to include such info in > the archive. (Obviously, table-filtering switches don't work that > way, since with those there can be a really large cost in file size > to include unwanted data.) > > So from my perspective, things are working fine and this patch would > break it. > > If you actually want to suppress this info from getting into the > archive file, you'd have to give a very compelling reason for > breaking this behavior for other people. Thanks for getting back to me Tom :) I don't really follow how having it do something more along the lines of how it's documented would both be less flexible and break existing uses of pg_dump. You're not prevented from including tablespace assignments, just don't pass --no-tablespaces, and your now able to not include them for the archive formats, just like the plain format, which in my view increases the flexibility of the tool, since something new is possible. I realise that for people who are passing --no-tablespaces, without realising it does nothing combined with the archive formats, that actually not including tablespace assignments will change the behaviour for them, but as above, I'd see this as a positive change, as it makes the tooling more powerful (and simpler to understand as well). I see now that while the --help output doesn't capture the nuances: --no-tablespaces do not dump tablespace assignments The documentation does: --no-tablespaces Do not output commands to select tablespaces. With this option, all objects will be created in whichever tablespace is the default during restore. This option is only meaningful for the plain-text format. For the archive formats, you can specify the option when you call pg_restore. Thanks again, Chris
Attachment
On Sat, 16 May 2020 at 04:20, Christopher Baines <mail@cbaines.net> wrote:
Tom Lane <tgl@sss.pgh.pa.us> writes:
> Christopher Baines <mail@cbaines.net> writes:
>> So I'm new to poking around in the PostgreSQL code, so this is a bit of
>> a shot in the dark. I'm having some problems with pg_dump, and a
>> database with tablespaces. A couple of the tables are not in the default
>> tablespace, and I want to ignore this for the dump.
>
> I think you've misunderstood how the pieces fit together. A lot of
> the detail-filtering switches, including --no-tablespaces, work on
> the output side of the "archive" format. While you can't really tell
> the difference in pg_dump text mode, the implication for custom-format
> output is that the info is always there in the archive file, and you
> give the switch to pg_restore if you don't want to see the info.
> This is more flexible since you aren't compelled to make the decision
> up-front, and it doesn't really cost anything to include such info in
> the archive. (Obviously, table-filtering switches don't work that
> way, since with those there can be a really large cost in file size
> to include unwanted data.)
>
I've also had to explain a dozen times how the archive format works. Archive
format is kind of intermediary format because you can produce a plain format
using it.
[Testing some pg_dump --no-option ...]
The following objects are not included if a --no-option is used:
* grant / revoke
* comment
* publication
* subscription
* security labels
but some are included even if --no-option is used:
* owner
* tablespace
I'm wondering why there is such a distinction. We have some options:
(a) leave it as is and document that those 2 options has no effect in pg_dump
and possibly add a warning to report if someone uses it with an archive format;
(b) exclude owner and tablespace from archive (it breaks compatibility but do
exactly what users expect).
I do not even consider a possibility to include all objects even if a
--no-option is used because you will have a bunch of complaints / reports.
format is kind of intermediary format because you can produce a plain format
using it.
[Testing some pg_dump --no-option ...]
The following objects are not included if a --no-option is used:
* grant / revoke
* comment
* publication
* subscription
* security labels
but some are included even if --no-option is used:
* owner
* tablespace
I'm wondering why there is such a distinction. We have some options:
(a) leave it as is and document that those 2 options has no effect in pg_dump
and possibly add a warning to report if someone uses it with an archive format;
(b) exclude owner and tablespace from archive (it breaks compatibility but do
exactly what users expect).
I do not even consider a possibility to include all objects even if a
--no-option is used because you will have a bunch of complaints / reports.
Euler Taveira http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Euler Taveira <euler.taveira@2ndquadrant.com> writes: > I'm wondering why there is such a distinction. We have some options: > (a) leave it as is and document that those 2 options has no effect in > pg_dump > and possibly add a warning to report if someone uses it with an archive > format; > (b) exclude owner and tablespace from archive (it breaks compatibility but > do > exactly what users expect). I think throwing a warning saying "this option does nothing" might be a reasonable change. I think (b) would be a break in the archive format, with unclear consequences, so I'm not in favor of that. regards, tom lane
On Sat, 16 May 2020 at 16:31, Tom Lane <tgl@sss.pgh.pa.us> wrote:
I think (b) would be a break in the archive format, with unclear
consequences, so I'm not in favor of that.
I came to the same conclusion while inspecting the source code.
Euler Taveira http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services