Thread: pg_dump --exclude-table-data
Attached is an undocumented patch that allows a user to have pg_dump exclude data but not DDL for a table. One use case for this is a very large table that changes infrequently, and for which dumping data frequently would be wasteful and unnecessary. This is especially useful in conjunction with another patch (see next email) to do post-data items only or omit post-data items in pg_restore. For those who are (like my clients :-) ) anxious to get their hands on this immediately, a backport patch is also attached which applies to 9.0 sources, and applies with offsets to 8.4 sources. cheers andrew
Attachment
> For those who are (like my clients :-) ) anxious to get their hands on > this immediately, a backport patch is also attached which applies to 9.0 > sources, and applies with offsets to 8.4 sources. FWIW, I have immediate use for this in creating cut-down versions of databases for testing purposes. It'll eliminate a couple pages of shell scripts for me. -- Josh Berkus PostgreSQL Experts Inc. http://pgexperts.com
On Wed, Aug 24, 2011 at 2:01 PM, Josh Berkus <josh@agliodbs.com> wrote: > > FWIW, I have immediate use for this in creating cut-down versions of > databases for testing purposes. It'll eliminate a couple pages of shell > scripts for me. Speaking of "cut-down versions", I have recently been using pg_sample, and been happy with the resulting subset database. I created a db <10 GB in size from a source db ~600Gb in a few minutes. https://github.com/mla/pg_sample
On 08/24/2011 05:01 PM, Josh Berkus wrote: >> For those who are (like my clients :-) ) anxious to get their hands on >> this immediately, a backport patch is also attached which applies to 9.0 >> sources, and applies with offsets to 8.4 sources. > FWIW, I have immediate use for this in creating cut-down versions of > databases for testing purposes. It'll eliminate a couple pages of shell > scripts for me. > OK, this seems to have some pluses and no negative comments, so it seems worth going forward. Do we want an equivalent pg_restore option? cheers andrew
> OK, this seems to have some pluses and no negative comments, so it seems > worth going forward. Do we want an equivalent pg_restore option? I'm not sure it's *as* important for pg_restore, since I can easily usea manifest to avoid restoring data for a single table. So I guess it's a question of "how hard is it to add it?" -- Josh Berkus PostgreSQL Experts Inc. http://pgexperts.com
On 09/02/2011 03:15 PM, Josh Berkus wrote: >> OK, this seems to have some pluses and no negative comments, so it seems >> worth going forward. Do we want an equivalent pg_restore option? > I'm not sure it's *as* important for pg_restore, since I can easily use > a manifest to avoid restoring data for a single table. So I guess it's > a question of "how hard is it to add it?" > The short answer is "more work than I want to put in to this." pg_restore doesn't have any of pg_dump's infrastructure for handling table name patterns, nor for excluding tables. So I think all that would remain a TODO. (A good beginner project, maybe). A slightly updated patch is attached, the main change being that I removed use of a short option and only support the long name option. "-D" didn't seem sufficiently mnemonic to me. I'll add this to the November commitfest, but I'd like to get it committed ASAP as it will simplify setting up the -pre and -post data patches. cheers andrew
Attachment
On Wed, Nov 2, 2011 at 6:02 PM, Andrew Dunstan <andrew@dunslane.net> wrote: > On 09/02/2011 03:15 PM, Josh Berkus wrote: >>> >>> OK, this seems to have some pluses and no negative comments, so it seems >>> worth going forward. Do we want an equivalent pg_restore option? >> >> I'm not sure it's *as* important for pg_restore, since I can easily use >> a manifest to avoid restoring data for a single table. So I guess it's >> a question of "how hard is it to add it?" >> > > The short answer is "more work than I want to put in to this." pg_restore > doesn't have any of pg_dump's infrastructure for handling table name > patterns, nor for excluding tables. So I think all that would remain a TODO. > (A good beginner project, maybe). > > A slightly updated patch is attached, the main change being that I removed > use of a short option and only support the long name option. "-D" didn't > seem sufficiently mnemonic to me. I'll add this to the November commitfest, > but I'd like to get it committed ASAP as it will simplify setting up the > -pre and -post data patches. Instead of: do NOT dump data for the named table(s) How about: dump only schema for the named table(s) I'm also a bit concerned about the relationship between this and the existing -s option. It seems odd that you use --schema-only to get the behavior database-wide, and --exclude-table-data to get it for just one table. Is there some way we can make that a bit more consistent? -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
On 11/04/2011 10:21 AM, Robert Haas wrote: > >> A slightly updated patch is attached, the main change being that I removed >> use of a short option and only support the long name option. "-D" didn't >> seem sufficiently mnemonic to me. I'll add this to the November commitfest, >> but I'd like to get it committed ASAP as it will simplify setting up the >> -pre and -post data patches. > Instead of: > > do NOT dump data for the named table(s) > > How about: > > dump only schema for the named table(s) I have no great objection to the wording change. > I'm also a bit concerned about the relationship between this and the > existing -s option. It seems odd that you use --schema-only to get > the behavior database-wide, and --exclude-table-data to get it for > just one table. Is there some way we can make that a bit more > consistent? > It's consistent, and was designed to be, with the --exclude-table option. I'm not sure what you want it to look like instead. But TBH I'm more interested in getting the functionality than in how it's spelled. cheers andrew
On Wed, Dec 7, 2011 at 10:19 PM, Andrew Dunstan <andrew@dunslane.net> wrote: >> I'm also a bit concerned about the relationship between this and the >> existing -s option. It seems odd that you use --schema-only to get >> the behavior database-wide, and --exclude-table-data to get it for >> just one table. Is there some way we can make that a bit more >> consistent? > > It's consistent, and was designed to be, with the --exclude-table option. > I'm not sure what you want it to look like instead. But TBH I'm more > interested in getting the functionality than in how it's spelled. Ah, hmm. Well, maybe it's fine the way that you have it. But I'd be tempted to at least add a sentence to the sgml documentation for each option referring to the other, e.g. "To dump only schema for all tables in the database, see --schema-only." and "To exclude table data for only a subset of tables in the database, see --exclude-table-data.", or something along those lines. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
On 12/08/2011 11:13 AM, Robert Haas wrote: > On Wed, Dec 7, 2011 at 10:19 PM, Andrew Dunstan<andrew@dunslane.net> wrote: >>> I'm also a bit concerned about the relationship between this and the >>> existing -s option. It seems odd that you use --schema-only to get >>> the behavior database-wide, and --exclude-table-data to get it for >>> just one table. Is there some way we can make that a bit more >>> consistent? >> It's consistent, and was designed to be, with the --exclude-table option. >> I'm not sure what you want it to look like instead. But TBH I'm more >> interested in getting the functionality than in how it's spelled. > Ah, hmm. Well, maybe it's fine the way that you have it. But I'd be > tempted to at least add a sentence to the sgml documentation for each > option referring to the other, e.g. "To dump only schema for all > tables in the database, see --schema-only." and "To exclude table data > for only a subset of tables in the database, see > --exclude-table-data.", or something along those lines. > Sure, no problem with that. cheers andrew
On 12/08/2011 11:36 AM, Andrew Dunstan wrote: > > > On 12/08/2011 11:13 AM, Robert Haas wrote: >> Ah, hmm. Well, maybe it's fine the way that you have it. But I'd be >> tempted to at least add a sentence to the sgml documentation for each >> option referring to the other, e.g. "To dump only schema for all >> tables in the database, see --schema-only." and "To exclude table data >> for only a subset of tables in the database, see >> --exclude-table-data.", or something along those lines. >> > > Sure, no problem with that. > > Revised patch attached. cheers andrew
Attachment
Not sure that I have a lot to add here, but I am officially listed as a reviewer, which is a responsibility that I don't want to shirk. In my opinion, this patch is obviously useful. I don't find the asymmetry that it will create with pg_restore to be troubling, so I'd favour committing it as-is. Extremely minor problem noted: There are two spaces at the start of one sentence in your SGML doc updates. -- Peter Geoghegan http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training and Services
On 12/14/2011 06:28 AM, Peter Geoghegan wrote: > Not sure that I have a lot to add here, but I am officially listed as > a reviewer, which is a responsibility that I don't want to shirk. > > In my opinion, this patch is obviously useful. I don't find the > asymmetry that it will create with pg_restore to be troubling, so I'd > favour committing it as-is. > > Extremely minor problem noted: There are two spaces at the start of > one sentence in your SGML doc updates. > Thanks. Committed with that changed, although we seem to be getting altogether too obsessive about white space, IMNSHO. cheers andrew
On 14 December 2011 14:31, Andrew Dunstan <andrew@dunslane.net> wrote: > Thanks. Committed with that changed, although we seem to be getting > altogether too obsessive about white space, IMNSHO. I agree, but I think it's important that we judge patches by a consistent standard. Right now, for better or worse, that standard includes being obsessed with white space. -- Peter Geoghegan http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training and Services
On Dec 14, 2011, at 6:31 AM, Andrew Dunstan wrote: > Thanks. Committed with that changed, although we seem to be getting altogether too obsessive about white space, IMNSHO. If that’s all there is to complain about, I think it’s a pretty good sign. ;-P David