Thread: pg_dump exclusion switches and functions/types
Testing out the new pg_dump exclusion switches I've found that excluding a table means that no functions or types will be dumped. Excluding one table shouldn't exclude these objects. My real use case for this functionality is that I have a database that has 99% of its space used by one big table and I'd like to be able to dump the rest of the database quickly. If I lose function/type information it's useless. Kris Jurka
Kris Jurka <books@ejurka.com> writes: > Testing out the new pg_dump exclusion switches I've found that excluding a > table means that no functions or types will be dumped. Excluding one > table shouldn't exclude these objects. I tend to agree ... will see if I can make it happen. (I never did get around to reviewing that patch, anyway ...) One issue is what to do with procedural languages and large objects, which don't have any associated schema. If we treat them as being outside all schemas, we'd have semantics like this: dump the PLs and blobs unless one or more --schema switches appeared. Is that OK? regards, tom lane
Hi, Tom, Tom Lane wrote: > One issue is what to do with procedural languages and large objects, > which don't have any associated schema. If we treat them as being > outside all schemas, we'd have semantics like this: dump the PLs and > blobs unless one or more --schema switches appeared. Is that OK? Sounds fine. Is there a possibility to dump only those objects? Maybe --large-objects and --languages? Thanks, Markus -- Markus Schaber | Logical Tracking&Tracing International AG Dipl. Inf. | Software Development GIS Fight against software patents in Europe! www.ffii.org www.nosoftwarepatents.org
Tom Lane wrote: > Kris Jurka <books@ejurka.com> writes: >> Testing out the new pg_dump exclusion switches I've found that excluding a >> table means that no functions or types will be dumped. Excluding one >> table shouldn't exclude these objects. > > I tend to agree ... will see if I can make it happen. (I never did get > around to reviewing that patch, anyway ...) > > One issue is what to do with procedural languages and large objects, > which don't have any associated schema. If we treat them as being > outside all schemas, we'd have semantics like this: dump the PLs and > blobs unless one or more --schema switches appeared. Is that OK? Is there a reason why pg_dump can't do the --list/--use-list flags like pg_restore, or is it just a matter of round tuits? -- Richard Huxton Archonet Ltd
Richard Huxton wrote: > > Is there a reason why pg_dump can't do the --list/--use-list flags > like pg_restore, or is it just a matter of round tuits? > The major reason for having those features as I understand it was to help overcome dependency difficulties in dumps, which are now largely a thing of the past. However, ISTM that a similar facility for fine grained control could fairly easily be built into pg_dump. cheers andrew
Andrew Dunstan <andrew@dunslane.net> writes: > However, ISTM that a similar facility for fine grained control could > fairly easily be built into pg_dump. Yeah ... later. The way I envision it is that the schema-related switches are fine for selecting things at the level of whole schemas, and the table-related switches are fine for selecting individual tables, and what we lack are inclusion/exclusion switches that operate on other kinds of individual objects. Somebody can design and implement those later, if the itch strikes. What we have to do today is make sure that the interaction of schema and table switches is such that an extension in that direction will fit in naturally. regards, tom lane
Tom Lane wrote: > Andrew Dunstan <andrew@dunslane.net> writes: > >> However, ISTM that a similar facility for fine grained control could >> fairly easily be built into pg_dump. >> > > Yeah ... later. > > The way I envision it is that the schema-related switches are fine for > selecting things at the level of whole schemas, and the table-related > switches are fine for selecting individual tables, and what we lack are > inclusion/exclusion switches that operate on other kinds of individual > objects. Somebody can design and implement those later, if the itch > strikes. What we have to do today is make sure that the interaction of > schema and table switches is such that an extension in that direction > will fit in naturally. > > > totally agree. cheers andrew
Kris Jurka <books@ejurka.com> writes: > Testing out the new pg_dump exclusion switches I've found that excluding a > table means that no functions or types will be dumped. Excluding one > table shouldn't exclude these objects. I've been chewing on this a bit and find that the existing patch has several behaviors that seem surprising. Considering just one type of inclusion/exclusion switches at a time (we'll get to the interaction of schema and table switches below), I think we can all agree without too much argument on these statements: * With no inclusion/exclusion switches, all objects except system objects should be dumped. * With only exclusion switches given, all objects except system objects and those matching at least one pattern should bedumped. * With only inclusion switches given, only those objects matching at least one pattern should be dumped (whether they aresystem objects or not). That last proviso might be debatable but on balance I think it's OK (for instance, "pg_dump --schema=information_schema" could be useful for debugging or documentation purposes). Where things get interesting is when you have both inclusion and exclusion switches given. The existing patch's behavior is that "the rightmost switch wins", ie, if an object's name matches more than one pattern then it is included or excluded according to the rightmost switch it matches. This is, erm, poorly documented, but it seems like useful behavior so I don't have an objection myself. The real question is what should happen to objects that don't match any of the switch patterns? (This is relevant to Kris' complaint because non-table objects should be treated the same as tables that don't match any table name inclusion/exclusion switches.) What I find in the existing code is that if an inclusion switch appears first: pg_dump -n 's.*' -N 'ss.*' ... then only schemas matching an inclusion switch (and not matching any later exclusion switch) are dumped. While if an exclusion switch appears first: pg_dump -N 's.*' -n 'ss.*' ... then all schemas are dumped except system schemas and those matching an exclusion switch (and not matching any later inclusion switch). So the "default" behavior for unmatched objects flips depending on switch order. This doesn't seem to satisfy the principle of least surprise, and it's certainly not adequately documented. It might be the most useful behavior though. I thought about the alternative rule that "if any inclusion switches appear at all, the default is not to dump" --- that is, an object must match at least one inclusion switch (and not match any later exclusion switch) to be dumped. But with that rule, exclusion switches before the first inclusion switch are actually useless. Has anyone got a better idea? Returning to the point about schema versus table selection switches, what we've got is that they are independent filters: to be dumped, a table must be in a schema selected by the schema inclusion/exclusion switches (if any), and it must have a name selected by the table inclusion/exclusion switches (if any). I think this is OK but it leads to the property that the order of -n/-N switches is relevant, and the order of -t/-T switches is relevant, but their order relative to each other is not relevant. This could be surprising. If you're still with me, the payoff is here: what are the rules for dumping non-table objects, given that there are no inclusion/exclusion switches for them (but we might want to add such later)? If only schema inclusion/exclusion switches are present, then it's relatively easy to say "dump objects that are in selected schemas" --- but what about objects that don't have a schema, such as PLs? And what about the case where table inclusion/exclusion switches are present? I said above that non-table objects should be treated the same way as unmatched tables, which I think is a necessary rule if we want to extend the set of switch types later. But that leads to the conclusion that "a non-table object is dumped unless a -t switch appears before any -T switches". Which strikes me as a mighty surprising behavior. I'm not sure what to do differently though. Lastly, as long as we're questioning the premises of this patch, I wonder about the choice to use regex pattern matching rules. The problem with regex is that to be upward-compatible with the old exact-match switch definitions, a switch value that doesn't contain any regex special characters is treated as an equality condition not a pattern, which makes for a discontinuity. For instance, "-t x" is treated like -t '^x$' while -t 'x.*y' doesn't get the anchors added. That's going to burn people. An alternative we could consider is to use LIKE patterns instead, but since underscore is a wildcard in LIKE, it's easy to imagine people getting burnt by that too. Or we could import the rather ad-hoc shell-wildcard-like rules used by psql's \d stuff. None of these are especially attractive :-( Comments? regards, tom lane
[Snip explanations] > Comments? Would it be reasonable to include one more switch: 'include dependencies' ? That would work like this: * first consider all to be included objects (possibly limited by the include switches); * if dependencies are included, add all dependent objects, plus non-schema objects (which arguably can be considered as dependencies for the whole data base); * remove all objects targeted by exclude switches; This way you won't have any dependency on the ordering, and you could consider all non-schema objects as dependencies, so they will only be included if dependencies are to be included. Excluding dependencies would be the default if any switches are specified, including otherwise (not sure how much is this of the principle of least surprise, but it would be backwards compatible). The scenario I most care about is to be able to make a complete data base dump (including non-schema objects) while excluding only a few tables. If I understood your explanations correctly, this would not be easily possible with the current implementation. Note that I have a patch (kludge ?) on the 8.1 pg_dump which does exactly this, it would be a pity if I would need to patch the 8.2 one again to do that... Cheers, Csaba.
Tom Lane wrote: > Lastly, as long as we're questioning the premises of this patch, > I wonder about the choice to use regex pattern matching rules. > The problem with regex is that to be upward-compatible with the old > exact-match switch definitions, a switch value that doesn't contain > any regex special characters is treated as an equality condition not > a pattern, which makes for a discontinuity. For instance, "-t x" is > treated like -t '^x$' while -t 'x.*y' doesn't get the anchors added. > That's going to burn people. An alternative we could consider is to > use LIKE patterns instead, but since underscore is a wildcard in LIKE, > it's easy to imagine people getting burnt by that too. Or we could > import the rather ad-hoc shell-wildcard-like rules used by psql's \d > stuff. None of these are especially attractive :-( > > Comments? > > 1. regexes, please. 2. I'd rather remove backwards compatibility than have the discontinuity. After all, users can anchor the expression pretty easily. If not, then let's use an alternate switch for the regexes, (I know we are running out of option space). cheers andrew
Csaba Nagy <nagy@ecircle-ag.com> writes: > Would it be reasonable to include one more switch: 'include > dependencies' ? We are two months past feature freeze ... adding entirely new features to pg_dump is *not* on the table for 8.2. What we need to do at the moment is make sure that the features we've got work sanely and won't create headaches for likely future extensions; but not actually implement those extensions. > The scenario I most care about is to be able to make a complete data > base dump (including non-schema objects) while excluding only a few > tables. Isn't this the same as Kris' complaint? Why do you need additional dependency analysis to do the above? regards, tom lane
Andrew Dunstan <andrew@dunslane.net> writes: > Tom Lane wrote: >> ... Or we could >> import the rather ad-hoc shell-wildcard-like rules used by psql's \d >> stuff. None of these are especially attractive :-( > 1. regexes, please. One argument that occurs to me for importing the psql code is that it's solved the problem of including a schema name in the pattern. It would be a lot nicer to say "-t schema.table" than to have to say "-t table -n schema". In particular this allows one to dump s1.foo and s2.bar without also getting s1.bar and s2.foo; a problem that is insoluble if we have only independent schema and table filters. I think that ideally one would only use the schema switches if one actually wanted a schema-by-schema dump, not as a wart on the side of the specific-object-selection switches. The psql code does allow you to get at most of the functionality of regexes... regards, tom lane
On Fri, Oct 06, 2006 at 11:54:51 -0400, Tom Lane <tgl@sss.pgh.pa.us> wrote: > > The problem with regex is that to be upward-compatible with the old > exact-match switch definitions, a switch value that doesn't contain > any regex special characters is treated as an equality condition not > a pattern, which makes for a discontinuity. For instance, "-t x" is > treated like -t '^x$' while -t 'x.*y' doesn't get the anchors added. > That's going to burn people. An alternative we could consider is to > use LIKE patterns instead, but since underscore is a wildcard in LIKE, > it's easy to imagine people getting burnt by that too. Or we could > import the rather ad-hoc shell-wildcard-like rules used by psql's \d > stuff. None of these are especially attractive :-( > > Comments? How about making the regex's anchored by default? People who want unanchored ones can add .* at the beginning and/or end. Since only whether or not the pattern matches is important (not the string it matched), this keeps all of the same power, but matches the old behavior in simple cases.
Tom Lane <tgl@sss.pgh.pa.us> writes: > The existing patch's behavior is that "the rightmost switch wins", ie, if an > object's name matches more than one pattern then it is included or excluded > according to the rightmost switch it matches. This is, erm, poorly > documented, but it seems like useful behavior so I don't have an objection > myself. I don't know, it sounds like it's the source of the confusion you identify later. My first thought is that the rule should be to apply all the inclusion switches (implicitly including everything if there are none), then apply all the exclusion switches. That leads to including non-schema objects only if there are no schema inclusion switches. Which seems pretty logical since if you're explicitly including objects then you'll only expect objects explicitly included to be dumped and you'll quickly realize there's no switch to bring in those non-schema objects. Maybe there should be a switch to include them just for completeness. -- Gregory Stark EnterpriseDB http://www.enterprisedb.com
Gregory Stark <stark@enterprisedb.com> writes: > Tom Lane <tgl@sss.pgh.pa.us> writes: >> The existing patch's behavior is that "the rightmost switch wins", ie, if an >> object's name matches more than one pattern then it is included or excluded >> according to the rightmost switch it matches. > My first thought is that the rule should be to apply all the inclusion > switches (implicitly including everything if there are none), then apply all > the exclusion switches. I kinda like that, because it makes the behavior completely independent of switch ordering, which seems like a good property to preserve. Anyone else have an opinion pro or con? > That leads to including non-schema objects only if there are no schema > inclusion switches. Which seems pretty logical since if you're explicitly > including objects then you'll only expect objects explicitly included to be > dumped and you'll quickly realize there's no switch to bring in those > non-schema objects. Maybe there should be a switch to include them just for > completeness. Well, pg_dump already has a --blobs switch, which has been a no-op (because now the default) since 8.1, but it's still in the switch parser. It wouldn't take much to revive it for the purpose of causing blobs to be dumped even when there's an inclusion switch. As for PLs, I'm not really too worried about dumping them per se (since it's usually easy enough to create the ones you're using). The functionality we're really lacking there is the "--include-dependencies" switch that was discussed upthread ... which I think is a fine idea but should wait for 8.3. regards, tom lane
Tom, > I kinda like that, because it makes the behavior completely independent > of switch ordering, which seems like a good property to preserve. > Anyone else have an opinion pro or con? The only "con" argument I can think of is that "tar" and "rsync", whose syntax is familiar to a lot of sysadmins, apply switches left-to-right. However, I don't feel that that is a compelling argument. The include/exclude switch order processing is something I've always *hated* about tar and has messed me up more times than I can count. Also, Windows users could care less if we behave like tar. So +1 to go with orderless switching. -- Josh Berkus PostgreSQL @ Sun San Francisco
I wrote: > One argument that occurs to me for importing the psql code is that it's > solved the problem of including a schema name in the pattern. It would > be a lot nicer to say "-t schema.table" than to have to say "-t table -n > schema". The more I think about this, the more I think the above is a killer argument. We really should have had the ability to say "-t schema.table" ever since schemas were added in 7.3, but no one got around to making it happen. If we go over to interpreting the arguments as standard regexes then we'll never be able to do that, because we'll have foreclosed the meaning of dot. The psql pattern code was specifically designed as a compromise notation adapted to SQL needs, and IMHO it's served pretty well --- so I think we should adopt that into pg_dump rather than pure regex notation. > The psql code does allow you to get at most of the functionality of > regexes... Actually, it lets you get at all of it, though perhaps a bit awkwardly. The transformations it makes are . => schema vs name separator* => .*? => . So the only regex patterns you can't write directly are dot, R* and R? for which you can use these locutions: . => ?R* => (R+|)R? => (R|) (Perhaps this should be documented somewhere...) So I propose that we should revise the patch to use psql's \d code to determine which objects match a pattern. I think that together with Greg's idea of processing all inclusions before all exclusions should answer the concerns I've got about the patch. regards, tom lane
On Fri, Oct 06, 2006 at 10:28:21PM -0400, Gregory Stark wrote: > Tom Lane <tgl@sss.pgh.pa.us> writes: > > > The existing patch's behavior is that "the rightmost switch wins", > > ie, if an object's name matches more than one pattern then it is > > included or excluded according to the rightmost switch it matches. > > This is, erm, poorly documented, but it seems like useful behavior > > so I don't have an objection myself. > > I don't know, it sounds like it's the source of the confusion you > identify later. > > My first thought is that the rule should be to apply all the > inclusion switches (implicitly including everything if there are > none), then apply all the exclusion switches. +1 :) Order-dependent switches are a giant foot gun. Cheers, D -- David Fetter <david@fetter.org> http://fetter.org/ phone: +1 415 235 3778 AIM: dfetter666 Skype: davidfetter Remember to vote!
> We are two months past feature freeze ... adding entirely new features > to pg_dump is *not* on the table for 8.2. Ok, clear. > > The scenario I most care about is to be able to make a complete data > > base dump (including non-schema objects) while excluding only a few > > tables. > > Isn't this the same as Kris' complaint? Why do you need additional > dependency analysis to do the above? Well, I obviously didn't understand well the complete feature as it is implemented. Now, is what I want (see above) possible with the new feature, or if I exclude some tables I implicitly loose some other things too from the dump which normally would be there ? This is my only concern... Cheers, Csaba.
Csaba Nagy <nagy@ecircle-ag.com> writes: >> Isn't this the same as Kris' complaint? Why do you need additional >> dependency analysis to do the above? > Well, I obviously didn't understand well the complete feature as it is > implemented. Now, is what I want (see above) possible with the new > feature, or if I exclude some tables I implicitly loose some other > things too from the dump which normally would be there ? This is my only > concern... I think we've agreed that if you use some exclusion switches, but not any inclusion switches, then only the specific objects matching your switches are excluded. CVS HEAD gets this wrong, but I'm going to work on it today. regards, tom lane
On Mon, 2006-10-09 at 16:24, Tom Lane wrote: > I think we've agreed that if you use some exclusion switches, but not > any inclusion switches, then only the specific objects matching your > switches are excluded. CVS HEAD gets this wrong, but I'm going to work > on it today. Cool, that makes it cover my use case and some more. Thanks, Csaba.
On Sat, Oct 07, 2006 at 05:29:03PM -0400, Tom Lane wrote: > So the only regex patterns you can't write directly are dot, R* and R? > for which you can use these locutions: > > . => ? > R* => (R+|) > R? => (R|) > > (Perhaps this should be documented somewhere...) Yes, please. -- Jim Nasby jim@nasby.net EnterpriseDB http://enterprisedb.com 512.569.9461 (cell)
On Sat, Oct 07, 2006 at 06:22:19PM -0700, David Fetter wrote: > On Fri, Oct 06, 2006 at 10:28:21PM -0400, Gregory Stark wrote: > > Tom Lane <tgl@sss.pgh.pa.us> writes: > > > > > The existing patch's behavior is that "the rightmost switch wins", > > > ie, if an object's name matches more than one pattern then it is > > > included or excluded according to the rightmost switch it matches. > > > This is, erm, poorly documented, but it seems like useful behavior > > > so I don't have an objection myself. > > > > I don't know, it sounds like it's the source of the confusion you > > identify later. > > > > My first thought is that the rule should be to apply all the > > inclusion switches (implicitly including everything if there are > > none), then apply all the exclusion switches. > > +1 :) > > Order-dependent switches are a giant foot gun. They're also very powerful, as anyone who's ever used them in a non-trivial rsync (or rdiff-backup) scenareo can tell you. What if you want to exclude all of a schema except for a few objects (granted, right now we're limited to just tables...)? Here's a real example, from my rdiff-backup exclude files list: + /opt/local/var/db/dports + /opt/local/var/log + /opt/local/etc /opt Note that rdiffbackup applies the first option that matches a file/directory, not the last. So this says to exclude all of /opt, except for /opt/local/var/db/dports, etc. If this was done as 'last operator takes priority', you'd just reverse the order of the list: pg_dump --exclude-schema /opt --include-table opt.local_var_db_dports --include-table opt.local_var_log --include-table opt.local_etc If we processed all includes first, you lose this ability. For simple cases, it might not matter, but think about wanting to back up an entire database, except for schema opt, but you do want a few objects from within opt. Without following some kind of ordering on includes and excludes, that's next to impossible. -- Jim Nasby jim@nasby.net EnterpriseDB http://enterprisedb.com 512.569.9461 (cell)
On Mon, Oct 09, 2006 at 12:07:29PM -0500, Jim C. Nasby wrote: > On Sat, Oct 07, 2006 at 06:22:19PM -0700, David Fetter wrote: > > On Fri, Oct 06, 2006 at 10:28:21PM -0400, Gregory Stark wrote: > > > Tom Lane <tgl@sss.pgh.pa.us> writes: > > > > > > > The existing patch's behavior is that "the rightmost switch wins", > > > > ie, if an object's name matches more than one pattern then it is > > > > included or excluded according to the rightmost switch it matches. > > > > This is, erm, poorly documented, but it seems like useful behavior > > > > so I don't have an objection myself. > > > > > > I don't know, it sounds like it's the source of the confusion you > > > identify later. > > > > > > My first thought is that the rule should be to apply all the > > > inclusion switches (implicitly including everything if there are > > > none), then apply all the exclusion switches. > > > > +1 :) > > > > Order-dependent switches are a giant foot gun. > > They're also very powerful, as anyone who's ever used them in a > non-trivial rsync (or rdiff-backup) scenareo can tell you. What if > you want to exclude all of a schema except for a few objects > (granted, right now we're limited to just tables...)? You make an important distinction here, and thanks for doing that. :) IMHO, order-dependent switches are appropriate for a configuration file and inappropriate for the command line. The pg_hba.conf file is a great example of a place where order dependence is a good idea. Cheers, D -- David Fetter <david@fetter.org> http://fetter.org/ phone: +1 415 235 3778 AIM: dfetter666 Skype: davidfetter Remember to vote!
"Jim C. Nasby" <jim@nasby.net> writes: > On Sat, Oct 07, 2006 at 06:22:19PM -0700, David Fetter wrote: >> On Fri, Oct 06, 2006 at 10:28:21PM -0400, Gregory Stark wrote: >>> My first thought is that the rule should be to apply all the >>> inclusion switches (implicitly including everything if there are >>> none), then apply all the exclusion switches. >> >> +1 :) >> Order-dependent switches are a giant foot gun. > They're also very powerful, as anyone who's ever used them in a > non-trivial rsync (or rdiff-backup) scenareo can tell you. Sure, but the question is whether that incremental gain in capability is worth the extra logical complexity. I'm inclined to think that many more users would get burned by the complexity than would have use for it. Considering that we've gotten along this long with only the most primitive selection capabilities in pg_dump, it doesn't seem like there's an enormous demand for highly refined capabilities. (And I agree with David's comment that it might be better to reserve such behavior for a configuration file than to put it on the command line.) regards, tom lane
On Mon, Oct 09, 2006 at 01:59:18PM -0400, Tom Lane wrote: > "Jim C. Nasby" <jim@nasby.net> writes: > > On Sat, Oct 07, 2006 at 06:22:19PM -0700, David Fetter wrote: > >> On Fri, Oct 06, 2006 at 10:28:21PM -0400, Gregory Stark wrote: > >>> My first thought is that the rule should be to apply all the > >>> inclusion switches (implicitly including everything if there are > >>> none), then apply all the exclusion switches. > >> > >> +1 :) > >> Order-dependent switches are a giant foot gun. > > > They're also very powerful, as anyone who's ever used them in a > > non-trivial rsync (or rdiff-backup) scenareo can tell you. > > Sure, but the question is whether that incremental gain in capability > is worth the extra logical complexity. I'm inclined to think that many > more users would get burned by the complexity than would have use for it. > Considering that we've gotten along this long with only the most > primitive selection capabilities in pg_dump, it doesn't seem like > there's an enormous demand for highly refined capabilities. > > (And I agree with David's comment that it might be better to reserve > such behavior for a configuration file than to put it on the command > line.) I can certainly see the logic in putting the more advanced capability in a config file of some kind (though, I think a simple include/exclude file is best for this...) The question becomes: do we want incompatible behavior between the config file and the command line? And which over-rides what? -- Jim Nasby jim@nasby.net EnterpriseDB http://enterprisedb.com 512.569.9461 (cell)
On Mon, Oct 09, 2006 at 02:34:09PM -0500, Jim C. Nasby wrote: > On Mon, Oct 09, 2006 at 01:59:18PM -0400, Tom Lane wrote: > > "Jim C. Nasby" <jim@nasby.net> writes: > > > On Sat, Oct 07, 2006 at 06:22:19PM -0700, David Fetter wrote: > > >> On Fri, Oct 06, 2006 at 10:28:21PM -0400, Gregory Stark wrote: > > >>> My first thought is that the rule should be to apply all the > > >>> inclusion switches (implicitly including everything if there > > >>> are none), then apply all the exclusion switches. > > >> > > >> +1 :) Order-dependent switches are a giant foot gun. > > > > > They're also very powerful, as anyone who's ever used them in a > > > non-trivial rsync (or rdiff-backup) scenareo can tell you. > > > > Sure, but the question is whether that incremental gain in > > capability is worth the extra logical complexity. I'm inclined to > > think that many more users would get burned by the complexity than > > would have use for it. Considering that we've gotten along this > > long with only the most primitive selection capabilities in > > pg_dump, it doesn't seem like there's an enormous demand for > > highly refined capabilities. > > > > (And I agree with David's comment that it might be better to > > reserve such behavior for a configuration file than to put it on > > the command line.) > > I can certainly see the logic in putting the more advanced > capability in a config file of some kind (though, I think a simple > include/exclude file is best for this...) > > The question becomes: do we want incompatible behavior between the > config file and the command line? And which over-rides what? The way I've cut this Gordian knot in the past is simply to make command-line and file-based options for a given thing (e.g. exclusion/inclusion) mutually exclusive and throw an error if somebody attempts to mix them. Cheers, D -- David Fetter <david@fetter.org> http://fetter.org/ phone: +1 415 235 3778 AIM: dfetter666 Skype: davidfetter Remember to vote!
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 > Sure, but the question is whether that incremental gain in capability > is worth the extra logical complexity. I'm inclined to think that many > more users would get burned by the complexity than would have use for it. > Considering that we've gotten along this long with only the most > primitive selection capabilities in pg_dump, it doesn't seem like > there's an enormous demand for highly refined capabilities. I disagree - we lose a lot of flexibility by taking out the ordering, and, as was pointed out to me when I first started this patch a while ago, we might as well front-load all the complexity and changes now rather than adding them in release by release. I'm also not sure why the regex should be changed to something even more non-standard than the current POSIX ones. Finally, I'm surprised at the apparent willingness at this point to shatter backwards-compatibility with previous -t scripts, as this was an option I raised early on but met strong resistance, thus the current compromise of allowing existing scripts to run unaltered, while adding in the ability to do some regular expressions. The regex stuff was discussed in January, and the patch submitted in July, so it seems a little rushed to be changing the underlying behavior so quickly right now (that behavior being the ability to control which tables and schemas to dump). I think the original post about the request to exclude a single table and still dump other objects is a fair one, but I think we've morphed far beyond solving that problem. - -- Greg Sabino Mullane greg@turnstep.com PGP Key: 0x14964AC8 200610092003 http://biglumber.com/x/web?pk=2529DF6AB8F79407E94445B4BC9B906714964AC8 -----BEGIN PGP SIGNATURE----- iD8DBQFFKuPFvJuQZxSWSsgRAjAxAJ9oY5HCM4KxmpLEU56eCMJauHBhFgCfcyDt R5yf5SKKBeBHJ2gdRlE1Pqs= =rIxZ -----END PGP SIGNATURE-----
"Greg Sabino Mullane" <greg@turnstep.com> writes: >> Sure, but the question is whether that incremental gain in capability >> is worth the extra logical complexity. I'm inclined to think that many >> more users would get burned by the complexity than would have use for >> it. > I disagree - we lose a lot of flexibility by taking out the ordering, We lose some flexibility, but it's not clear to me that it's so essential as all that. Even the restricted patch is tremendously more flexible than pg_dump has ever been, and I just don't see the argument that there's a market demand for doing more at the cost of clarity. > I'm also not sure why the regex > should be changed to something even more non-standard than the current > POSIX ones. Finally, I'm surprised at the apparent willingness at this > point to shatter backwards-compatibility with previous -t scripts, as > this was an option I raised early on but met strong resistance, thus > the current compromise of allowing existing scripts to run unaltered, > while adding in the ability to do some regular expressions. That's a fair point, but the way that the patch was preserving exact backward compatibility was by making it a discontinuous corner case, which is a decision I think we'd regret in the long run. Andrew was already suggesting upthread that we drop the anchoring (and lose compatibility to a much greater extent than what this does) in order to make the behavior more self-consistent. Also, insisting on straight regexps amounts to failing to learn from experience: before 7.3 the psql \d commands used patterns that *were* straight regexps, and that just did not work all that conveniently. > The regex stuff was discussed in January, and the patch submitted in > July, so it seems a little rushed to be changing the underlying behavior > so quickly right now Well, the problem is that once we ship 8.2 we'll be stuck with whatever behavior we've defined --- it's unlikely that it'd be worth the pain of another round of incompatibility in order to make small adjustments. So we'd better get it right the first time. I do apologize for not having reviewed this patch more closely earlier, but I've been a tad busy... regards, tom lane