Re: proposal: possibility to read dumped table's name from file - Mailing list pgsql-hackers
From | Daniel Gustafsson |
---|---|
Subject | Re: proposal: possibility to read dumped table's name from file |
Date | |
Msg-id | B854A710-E1DE-4C6B-A411-E246A95DCDF9@yesql.se Whole thread Raw |
In response to | Re: proposal: possibility to read dumped table's name from file (Pavel Stehule <pavel.stehule@gmail.com>) |
Responses |
Re: proposal: possibility to read dumped table's name from file
|
List | pgsql-hackers |
> On 28 Jul 2021, at 09:28, Pavel Stehule <pavel.stehule@gmail.com> wrote: > út 13. 7. 2021 v 1:16 odesílatel Tom Lane <tgl@sss.pgh.pa.us <mailto:tgl@sss.pgh.pa.us>> napsal: > Hence I suggest > > include table PATTERN > exclude table PATTERN > > which ends up being the above but with words not [+-]. One issue with this syntax is that the include keyword can be quite misleading as it's semantic interpretion of "include table t" can be different from "--table=t". The former is less clear about the fact that it means "exclude all other tables than " then the latter. It can be solved with documentation, but I think that needs be to be made clearer. > Here is an updated implementation of filter's file, that implements syntax proposed by you. While it's not the format I would prefer, it does allow for most (all?) use cases expressed in this thread with ample armtwisting applied so let's go ahead from this point and see if we can agree on it (or a version of it). A few notes on the patch after a first pass over it: +(include|exclude)[table|schema|foreign_data|data] <replaceable class="parameter">objectname</replaceable> Lacks whitespace between keyword and object type. Also, since these are mandatory parameters, shouldn't they be within '{..}' ? + /* skip initial white spaces */ + while (isblank(*ptr)) + ptr += 1; We don't trust isblank() as of 3fd5faed5 due to portability concerns, this should probably use a version of the pg_isblank() we already have (and possibly move that to src/common/string.c as there now are more consumers). +static bool +isblank_line(const char *line) This could be replaced with a single call to strspn() as we already do for parsing the TOC file. + /* when first char is hash, ignore whole line */ + if (*str == '#') + continue; I think we should strip leading whitespace before this to allow commentlines to start with whitespace, it's easy enough and will make life easier for users. + pg_log_error("invalid format of filter file \"%s\": %s", + filename, + message); + + fprintf(stderr, "%d: %s\n", lineno, line); Can't we just include the lineno in the error logging and skip dumping the offending line? Fast-forwarding the pointer to print the offending part is less useful than a context marker, and in some cases suboptimal. With this coding, if a pattern is omitted for example the below error message is given: pg_dump: error: invalid format of filter file "filter.txt": missing object name 1: The errormessage and the linenumber in the file should be enough for the user to figure out what to fix. + if (keyword && is_keyword(keyword, size, "table")) + objecttype = 't'; Should this use an enum, or at least a struct translation the literal keyword to the internal representation? Magic constants without explicit connection to their token counterparts can easily be cause of bugs. If I create a table called "a\nb" and try to dump it I get an error in parsing the file. Isn't this supposed to work? $ cat filter.txt include table "a b" $ ./bin/pg_dump --filter=filter.txt pg_dump: error: invalid format of filter file "filter.txt": unexpected chars after object name 2: Did you consider implementing this in Bison to abstract some of the messier parsing logic? -- Daniel Gustafsson https://vmware.com/
pgsql-hackers by date: