Re: proposal: possibility to read dumped table's name from file - Mailing list pgsql-hackers
From | Julien Rouhaud |
---|---|
Subject | Re: proposal: possibility to read dumped table's name from file |
Date | |
Msg-id | 20221111081125.bxtndzqlt5tpc5qw@jrouhaud Whole thread Raw |
In response to | Re: proposal: possibility to read dumped table's name from file (Pavel Stehule <pavel.stehule@gmail.com>) |
Responses |
Re: proposal: possibility to read dumped table's name from file
|
List | pgsql-hackers |
Hi, On Sat, Nov 05, 2022 at 08:54:57PM +0100, Pavel Stehule wrote: > > pá 4. 11. 2022 v 14:28 odesílatel Julien Rouhaud <rjuju123@gmail.com> > napsal: > > > > But one thing I noticed is that "optarg" looks wrong here: > > > > > > simple_string_list_append(&opts->triggerNames, optarg); > > > > Ah indeed, good catch! Maybe there should be an explicit test for every > > (include|exclude) / objtype combination? It would be a bit verbose (and > > possibly hard to maintain). > > > > yes - pg_restore is not well covered by tests, fixed > > I found another issue. The pg_restore requires a full signature of the > function and it is pretty sensitive on white spaces (pg_restore). Argh, indeed. It's a good thing to have expanded the regression tests :) > I made a > mistake when I partially parsed patterns like SQL identifiers. It can work > for simple cases, but when I parse the function's signature it stops > working. So I rewrote the parsing pattern part. Now, I just read an input > string and I try to reduce spaces. Still multiline identifiers are > supported. Against the previous method of pattern parsing, I needed to > change just one regress test - now I am not able to detect garbage after > pattern :-/. I'm not sure it's really problematic. It looks POLA-violation compatible with regular pg_dump options, for instance: $ echo "include table t1()" | pg_dump --filter - | ag CREATE CREATE TABLE public.t1 ( $ pg_dump -t "t1()" | ag CREATE CREATE TABLE public.t1 ( $ echo "include table t1()blabla" | pg_dump --filter - | ag CREATE pg_dump: error: no matching tables were found $ pg_dump -t "t1()blabla" | ag CREATE pg_dump: error: no matching tables were found I don't think the file parsing code should try to be smart about checking the found patterns. > * function's filtering doesn't support schema - when the name of function > is specified with schema, then the function is not found Ah I didn't know that. Indeed it only expect a non-qualified identifier, and would restore any function that matches the name (and arguments), possibly multiple ones if there are variants in different schema. That's unrelated to this patch though. > * the function has to be specified with an argument type list - the > separator has to be exactly ", " string. Without space or with one space > more, the filtering doesn't work (new implementation of pattern parsing > reduces white spaces sensitivity). This is not a bug, but it is not well > documented. Agreed. > attached updated patch It looks overall good to me! I just have a few minor nitpicking complaints: - you removed the pg_strip_clrf() calls and declared everything as "const char *", so there's no need to explicitly cast the filter_get_keyword() arguments anymore Note also that the code now relies on the fact that there are some non-zero bytes after a pattern to know that no errors happened. It's not a problem as you should find an EOF marker anyway if CLRF were stripped. + * Following routines are called from pg_dump, pg_dumpall and pg_restore. + * Unfortunately, implementation of exit_nicely in pg_dump and pg_restore + * is different from implementation of this routine in pg_dumpall. So instead + * of directly calling exit_nicely we have to return some error flag (in this + * case NULL), and exit_nicelly will be executed from caller's routine. Slight improvement: [...] Unfortunately, the implementation of exit_nicely in pg_dump and pg_restore is different from the one in pg_dumpall, so instead of... + * read_pattern - reads an pattern from input. The pattern can be mix of + * single line or multi line subpatterns. Single line subpattern starts first + * non white space char, and ending last non space char on line or by char + * '#'. The white spaces inside are removed (around char ".()"), or reformated + * around char ',' or reduced (the multiple spaces are replaced by one). + * Multiline subpattern starts by double quote and ending by this char too. + * The escape rules are same like for SQL quoted literal. + * + * Routine signalizes error by returning NULL. Otherwise returns pointer + * to next char after last processed char in input string. typo: reads "a" pattern from input... I don't fully understand the part about subpatterns, but is that necessary to describe it? Simply saying that any valid and possibly-quoted identifier can be parsed should make it clear that identifiers containing \n characters should work too. Maybe also just mention that whitespaces are removed and special care is taken to output routines in exactly the same way calling code will expect it (that is comma-and-single-space type delimiter).
pgsql-hackers by date: