Re: proposal: possibility to read dumped table's name from file - Mailing list pgsql-hackers

From Daniel Gustafsson
Subject Re: proposal: possibility to read dumped table's name from file
Date
Msg-id 5D324214-98D9-4AC5-BE08-56B8346BB8A8@yesql.se
Whole thread Raw
In response to Re: proposal: possibility to read dumped table's name from file  (Pavel Stehule <pavel.stehule@gmail.com>)
Responses Re: proposal: possibility to read dumped table's name from file
Re: proposal: possibility to read dumped table's name from file
List pgsql-hackers
> On 15 Sep 2021, at 19:31, Pavel Stehule <pavel.stehule@gmail.com> wrote:
> po 13. 9. 2021 v 15:01 odesílatel Daniel Gustafsson <daniel@yesql.se <mailto:daniel@yesql.se>> napsal:

> One issue with this syntax is that the include keyword can be quite misleading
> as it's semantic interpretion of "include table t" can be different from
> "--table=t".  The former is less clear about the fact that it means "exclude
> all other tables than " then the latter.  It can be solved with documentation,
> but I think that needs be to be made clearer.
>
> I invite any documentation enhancing and fixing

Sure, that can be collabored on.  This gist is though that IMO the keywords in
the filter file aren't as clear on the sideeffects as the command line params,
even though they are equal in functionality.

> +       pg_log_error("invalid format of filter file \"%s\": %s",
> +                                filename,
> +                                message);
> +
> +       fprintf(stderr, "%d: %s\n", lineno, line);
> Can't we just include the lineno in the error logging and skip dumping the
> offending line?  Fast-forwarding the pointer to print the offending part is
> less useful than a context marker, and in some cases suboptimal.  With this
> coding, if a pattern is omitted for example the below error message is given:
>
>     pg_dump: error: invalid format of filter file "filter.txt": missing object name
>     1:
>
> The errormessage and the linenumber in the file should be enough for the user
> to figure out what to fix.
>
> I did it like you proposed, but still, I think the content can be useful.

Not when there is no content in the error message, printing an empty string for
a line number which isn't a blank line doesn't seem terribly helpful.  If we
know the error context is empty, printing a tailored error hint seems more
useful for the user.

> More times you read dynamically generated files, or you read data from stdin, and in complex environments it can be
hardregenerate new content for debugging. 

That seems odd given that the arguments for this format has been that it's
likely to be handwritten.

> If I create a table called "a\nb" and try to dump it I get an error in parsing
> the file.  Isn't this supposed to work?
>     $ cat filter.txt
>     include table "a
>     b"
>     $ ./bin/pg_dump --filter=filter.txt
>     pg_dump: error: invalid format of filter file "filter.txt": unexpected chars after object name
>     2:
>
> probably there was some issue, because it should work. I tested a new version and this is tested in new regress
tests.Please, check 

That seems to work, but I am unable to write a filter statement which can
handle this relname:

CREATE TABLE "a""
""b" (a integer);

Are you able to craft one for that?

> Did you consider implementing this in Bison to abstract some of the messier
> parsing logic?
>
> Initially not, but now, when I am thinking about it, I don't think so Bison helps. The syntax of the filter file is
nicelylinear. Now, the code of the parser is a little bit larger than minimalistic, but it is due to nicer error's
messages.The raw implementation in Bison raised just "syntax error" and positions. I did code refactoring, and now the
scanning,parsing and processing are divided into separated routines. Parsing related code has 90 lines. In this case, I
don'tthink using a parser grammar file can carry any benefit. grammar is more readable, sure, but we need to include
bison,we need to handle errors, and if we want to raise more helpful errors than just "syntax error", then the code
willbe longer.  

I'm not so concerned by code size, but rather parsing of quotations etc and
being able to reason about it's correctness.  IMHO that's easier done by
reading a defined grammar than parsing a handwritten parser.

Will do a closer review on the patch shortly.

--
Daniel Gustafsson        https://vmware.com/




pgsql-hackers by date:

Previous
From: Daniel Gustafsson
Date:
Subject: Re: proposal: possibility to read dumped table's name from file
Next
From: Pavel Stehule
Date:
Subject: Re: proposal: possibility to read dumped table's name from file