Re: proposal: possibility to read dumped table's name from file - Mailing list pgsql-hackers

From Pavel Stehule
Subject Re: proposal: possibility to read dumped table's name from file
Date
Msg-id CAFj8pRBuEOCTGR8VwhQi3_dNg6v=g5zBB4kmACRsnAFiWJKWVA@mail.gmail.com
Whole thread Raw
In response to Re: proposal: possibility to read dumped table's name from file  (Justin Pryzby <pryzby@telsasoft.com>)
Responses Re: proposal: possibility to read dumped table's name from file
List pgsql-hackers


út 17. 11. 2020 v 22:53 odesílatel Justin Pryzby <pryzby@telsasoft.com> napsal:
On Wed, Nov 11, 2020 at 06:49:43AM +0100, Pavel Stehule wrote:
> Perhaps this feature could co-exist with a full blown configuration for
> >> pg_dump, but even then there's certainly issues with what's proposed-
> >> how would you handle explicitly asking for a table which is named
> >> "  mytable" to be included or excluded?  Or a table which has a newline
> >> in it?  Using a standardized format which supports the full range of
> >> what we do in a table name, explicitly and clearly, would address these
> >> issues and also give us the flexibility to extend the options which
> >> could be used through the configuration file beyond just the filters in
> >> the future.

I think it's a reasonable question - why would a new configuration file option
include support for only a handful of existing arguments but not the rest.

I don't see a strong technical problem - enhancing parsing is not hard work, but I miss a use case for this. The option "--filter" tries to solve a problem with limited command line size. This is a clean use case and there and supported options are options that can be used repeatedly on the command line. Nothing less, nothing more. The format that is used is designed just for this purpose.

When we would implement an alternative configuration to command line and system environments, then the use case should be defined first. When the use case is defined, we can talk about implementation and about good format. There are a lot of interesting formats, but I miss a reason why the usage of this alternative configuration can be helpful for pg_dump. Using external libraries for richer formats means a new dependency, necessity to solve portability issues, and maybe other issues, and for this there should be a good use case. Passing a list of tables for dumping doesn't need a rich format.

I cannot imagine using a config file with generated object names and some other options together. Maybe if these configurations will not be too long (then handy written) configuration can be usable. But when I think about using pg_dump from some bash scripts, then much more practical is using usual command line options and passing a list of objects by pipe. I really miss the use case for special pg_dump's config file, and if there is, then it is very different from a use case for "--filter" option.


> > This is the correct argument - I will check a possibility to use strange
> > names, but there is the same possibility and functionality like we allow
> > from the command line. So you can use double quoted names. I'll check it.
>
> I checked
> echo "+t \"bad Name\"" | /usr/local/pgsql/master/bin/pg_dump --filter=/dev/stdin
> It is working without any problem

I think it couldn't possibly work with newlines, since you call pg_get_line().
I realize that entering a newline into the shell would also be a PITA, but that
could be one *more* reason to support a config file - to allow terrible table
names to be in a file and avoid writing dash tee quote something enter else
quote in a pg_dump command, or shell script.

New patch is working with names that contains multilines

[pavel@localhost postgresql.master]$ psql -At -X -c "select '+t ' || quote_ident(table_name) from information_schema.tables where table_name like 'foo%'"|  /usr/local/pgsql/master/bin/pg_dump --filter=/dev/stdin
--
-- PostgreSQL database dump
--

-- Dumped from database version 14devel
-- Dumped by pg_dump version 14devel

-
-- Name: foo boo; Type: TABLE; Schema: public; Owner: pavel
--

CREATE TABLE public."foo
boo" (
    a integer
);


ALTER TABLE public."foo
boo" OWNER TO pavel;

--
-- Data for Name: foo boo; Type: TABLE DATA; Schema: public; Owner: pavel
--

COPY public."foo
boo" (a) FROM stdin;
\.


--
-- PostgreSQL database dump complete
--


I fooled with argument parsing to handle reading from a file in the quickest
way.  As written, this fails to handle multiple config files, and special table
names, which need to support arbitrary, logical lines, with quotes surrounding
newlines or other special chars.  As written, the --config file is parsed
*after* all other arguments, so it could override previous args (like
--no-blobs --no-blogs, --file, --format, --compress, --lock-wait), which I
guess is bad, so the config file should be processed *during* argument parsing.
Unfortunately, I think that suggests duplicating parsing of all/most the
argument parsing for config file support - I'd be happy if someone suggested a
better way.

BTW, in your most recent patch:
s/empty rows/empty lines/
unbalanced parens: "invalid option type (use [+-]"

should be fixed now, thank you for check

Regards

Pavel



@cfbot: I renamed the patch so please ignore it.

--
Justin
Attachment

pgsql-hackers by date:

Previous
From: Peter Geoghegan
Date:
Subject: Re: new heapcheck contrib module
Next
From: Pavel Stehule
Date:
Subject: Re: proposal: possibility to read dumped table's name from file