Re: proposal: possibility to read dumped table's name from file - Mailing list pgsql-hackers

From Pavel Stehule
Subject Re: proposal: possibility to read dumped table's name from file
Date
Msg-id CAFj8pRBCO8TX0U7M-oouUUHp1FtrRmmcdR6+WxB-G0be=+jVeg@mail.gmail.com
Whole thread Raw
In response to Re: proposal: possibility to read dumped table's name from file  (Pavel Stehule <pavel.stehule@gmail.com>)
List pgsql-hackers


st 11. 11. 2020 v 6:32 odesílatel Pavel Stehule <pavel.stehule@gmail.com> napsal:
Hi

út 10. 11. 2020 v 21:09 odesílatel Stephen Frost <sfrost@snowman.net> napsal:
Greetings,

* Pavel Stehule (pavel.stehule@gmail.com) wrote:
> rebase + minor change - using pg_get_line_buf instead pg_get_line_append

I started looking at this and went back through the thread and while I
tend to agree that JSON may not be a good choice for this, it's not the
only possible alternative.  There is no doubt that pg_dump is already a
sophisticated data export tool, and likely to continue to gain new
features, such that having a configuration file for it would be very
handy, but this clearly isn't really going in a direction that would
allow for that.

Perhaps this feature could co-exist with a full blown configuration for
pg_dump, but even then there's certainly issues with what's proposed-
how would you handle explicitly asking for a table which is named
"  mytable" to be included or excluded?  Or a table which has a newline
in it?  Using a standardized format which supports the full range of
what we do in a table name, explicitly and clearly, would address these
issues and also give us the flexibility to extend the options which
could be used through the configuration file beyond just the filters in
the future.

This is the correct argument - I will check a possibility to use strange names, but there is the same possibility and functionality like we allow from the command line. So you can use double quoted names. I'll check it.


Unlike for the pg_basebackup manifest, which we generate and read
entirely programatically, a config file for pg_dump would almost
certainly be updated manually (or, at least, parts of it would be and
perhaps other parts generated), which means it'd really be ideal to have
a proper way to support comments in it (something that the proposed
format also doesn't really get right- # must be the *first* character,
and you can only have whole-line comments..?), avoid extra unneeded
punctuation (or, at times, allow it- such as trailing commas in lists),
cleanly handle multi-line strings (consider the oft discussed idea
around having pg_dump support a WHERE clause for exporting data from
tables...), etc.

I think the proposed feature is very far to be the config file for pg_dump (it implements a option "--filter"). This is not the target. It is not designed for this. This is just an alternative for options like -t, -T, ... and I am sure so nobody will generate this file manually. Main target of this patch is eliminating problems with the max length of the command line. So it is really not designed to be the config file for pg_dump.
 

Overall, -1 from me on this approach.  Maybe it could be fixed up to
handle all the different names of objects that we support today
(something which, imv, is really a clear requirement for this feature to
be committed), but I suspect you'd end up half-way to yet another
configuration format when we could be working to support something like
TOML or maybe YAML... but if you want my 2c, TOML seems closer to what
we do for postgresql.conf and getting that over to something that's
standardized, while a crazy long shot, is a general nice idea, imv.

I have nothing against TOML, but I don't see a sense of usage in this patch. This patch doesn't implement a config file for pg_dump, and I don't see any sense or benefits of it. The TOML is designed for different purposes. TOML is good for manual creating, but it is not this case. Typical usage of this patch is some like, and TOML syntax (or JSON) is not good for this.

psql -c "select '+t' || quote_ident(relname) from pg_class where relname ..." | pg_dump --filter=/dev/stdin

I can imagine some benefits of saved configure files for postgres applications - but it should be designed generally and implemented generally. Probably you would use one for pg_dump, psql, pg_restore, .... But it is a different feature with different usage. This patch doesn't implement option "--config", it implements option "--filter".

Some generic configuration for postgres binary applications is an interesting idea.  And TOML language can be well for this purpose. We can parametrize applications by command line and by system variables. But filtering objects is a really different case - although there is some small intersection, and it will be used very differently, and I don't think so one language can be practical for both cases. The object filtering is an independent feature, and both features can coexist together.

Regards

Pavel


Regards

Pavel



Thanks,

Stephen

pgsql-hackers by date:

Previous
From: Michael Paquier
Date:
Subject: Re: Skip ExecCheckRTPerms in CTAS with no data
Next
From: "tsunakawa.takay@fujitsu.com"
Date:
Subject: RE: POC: postgres_fdw insert batching