Re: proposal: possibility to read dumped table's name from file - Mailing list pgsql-hackers

From Tomas Vondra
Subject Re: proposal: possibility to read dumped table's name from file
Date
Msg-id e33b7c84-40f2-8a3b-3197-2323b66dca83@enterprisedb.com
Whole thread Raw
In response to Re: proposal: possibility to read dumped table's name from file  (Stephen Frost <sfrost@snowman.net>)
List pgsql-hackers
On 7/14/21 2:18 AM, Stephen Frost wrote:
> Greetings,
> 
> * Alvaro Herrera (alvherre@2ndquadrant.com) wrote:
>> On 2021-Jul-13, Stephen Frost wrote:
>>> The simplest possible format isn't going to work with all the different
>>> pg_dump options and it still isn't going to be 'simple' since it needs
>>> to work with the flexibility that we have in what we support for object
>>> names,
>>
>> That's fine.  If people want a mechanism that allows changing the other
>> pg_dump options that are not related to object filtering, they can
>> implement a configuration file for that.
> 
> It's been said multiple times that people *do* want that and that they
> want it to all be part of this one file, and specifically that they
> don't want to end up with a file structure that actively works against
> allowing other options to be added to it.
> 

I have no problem believing some people want to be able to specify 
pg_dump parameters in a file, similarly to IMPDP/EXPDP parameter files 
etc. That seems useful, but I doubt they considered the case with many 
filter rules ... which is what "my people" want.

Not sure how keeping the filter rules in a separate file (which I assume 
is what you mean by "file structure"), with a format tailored for filter 
rules, works *actively* against adding options to the "main" config.

I'm not buying the argument that keeping some of the stuff in a separate 
file is an issue - plenty of established tools do that, the concept of 
"including" a config is not a radical new thing, and I don't expect we'd 
have many options supported by a file.

In any case, I think user input is important, but ultimately it's up to 
us to reconcile the conflicting requirements coming from various users 
and come up with a reasonable compromise design.

>>> I don't know that the options that I suggested previously would
>>> definitely work or not but they at least would allow other projects like
>>> pgAdmin to leverage existing code for parsing and generating these
>>> config files.
>>
>> Keep in mind that this patch is not intended to help pgAdmin
>> specifically.  It would be great if pgAdmin uses the functionality
>> implemented here, but if they decide not to, that's not terrible.  They
>> have survived decades without a pg_dump configuration file; they still
>> can.
> 
> The adding of a config file for pg_dump should specifically be looking
> at pgAdmin as the exact use-case for having such a capability.
> 
>> There are several votes in this thread for pg_dump to gain functionality
>> to filter objects based on a simple specification -- particularly one
>> that can be written using shell pipelines.  This patch gives it.
> 
> And several votes for having a config file that supports, or at least
> can support in the future, the various options which pg_dump supports-
> and active voices against having a new file format that doesn't allow
> for that.
> 

IMHO the whole "problem" here stems from the question whether there 
should be a single universal pg_dump config file, containing everything 
including the filter rules. I'm of the opinion it's better to keep the 
filter rules separate, mainly because:

1) simplicity - Options (key/value) and filter rules (with more internal 
structure) seem quite different, and mixing them in the same file will 
just make the format more complex.

2) flexibility - Keeping the filter rules in a separate file makes it 
easier to reuse the same set of rules with different pg_dump configs, 
specified in (much smaller) config files.

So in principle, the "main" config could use e.g. TOML or whatever we 
find most suitable for this type of key/value config file (or we could 
just use the same format as for postgresql.conf et al). And the filter 
rules could use something as simple as CSV (yes, I know it's not great, 
but there's plenty of parsers, it handles multi-line strings etc.).


>>> I'm not completely against inventing something new, but I'd really
>>> prefer that we at least try to make something existing work first
>>> before inventing something new that everyone is going to have to deal
>>> with.
>>
>> That was discussed upthread and led nowhere.
> 
> You're right- no one followed up on that.  Instead, one group continues
> to push for 'simple' and to just accept what's been proposed, while
> another group counters that we should be looking at the broader design
> question and work towards a solution which will work for us down the
> road, and not just right now.
> 

I have quite thick skin, but I have to admit I rather dislike how this 
paints the people arguing for simplicity.

IMO simplicity is a perfectly legitimate (and desirable) design feature, 
and simpler solutions often fare better in the long run. Yes, we need to 
look at the broader design, no doubt about that.

> One thing remains clear- there's no consensus here.
> 

True.


regards

-- 
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



pgsql-hackers by date:

Previous
From: Julien Rouhaud
Date:
Subject: Re: [HACKERS] Preserving param location
Next
From: Heikki Linnakangas
Date:
Subject: Re: psql \copy from sends a lot of packets