Re: proposal: possibility to read dumped table's name from file - Mailing list pgsql-hackers

From Pavel Stehule
Subject Re: proposal: possibility to read dumped table's name from file
Date
Msg-id CAFj8pRAfLCnPhfGBWmk6=MyDDXd1rYbPu_37w3W+=spErYxo9g@mail.gmail.com
Whole thread Raw
In response to Re: proposal: possibility to read dumped table's name from file  (Stephen Frost <sfrost@snowman.net>)
List pgsql-hackers


pá 27. 11. 2020 v 19:45 odesílatel Stephen Frost <sfrost@snowman.net> napsal:
Greetings,

* Pavel Stehule (pavel.stehule@gmail.com) wrote:
> > I agree that being able to configure pg_dump via a config file would
> > be very useful, but the syntax proposed here feels much more like a
> > hacked-up syntax designed to meet this one use case, rather than a
> > good general-purpose design that can be easily extended.
>
> Nobody sent a real use case for introducing the config file. There was a
> discussion about formats, and you introduce other dimensions and
> variability.

I'm a bit baffled by this because it seems abundently clear to me that
being able to have a config file for pg_dump would be extremely helpful.
There's no shortage of times that I've had to hack up a shell script and
figure out quoting and set up the right set of options for pg_dump,
resulting in things like:

pg_dump \
  --host=myserver.com \
  --username=postgres \
  --schema=public \
  --schema=myschema \
  --no-comments \
  --no-tablespaces \
  --file=somedir \
  --format=d \
  --jobs=5

which really is pretty grotty.  Being able to have a config file that
has proper comments would be much better and we could start to extend to
things like "please export schema A to directory A, schema B to
directory B" and other ways of selecting source and destination, and
imagine if we could validate it too, eg:

pg_dump --config=whatever --dry-run

or --check-config maybe.

This isn't a new concept either- export and import tools for other
databases have similar support, eg: Oracle's imp/exp tool, mysqldump
(see: https://dev.mysql.com/doc/refman/8.0/en/option-files.html which
has a TOML-looking format too), pgloader of course has a config file,
etc.  We certainly aren't in novel territory here

Still, I am not a fan of this. pg_dump is a simple tool for simple purposes. It is not a pgloader or any ETL tool. It can be changed in future, maybe, but still, why? And any time, there will be a question if pg_dump is a good foundation for massive enhancement in ETL direction. The development in C is expensive and pg_dump is too Postgres specific, so I cannot imagine so pg_dump will be used for some complex tasks directly, and there will be requirements for special configuration. When we have a pgloader, then we don't need to move pg_dump in the pgloader direction.

Anyway - new patch allows to store any options (one per line) with possible comments (everywhere in line) and argument's can be across more lines. It hasn't any more requirements on memory or CPU.

Regards

Pavel
 

Thanks,

Stephen

pgsql-hackers by date:

Previous
From: Pavel Stehule
Date:
Subject: Re: proposal: possibility to read dumped table's name from file
Next
From: "Bossart, Nathan"
Date:
Subject: Re: A few new options for CHECKPOINT