Re: CSV mode option for pg_dump - Mailing list pgsql-hackers

From Bruce Momjian
Subject Re: CSV mode option for pg_dump
Date
Msg-id 200606141247.k5ECltn03581@candle.pha.pa.us
Whole thread Raw
In response to Re: CSV mode option for pg_dump  ("Andrew Dunstan" <andrew@dunslane.net>)
Responses Re: CSV mode option for pg_dump  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-hackers
I agree with Andrew.  If there are bugs in CSV, then we should fix them,
not avoid give users this usability improvement.  I know I have analyzed
the code and fixed a few problems with it in the past.

As far as pg_dump, I am thinking the most flexible solution would be to
allow an arbitrary WITH clause to be added to COPY, like
pg_dump --copy-with "CSV"

This would allow not only CSV control, but specification of non-CSV
options if desired.

---------------------------------------------------------------------------

Andrew Dunstan wrote:
> Tom Lane said:
> > Andrew Dunstan <andrew@dunslane.net> writes:
> >> This is just nonsense. There is not the slightest reason that CSV data
> >>  cannot be embedded in a text dump nor exist as the data members of a
> >> custom or tar dump with the corresponding COPY statements modified
> >> accordingly.
> >
> > Well, the really *core* question here is whether we trust the stability
> > of the CSV format definition (and code) enough to want to rely on it
> > for data dump/restore purposes.  I'm still a few years away from that,
> > myself.  AFAICT the raison d'etre of the CSV code is "emit whatever it
> > takes to satisfy this, that, and the other broken Microsoft
> > application". That's fine as an export tool, but as a dump/reload tool,
> > nyet.  If you put it in pg_dump you're just handing neophytes another
> > foot-gun.
> >
> 
> 
> Well, I'm staggered. Really.
> 
> The CSV definition and its lack of formality is a complete red herring in
> this, as are references to Microsoft.
> 
> The real issue in what you say here is your suggestion that we might not be
> able to reproduce the input in some cases via a COPY CSV round trip. If that
> is so it's a bug of our (probably my) making, and must be fixed. I assert
> that it is not so. In fact all the tests I did during development and since
> were premised on recovering the input exactly. The only CSV option that
> breaks things in that respect is FORCE NOT NULL, and it is designed for data
> coming in from a non Postgres source, so it's not relevant here. Even FORCE
> QUOTE won't break things because it never quotes a null value, and the only
> semantic significance to us of quoting is the null-ness of the value.
> 
> If the code is broken then it should be discoverable by test or code
> analysis. There is no need to refer to any other application or standard. So
> if you or anyone think there is a case that will not reproduce the data
> exactly when the same CSV options are used for output and input, I challenge
> you or them to provide a single example.
> 
> You say you're a few years away from trusting the code. Well, it's not so
> huge that it's beyond analysis, and I'll be happy to explain anything that
> puzzles you. Perhaps more importantly, it has been in use now for 18 months.
> We discovered one problem with embedded line feeds very early in the 8.0
> release cycle, and fixed it. After that I have not heard of a single
> problem. And I assure you this code is widely used.
> 
> It probably isn't used much as a round trip mechanism, probably in part
> because we haven't provided it as a pg_dump option. So maybe we have a
> chicken/egg scenario here. We do have some round trip regression tests in
> the copy test, and those can be beefed up if necessary to increase your
> confidence level.
> 
> I'm happy to debate details, but general assertions of "we can't trust this
> code" don't seem worth much to me.
> 
> cheers
> 
> andrew
> 
> 
> 
> 
> 
> 
> ---------------------------(end of broadcast)---------------------------
> TIP 5: don't forget to increase your free space map settings
> 

--  Bruce Momjian   http://candle.pha.pa.us EnterpriseDB    http://www.enterprisedb.com
 + If your life is a hard drive, Christ can be your backup. +


pgsql-hackers by date:

Previous
From: "Andrew Dunstan"
Date:
Subject: Re: CSV mode option for pg_dump
Next
From: Bruce Momjian
Date:
Subject: Re: [PATCHES] PL/pgSQL: SELECT INTO EXACT