Re: Plan for CSV handling of quotes, NULL - Mailing list pgsql-patches

From Bruce Momjian
Subject Re: Plan for CSV handling of quotes, NULL
Date
Msg-id 200404150518.i3F5ISF23435@candle.pha.pa.us
Whole thread Raw
In response to Re: Plan for CSV handling of quotes, NULL  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-patches
Tom Lane wrote:
> Bruce Momjian <pgman@candle.pha.pa.us> writes:
> > Particularly, how do we identify a numeric and dates?
>
> We don't, and I'm not convinced that we should.  The entire concept is
> suspect in a type-agnostic system.  In particular, I've really got a
> problem with the fact that TypeCategory uses a fixed, nonexpansible
> set of categories.
>
> Even if you try to push ahead with using TypeCategory, I'd lay a side
> bet that it doesn't work.  Will those spreadsheets that make this
> nonstandard assumption about quotes having semantic significance be able
> to cope with all the possible output formats from timestamptz, for
> instance?  How about timetz, which is in the same category?
>
> > The only other thing we could do would be to add something to pg_type
> > that says whether it needs quotes.  Seems like overkill.
>
> Seems like wrong.  The real problem here is that "whether it needs
> quotes" depends on the program you are intending to export to, the
> semantic behavior you want, and likely also the phase of the moon.
> I don't think we can hope to invent a COPY behavior rule that
> automagically gets this right.  What's needed is a tool that can output
> a user-customizable data format, and that seems to me to be outside
> COPY's charter.

Well, certainly one option is to quote everything.  That would import
into everything just fine.

However, numbers really want to be numbers, and dates want to be dates.
Most spreadsheets understand it, though you are right that some apps
might get confused.  One idea Andrew had was to allow the user to
specify which fields get quotes and which don't. I figured we could just
skip quotes from our builtin numeric types, and maybe dates/times, and
quote everything else.  I just checked OpenOffice and it understand
those values.  The other date fields have spaces and stuff and are just
treated as text anyway.

Sure, this is a hard problem, but with the patch I am working on, it
gets pretty close with very little code change.  It isn't perfect, but
it is closer that most folks are going to get with some external utility
that will never get written or maintained properly.

We can't let the perfect be the enemy of good on this one.  Too many
folks ask for it.

--
  Bruce Momjian                        |  http://candle.pha.pa.us
  pgman@candle.pha.pa.us               |  (610) 359-1001
  +  If your life is a hard drive,     |  13 Roberts Road
  +  Christ can be your backup.        |  Newtown Square, Pennsylvania 19073

pgsql-patches by date:

Previous
From: Tom Lane
Date:
Subject: Re: Plan for CSV handling of quotes, NULL
Next
From: Bruce Momjian
Date:
Subject: Re: Plan for CSV handling of quotes, NULL