Re: Patch: FORCE_NULL option for copy COPY in CSV mode - Mailing list pgsql-hackers
From | David Fetter |
---|---|
Subject | Re: Patch: FORCE_NULL option for copy COPY in CSV mode |
Date | |
Msg-id | 20131009172556.GB13993@fetter.org Whole thread Raw |
In response to | Re: Patch: FORCE_NULL option for copy COPY in CSV mode (Andrew Dunstan <andrew@dunslane.net>) |
Responses |
Re: Patch: FORCE_NULL option for copy COPY in CSV mode
|
List | pgsql-hackers |
On Wed, Oct 09, 2013 at 09:52:29AM -0400, Andrew Dunstan wrote: > > On 10/09/2013 09:22 AM, Amit Kapila wrote: > >On Wed, Oct 9, 2013 at 9:15 AM, Amit Kapila <amit.kapila16@gmail.com> wrote: > >>On Tue, Oct 8, 2013 at 6:03 PM, Andrew Dunstan <andrew@dunslane.net> wrote: > >>>On 10/07/2013 11:34 PM, Amit Kapila wrote: > >>>>On Tue, Oct 8, 2013 at 12:55 AM, Andrew Dunstan <andrew@dunslane.net> > >>>>wrote: > >>>>>On 10/07/2013 03:06 PM, Robert Haas wrote: > >>>>>> > >>>>>>>Also if your use case is to treat empty strings as NULL (as per above > >>>>>>>documentation), can't it be handled with "WITH NULL AS" option. > >>>>>>>For example, something like: > >>>>>>> > >>>>>>>postgres=# COPY testnull FROM stdin with CSV NULL AS E''; > >>>>>>>Enter data to be copied followed by a newline. > >>>>>>>End with a backslash and a period on a line by itself. > >>>>>>>>>50, > >>>>>>>>>\. > >>>>>>>postgres=# select * from testnull; > >>>>>>> a | b > >>>>>>>----+------ > >>>>>>> 50 | NULL > >>>>>>>(1 row) > >>>>>>Good point. If this patch is just implementing something that can > >>>>>>already be done with another syntax, we don't need it. > >>>>>> > >>>>>Isn't the point of this option to allow a *quoted* empty string to be > >>>>>forced > >>>>>to NULL? If so, this is not testing the same case - in fact the COPY > >>>>>command > >>>>>above just makes explicit the default CSV NULL setting anyway. > >>>>I am really not sure if all the purpose of patch can be achieved by > >>>>existing syntax, neither it is explained clearly. > >>>>However the proposal hasn't discussed why it's not good idea to extend > >>>>some similar syntax "COPY .. NULL" which is used to replace string > >>>>with NULL's? > >>>>Description of NULL says: "Specifies the string that represents a null > >>>>value." > >>>>Now why can't this syntax be extended to support quoted empty string > >>>>if it's not supported currently? > >>>>I have not checked completely, If it's difficult or not possible to > >>>>support in existing syntax, then even it add's more value to introduce > >>>>new syntax. > >>>> > >>>>By asking above question, I doesn't mean that we should not go for the > >>>>new proposed syntax, rather it's to know and understand the benefit of > >>>>new syntax, also it helps during CF review for reviewer's if the > >>>>proposal involves new syntax and that's discussed previously. > >>>> > >>>Quite apart from any other consideration, this suggestion is inferior to > >>>what's proposed in that it's an all or nothing deal, while the patch allows > >>>you to specify the behaviour very explicitly on a per column basis. I can > >>>well imagine wanting to be able to force a quoted empty string to null for > >>>numeric fields but not for text. > > Okay, but can't it be done by extending current syntax such as > > NULL FOR <col1, col2> AS ""- which would mean it will replace > >corresponding column's values as NULL if they contain empty string. > > > >>>The basic principal of our CSV processing is that we don't ever turn a NULL > >>>into something quoted and we don't ever turn something quoted into NULL. > >>>That's what lets us round-trip test just about every combination of options. > >>>I'm only going to be happy violating that, as this patch does, in a very > >>>explicit and controlled way. > >>Will this option allow only quoted empty string to be NULL or will > >>handle without quoted empty string as well? > > I had checked patch and it seems for both quoted and unquoted empty > >string, it will replace it with NULL. > > Now here it's bit different from FORCE_NOT_NULL, because already > >for un-quoted empty strings we have a way to replace them with NULL. > > > > I think having 2 different syntax for replacing empty strings (one > >for quoted and another for un-quoted) as NULL's might not be best way > >to accomplish > > this feature. > > > > > > > I really don't know what you're saying here. > > Here is the situation we have today (assuming the default null > marker of empty-string): > > default: empty-string -> null, quoted-empty-string -> > emptystring > with force_not_null: empty-string -> emptystring, > quoted-empty-string -> emptystring > > and the proposal would add to that: > > with force-null: empty-string -> null, quoted-empty-string -> null > > So it appears to be quite on all fours with the way force_not_null > works now, it just does the reverse. > > I don't see at all that your suggested alternative has any > advantages over what's been written. If you can say "NULL FOR (foo) > as '""' how will you specify the null for some other column(s)? Idea: NULL FOR (foo,bar,baz,blurf) AS '""', NULL FOR (quux,fleeg) AS ..., > Are we going to have multiple such clauses? It looks like a real > mess. Is that not part of what parsers ordinarily do? Cheers, David. -- David Fetter <david@fetter.org> http://fetter.org/ Phone: +1 415 235 3778 AIM: dfetter666 Yahoo!: dfetter Skype: davidfetter XMPP: david.fetter@gmail.com iCal: webcal://www.tripit.com/feed/ical/people/david74/tripit.ics Remember to vote! Consider donating to Postgres: http://www.postgresql.org/about/donate
pgsql-hackers by date: