Re: Patch: FORCE_NULL option for copy COPY in CSV mode - Mailing list pgsql-hackers

From Amit Kapila
Subject Re: Patch: FORCE_NULL option for copy COPY in CSV mode
Date
Msg-id CAA4eK1KbkGvnwFv-E7uw7zQveAaRweW3m5k1W8nmtOYhkBFbgg@mail.gmail.com
Whole thread Raw
In response to Re: Patch: FORCE_NULL option for copy COPY in CSV mode  (Amit Kapila <amit.kapila16@gmail.com>)
Responses Re: Patch: FORCE_NULL option for copy COPY in CSV mode
List pgsql-hackers
On Wed, Oct 9, 2013 at 9:15 AM, Amit Kapila <amit.kapila16@gmail.com> wrote:
> On Tue, Oct 8, 2013 at 6:03 PM, Andrew Dunstan <andrew@dunslane.net> wrote:
>>
>> On 10/07/2013 11:34 PM, Amit Kapila wrote:
>>>
>>> On Tue, Oct 8, 2013 at 12:55 AM, Andrew Dunstan <andrew@dunslane.net>
>>> wrote:
>>>>
>>>> On 10/07/2013 03:06 PM, Robert Haas wrote:
>>>>>
>>>>>
>>>>>> Also if your use case is to treat empty strings as NULL (as per above
>>>>>> documentation), can't it be handled with "WITH NULL AS" option.
>>>>>> For example, something like:
>>>>>>
>>>>>> postgres=# COPY testnull FROM stdin with CSV NULL AS E'';
>>>>>> Enter data to be copied followed by a newline.
>>>>>> End with a backslash and a period on a line by itself.
>>>>>>>>
>>>>>>>> 50,
>>>>>>>> \.
>>>>>>
>>>>>> postgres=# select * from testnull;
>>>>>>    a  |  b
>>>>>> ----+------
>>>>>>    50 | NULL
>>>>>> (1 row)
>>>>>
>>>>> Good point.  If this patch is just implementing something that can
>>>>> already be done with another syntax, we don't need it.
>>>>>
>>>>
>>>> Isn't the point of this option to allow a *quoted* empty string to be
>>>> forced
>>>> to NULL? If so, this is not testing the same case - in fact the COPY
>>>> command
>>>> above just makes explicit the default CSV NULL setting anyway.
>>>
>>> I am really not sure if all the purpose of patch can be achieved by
>>> existing syntax, neither it is explained clearly.
>>> However the proposal hasn't discussed why it's not good idea to extend
>>> some similar syntax "COPY .. NULL" which is used to replace string
>>> with NULL's?
>>> Description of NULL says: "Specifies the string that represents a null
>>> value."
>>> Now why can't this syntax be extended to support quoted empty string
>>> if it's not supported currently?
>>> I have not checked completely, If it's difficult or not possible to
>>> support in existing syntax, then even it add's more value to introduce
>>> new syntax.
>>>
>>> By asking above question, I doesn't mean that we should not go for the
>>> new proposed syntax, rather it's to know and understand the benefit of
>>> new syntax, also it helps during CF review for reviewer's if the
>>> proposal involves new syntax and that's discussed previously.
>>>
>>
>> Quite apart from any other consideration, this suggestion is inferior to
>> what's proposed in that it's an all or nothing deal, while the patch allows
>> you to specify the behaviour very explicitly on a per column basis. I can
>> well imagine wanting to be able to force a quoted empty string to null for
>> numeric fields but not for text.
   Okay, but can't it be done by extending current syntax such as   NULL FOR <col1, col2> AS ""- which would mean it
willreplace
 
corresponding column's values as NULL if they contain empty string.

>> The basic principal of our CSV processing is that we don't ever turn a NULL
>> into something quoted and we don't ever turn something quoted into NULL.
>> That's what lets us round-trip test just about every combination of options.
>> I'm only going to be happy violating that, as this patch does, in a very
>> explicit and controlled way.
>
> Will this option allow only quoted empty string to be NULL or will
> handle without quoted empty string as well?
  I had checked patch and it seems for both quoted and unquoted empty
string, it will replace it with NULL.  Now here it's bit different from FORCE_NOT_NULL, because already
for un-quoted empty strings we have a way to replace them with NULL.
  I think having 2 different syntax for replacing empty strings (one
for quoted and another for un-quoted) as NULL's might not be best way
to accomplish  this feature.


With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com



pgsql-hackers by date:

Previous
From: Bruce Momjian
Date:
Subject: Re: docs: clarify references to md5 hash and md5 crypt in pgcrypto docs
Next
From: Andrew Dunstan
Date:
Subject: Re: Patch: FORCE_NULL option for copy COPY in CSV mode