Re: [PATCHES] allow CSV quote in NULL - Mailing list pgsql-hackers

From Andrew Dunstan
Subject Re: [PATCHES] allow CSV quote in NULL
Date
Msg-id 46AA1747.4030908@dunslane.net
Whole thread Raw
Responses Re: [PATCHES] allow CSV quote in NULL  (Stephen Frost <sfrost@snowman.net>)
Re: [PATCHES] allow CSV quote in NULL  (Bruce Momjian <bruce@momjian.us>)
List pgsql-hackers
[redirecting to -hackers]

Stephen Frost wrote:
> * Gregory Stark (stark@enterprisedb.com) wrote:
>   
>> "Tom Lane" <tgl@sss.pgh.pa.us> writes:
>>
>>     
>>> Stephen Frost <sfrost@snowman.net> writes:
>>>       
>>>>   Please find attached a minor patch to remove the constraints that a
>>>>   user can't include the delimiter or quote characters in a 'NULL AS'
>>>>   string when importing CSV files.
>>>>         
>>> This can't really be sane can it?
>>>       
>   

Not very, no :-)
>   
>> The alternative would be interpreting NULL strings after dequoting but that
>> would leave no way to include the NULL string literally. This solution means
>> there's no way to include it (if it needs quoting) but only when you specify
>> it this way.
>>     
>
> Yeah, interpreting NULLs after dequoting means you've lost the
> information about if it's quoted or not, or you have to add some funky
> syntax to say "if it's quoted, do it differently...", which is no good,
> imv.
>
> What the patch does basically is say "give us the exact string that
> shows up between the unquoted delimiters that you want to be treated
> as a NULL."  This removes the complexity of the question about quoting,
> unquoting, whatever, and makes it a very clear-cut, straight-forward
> solution with no impact on existing users, imv.
>
>     
>   

This looks too clever by half, to me. Someone facing the problem you are 
facing would have to dig quite deep to find the solution you're promoting.

A much better way IMNSHO would be to add an extra FORCE switch. On 
input, FORCE NOT NULL says to treat an unquoted null as the literal 
value rather than as a null field for the columns named. The reverse 
would be to tell it to treat a quoted null as null rather than as the 
literal value, for the named columns. Perhaps that should just be "FORCE 
NULL columnlist". It would be more explicit and at the same time would 
only apply to the named columns, rather than discarding totally the 
ability to distinguish between null and not null values.

This should probably be discussed on -hackers, anyway.



cheers

andrew


pgsql-hackers by date:

Previous
From: "Simon Riggs"
Date:
Subject: Re: stats_block_level
Next
From: Gregory Stark
Date:
Subject: Re: Document and/or remove unreachable code in tuptoaster.c from varvarlena patch