Thread: COPY CSV keywords

COPY CSV keywords

From
Bruce Momjian
Date:
[ discussion moved to hackers.]

Peter Eisentraut wrote:
> Bruce Momjian wrote:
> > I have applied the attached patch that complete TODO item:
> >
> >         o -Allow dump/load of CSV format
> >
> > This adds new keywords to COPY and \copy:
> >
> >         CSV - enable CSV mode
> >         QUOTE - specify quote character
> >         ESCAPE - specify escape character
> >         FORCE - force quoting of specified columns
> 
> FORCE QUOTE
> 
> >         LITERAL - prevent NULL checks for specific columns
> 
> NO NULL CHECK

I considered going this direction, but it broke the WITH clause style of
COPY.  Previously it was "WITH keyword value".  Now it is also "WITH
keyword value, value" too.  This would add "WITH keyword keyword value,
value".

It would change:COPY pg_language TO '/tmp/x' WITH CSV FORCE lanname     LITERAL lanacl
to:COPY pg_language TO '/tmp/x' WITH CSV FORCE QUOTE lanname NO    NULL CHECK lanacl

If folks like it, I can make the change.

--  Bruce Momjian                        |  http://candle.pha.pa.us pgman@candle.pha.pa.us               |  (610)
359-1001+  If your life is a hard drive,     |  13 Roberts Road +  Christ can be your backup.        |  Newtown Square,
Pennsylvania19073
 


Re: COPY CSV keywords

From
Bruce Momjian
Date:
Tom Lane wrote:
> Bruce Momjian <pgman@candle.pha.pa.us> writes:
> > Peter Eisentraut wrote:
> >> Bruce Momjian wrote:
> >>> LITERAL - prevent NULL checks for specific columns
> >> 
> >> NO NULL CHECK
> 
> > I considered going this direction, but it broke the WITH clause style of
> > COPY.
> 
> Huh?  Many of the other options are already potentially two words,
> eg QUOTE [AS], so I don't see how you can object to that aspect.

AS was a noise word, while the new keywords would not be.

> I agree with Peter that "LITERAL" is a completely unhelpful name for the
> functionality.

Yep.

> "NO NULL CHECK" seems a bit confusing as well --- at first glance it
> looks like it might mean overriding the column's NOT NULL attribute
> (which I trust is not what it means ;-)).  Perhaps "NO NULLS" ?

Right.  It doesn't affect the constraint.

> Or just "NOT NULL", to avoid introducing another keyword.

Yes, I liked that, but to me it looked like we were saying the column
was NOT NULL, which isn't the meaning.  You could almost call it QUOTE
NULL because it is actually quoting the NULL strings in those columns.

> I liked the "FORCE QUOTE" suggestion, too.

OK.

--  Bruce Momjian                        |  http://candle.pha.pa.us pgman@candle.pha.pa.us               |  (610)
359-1001+  If your life is a hard drive,     |  13 Roberts Road +  Christ can be your backup.        |  Newtown Square,
Pennsylvania19073
 


Re: COPY CSV keywords

From
Tom Lane
Date:
Bruce Momjian <pgman@candle.pha.pa.us> writes:
> Peter Eisentraut wrote:
>> Bruce Momjian wrote:
>>> LITERAL - prevent NULL checks for specific columns
>> 
>> NO NULL CHECK

> I considered going this direction, but it broke the WITH clause style of
> COPY.

Huh?  Many of the other options are already potentially two words,
eg QUOTE [AS], so I don't see how you can object to that aspect.

I agree with Peter that "LITERAL" is a completely unhelpful name for the
functionality.

"NO NULL CHECK" seems a bit confusing as well --- at first glance it
looks like it might mean overriding the column's NOT NULL attribute
(which I trust is not what it means ;-)).  Perhaps "NO NULLS" ?
Or just "NOT NULL", to avoid introducing another keyword.

I liked the "FORCE QUOTE" suggestion, too.
        regards, tom lane


Re: COPY CSV keywords

From
Andrew Dunstan
Date:
I wrote:

>
> The problem is that using QUOTE or NULL in these phrases might look 
> confusing, e.g.
>  COPY mytable TO 'mytable.csv' CSV QUOTE '"' FORCE QUOTE field1,field2;
>  COPY mytable FROM 'mytable.csv' NULL '' CSV NO NULL CHECK field1,field2;
>
> I also don't think NO NULL CHECK actually matches the use case for 
> this very well (and I'm dubious about LITERAL too).  They both 
> describe the actual behaviour, but not what you are trying to achieve. 
> Essentially, this would be used when you have a field with a NOT NULL 
> constraint, but the input CSV data stream has what would otherwise be 
> considered nulls. (COPY itself will never produce such a CSV, as 
> non-null values that resemble null are always quoted, but third party 
> programs well might.) So an alternative might be FORCE NOT NULL, but 
> for the previous consideration. Perhaps use of an optional preposition 
> might make things slightly clearer, e.g.:
>
>  COPY mytable TO 'mytable.csv' CSV QUOTE '"' FORCE QUOTE  IN 
> field1,field2;
>  COPY mytable FROM 'mytable.csv' NULL '' CSV FORCE NOT NULL IN 
> field1,field2;
>
> But it does start to look a little too much like COBOL .
>
> So I'm interested to see if there are any other inspirations people have.


The other alternative for the NOT NULL side would be to abandon it and 
tell users they would need to use a trigger. That requires a little more 
work from them, but would work in the general case, whereas this one is 
likely to fail on everything but a text-and-friends column. Example:

andrew=# create table b(a int not null default 0);
CREATE TABLE
andrew=# create function bnull() returns trigger language plpgsql as $$
andrew$# begin
andrew$# if NEW.a is null then NEW.a := 0; end if;
andrew$# return NEW;
andrew$# end;
andrew$# $$;
CREATE FUNCTION
andrew=# create trigger btrigger before insert on b FOR EACH ROW EXECUTE 
PROCEDURE bnull();
CREATE TRIGGER
andrew=# copy b from stdin csv;
Enter data to be copied followed by a newline.
End with a backslash and a period on a line by itself.>>>> \.
andrew=# select * from b
andrew-# ;a
---0
(1 row)


But there isn't a reasonable alternative to making the user make choices 
on the output side (see previous discussion regarding zip codes).

cheers

andrew


Re: COPY CSV keywords

From
Andrew Dunstan
Date:
Bruce Momjian wrote:

>[ discussion moved to hackers.]
>
>Peter Eisentraut wrote:
>  
>
>>Bruce Momjian wrote:
>>    
>>
>>>I have applied the attached patch that complete TODO item:
>>>
>>>        o -Allow dump/load of CSV format
>>>
>>>This adds new keywords to COPY and \copy:
>>>
>>>        CSV - enable CSV mode
>>>        QUOTE - specify quote character
>>>        ESCAPE - specify escape character
>>>        FORCE - force quoting of specified columns
>>>      
>>>
>>FORCE QUOTE
>>
>>    
>>
>>>        LITERAL - prevent NULL checks for specific columns
>>>      
>>>
>>NO NULL CHECK
>>    
>>
>
>I considered going this direction, but it broke the WITH clause style of
>COPY.  Previously it was "WITH keyword value".  Now it is also "WITH
>keyword value, value" too.  This would add "WITH keyword keyword value,
>value".
>
>It would change:
>    
>    COPY pg_language TO '/tmp/x' WITH CSV FORCE lanname 
>        LITERAL lanacl
>to:
>    COPY pg_language TO '/tmp/x' WITH CSV FORCE QUOTE lanname NO
>        NULL CHECK lanacl
>
>If folks like it, I can make the change.
>
>  
>
[reposting my comments from patches]

The problem is that using QUOTE or NULL in these phrases might look 
confusing, e.g. COPY mytable TO 'mytable.csv' CSV QUOTE '"' FORCE QUOTE field1,field2; COPY mytable FROM 'mytable.csv'
NULL'' CSV NO NULL CHECK field1,field2;
 

I also don't think NO NULL CHECK actually matches the use case for this 
very well (and I'm dubious about LITERAL too).  They both describe the 
actual behaviour, but not what you are trying to achieve. Essentially, 
this would be used when you have a field with a NOT NULL constraint, but 
the input CSV data stream has what would otherwise be considered nulls. 
(COPY itself will never produce such a CSV, as non-null values that 
resemble null are always quoted, but third party programs well might.) 
So an alternative might be FORCE NOT NULL, but for the previous 
consideration. Perhaps use of an optional preposition might make things 
slightly clearer, e.g.:
 COPY mytable TO 'mytable.csv' CSV QUOTE '"' FORCE QUOTE  IN 
field1,field2; COPY mytable FROM 'mytable.csv' NULL '' CSV FORCE NOT NULL IN 
field1,field2;

But it does start to look a little too much like COBOL .

So I'm interested to see if there are any other inspirations people have.

cheers

andrew