Thread: COPY CSV keywords
[ discussion moved to hackers.] Peter Eisentraut wrote: > Bruce Momjian wrote: > > I have applied the attached patch that complete TODO item: > > > > o -Allow dump/load of CSV format > > > > This adds new keywords to COPY and \copy: > > > > CSV - enable CSV mode > > QUOTE - specify quote character > > ESCAPE - specify escape character > > FORCE - force quoting of specified columns > > FORCE QUOTE > > > LITERAL - prevent NULL checks for specific columns > > NO NULL CHECK I considered going this direction, but it broke the WITH clause style of COPY. Previously it was "WITH keyword value". Now it is also "WITH keyword value, value" too. This would add "WITH keyword keyword value, value". It would change:COPY pg_language TO '/tmp/x' WITH CSV FORCE lanname LITERAL lanacl to:COPY pg_language TO '/tmp/x' WITH CSV FORCE QUOTE lanname NO NULL CHECK lanacl If folks like it, I can make the change. -- Bruce Momjian | http://candle.pha.pa.us pgman@candle.pha.pa.us | (610) 359-1001+ If your life is a hard drive, | 13 Roberts Road + Christ can be your backup. | Newtown Square, Pennsylvania19073
Tom Lane wrote: > Bruce Momjian <pgman@candle.pha.pa.us> writes: > > Peter Eisentraut wrote: > >> Bruce Momjian wrote: > >>> LITERAL - prevent NULL checks for specific columns > >> > >> NO NULL CHECK > > > I considered going this direction, but it broke the WITH clause style of > > COPY. > > Huh? Many of the other options are already potentially two words, > eg QUOTE [AS], so I don't see how you can object to that aspect. AS was a noise word, while the new keywords would not be. > I agree with Peter that "LITERAL" is a completely unhelpful name for the > functionality. Yep. > "NO NULL CHECK" seems a bit confusing as well --- at first glance it > looks like it might mean overriding the column's NOT NULL attribute > (which I trust is not what it means ;-)). Perhaps "NO NULLS" ? Right. It doesn't affect the constraint. > Or just "NOT NULL", to avoid introducing another keyword. Yes, I liked that, but to me it looked like we were saying the column was NOT NULL, which isn't the meaning. You could almost call it QUOTE NULL because it is actually quoting the NULL strings in those columns. > I liked the "FORCE QUOTE" suggestion, too. OK. -- Bruce Momjian | http://candle.pha.pa.us pgman@candle.pha.pa.us | (610) 359-1001+ If your life is a hard drive, | 13 Roberts Road + Christ can be your backup. | Newtown Square, Pennsylvania19073
Bruce Momjian <pgman@candle.pha.pa.us> writes: > Peter Eisentraut wrote: >> Bruce Momjian wrote: >>> LITERAL - prevent NULL checks for specific columns >> >> NO NULL CHECK > I considered going this direction, but it broke the WITH clause style of > COPY. Huh? Many of the other options are already potentially two words, eg QUOTE [AS], so I don't see how you can object to that aspect. I agree with Peter that "LITERAL" is a completely unhelpful name for the functionality. "NO NULL CHECK" seems a bit confusing as well --- at first glance it looks like it might mean overriding the column's NOT NULL attribute (which I trust is not what it means ;-)). Perhaps "NO NULLS" ? Or just "NOT NULL", to avoid introducing another keyword. I liked the "FORCE QUOTE" suggestion, too. regards, tom lane
I wrote: > > The problem is that using QUOTE or NULL in these phrases might look > confusing, e.g. > COPY mytable TO 'mytable.csv' CSV QUOTE '"' FORCE QUOTE field1,field2; > COPY mytable FROM 'mytable.csv' NULL '' CSV NO NULL CHECK field1,field2; > > I also don't think NO NULL CHECK actually matches the use case for > this very well (and I'm dubious about LITERAL too). They both > describe the actual behaviour, but not what you are trying to achieve. > Essentially, this would be used when you have a field with a NOT NULL > constraint, but the input CSV data stream has what would otherwise be > considered nulls. (COPY itself will never produce such a CSV, as > non-null values that resemble null are always quoted, but third party > programs well might.) So an alternative might be FORCE NOT NULL, but > for the previous consideration. Perhaps use of an optional preposition > might make things slightly clearer, e.g.: > > COPY mytable TO 'mytable.csv' CSV QUOTE '"' FORCE QUOTE IN > field1,field2; > COPY mytable FROM 'mytable.csv' NULL '' CSV FORCE NOT NULL IN > field1,field2; > > But it does start to look a little too much like COBOL . > > So I'm interested to see if there are any other inspirations people have. The other alternative for the NOT NULL side would be to abandon it and tell users they would need to use a trigger. That requires a little more work from them, but would work in the general case, whereas this one is likely to fail on everything but a text-and-friends column. Example: andrew=# create table b(a int not null default 0); CREATE TABLE andrew=# create function bnull() returns trigger language plpgsql as $$ andrew$# begin andrew$# if NEW.a is null then NEW.a := 0; end if; andrew$# return NEW; andrew$# end; andrew$# $$; CREATE FUNCTION andrew=# create trigger btrigger before insert on b FOR EACH ROW EXECUTE PROCEDURE bnull(); CREATE TRIGGER andrew=# copy b from stdin csv; Enter data to be copied followed by a newline. End with a backslash and a period on a line by itself.>>>> \. andrew=# select * from b andrew-# ;a ---0 (1 row) But there isn't a reasonable alternative to making the user make choices on the output side (see previous discussion regarding zip codes). cheers andrew
Bruce Momjian wrote: >[ discussion moved to hackers.] > >Peter Eisentraut wrote: > > >>Bruce Momjian wrote: >> >> >>>I have applied the attached patch that complete TODO item: >>> >>> o -Allow dump/load of CSV format >>> >>>This adds new keywords to COPY and \copy: >>> >>> CSV - enable CSV mode >>> QUOTE - specify quote character >>> ESCAPE - specify escape character >>> FORCE - force quoting of specified columns >>> >>> >>FORCE QUOTE >> >> >> >>> LITERAL - prevent NULL checks for specific columns >>> >>> >>NO NULL CHECK >> >> > >I considered going this direction, but it broke the WITH clause style of >COPY. Previously it was "WITH keyword value". Now it is also "WITH >keyword value, value" too. This would add "WITH keyword keyword value, >value". > >It would change: > > COPY pg_language TO '/tmp/x' WITH CSV FORCE lanname > LITERAL lanacl >to: > COPY pg_language TO '/tmp/x' WITH CSV FORCE QUOTE lanname NO > NULL CHECK lanacl > >If folks like it, I can make the change. > > > [reposting my comments from patches] The problem is that using QUOTE or NULL in these phrases might look confusing, e.g. COPY mytable TO 'mytable.csv' CSV QUOTE '"' FORCE QUOTE field1,field2; COPY mytable FROM 'mytable.csv' NULL'' CSV NO NULL CHECK field1,field2; I also don't think NO NULL CHECK actually matches the use case for this very well (and I'm dubious about LITERAL too). They both describe the actual behaviour, but not what you are trying to achieve. Essentially, this would be used when you have a field with a NOT NULL constraint, but the input CSV data stream has what would otherwise be considered nulls. (COPY itself will never produce such a CSV, as non-null values that resemble null are always quoted, but third party programs well might.) So an alternative might be FORCE NOT NULL, but for the previous consideration. Perhaps use of an optional preposition might make things slightly clearer, e.g.: COPY mytable TO 'mytable.csv' CSV QUOTE '"' FORCE QUOTE IN field1,field2; COPY mytable FROM 'mytable.csv' NULL '' CSV FORCE NOT NULL IN field1,field2; But it does start to look a little too much like COBOL . So I'm interested to see if there are any other inspirations people have. cheers andrew