Re: Updated COPY CSV patch - Mailing list pgsql-patches
From | Bruce Momjian |
---|---|
Subject | Re: Updated COPY CSV patch |
Date | |
Msg-id | 200404141913.i3EJDNi02660@candle.pha.pa.us Whole thread Raw |
In response to | Re: Updated COPY CSV patch (Andrew Dunstan <andrew@dunslane.net>) |
Responses |
Plan for CSV handling of quotes, NULL
|
List | pgsql-patches |
Andrew Dunstan wrote: > Bruce Momjian wrote: > > > > >Do we need control for each column? What if we go with preferring NULL > >for comma-comma, and then print warnings for NOT NULL columns and try > >the promote. If you want comma-comma to be zero-length string, you can > >create the column with NOT NULL, load the file, then ALTER TABLE to > >allow NULL's again. Basically, the NOT NULL specification on the column > >is the COPY CSV control method, rather than having it be in COPY. > > > > > > > > > If we can't do type specific stuff then we need to be able to have > column-specific controls on export, at least. > > Consider a text column containing US 5-digit ZIP codes. If they are not > quoted, a spreadsheet will almost certainly not preserve the leading > zero some of them have, producing very undesirable results. However, a > genuine numeric-type field must not be quoted, or the same spreadsheet > won't see that value as a number. Unless we do stuff based on type, we > have no way of knowing from the text representation of the data what we > really need. Thus my proposal from this morning for column-specific user > control over this aspect. And if we are going to have per column user > control on export, why not on import too, to handle the NOT NULL > problem? It might make life easier for us code-wise than chasing down > nullability (e.g. in domains). Wow, that is certainly an excellent point. When we import, we know the resulting data type, but spreadsheets don't, and rely on the quoting to know what to do with the value. The zipcode is an excellent example. You can't even test for leading zeros because then some spreadsheet values in the column are text and some numeric. We do have a column list capability with COPY already: COPY tablename [ ( column [, ...] ) ] FROM { 'filename' | STDIN } Maybe we should extend that to control quoting on export and NULL handling on import. Does that solve our problems? FYI, do you have IM? I am: AIM bmomjian ICQ 151255111 Yahoo bmomjian MSN root@candle.pha.pa.us IRC bmomjian via FreeNode or EFNet -- Bruce Momjian | http://candle.pha.pa.us pgman@candle.pha.pa.us | (610) 359-1001 + If your life is a hard drive, | 13 Roberts Road + Christ can be your backup. | Newtown Square, Pennsylvania 19073
pgsql-patches by date: