Thread: The COPY command and csv files
**I apologize in advance if this is a familiar question, but I couldn't find an answer** I am attempting to copy a csv file into a table, but no matter what I can think of I get one of two errors. I am using Postgresql 8.1.2. 1.) Here is the first COPY command I gave: COPY pwt61_oecd FROM stdin WITH DELIMITER AS ',' CSV QUOTE AS ' " ' NULL AS 'na' ; And here is the error: ERROR: missing data for column "isocode" CONTEXT: COPY pwt61_oecd, line 1 '/home/.../data/pwt61_oecd.csv" The file I am attempting to copy is very large, but there are no empty spaces that I can find, and the column "isocode" certainly has no empty spaces or NULL values. Here are the first two lines of the pwt61_oecd.csv file: "country","isocode","year","POP","XRAT","PPP","cgdp","cc","ci","cg","p","pc","pg","pi","openc","cgnp","csave","y","rgdpl","rgdpch","rgdpeqa","rgdpwok","rgdptt","openk","kc","kg","ki","grgdpch" "Australia","AUS","1950","8434.0650803","0.8931999704","0.515977685","1836.3303946","60.365553134","27.690222887","7.9941116957","57.767319935","53.853430186","73.898773273","61.642604011","47.850011457","na","31.64033517","95.021059091","9113.8072046","9173.8190347","10576.40502","22078.709149","9988.2503259","19.331539711","65.924922543","13.104642709","26.200404435","na" (I attempted the same copy command after changing all "na" to "NULL" and I also tried the same command after removing column headers (1st line) from the csv file, but I received the same error message each time.) 2.) Curious if I would get the same error, I attemtped to use the \copy command in psql, but this time I removed the column headers from the csv file: \copy pwt61_oecd FROM '/home/.../data/pwt61_oecd2.csv' WITH DELIMITER AS ',' CSV QUOTE AS ' " ' NULL AS 'na' And the error report: ERROR: invalid input syntax for type numeric: "na" CONTEXT: COPY pwt61_oecd, line 1, column cgnp: "na" Obviously I haven't used the COPY or \copy functions properly. Or I am using the wrong data type. Could someone please point out my mistake(s)? And on a unrelated note, in a seperate table I would like to import there are exponential numbers (e.g. 3.77E-13). Do I need to convert these to regular numbers if I wish to use the numeric data type? Thank you. -Eric
E. Matthew Finnin wrote: > **I apologize in advance if this is a familiar question, but I couldn't > find an answer** > > I am attempting to copy a csv file into a table, but no matter what I > can think of I get one of two errors. I am using Postgresql 8.1.2. > > 1.) Here is the first COPY command I gave: > COPY pwt61_oecd FROM stdin > WITH DELIMITER AS ',' > CSV QUOTE AS ' " ' Should be '"', not ' " '. In fact, DELIMITER and QUOTE are the defaults, so there is no need to specify them. > NULL AS 'na' > ; > 2.) Curious if I would get the same error, I attemtped to use the > \copy command in psql, but this time I removed the column headers from > the csv file: > > \copy pwt61_oecd FROM '/home/.../data/pwt61_oecd2.csv' WITH DELIMITER > AS ',' CSV QUOTE AS ' " ' NULL AS 'na' > > And the error report: > ERROR: invalid input syntax for type numeric: "na" > CONTEXT: COPY pwt61_oecd, line 1, column cgnp: "na" > Yep, same ' " ' problem. We disallows multi-character delimiters in current CVS, but you might not have the 8.1.X backpatch. If you have continued problems, please send us the table schema and a line that is failing and we can help you get it working. -- Bruce Momjian http://candle.pha.pa.us EnterpriseDB http://www.enterprisedb.com + If your life is a hard drive, Christ can be your backup. +
Currently I have removed all double quotations from my null values ("na" has been changed to na). I have also removed the header line of the csv file I am trying to copy into the table. I have tried various forms of null values and everything else suggested to me, with no luck. Here are the table definitions. I was trying to make it as simple as possible: emf=> \d pwt61_oecd Table "public.pwt61_oecd" Column | Type | Modifiers ---------+---------+----------- country | text | isocode | text | year | numeric | pop | numeric | xrat | numeric | ppp | numeric | cgdp | numeric | cc | numeric | ci | numeric | cg | numeric | p | numeric | pc | numeric | pg | numeric | pi | numeric | openc | numeric | cgnp | numeric | csave | numeric | y | numeric | rgdpl | numeric | rgdpch | numeric | rgdpeqa | numeric | rgdpwok | numeric | rgdptt | numeric | openk | numeric | kc | numeric | kg | numeric | ki | numeric | grgdpch | numeric | This is the error I get after using the copy command: ERROR: missing data for column "isocode" CONTEXT: COPY pwt61_oecd, line 1: This is the first line of the csv file, where the error is occuring: "Australia","AUS","1950","8434.0650803","0.8931999704","0.515977685","1836.3303946","60.365553134","27.690222887","7.9941116957","57.767319935","53.853430186","73.898773273","61.642604011","47.850011457",na,"31.64033517","95.021059091","9113.8072046","9173.8190347","10576.40502","22078.709149","9988.2503259","19.331539711","65.924922543","13.104642709","26.200404435",na If I drop the column 'isocode' from the table and from the csv file, I receive the following error: ERROR: missing data for column "year" CONTEXT: COPY pwt61_test, line 1: Here the first line of the csv file is: "Australia","AUS","1950","8434.0650803","0.8931999704","0.515977685","1836.3303946","60.365553134","27.690222887","7.9941116957","57.767319935","53.853430186","73.898773273","61.642604011","47.850011457",na,"31.64033517","95.021059091","9113.8072046","9173.8190347","10576.40502","22078.709149","9988.2503259","19.331539711","65.924922543","13.104642709","26.200404435",na Notice the error has remained in the second column and it doesn't matter what value it holds or what the data type is. I am at a complete loss. Thank you again for your help. -Eric On 4/14/06, Bruce Momjian <pgman@candle.pha.pa.us> wrote: > E. Matthew Finnin wrote: > > Thank you both for replying. When I wrote ' " ' I meant '"', I was > > trying to make it easier to read for an email, but all I did was > > confuse things. Anyway, I tried dropping the quote and delimiter > > comments from the command, but I still received a missing data error > > on line 1 of the csv file. The data I'm using is all OECD countries > > selected from the Penn World Tables 6.1 website > > (http://pwt.econ.upenn.edu/php_site/pwt61_form.php) and cutting and > > pasting the output into a csv file. > > > > Bruce you mentioned there is a cvs backpatch for version 8.1.x. Maybe > > this is my problem. How do I check if this is installed and, assuming > > it isn't, how do I go about installing it? I've come across scripts > > in the mail lists related to something like this, but I wasn't sure if > > they were final or even for anyone other than developers. If its of > > any interest to anyone, I installed Postgresql 8.1.x as an Kubuntu > > package. > > OK, got it. If you remove the first line, does it work? Also, those > "n/a" are not going to work because we don't allow the NULL indicator to > be in double quotes. Try change "n/a" to n/a and see if that helps. > Also try removing the first line to see if that helps. Also, show use > the table definition you are trying to load into, and the error message, > and a line generating an error. > > I don't think you need anything backpatched. > > -- > Bruce Momjian http://candle.pha.pa.us > EnterpriseDB http://www.enterprisedb.com > > + If your life is a hard drive, Christ can be your backup. + >
On 4/18/06 3:06 AM, "E. Matthew Finnin" <emf.storage@gmail.com> wrote: > Currently I have removed all double quotations from my null values > ("na" has been changed to na). I have also removed the header line of > the csv file I am trying to copy into the table. I have tried various > forms of null values and everything else suggested to me, with no > luck. > > Here are the table definitions. I was trying to make it as simple as > possible: > emf=> \d pwt61_oecd > Table "public.pwt61_oecd" > Column | Type | Modifiers > ---------+---------+----------- > country | text | > isocode | text | > year | numeric | > pop | numeric | > xrat | numeric | > ppp | numeric | > cgdp | numeric | > cc | numeric | > ci | numeric | > cg | numeric | > p | numeric | > pc | numeric | > pg | numeric | > pi | numeric | > openc | numeric | > cgnp | numeric | > csave | numeric | > y | numeric | > rgdpl | numeric | > rgdpch | numeric | > rgdpeqa | numeric | > rgdpwok | numeric | > rgdptt | numeric | > openk | numeric | > kc | numeric | > kg | numeric | > ki | numeric | > grgdpch | numeric | > > This is the error I get after using the copy command: > ERROR: missing data for column "isocode" > CONTEXT: COPY pwt61_oecd, line 1: > > This is the first line of the csv file, where the error is occuring: > "Australia","AUS","1950","8434.0650803","0.8931999704","0.515977685","1836.330 > 3946","60.365553134","27.690222887","7.9941116957","57.767319935","53.85343018 > 6","73.898773273","61.642604011","47.850011457",na,"31.64033517","95.021059091 > ","9113.8072046","9173.8190347","10576.40502","22078.709149","9988.2503259","1 > 9.331539711","65.924922543","13.104642709","26.200404435",na > > If I drop the column 'isocode' from the table and from the csv file, I > receive the following error: > ERROR: missing data for column "year" > CONTEXT: COPY pwt61_test, line 1: > > Here the first line of the csv file is: > "Australia","AUS","1950","8434.0650803","0.8931999704","0.515977685","1836.330 > 3946","60.365553134","27.690222887","7.9941116957","57.767319935","53.85343018 > 6","73.898773273","61.642604011","47.850011457",na,"31.64033517","95.021059091 > ","9113.8072046","9173.8190347","10576.40502","22078.709149","9988.2503259","1 > 9.331539711","65.924922543","13.104642709","26.200404435",na > > Notice the error has remained in the second column and it doesn't > matter what value it holds or what the data type is. > > I am at a complete loss. Thank you again for your help. What is the exact command that you are using to import the data? Sean
"E. Matthew Finnin" <emf.storage@gmail.com> writes: > This is the error I get after using the copy command: > ERROR: missing data for column "isocode" > CONTEXT: COPY pwt61_oecd, line 1: > This is the first line of the csv file, where the error is occuring: > "Australia","AUS","1950","8434.0650803","0.8931999704","0.515977685","1836.3303946","60.365553134","27.690222887","7.9941116957","57.767319935","53.853430186","73.898773273","61.642604011","47.850011457",na,"31.64033517","95.021059091","9113.8072046","9173.8190347","10576.40502","22078.709149","9988.2503259","19.331539711","65.924922543","13.104642709","26.200404435",na Is that really exactly what the error message looks like? Because what it should be echoing back after "line 1:" is the first line of the file. Here's a trivial example using data entered from the keyboard: regression=# create table t1 (f1 text,f2 text); CREATE TABLE regression=# copy t1 from stdin csv; Enter data to be copied followed by a newline. End with a backslash and a period on a line by itself. >> "A","B" >> \. regression=# copy t1 from stdin csv; Enter data to be copied followed by a newline. End with a backslash and a period on a line by itself. >> "A" >> \. ERROR: missing data for column "f2" CONTEXT: COPY t1, line 1: ""A"" regression=# copy t1 from stdin csv; Enter data to be copied followed by a newline. End with a backslash and a period on a line by itself. >> >> \. ERROR: missing data for column "f2" CONTEXT: COPY t1, line 1: "" regression=# I'm wondering if maybe you're making some simple mistake that's got nothing to do with the file format, but rather you're pointing COPY at the wrong file or some such. The behavior seems consistent with the idea that COPY is seeing an empty line as the first line of data. regards, tom lane
Hello Matthew, I've noticed that the numeric fields are quoted with ". Try it after you removed the " from the numeric fields. Is it now working? With kind regards Florian Reiser -- http://www.ra-bc.de RA Unternehmensberatung F�hren durch pr�zise Daten ""E. Matthew Finnin"" <emf.storage@gmail.com> schrieb im Newsbeitrag news:c939097c0604180006r5bead672t199985c7b8858f71@mail.gmail.com... > Currently I have removed all double quotations from my null values > ("na" has been changed to na). I have also removed the header line of > the csv file I am trying to copy into the table. I have tried various > forms of null values and everything else suggested to me, with no > luck. > > Here are the table definitions. I was trying to make it as simple as > possible: > emf=> \d pwt61_oecd > Table "public.pwt61_oecd" > Column | Type | Modifiers > ---------+---------+----------- > country | text | > isocode | text | > year | numeric | > pop | numeric | > xrat | numeric | > ppp | numeric | > cgdp | numeric | > cc | numeric | > ci | numeric | > cg | numeric | > p | numeric | > pc | numeric | > pg | numeric | > pi | numeric | > openc | numeric | > cgnp | numeric | > csave | numeric | > y | numeric | > rgdpl | numeric | > rgdpch | numeric | > rgdpeqa | numeric | > rgdpwok | numeric | > rgdptt | numeric | > openk | numeric | > kc | numeric | > kg | numeric | > ki | numeric | > grgdpch | numeric | > > This is the error I get after using the copy command: > ERROR: missing data for column "isocode" > CONTEXT: COPY pwt61_oecd, line 1: > > This is the first line of the csv file, where the error is occuring: > "Australia","AUS","1950","8434.0650803","0.8931999704","0.515977685","1836.3303946","60.365553134","27.690222887","7.9941116957","57.767319935","53.853430186","73.898773273","61.642604011","47.850011457",na,"31.64033517","95.021059091","9113.8072046","9173.8190347","10576.40502","22078.709149","9988.2503259","19.331539711","65.924922543","13.104642709","26.200404435",na > > If I drop the column 'isocode' from the table and from the csv file, I > receive the following error: > ERROR: missing data for column "year" > CONTEXT: COPY pwt61_test, line 1: > > Here the first line of the csv file is: > "Australia","AUS","1950","8434.0650803","0.8931999704","0.515977685","1836.3303946","60.365553134","27.690222887","7.9941116957","57.767319935","53.853430186","73.898773273","61.642604011","47.850011457",na,"31.64033517","95.021059091","9113.8072046","9173.8190347","10576.40502","22078.709149","9988.2503259","19.331539711","65.924922543","13.104642709","26.200404435",na > > Notice the error has remained in the second column and it doesn't > matter what value it holds or what the data type is. > > I am at a complete loss. Thank you again for your help. > > -Eric > > > > > On 4/14/06, Bruce Momjian <pgman@candle.pha.pa.us> wrote: >> E. Matthew Finnin wrote: >> > Thank you both for replying. When I wrote ' " ' I meant '"', I was >> > trying to make it easier to read for an email, but all I did was >> > confuse things. Anyway, I tried dropping the quote and delimiter >> > comments from the command, but I still received a missing data error >> > on line 1 of the csv file. The data I'm using is all OECD countries >> > selected from the Penn World Tables 6.1 website >> > (http://pwt.econ.upenn.edu/php_site/pwt61_form.php) and cutting and >> > pasting the output into a csv file. >> > >> > Bruce you mentioned there is a cvs backpatch for version 8.1.x. Maybe >> > this is my problem. How do I check if this is installed and, assuming >> > it isn't, how do I go about installing it? I've come across scripts >> > in the mail lists related to something like this, but I wasn't sure if >> > they were final or even for anyone other than developers. If its of >> > any interest to anyone, I installed Postgresql 8.1.x as an Kubuntu >> > package. >> >> OK, got it. If you remove the first line, does it work? Also, those >> "n/a" are not going to work because we don't allow the NULL indicator to >> be in double quotes. Try change "n/a" to n/a and see if that helps. >> Also try removing the first line to see if that helps. Also, show use >> the table definition you are trying to load into, and the error message, >> and a line generating an error. >> >> I don't think you need anything backpatched. >> >> -- >> Bruce Momjian http://candle.pha.pa.us >> EnterpriseDB http://www.enterprisedb.com >> >> + If your life is a hard drive, Christ can be your backup. + >> > > ---------------------------(end of broadcast)--------------------------- > TIP 6: explain analyze is your friend >
Thank you to everyone who replied. After removing double quotes from all NULL values and switching to my superuser, I could copy just fine. Apparently my error was my use of stdin as a regular user. I knew I couldn't use COPY pwt61_test FROM '/home/emf/pwt61_test' # (that is, FROM 'file') as a regular user and from what I read I thought stdin was to be used when you are not a superuser, as in: COPY pwt61_test FROM stdin WITH DELIMITER ',' CSV QUOTE AS '"' NULL AS 'na'; Enter data to be copied followed by a newline. End with a backslash and a period on a line by itself. >> /home/emf/pwt61_test.csv >> \. ERROR: missing data for column "isocode" CONTEXT: COPY pwt61_oecd, line 1: "/home/emf/pwt61_test.csv" It wasn't until I saw Tom Lane's post that I realized that I was using stdin wrong. Now its obvious why I was getting the error I was getting. It was expecting to see data, not a file. Apparently, you can't use the COPY command to copy a whole file without being a super user? I didn't understand this from the errors I received. I apologize to everyone. I didn't mean to waste your time. -Eric On 4/18/06, Florian Reiser <florian.reiser@ra-bc.de> wrote: > Hello Matthew, > > I've noticed that the numeric fields are quoted with ". > Try it after you removed the " from the numeric fields. > > Is it now working? > > With kind regards > > Florian Reiser > > -- > http://www.ra-bc.de > RA Unternehmensberatung > Führen durch präzise Daten > > ""E. Matthew Finnin"" <emf.storage@gmail.com> schrieb im Newsbeitrag > news:c939097c0604180006r5bead672t199985c7b8858f71@mail.gmail.com... > > Currently I have removed all double quotations from my null values > > ("na" has been changed to na). I have also removed the header line of > > the csv file I am trying to copy into the table. I have tried various > > forms of null values and everything else suggested to me, with no > > luck. > > > > Here are the table definitions. I was trying to make it as simple as > > possible: > > emf=> \d pwt61_oecd > > Table "public.pwt61_oecd" > > Column | Type | Modifiers > > ---------+---------+----------- > > country | text | > > isocode | text | > > year | numeric | > > pop | numeric | > > xrat | numeric | > > ppp | numeric | > > cgdp | numeric | > > cc | numeric | > > ci | numeric | > > cg | numeric | > > p | numeric | > > pc | numeric | > > pg | numeric | > > pi | numeric | > > openc | numeric | > > cgnp | numeric | > > csave | numeric | > > y | numeric | > > rgdpl | numeric | > > rgdpch | numeric | > > rgdpeqa | numeric | > > rgdpwok | numeric | > > rgdptt | numeric | > > openk | numeric | > > kc | numeric | > > kg | numeric | > > ki | numeric | > > grgdpch | numeric | > > > > This is the error I get after using the copy command: > > ERROR: missing data for column "isocode" > > CONTEXT: COPY pwt61_oecd, line 1: > > > > This is the first line of the csv file, where the error is occuring: > > "Australia","AUS","1950","8434.0650803","0.8931999704","0.515977685","1836.3303946","60.365553134","27.690222887","7.9941116957","57.767319935","53.853430186","73.898773273","61.642604011","47.850011457",na,"31.64033517","95.021059091","9113.8072046","9173.8190347","10576.40502","22078.709149","9988.2503259","19.331539711","65.924922543","13.104642709","26.200404435",na > > > > If I drop the column 'isocode' from the table and from the csv file, I > > receive the following error: > > ERROR: missing data for column "year" > > CONTEXT: COPY pwt61_test, line 1: > > > > Here the first line of the csv file is: > > "Australia","AUS","1950","8434.0650803","0.8931999704","0.515977685","1836.3303946","60.365553134","27.690222887","7.9941116957","57.767319935","53.853430186","73.898773273","61.642604011","47.850011457",na,"31.64033517","95.021059091","9113.8072046","9173.8190347","10576.40502","22078.709149","9988.2503259","19.331539711","65.924922543","13.104642709","26.200404435",na > > > > Notice the error has remained in the second column and it doesn't > > matter what value it holds or what the data type is. > > > > I am at a complete loss. Thank you again for your help. > > > > -Eric > > > > > > > > > > On 4/14/06, Bruce Momjian <pgman@candle.pha.pa.us> wrote: > >> E. Matthew Finnin wrote: > >> > Thank you both for replying. When I wrote ' " ' I meant '"', I was > >> > trying to make it easier to read for an email, but all I did was > >> > confuse things. Anyway, I tried dropping the quote and delimiter > >> > comments from the command, but I still received a missing data error > >> > on line 1 of the csv file. The data I'm using is all OECD countries > >> > selected from the Penn World Tables 6.1 website > >> > (http://pwt.econ.upenn.edu/php_site/pwt61_form.php) and cutting and > >> > pasting the output into a csv file. > >> > > >> > Bruce you mentioned there is a cvs backpatch for version 8.1.x. Maybe > >> > this is my problem. How do I check if this is installed and, assuming > >> > it isn't, how do I go about installing it? I've come across scripts > >> > in the mail lists related to something like this, but I wasn't sure if > >> > they were final or even for anyone other than developers. If its of > >> > any interest to anyone, I installed Postgresql 8.1.x as an Kubuntu > >> > package. > >> > >> OK, got it. If you remove the first line, does it work? Also, those > >> "n/a" are not going to work because we don't allow the NULL indicator to > >> be in double quotes. Try change "n/a" to n/a and see if that helps. > >> Also try removing the first line to see if that helps. Also, show use > >> the table definition you are trying to load into, and the error message, > >> and a line generating an error. > >> > >> I don't think you need anything backpatched. > >> > >> -- > >> Bruce Momjian http://candle.pha.pa.us > >> EnterpriseDB http://www.enterprisedb.com > >> > >> + If your life is a hard drive, Christ can be your backup. + > >> > > > > ---------------------------(end of broadcast)--------------------------- > > TIP 6: explain analyze is your friend > > > > > > ---------------------------(end of broadcast)--------------------------- > TIP 4: Have you searched our list archives? > > http://archives.postgresql.org >
"Eric Matthew Finnin" <emf.storage@gmail.com> writes: > Apparently, you can't use the COPY command to copy a whole file > without being a super user? You can, but not like that. The point of the restriction is that the database may be running on a different machine from where the user is, and non-superusers shouldn't have access to the filesystem that the database can see. "COPY FROM 'filename'" is a server-side operation and reads the server's filesystem, so it's allowed only to superusers. However, psql (and probably most other client-side tools) have a way to read from the client's filesystem and shove the data across the network to a COPY FROM STDIN command. In psql what you want is to use the \copy command --- see the psql reference page for details. regards, tom lane