Thread: Importing data from csv
Hi Folks,<br /><br />sorry if this is a duplicate post, i've been tryin to find a solution of importing data into postgresfrom a csv file. The problem is, I have a database which consists of columns which contain newline characters (macand unix). now when i export these files to a csv format, there are some line breaks (mixed unix and mac) in the datawhich breaks the copy procedure. <br /><br />I also tried using the script posted in one of the previous posts..<br clear="all"/><br />#! /usr/bin/perl<br />$inquotes = 0;<br />while (<>){<br /> # Chop the crlf<br /> chop ($_);<br/> chop ($_);<br /><br /> # this first bit goes through and replaces <br /> # all the commas that renot in quotes with tildes<br /> for ($i=0 ; $i < length($_) ; $i++){<br /> $char=substr($_,$i,1);<br /> if ($char eq '"' ){<br /> $inquotes = not($inquotes); <br /> }else{<br /> if ((!$inquotes) && ($char eq ",") ){<br /> substr($_,$i,1)="~";<br /> }<br /> }<br/> }<br /> # this replaces any quotes<br /> s/"//g;<br /> print "$_\n";<br />}<br /><br /><br />catdata_file | perl scriptname.pl > outputfile.dat<br /><br />and when i run the copy command i get messages like datamissing for xyz column.<br />any possible hints....... <br /><br />--<br />Thanks,<br />Sumeet
<div class="Section1"><p class="MsoNormal"><font color="navy" face="Arial" size="2"><span style="font-size: 10.0pt;font-family:Arial;color:navy">I recently did this by parsing the data through a VB program that appended a “\” infront of any Char(10) and/or Char(13) characters which tells Postgres to accept the next character as a literal part ofthe column value I believe – must do because it worked! I also quoted the whole column as part of the VB prog…</span></font><pclass="MsoNormal"><font color="navy" face="Arial" size="2"><span style="font-size: 10.0pt;font-family:Arial;color:navy"> </span></font><p class="MsoNormal"><font color="navy" face="Arial" size="2"><span style="font-size: 10.0pt;font-family:Arial;color:navy">Worked for me but I’m not sure the exact science behind it so someone else might beable to be of some more detailed help.</span></font><p class="MsoNormal"><font color="navy" face="Arial" size="2"><spanstyle="font-size: 10.0pt;font-family:Arial;color:navy"> </span></font><p class="MsoNormal"><font color="navy" face="Arial" size="2"><span style="font-size: 10.0pt;font-family:Arial;color:navy">Cheers,</span></font><p class="MsoNormal"><font color="navy" face="Arial" size="2"><spanstyle="font-size: 10.0pt;font-family:Arial;color:navy">-p</span></font><p class="MsoNormal"><font color="navy" face="Arial" size="2"><spanstyle="font-size: 10.0pt;font-family:Arial;color:navy"> </span></font><p class="MsoNormal" style="margin-left:36.0pt"><font face="Tahoma" size="2"><spanlang="EN-US" style="font-size:10.0pt;font-family:Tahoma">-----Original Message-----<br /><b><span style="font-weight:bold">From:</span></b>pgsql-sql-owner@postgresql.org [mailto:pgsql-sql-owner@postgresql.org] <b><spanstyle="font-weight:bold">On Behalf Of </span></b>Sumeet<br /><b><span style="font-weight:bold">Sent:</span></b> </span></font><fontface="Tahoma" size="2"><span lang="EN-US" style="font-size:10.0pt;font-family:Tahoma">Friday, 25 August2006</span></font><font face="Tahoma" size="2"><span lang="EN-US" style="font-size:10.0pt;font-family:Tahoma"> </span></font><fontface="Tahoma" size="2"><span lang="EN-US" style="font-size:10.0pt;font-family:Tahoma">00:48</span></font><fontface="Tahoma" size="2"><span lang="EN-US" style="font-size:10.0pt;font-family:Tahoma"><br/><b><span style="font-weight:bold">To:</span></b> pgsql-sql@postgresql.org<br/><b><span style="font-weight:bold">Subject:</span></b> [SQL] Importing data from csv</span></font><pclass="MsoNormal" style="margin-left:36.0pt"><font face="Times New Roman" size="3"><span style="font-size:12.0pt"> </span></font><pclass="MsoNormal" style="margin-left:36.0pt"><font face="Times New Roman" size="3"><spanstyle="font-size:12.0pt">Hi Folks,<br /><br /> sorry if this is a duplicate post, i've been tryin to find asolution of importing data into postgres from a csv file. The problem is, I have a database which consists of columns whichcontain newline characters (mac and unix). now when i export these files to a csv format, there are some line breaks(mixed unix and mac) in the data which breaks the copy procedure. <br /><br /> I also tried using the script postedin one of the previous posts..<br clear="all" /><br /> #! /usr/bin/perl<br /> $inquotes = 0;<br /> while (<>){<br/> # Chop the crlf<br /> chop ($_);<br /> chop ($_);<br /><br /> # this first bit goesthrough and replaces <br /> # all the commas that re not in quotes with tildes<br /> for ($i=0 ; $i < length($_); $i++){<br /> $char=substr($_,$i,1);<br /> if ($char eq '"' ){<br /> $inquotes= not($inquotes); <br /> }else{<br /> if ( (!$inquotes) && ($char eq ",") ){<br /> substr($_,$i,1)="~";<br /> }<br /> }<br /> }<br /> # this replaces anyquotes<br /> s/"//g;<br /> print "$_\n";<br /> }<br /><br /><br /> cat data_file | perl scriptname.pl > outputfile.dat<br/><br /> and when i run the copy command i get messages like data missing for xyz column.<br /> any possiblehints....... <br /><br /> --<br /> Thanks,<br /> Sumeet </span></font></div><br /><p><b>*******************Confidentialityand Privilege Notice*******************</b><p> The material contained in this messageis privileged and confidential to the addressee. If you are not the addressee indicated in this message or responsiblefor delivery of the message to such person, you may not copy or deliver this message to anyone, and you shoulddestroy it and kindly notify the sender by reply email. <p> Information in this message that does not relate to theofficial business of Weatherbeeta must be treated as neither given nor endorsed by Weatherbeeta. Weatherbeeta, its employees,contractors or associates shall not be liable for direct, indirect or consequential loss arising from transmissionof this message or any attachments <br />
A newline in CSV parlance denotes the end of a record....unless that newline is contained with quotes...<br /><br /> PhillipSmith wrote: <blockquote cite="mid004a01c6c7d3$60de8320$9b0014ac@ITPhil" type="cite"><style> <!--/* Font Definitions */@font-face{font-family:Tahoma;panose-1:2 11 6 4 3 5 4 4 2 4;}/* Style Definitions */p.MsoNormal,li.MsoNormal, div.MsoNormal{margin:0cm;margin-bottom:.0001pt;font-size:12.0pt;font-family:"Times New Roman";} a:link, span.MsoHyperlink{color:blue;text-decoration:underline;} a:visited, span.MsoHyperlinkFollowed{color:purple;text-decoration:underline;} span.EmailStyle17{font-family:Arial;color:navy;} @page Section1{size:595.3pt 841.9pt;margin:72.0pt 90.0pt 72.0pt 90.0pt;} div.Section1{page:Section1;} --> </style><div class="Section1"><p class="MsoNormal"><font color="navy" face="Arial" size="2"><span style="font-size: 10pt;font-family: Arial; color: navy;">I recently did this by parsing the data through a VB program that appended a “\” infront of any Char(10) and/or Char(13) characters which tells Postgres to accept the next character as a literal part ofthe column value I believe – must do because it worked! I also quoted the whole column as part of the VB prog…</span></font><pclass="MsoNormal"><font color="navy" face="Arial" size="2"><span style="font-size: 10pt; font-family:Arial; color: navy;"> </span></font><p class="MsoNormal"><font color="navy" face="Arial" size="2"><span style="font-size:10pt; font-family: Arial; color: navy;">Worked for me but I’m not sure the exact science behind it so someoneelse might be able to be of some more detailed help.</span></font><p class="MsoNormal"><font color="navy" face="Arial"size="2"><span style="font-size: 10pt; font-family: Arial; color: navy;"> </span></font><p class="MsoNormal"><fontcolor="navy" face="Arial" size="2"><span style="font-size: 10pt; font-family: Arial; color: navy;">Cheers,</span></font><pclass="MsoNormal"><font color="navy" face="Arial" size="2"><span style="font-size: 10pt; font-family:Arial; color: navy;">-p</span></font><p class="MsoNormal"><font color="navy" face="Arial" size="2"><span style="font-size:10pt; font-family: Arial; color: navy;"> </span></font><p class="MsoNormal" style="margin-left: 36pt;"><fontface="Tahoma" size="2"><span lang="EN-US" style="font-size: 10pt; font-family: Tahoma;">-----Original Message-----<br/><b><span style="font-weight: bold;">From:</span></b> <a class="moz-txt-link-abbreviated" href="mailto:pgsql-sql-owner@postgresql.org">pgsql-sql-owner@postgresql.org</a>[<a class="moz-txt-link-freetext" href="mailto:pgsql-sql-owner@postgresql.org">mailto:pgsql-sql-owner@postgresql.org</a>]<b><span style="font-weight: bold;">OnBehalf Of </span></b>Sumeet<br /><b><span style="font-weight: bold;">Sent:</span></b> </span></font><font face="Tahoma"size="2"><span lang="EN-US" style="font-size: 10pt; font-family: Tahoma;">Friday, 25 August 2006</span></font><fontface="Tahoma" size="2"><span lang="EN-US" style="font-size: 10pt; font-family: Tahoma;"> </span></font><fontface="Tahoma" size="2"><span lang="EN-US" style="font-size: 10pt; font-family: Tahoma;">00:48</span></font><fontface="Tahoma" size="2"><span lang="EN-US" style="font-size: 10pt; font-family: Tahoma;"><br/><b><span style="font-weight: bold;">To:</span></b> <a class="moz-txt-link-abbreviated" href="mailto:pgsql-sql@postgresql.org">pgsql-sql@postgresql.org</a><br/><b><span style="font-weight: bold;">Subject:</span></b>[SQL] Importing data from csv</span></font><p class="MsoNormal" style="margin-left: 36pt;"><fontface="Times New Roman" size="3"><span style="font-size: 12pt;"> </span></font><p class="MsoNormal" style="margin-left:36pt;"><font face="Times New Roman" size="3"><span style="font-size: 12pt;">Hi Folks,<br /><br /> sorryif this is a duplicate post, i've been tryin to find a solution of importing data into postgres from a csv file. Theproblem is, I have a database which consists of columns which contain newline characters (mac and unix). now when i exportthese files to a csv format, there are some line breaks (mixed unix and mac) in the data which breaks the copy procedure.<br /><br /> I also tried using the script posted in one of the previous posts..<br clear="all" /><br /> #! /usr/bin/perl<br/> $inquotes = 0;<br /> while (<>){<br /> # Chop the crlf<br /> chop ($_);<br /> chop($_);<br /><br /> # this first bit goes through and replaces <br /> # all the commas that re not in quoteswith tildes<br /> for ($i=0 ; $i < length($_) ; $i++){<br /> $char=substr($_,$i,1);<br /> if ($char eq '"' ){<br /> $inquotes = not($inquotes); <br /> }else{<br /> if ((!$inquotes) && ($char eq ",") ){<br /> substr($_,$i,1)="~";<br /> }<br /> }<br /> }<br /> # this replaces any quotes<br /> s/"//g;<br /> print "$_\n";<br /> }<br /><br/><br /> cat data_file | perl scriptname.pl > outputfile.dat<br /><br /> and when i run the copy command i get messageslike data missing for xyz column.<br /> any possible hints....... <br /><br /> --<br /> Thanks,<br /> Sumeet </span></font></div><br/><p><b>*******************Confidentiality and Privilege Notice*******************</b><p>The materialcontained in this message is privileged and confidential to the addressee. If you are not the addressee indicatedin this message or responsible for delivery of the message to such person, you may not copy or deliver this messageto anyone, and you should destroy it and kindly notify the sender by reply email. <p>Information in this message thatdoes not relate to the official business of Weatherbeeta must be treated as neither given nor endorsed by Weatherbeeta.Weatherbeeta, its employees, contractors or associates shall not be liable for direct, indirect or consequentialloss arising from transmission of this message or any attachments <br /></blockquote><br /><pre class="moz-signature"cols="72">-- Scot P. Floess 27 Lake Royale Louisburg, NC 27549 252-478-8087 (Home) 919-754-4592 (Work) Chief Architect JPlate <a class="moz-txt-link-freetext" href="http://sourceforge.net/projects/jplate">http://sourceforge.net/projects/jplate</a> Chief Architect JavaPIM <a class="moz-txt-link-freetext" href="http://sourceforge.net/projects/javapim">http://sourceforge.net/projects/javapim</a> </pre>
And if its contained with quotes...its considered a field<br /><br /> Scot P. Floess wrote: <blockquote cite="mid44EE3CD9.5000300@mindspring.com"type="cite"> A newline in CSV parlance denotes the end of a record....unless thatnewline is contained with quotes...<br /><br /> Phillip Smith wrote: <blockquote cite="mid004a01c6c7d3$60de8320$9b0014ac@ITPhil"type="cite"><style> <!--/* Font Definitions */@font-face{font-family:Tahoma;panose-1:2 11 6 4 3 5 4 4 2 4;}/* Style Definitions */p.MsoNormal,li.MsoNormal, div.MsoNormal{margin:0cm;margin-bottom:.0001pt;font-size:12.0pt;font-family:"Times New Roman";} a:link, span.MsoHyperlink{color:blue;text-decoration:underline;} a:visited, span.MsoHyperlinkFollowed{color:purple;text-decoration:underline;} span.EmailStyle17{font-family:Arial;color:navy;} @page Section1{size:595.3pt 841.9pt;margin:72.0pt 90.0pt 72.0pt 90.0pt;} div.Section1{page:Section1;} --> </style><div class="Section1"><p class="MsoNormal"><font color="navy" face="Arial" size="2"><span style="font-size: 10pt;font-family: Arial; color: navy;">I recently did this by parsing the data through a VB program that appended a “\” infront of any Char(10) and/or Char(13) characters which tells Postgres to accept the next character as a literal part ofthe column value I believe – must do because it worked! I also quoted the whole column as part of the VB prog…</span></font><pclass="MsoNormal"><font color="navy" face="Arial" size="2"><span style="font-size: 10pt; font-family:Arial; color: navy;"> </span></font><p class="MsoNormal"><font color="navy" face="Arial" size="2"><span style="font-size:10pt; font-family: Arial; color: navy;">Worked for me but I’m not sure the exact science behind it so someoneelse might be able to be of some more detailed help.</span></font><p class="MsoNormal"><font color="navy" face="Arial"size="2"><span style="font-size: 10pt; font-family: Arial; color: navy;"> </span></font><p class="MsoNormal"><fontcolor="navy" face="Arial" size="2"><span style="font-size: 10pt; font-family: Arial; color: navy;">Cheers,</span></font><pclass="MsoNormal"><font color="navy" face="Arial" size="2"><span style="font-size: 10pt; font-family:Arial; color: navy;">-p</span></font><p class="MsoNormal"><font color="navy" face="Arial" size="2"><span style="font-size:10pt; font-family: Arial; color: navy;"> </span></font><p class="MsoNormal" style="margin-left: 36pt;"><fontface="Tahoma" size="2"><span lang="EN-US" style="font-size: 10pt; font-family: Tahoma;">-----Original Message-----<br/><b><span style="font-weight: bold;">From:</span></b> <a class="moz-txt-link-abbreviated" href="mailto:pgsql-sql-owner@postgresql.org">pgsql-sql-owner@postgresql.org</a>[<a class="moz-txt-link-freetext" href="mailto:pgsql-sql-owner@postgresql.org">mailto:pgsql-sql-owner@postgresql.org</a>]<b><span style="font-weight: bold;">OnBehalf Of </span></b>Sumeet<br /><b><span style="font-weight: bold;">Sent:</span></b> </span></font><font face="Tahoma"size="2"><span lang="EN-US" style="font-size: 10pt; font-family: Tahoma;">Friday, 25 August 2006</span></font><fontface="Tahoma" size="2"><span lang="EN-US" style="font-size: 10pt; font-family: Tahoma;"> </span></font><fontface="Tahoma" size="2"><span lang="EN-US" style="font-size: 10pt; font-family: Tahoma;">00:48</span></font><fontface="Tahoma" size="2"><span lang="EN-US" style="font-size: 10pt; font-family: Tahoma;"><br/><b><span style="font-weight: bold;">To:</span></b> <a class="moz-txt-link-abbreviated" href="mailto:pgsql-sql@postgresql.org">pgsql-sql@postgresql.org</a><br/><b><span style="font-weight: bold;">Subject:</span></b>[SQL] Importing data from csv</span></font><p class="MsoNormal" style="margin-left: 36pt;"><fontface="Times New Roman" size="3"><span style="font-size: 12pt;"> </span></font><p class="MsoNormal" style="margin-left:36pt;"><font face="Times New Roman" size="3"><span style="font-size: 12pt;">Hi Folks,<br /><br /> sorryif this is a duplicate post, i've been tryin to find a solution of importing data into postgres from a csv file. Theproblem is, I have a database which consists of columns which contain newline characters (mac and unix). now when i exportthese files to a csv format, there are some line breaks (mixed unix and mac) in the data which breaks the copy procedure.<br /><br /> I also tried using the script posted in one of the previous posts..<br clear="all" /><br /> #! /usr/bin/perl<br/> $inquotes = 0;<br /> while (<>){<br /> # Chop the crlf<br /> chop ($_);<br /> chop($_);<br /><br /> # this first bit goes through and replaces <br /> # all the commas that re not in quoteswith tildes<br /> for ($i=0 ; $i < length($_) ; $i++){<br /> $char=substr($_,$i,1);<br /> if ($char eq '"' ){<br /> $inquotes = not($inquotes); <br /> }else{<br /> if ((!$inquotes) && ($char eq ",") ){<br /> substr($_,$i,1)="~";<br /> }<br /> }<br /> }<br /> # this replaces any quotes<br /> s/"//g;<br /> print "$_\n";<br /> }<br /><br/><br /> cat data_file | perl scriptname.pl > outputfile.dat<br /><br /> and when i run the copy command i get messageslike data missing for xyz column.<br /> any possible hints....... <br /><br /> --<br /> Thanks,<br /> Sumeet </span></font></div><br/><p><b>*******************Confidentiality and Privilege Notice*******************</b><p>The materialcontained in this message is privileged and confidential to the addressee. If you are not the addressee indicatedin this message or responsible for delivery of the message to such person, you may not copy or deliver this messageto anyone, and you should destroy it and kindly notify the sender by reply email. <p>Information in this message thatdoes not relate to the official business of Weatherbeeta must be treated as neither given nor endorsed by Weatherbeeta.Weatherbeeta, its employees, contractors or associates shall not be liable for direct, indirect or consequentialloss arising from transmission of this message or any attachments <br /></blockquote><br /><pre class="moz-signature"cols="72">-- Scot P. Floess 27 Lake Royale Louisburg, NC 27549 252-478-8087 (Home) 919-754-4592 (Work) Chief Architect JPlate <a class="moz-txt-link-freetext" href="http://sourceforge.net/projects/jplate">http://sourceforge.net/projects/jplate</a> Chief Architect JavaPIM <a class="moz-txt-link-freetext" href="http://sourceforge.net/projects/javapim">http://sourceforge.net/projects/javapim</a></pre></blockquote><br /><pre class="moz-signature"cols="72">-- Scot P. Floess 27 Lake Royale Louisburg, NC 27549 252-478-8087 (Home) 919-754-4592 (Work) Chief Architect JPlate <a class="moz-txt-link-freetext" href="http://sourceforge.net/projects/jplate">http://sourceforge.net/projects/jplate</a> Chief Architect JavaPIM <a class="moz-txt-link-freetext" href="http://sourceforge.net/projects/javapim">http://sourceforge.net/projects/javapim</a> </pre>
<div class="Section1"><p class="MsoNormal"><font color="navy" face="Arial" size="2"><span style="font-size: 10.0pt;font-family:Arial;color:navy">There you go – it was the quotes that did it, not the back-slashes. I knew someone elsewould shed some better light! </span></font><font color="navy" face="Wingdings" size="2"><span style="font-size:10.0pt;font-family:Wingdings;color:navy">J</span></font><pclass="MsoNormal"><font color="navy" face="Arial"size="2"><span style="font-size: 10.0pt;font-family:Arial;color:navy"> </span></font><p class="MsoNormal"><font color="navy" face="Arial" size="2"><span style="font-size: 10.0pt;font-family:Arial;color:navy">Cheers,</span></font><p class="MsoNormal"><font color="navy" face="Arial" size="2"><spanstyle="font-size: 10.0pt;font-family:Arial;color:navy">-p</span></font><p class="MsoNormal"><font color="navy" face="Arial" size="2"><spanstyle="font-size: 10.0pt;font-family:Arial;color:navy"> </span></font><p class="MsoNormal" style="margin-left:36.0pt"><font color="black" face="Tahoma"size="2"><span lang="EN-US" style="font-size:10.0pt;font-family:Tahoma; color:windowtext">-----Original Message-----<br /><b><span style="font-weight:bold">From:</span></b> pgsql-sql-owner@postgresql.org[mailto:pgsql-sql-owner@postgresql.org] <b><span style="font-weight:bold">On Behalf Of </span></b>ScotP. Floess<br /><b><span style="font-weight:bold">Sent:</span></b> Friday, 25 August 2006 10:00<br /><b><spanstyle="font-weight:bold">To:</span></b> floess@mindspring.com<br /><b><span style="font-weight:bold">Cc:</span></b>Phillip Smith; pgsql-sql@postgresql.org<br /><b><span style="font-weight:bold">Subject:</span></b>Re: [SQL] Importing data from csv</span></font><p class="MsoNormal" style="margin-left:36.0pt"><fontcolor="black" face="Times New Roman" size="3"><span style="font-size:12.0pt"> </span></font><pclass="MsoNormal" style="margin-left:36.0pt"><font color="black" face="Times NewRoman" size="3"><span style="font-size:12.0pt">And if its contained with quotes...its considered a field<br /><br /> ScotP. Floess wrote: </span></font><p class="MsoNormal" style="margin-left:36.0pt"><font color="black" face="Times New Roman"size="3"><span style="font-size:12.0pt">A newline in CSV parlance denotes the end of a record....unless that newlineis contained with quotes...<br /><br /> Phillip Smith wrote: </span></font><p class="MsoNormal" style="margin-left:36.0pt"><fontcolor="navy" face="Arial" size="2"><span style="font-size:10.0pt;font-family:Arial;color:navy">Irecently did this by parsing the data through a VB program that appendeda “\” in front of any Char(10) and/or Char(13) characters which tells Postgres to accept the next character as aliteral part of the column value I believe – must do because it worked! I also quoted the whole column as part of the VBprog…</span></font><p class="MsoNormal" style="margin-left:36.0pt"><font color="navy" face="Arial" size="2"><span style="font-size:10.0pt;font-family:Arial;color:navy"> </span></font><pclass="MsoNormal" style="margin-left:36.0pt"><fontcolor="navy" face="Arial" size="2"><span style="font-size:10.0pt;font-family:Arial;color:navy">Workedfor me but I’m not sure the exact science behind it so someoneelse might be able to be of some more detailed help.</span></font><p class="MsoNormal" style="margin-left:36.0pt"><fontcolor="navy" face="Arial" size="2"><span style="font-size:10.0pt;font-family:Arial;color:navy"> </span></font><pclass="MsoNormal" style="margin-left:36.0pt"><fontcolor="navy" face="Arial" size="2"><span style="font-size:10.0pt;font-family:Arial;color:navy">Cheers,</span></font><pclass="MsoNormal" style="margin-left:36.0pt"><fontcolor="navy" face="Arial" size="2"><span style="font-size:10.0pt;font-family:Arial;color:navy">-p</span></font><pclass="MsoNormal" style="margin-left:36.0pt"><fontcolor="navy" face="Arial" size="2"><span style="font-size:10.0pt;font-family:Arial;color:navy"> </span></font><pclass="MsoNormal" style="margin-left:72.0pt"><fontcolor="black" face="Tahoma" size="2"><span lang="EN-US" style="font-size:10.0pt;font-family:Tahoma">-----OriginalMessage-----<br /><b><span style="font-weight:bold">From:</span></b><a href="mailto:pgsql-sql-owner@postgresql.org">pgsql-sql-owner@postgresql.org</a>[<a href="mailto:pgsql-sql-owner@postgresql.org">mailto:pgsql-sql-owner@postgresql.org</a>]<b><span style="font-weight:bold">OnBehalf Of </span></b>Sumeet<br /><b><span style="font-weight:bold">Sent:</span></b> Friday, 25August 2006 00:48<br /><b><span style="font-weight:bold">To:</span></b> <a href="mailto:pgsql-sql@postgresql.org">pgsql-sql@postgresql.org</a><br/><b><span style="font-weight:bold">Subject:</span></b>[SQL] Importing data from csv</span></font><p class="MsoNormal" style="margin-left:72.0pt"><fontcolor="black" face="Times New Roman" size="3"><span style="font-size:12.0pt"> </span></font><pclass="MsoNormal" style="margin-left:72.0pt"><font color="black" face="Times NewRoman" size="3"><span style="font-size:12.0pt">Hi Folks,<br /><br /> sorry if this is a duplicate post, i've been tryinto find a solution of importing data into postgres from a csv file. The problem is, I have a database which consistsof columns which contain newline characters (mac and unix). now when i export these files to a csv format, thereare some line breaks (mixed unix and mac) in the data which breaks the copy procedure. <br /><br /> I also tried usingthe script posted in one of the previous posts..<br clear="all" /><br /> #! /usr/bin/perl<br /> $inquotes = 0;<br />while (<>){<br /> # Chop the crlf<br /> chop ($_);<br /> chop ($_);<br /><br /> # this firstbit goes through and replaces <br /> # all the commas that re not in quotes with tildes<br /> for ($i=0 ;$i < length($_) ; $i++){<br /> $char=substr($_,$i,1);<br /> if ($char eq '"' ){<br /> $inquotes = not($inquotes); <br /> }else{<br /> if ( (!$inquotes) && ($char eq",") ){<br /> substr($_,$i,1)="~";<br /> }<br /> }<br /> }<br /> # thisreplaces any quotes<br /> s/"//g;<br /> print "$_\n";<br /> }<br /><br /><br /> cat data_file | perl scriptname.pl> outputfile.dat<br /><br /> and when i run the copy command i get messages like data missing for xyz column.<br/> any possible hints....... <br /><br /> --<br /> Thanks,<br /> Sumeet </span></font><p class="MsoNormal" style="margin-left:36.0pt"><fontcolor="black" face="Times New Roman" size="3"><span style="font-size:12.0pt"> </span></font><pstyle="margin-left:36.0pt"><b><font color="black" face="Times New Roman" size="3"><spanstyle="font-size:12.0pt;font-weight:bold">*******************Confidentiality and Privilege Notice*******************</span></font></b><pstyle="margin-left:36.0pt"><font color="black" face="Times New Roman" size="3"><spanstyle="font-size:12.0pt">The material contained in this message is privileged and confidential to the addressee.If you are not the addressee indicated in this message or responsible for delivery of the message to such person,you may not copy or deliver this message to anyone, and you should destroy it and kindly notify the sender by replyemail. </span></font><p style="margin-left:36.0pt"><font color="black" face="Times New Roman" size="3"><span style="font-size:12.0pt">Informationin this message that does not relate to the official business of Weatherbeeta must betreated as neither given nor endorsed by Weatherbeeta. Weatherbeeta, its employees, contractors or associates shall notbe liable for direct, indirect or consequential loss arising from transmission of this message or any attachments </span></font><pclass="MsoNormal" style="margin-left:36.0pt"><font color="black" face="Times New Roman" size="3"><span style="font-size:12.0pt"><br/><br /></span></font><pre style="margin-left:36.0pt"><font color="black" face="Courier New"size="2"><span style="font-size:10.0pt">-- </span></font></pre><pre style="margin-left:36.0pt"><font color="black" face="CourierNew" size="2"><span style="font-size:10.0pt">Scot P. Floess</span></font></pre><pre style="margin-left:36.0pt"><fontcolor="black" face="Courier New" size="2"><span style="font-size:10.0pt">27 Lake Royale</span></font></pre><prestyle="margin-left:36.0pt"><font color="black" face="Courier New" size="2"><span style="font-size:10.0pt">Louisburg,NC 27549</span></font></pre><pre style="margin-left:36.0pt"><font color="black" face="CourierNew" size="2"><span style="font-size:10.0pt"> </span></font></pre><pre style="margin-left: 36.0pt"><font color="black" face="Courier New" size="2"><span style="font-size: 10.0pt">252-478-8087 (Home)</span></font></pre><pre style="margin-left:36.0pt"><font color="black" face="Courier New" size="2"><spanstyle="font-size:10.0pt">919-754-4592 (Work)</span></font></pre><pre style="margin-left:36.0pt"><font color="black"face="Courier New" size="2"><span style="font-size:10.0pt"> </span></font></pre><pre style="margin-left: 36.0pt"><font color="black" face="Courier New" size="2"><span style="font-size: 10.0pt">Chief Architect JPlate <a href="http://sourceforge.net/projects/jplate">http://sourceforge.net/projects/jplate</a></span></font></pre><pre style="margin-left:36.0pt"><fontcolor="black" face="Courier New" size="2"><span style="font-size:10.0pt">Chief ArchitectJavaPIM <a href="http://sourceforge.net/projects/javapim">http://sourceforge.net/projects/javapim</a></span></font></pre><pre style="margin-left:36.0pt"><fontcolor="black" face="Courier New" size="2"><span style="font-size:10.0pt"> </span></font></pre><pclass="MsoNormal" style="margin-left:36.0pt"><font color="black" face="Times New Roman" size="3"><spanstyle="font-size:12.0pt"><br /><br /></span></font><pre style="margin-left:36.0pt"><font color="black" face="CourierNew" size="2"><span style="font-size:10.0pt">-- </span></font></pre><pre style="margin-left:36.0pt"><font color="black"face="Courier New" size="2"><span style="font-size:10.0pt">Scot P. Floess</span></font></pre><pre style="margin-left:36.0pt"><fontcolor="black" face="Courier New" size="2"><span style="font-size:10.0pt">27 Lake Royale</span></font></pre><prestyle="margin-left:36.0pt"><font color="black" face="Courier New" size="2"><span style="font-size:10.0pt">Louisburg,NC 27549</span></font></pre><pre style="margin-left:36.0pt"><font color="black" face="CourierNew" size="2"><span style="font-size:10.0pt"> </span></font></pre><pre style="margin-left: 36.0pt"><font color="black" face="Courier New" size="2"><span style="font-size: 10.0pt">252-478-8087 (Home)</span></font></pre><pre style="margin-left:36.0pt"><font color="black" face="Courier New" size="2"><spanstyle="font-size:10.0pt">919-754-4592 (Work)</span></font></pre><pre style="margin-left:36.0pt"><font color="black"face="Courier New" size="2"><span style="font-size:10.0pt"> </span></font></pre><pre style="margin-left: 36.0pt"><font color="black" face="Courier New" size="2"><span style="font-size: 10.0pt">Chief Architect JPlate <a href="http://sourceforge.net/projects/jplate">http://sourceforge.net/projects/jplate</a></span></font></pre><pre style="margin-left:36.0pt"><fontcolor="black" face="Courier New" size="2"><span style="font-size:10.0pt">Chief ArchitectJavaPIM <a href="http://sourceforge.net/projects/javapim">http://sourceforge.net/projects/javapim</a></span></font></pre></div><br /><p><b>*******************Confidentialityand Privilege Notice*******************</b><p> The material contained in this messageis privileged and confidential to the addressee. If you are not the addressee indicated in this message or responsiblefor delivery of the message to such person, you may not copy or deliver this message to anyone, and you shoulddestroy it and kindly notify the sender by reply email. <p> Information in this message that does not relate to theofficial business of Weatherbeeta must be treated as neither given nor endorsed by Weatherbeeta. Weatherbeeta, its employees,contractors or associates shall not be liable for direct, indirect or consequential loss arising from transmissionof this message or any attachments <br />
Well, being that there isn't a RFC for CSV...other than "defacto" definitions...I am pretty sure that is widely agreed upon;)<br /><br /> Phillip Smith wrote: <blockquote cite="mid000501c6c7da$15a01b10$9b0014ac@ITPhil" type="cite"><style> <!--/* Font Definitions */@font-face{font-family:Wingdings;panose-1:5 0 0 0 0 0 0 0 0 0;} @font-face{font-family:Tahoma;panose-1:2 11 6 4 3 5 4 4 2 4;}/* Style Definitions */p.MsoNormal, li.MsoNormal, div.MsoNormal{margin:0cm;margin-bottom:.0001pt;font-size:12.0pt;font-family:"TimesNew Roman";color:black;} a:link, span.MsoHyperlink{color:blue;text-decoration:underline;} a:visited, span.MsoHyperlinkFollowed{color:purple;text-decoration:underline;} p{margin-right:0cm;margin-left:0cm;font-size:12.0pt;font-family:"Times New Roman";color:black;} pre{margin:0cm;margin-bottom:.0001pt;font-size:10.0pt;font-family:"Courier New";color:black;} span.emailstyle17{font-family:Arial;color:navy;} span.EmailStyle20{font-family:Arial;color:navy;} @page Section1{size:595.3pt 841.9pt;margin:72.0pt 90.0pt 72.0pt 90.0pt;} div.Section1{page:Section1;} --> </style><div class="Section1"><p class="MsoNormal"><font color="navy" face="Arial" size="2"><span style="font-size: 10pt;font-family: Arial; color: navy;">There you go – it was the quotes that did it, not the back-slashes. I knew someoneelse would shed some better light! </span></font><font color="navy" face="Wingdings" size="2"><span style="font-size:10pt; font-family: Wingdings; color: navy;">J</span></font><p class="MsoNormal"><font color="navy" face="Arial"size="2"><span style="font-size: 10pt; font-family: Arial; color: navy;"> </span></font><p class="MsoNormal"><fontcolor="navy" face="Arial" size="2"><span style="font-size: 10pt; font-family: Arial; color: navy;">Cheers,</span></font><pclass="MsoNormal"><font color="navy" face="Arial" size="2"><span style="font-size: 10pt; font-family:Arial; color: navy;">-p</span></font><p class="MsoNormal"><font color="navy" face="Arial" size="2"><span style="font-size:10pt; font-family: Arial; color: navy;"> </span></font><p class="MsoNormal" style="margin-left: 36pt;"><fontcolor="black" face="Tahoma" size="2"><span lang="EN-US" style="font-size: 10pt; font-family: Tahoma; color: windowtext;">-----OriginalMessage-----<br /><b><span style="font-weight: bold;">From:</span></b> <a class="moz-txt-link-abbreviated"href="mailto:pgsql-sql-owner@postgresql.org">pgsql-sql-owner@postgresql.org</a> [<a class="moz-txt-link-freetext"href="mailto:pgsql-sql-owner@postgresql.org">mailto:pgsql-sql-owner@postgresql.org</a>] <b><spanstyle="font-weight: bold;">On Behalf Of </span></b>Scot P. Floess<br /><b><span style="font-weight: bold;">Sent:</span></b>Friday, 25 August 2006 10:00<br /><b><span style="font-weight: bold;">To:</span></b> <a class="moz-txt-link-abbreviated"href="mailto:floess@mindspring.com">floess@mindspring.com</a><br /><b><span style="font-weight:bold;">Cc:</span></b> Phillip Smith; <a class="moz-txt-link-abbreviated" href="mailto:pgsql-sql@postgresql.org">pgsql-sql@postgresql.org</a><br/><b><span style="font-weight: bold;">Subject:</span></b>Re: [SQL] Importing data from csv</span></font><p class="MsoNormal" style="margin-left: 36pt;"><fontcolor="black" face="Times New Roman" size="3"><span style="font-size: 12pt;"> </span></font><p class="MsoNormal"style="margin-left: 36pt;"><font color="black" face="Times New Roman" size="3"><span style="font-size: 12pt;">Andif its contained with quotes...its considered a field<br /><br /> Scot P. Floess wrote: </span></font><p class="MsoNormal"style="margin-left: 36pt;"><font color="black" face="Times New Roman" size="3"><span style="font-size: 12pt;">Anewline in CSV parlance denotes the end of a record....unless that newline is contained with quotes...<br /><br />Phillip Smith wrote: </span></font><p class="MsoNormal" style="margin-left: 36pt;"><font color="navy" face="Arial" size="2"><spanstyle="font-size: 10pt; font-family: Arial; color: navy;">I recently did this by parsing the data through aVB program that appended a “\” in front of any Char(10) and/or Char(13) characters which tells Postgres to accept the nextcharacter as a literal part of the column value I believe – must do because it worked! I also quoted the whole columnas part of the VB prog…</span></font><p class="MsoNormal" style="margin-left: 36pt;"><font color="navy" face="Arial"size="2"><span style="font-size: 10pt; font-family: Arial; color: navy;"> </span></font><p class="MsoNormal"style="margin-left: 36pt;"><font color="navy" face="Arial" size="2"><span style="font-size: 10pt; font-family:Arial; color: navy;">Worked for me but I’m not sure the exact science behind it so someone else might be ableto be of some more detailed help.</span></font><p class="MsoNormal" style="margin-left: 36pt;"><font color="navy" face="Arial"size="2"><span style="font-size: 10pt; font-family: Arial; color: navy;"> </span></font><p class="MsoNormal"style="margin-left: 36pt;"><font color="navy" face="Arial" size="2"><span style="font-size: 10pt; font-family:Arial; color: navy;">Cheers,</span></font><p class="MsoNormal" style="margin-left: 36pt;"><font color="navy"face="Arial" size="2"><span style="font-size: 10pt; font-family: Arial; color: navy;">-p</span></font><p class="MsoNormal"style="margin-left: 36pt;"><font color="navy" face="Arial" size="2"><span style="font-size: 10pt; font-family:Arial; color: navy;"> </span></font><p class="MsoNormal" style="margin-left: 72pt;"><font color="black" face="Tahoma"size="2"><span lang="EN-US" style="font-size: 10pt; font-family: Tahoma;">-----Original Message-----<br /><b><spanstyle="font-weight: bold;">From:</span></b> <a href="mailto:pgsql-sql-owner@postgresql.org">pgsql-sql-owner@postgresql.org</a>[<a href="mailto:pgsql-sql-owner@postgresql.org">mailto:pgsql-sql-owner@postgresql.org</a>]<b><span style="font-weight: bold;">OnBehalf Of </span></b>Sumeet<br /><b><span style="font-weight: bold;">Sent:</span></b> Friday, 25 August 2006 00:48<br/><b><span style="font-weight: bold;">To:</span></b> <a href="mailto:pgsql-sql@postgresql.org">pgsql-sql@postgresql.org</a><br/><b><span style="font-weight: bold;">Subject:</span></b>[SQL] Importing data from csv</span></font><p class="MsoNormal" style="margin-left: 72pt;"><fontcolor="black" face="Times New Roman" size="3"><span style="font-size: 12pt;"> </span></font><p class="MsoNormal"style="margin-left: 72pt;"><font color="black" face="Times New Roman" size="3"><span style="font-size: 12pt;">HiFolks,<br /><br /> sorry if this is a duplicate post, i've been tryin to find a solution of importing data intopostgres from a csv file. The problem is, I have a database which consists of columns which contain newline characters(mac and unix). now when i export these files to a csv format, there are some line breaks (mixed unix and mac)in the data which breaks the copy procedure. <br /><br /> I also tried using the script posted in one of the previousposts..<br clear="all" /><br /> #! /usr/bin/perl<br /> $inquotes = 0;<br /> while (<>){<br /> # Chop thecrlf<br /> chop ($_);<br /> chop ($_);<br /><br /> # this first bit goes through and replaces <br /> # all the commas that re not in quotes with tildes<br /> for ($i=0 ; $i < length($_) ; $i++){<br /> $char=substr($_,$i,1);<br /> if ($char eq '"' ){<br /> $inquotes = not($inquotes); <br /> }else{<br /> if ( (!$inquotes) && ($char eq ",") ){<br /> substr($_,$i,1)="~";<br/> }<br /> }<br /> }<br /> # this replaces any quotes<br /> s/"//g;<br/> print "$_\n";<br /> }<br /><br /><br /> cat data_file | perl scriptname.pl > outputfile.dat<br /><br/> and when i run the copy command i get messages like data missing for xyz column.<br /> any possible hints.......<br /><br /> --<br /> Thanks,<br /> Sumeet </span></font><p class="MsoNormal" style="margin-left: 36pt;"><fontcolor="black" face="Times New Roman" size="3"><span style="font-size: 12pt;"> </span></font><p style="margin-left:36pt;"><b><font color="black" face="Times New Roman" size="3"><span style="font-size: 12pt; font-weight:bold;">*******************Confidentiality and Privilege Notice*******************</span></font></b><p style="margin-left:36pt;"><font color="black" face="Times New Roman" size="3"><span style="font-size: 12pt;">The materialcontained in this message is privileged and confidential to the addressee. If you are not the addressee indicatedin this message or responsible for delivery of the message to such person, you may not copy or deliver this messageto anyone, and you should destroy it and kindly notify the sender by reply email. </span></font><p style="margin-left:36pt;"><font color="black" face="Times New Roman" size="3"><span style="font-size: 12pt;">Informationin this message that does not relate to the official business of Weatherbeeta must be treated as neithergiven nor endorsed by Weatherbeeta. Weatherbeeta, its employees, contractors or associates shall not be liable fordirect, indirect or consequential loss arising from transmission of this message or any attachments </span></font><p class="MsoNormal"style="margin-left: 36pt;"><font color="black" face="Times New Roman" size="3"><span style="font-size: 12pt;"><br/><br /></span></font><pre style="margin-left: 36pt;"><font color="black" face="Courier New" size="2"><span style="font-size:10pt;">-- </span></font></pre> <pre style="margin-left: 36pt;"><font color="black" face="Courier New" size="2"><spanstyle="font-size: 10pt;">Scot P. Floess</span></font></pre> <pre style="margin-left: 36pt;"><font color="black"face="Courier New" size="2"><span style="font-size: 10pt;">27 Lake Royale</span></font></pre> <pre style="margin-left:36pt;"><font color="black" face="Courier New" size="2"><span style="font-size: 10pt;">Louisburg, NC 27549</span></font></pre><pre style="margin-left: 36pt;"><font color="black" face="Courier New" size="2"><span style="font-size:10pt;"> </span></font></pre> <pre style="margin-left: 36pt;"><font color="black" face="Courier New" size="2"><spanstyle="font-size: 10pt;">252-478-8087 (Home)</span></font></pre> <pre style="margin-left: 36pt;"><font color="black"face="Courier New" size="2"><span style="font-size: 10pt;">919-754-4592 (Work)</span></font></pre> <pre style="margin-left:36pt;"><font color="black" face="Courier New" size="2"><span style="font-size: 10pt;"> </span></font></pre><pre style="margin-left: 36pt;"><font color="black" face="Courier New" size="2"><span style="font-size:10pt;">Chief Architect JPlate <a href="http://sourceforge.net/projects/jplate">http://sourceforge.net/projects/jplate</a></span></font></pre><pre style="margin-left:36pt;"><font color="black" face="Courier New" size="2"><span style="font-size: 10pt;">Chief ArchitectJavaPIM <a href="http://sourceforge.net/projects/javapim">http://sourceforge.net/projects/javapim</a></span></font></pre><pre style="margin-left:36pt;"><font color="black" face="Courier New" size="2"><span style="font-size: 10pt;"> </span></font></pre><pclass="MsoNormal" style="margin-left: 36pt;"><font color="black" face="Times New Roman" size="3"><spanstyle="font-size: 12pt;"><br /><br /></span></font><pre style="margin-left: 36pt;"><font color="black" face="CourierNew" size="2"><span style="font-size: 10pt;">-- </span></font></pre> <pre style="margin-left: 36pt;"><font color="black"face="Courier New" size="2"><span style="font-size: 10pt;">Scot P. Floess</span></font></pre> <pre style="margin-left:36pt;"><font color="black" face="Courier New" size="2"><span style="font-size: 10pt;">27 Lake Royale</span></font></pre><pre style="margin-left: 36pt;"><font color="black" face="Courier New" size="2"><span style="font-size:10pt;">Louisburg, NC 27549</span></font></pre> <pre style="margin-left: 36pt;"><font color="black" face="CourierNew" size="2"><span style="font-size: 10pt;"> </span></font></pre> <pre style="margin-left: 36pt;"><font color="black"face="Courier New" size="2"><span style="font-size: 10pt;">252-478-8087 (Home)</span></font></pre> <pre style="margin-left:36pt;"><font color="black" face="Courier New" size="2"><span style="font-size: 10pt;">919-754-4592 (Work)</span></font></pre><pre style="margin-left: 36pt;"><font color="black" face="Courier New" size="2"><span style="font-size:10pt;"> </span></font></pre> <pre style="margin-left: 36pt;"><font color="black" face="Courier New" size="2"><spanstyle="font-size: 10pt;">Chief Architect JPlate <a href="http://sourceforge.net/projects/jplate">http://sourceforge.net/projects/jplate</a></span></font></pre><pre style="margin-left:36pt;"><font color="black" face="Courier New" size="2"><span style="font-size: 10pt;">Chief ArchitectJavaPIM <a href="http://sourceforge.net/projects/javapim">http://sourceforge.net/projects/javapim</a></span></font></pre></div><br /><p><b>*******************Confidentialityand Privilege Notice*******************</b><p>The material contained in this messageis privileged and confidential to the addressee. If you are not the addressee indicated in this message or responsiblefor delivery of the message to such person, you may not copy or deliver this message to anyone, and you shoulddestroy it and kindly notify the sender by reply email. <p>Information in this message that does not relate to theofficial business of Weatherbeeta must be treated as neither given nor endorsed by Weatherbeeta. Weatherbeeta, its employees,contractors or associates shall not be liable for direct, indirect or consequential loss arising from transmissionof this message or any attachments <br /></blockquote><br /><pre class="moz-signature" cols="72">-- Scot P. Floess 27 Lake Royale Louisburg, NC 27549 252-478-8087 (Home) 919-754-4592 (Work) Chief Architect JPlate <a class="moz-txt-link-freetext" href="http://sourceforge.net/projects/jplate">http://sourceforge.net/projects/jplate</a> Chief Architect JavaPIM <a class="moz-txt-link-freetext" href="http://sourceforge.net/projects/javapim">http://sourceforge.net/projects/javapim</a> </pre>
On Thu, Aug 24, 2006 at 08:19:58PM -0400, Scot P. Floess wrote: > Well, being that there isn't a RFC for CSV...other than "defacto" > definitions...I am pretty sure that is widely agreed upon ;) RFC 4180 Common Format and MIME Type for Comma-Separated Values (CSV) Files ftp://ftp.rfc-editor.org/in-notes/rfc4180.txt "While there are various specifications and implementations for the CSV format (for ex. [4], [5], [6] and [7]), there is no formal specification in existence, which allows for a wide variety of interpretations of CSV files. This section documents the format that seems to be followed by most implementations:" -- Michael Fuhr
On 8/24/06, Sumeet <asumeet@gmail.com> wrote:
Sounds like you are checking the file in as binary when it should be checked in as ASCII. You may want to consider changing that in your repository so this does not continue to be an issue.
==================================================================
Aaron Bono
Aranya Software Technologies, Inc.
http://www.aranya.com
http://codeelixir.com
==================================================================
Hi Folks,
sorry if this is a duplicate post, i've been tryin to find a solution of importing data into postgres from a csv file. The problem is, I have a database which consists of columns which contain newline characters (mac and unix). now when i export these files to a csv format, there are some line breaks (mixed unix and mac) in the data which breaks the copy procedure.
==================================================================
Aaron Bono
Aranya Software Technologies, Inc.
http://www.aranya.com
http://codeelixir.com
==================================================================