Re: Creating 2D arrays for pg_copy_from, reading tab-delimted textfile that contains comma and double quotes - Mailing list pgsql-general

From rob stone
Subject Re: Creating 2D arrays for pg_copy_from, reading tab-delimted textfile that contains comma and double quotes
Date
Msg-id c1802beea93a82f7e1fb49c198b4d6c2863e4101.camel@gmail.com
Whole thread Raw
In response to Creating 2D arrays for pg_copy_from, reading tab-delimted text filethat contains comma and double quotes  (<s400t@yahoo.co.jp>)
Responses Re: Creating 2D arrays for pg_copy_from, reading tab-delimited textfile that contains comma and double quotes
List pgsql-general
Hello,

On Mon, 2018-12-17 at 21:34 +0900, s400t@yahoo.co.jp wrote:
> Hello Good People of the Forum!
> 
> I am trying to insert some data into a PostgreSQL database using PHP
> and struggling to create an array so that pg_copy_from function will
> accept and process the data.
> 
> I can insert data but not the way I want- my data this case contains
> comma, space, double quotes and unpaired double quote.
> I need to use Excel to create the data and save it as tab delimited
> text file, and then convert encoding to UTF-8 (from Excel's ANSI)
> before passing it to the PHP. 
> 
> Here is my dummy data in Excel:
> rec_no    item1    item2    item3    item4    item5
> 1    Denny's    orange juice    1,500 yen    "Dear John"    32" TV
> 
> Explanation: the first row is header.
> The second row is data for each column.
> I tried to create variation like apostrophe, space between words,
> comma, double quotes and unpaired double quote (32" TV).
> 
> When I save that Excel as a tab delimited text file, I get this:
> rec_no    item1    item2    item3    item4    item5
> 1    Denny's    orange juice    "1,500 yen"    """Dear John"""   
> "32"" TV"
> (As seen when I opened that file with Notepad)
> 
> Because my data also contains non-ascii, I saved the file with UTF-8
> encoding using the Notepad.
> 
> Then I created a two-dimensional array with PHP:
> 
> 1.$file = 'test.txt'; //tab delimited file 
> 2.$fileRead = fopen($file, 'r');
>             
> 3.$row = 1;
> 4.$twoDarray = array();
> 5.while (($line = fgetcsv($fileRead, 0, "\t")) !== FALSE) { //0 means
> I can read row whatever its length
> 6.    if($row == 1){ $row++; continue; } //skip header
> 7.    $line = implode(" ",$line). "\n";
> 8.    $twoDarray[] = $line;
> 9.}
> 10.fclose($fileRead);
> 
> Then I passed that twoDarray to pg_copy_from.
> 
> $con=pg_connect("host=$host dbname=$dbname port=5432 user=$user
> password=$password");
> 
> 11.if (!$con) {
> 12.    die("Couldn't open..<br>\n");
> 13.}
> 
> 14.if (pg_copy_from($con, $tableName, $twoDarray) !== FALSE) {
> 15.    print "OK!";
> 16.}
> 17.else{
> 18.    print "Not OK.";    
> 19.}
> 
> When I run the program, I have this error:
> Warning: pg_copy_from(): Copy command failed: 
> ERROR: value too long for type character varying(32) CONTEXT: COPY
> test_table, line 1, column rec_no: "1 Denny's orange juice 1,500 yen
> "Dear John" 32" TV" in testProgram.php line xx.
> 
> My table definition is:
> CREATE TABLE test_table (
> rec_no VARCHAR(32) PRIMARY KEY NOT NULL,
> item1 VARCHAR(255),..item2 .. until item5.);
> 
> Obviously, my program thinks everything in the data row is for the
> first field.
> No, no.
> 
> How to make it think that 
> 1 is for the 'rec_no' field,
> Denny's is for the 'item1' field, 
> orange juice is for the 'item2' field,
> 1,500 yen is for the 'item3' field,
> "Dear John" is for the 'item4' field and 
> 32" TV is for the 'item5' field?
> 
> When I tried removing '0' from line 5, that is, 
> while (($line = fgetcsv($fileRead, "\t")) !== FALSE) { //without read
> length
> 
> I can see data written in the database, but with some extra double
> quotes and a missing comma!
> That is data was saved as
> Denny's,    orange juice,    "1 500 yen",    """Dear John""", and 
>     "32"" TV"
> into the respective fields.
> I cannot have those extra double quotes, and I cannot have missing
> comma in my data. 



fgetscsv returns an array from reading a record from a text file.
So  $line = fgetcsv($fileRead, 0, "\t", '"') would use tab as the
delimiter and remove any enclosure character equal to ".

I don't use Excel. With Libreoffice you can set the delimiter to the
pipe character and tell it not to use enclosures. So I don't know how
to massage your Excel file so that 32" TV is rendered correctly. 

Secondly, in PHP implode takes an array and turns it into a string with
a specified character used to delimit the values from the array.

I can only suggest that you read the PHP manual, as well as do some
searches for the use of pg_copy_from, although I doubt anything useful
will turn up. 

Cheers,
Rob



pgsql-general by date:

Previous
From: Vijaykumar Jain
Date:
Subject: Re: [External] Re: simple query on why a merge join plan got selected
Next
From: Jerry Sievers
Date:
Subject: Re: conditionally terminate psql script