Thread: Unknown character on copy

Unknown character on copy

From
"THOMPSON, JARED (ATTBAPCO)"
Date:

I have noticed on some of my imports with copy I will get a funny character now and then in some of the data.  When it shows up it will always be the first character in the string.  I did a simple test to try to find it.

 

Here is my setup:

 

Table –

 

-- Table: pigdig

 

-- DROP TABLE pigdig;

 

CREATE TABLE pigdig

(

  "name" character varying

)

WITH (

  OIDS=FALSE

);

ALTER TABLE pigdig OWNER TO postgres;

 

 

 

Data in forimport.txt :

 

mary

tary

lary

 

 

 

copy command being used:

 

COPY pigdig (name)

FROM E'\\Documents and Settings\\username\\My Documents\\postgres\\Pg Daily Inventory\\forimport.txt'

WITH DELIMITER AS ','

NULL AS 'NULL'

 

 

Now, when looking at the data in the table I get:

 

"mary"  but marry has a box in front of “mary”, the box has a solid border with a clear center.

"tary"

"lary"

 

 

 

Problem – why did this come into the table, what is the character? 

 

I pasted the data into excel and isolated the character, in excel it shows up as a box with a question mark.  In excel , I did =CHAR(ROW()) in rows 1 to 255 with the offending character in E1, then tried to equate it to all of the characters in rows 1 through 255 – it was not equivalent to any.

 

 

So I do not know what the character is and how it made its way into the import process.

 

That in and of itself is an issue, plus since it is now in postgres when I query I cannot query on ‘mary’

 

Further, when I am in pgadmin III and view table data then past that mary into a query window, the character in question is no longer a box but is a light bullet point looking character.

 

 

Any ideas on:

 

How and why this came into the import process and got into my data?

What the character is?

 

 

Re: Unknown character on copy

From
Tom Lane
Date:
"THOMPSON, JARED (ATTBAPCO)" <JT060b@ATT.COM> writes:
> I have noticed on some of my imports with copy I will get a funny
> character now and then in some of the data.  When it shows up it will
> always be the first character in the string.

Is it also in the first line of the COPY data file?

If so, it's probably a Unicode byte order mark.  That's not really
legitimate in UTF8 data, but some Windows text editors are broken enough
to insert one at the start of a file anyway.

            regards, tom lane

Re: Unknown character on copy

From
"THOMPSON, JARED (ATTBAPCO)"
Date:
> -----Original Message-----
> From: Tom Lane [mailto:tgl@sss.pgh.pa.us]
> Sent: Friday, July 22, 2011 10:25 AM
> Is it also in the first line of the COPY data file?
>
> If so, it's probably a Unicode byte order mark.  That's not really
> legitimate in UTF8 data, but some Windows text editors are broken
> enough
> to insert one at the start of a file anyway.
>
>             regards, tom lane

Nope, I verified that there were no extraneous spaces in the copy file.
I will try installing gedit on windows and see if it happens again.


Re: Unknown character on copy

From
"THOMPSON, JARED (ATTBAPCO)"
Date:
> -----Original Message-----
> From: Tom Lane [mailto:tgl@sss.pgh.pa.us]
>
> Is it also in the first line of the COPY data file?
>
> If so, it's probably a Unicode byte order mark.  That's not really
> legitimate in UTF8 data, but some Windows text editors are broken
> enough
> to insert one at the start of a file anyway.
>
>             regards, tom lane


Tom, thanks for the idea of the windows text editor - it was
(apparently) the cause of the offending character.


I dropped the table, and redid everything and the character showed up.

So,

I dropped the table - keeping all my processes the same with the
exception of using gedit instead of windows notpad.
The character did not show up.

Thanks again,

Jared