Thread: Problems with ñ and tildes / CSV import problems in PostgreSQL 9.1
I'm fairly new to PostgreSQL 9.1 but I need it, so here I am.
This a similar question to this one, so I have encoded a database with LATIN-1 as suggested but can't copy a CSV file into a table within the database.
ERROR: invalid byte sequence for encoding "UTF8": 0xe17371
This a similar question to this one, so I have encoded a database with LATIN-1 as suggested but can't copy a CSV file into a table within the database.
ERROR: invalid byte sequence for encoding "UTF8": 0xe17371
Googling doesn't get me anywhere and I am working with Spanish characters.
Thanks again all,
Zach Seaman
Re: [NOVICE] Problems with ñ and tildes / CSV import problems in PostgreSQL 9.1
From
Gurjeet Singh
Date:
On Wed, Feb 6, 2013 at 7:56 PM, Zach Seaman <znseaman@gmail.com> wrote:
I think the data in your CSV file should match the client_encoding parameter.
What is your client_encoding parameter set to?
show client_encoding;
--
Gurjeet Singh
http://gurjeet.singh.im/
I'm fairly new to PostgreSQL 9.1 but I need it, so here I am.
This a similar question to this one, so I have encoded a database with LATIN-1 as suggested but can't copy a CSV file into a table within the database.
ERROR: invalid byte sequence for encoding "UTF8": 0xe17371Googling doesn't get me anywhere and I am working with Spanish characters.
I think the data in your CSV file should match the client_encoding parameter.
What is your client_encoding parameter set to?
show client_encoding;
--
Gurjeet Singh
http://gurjeet.singh.im/
Re: [NOVICE] Problems with ñ and tildes / CSV import problems in PostgreSQL 9.1
From
Jaime Casanova
Date:
On Wed, Feb 6, 2013 at 7:56 PM, Zach Seaman <znseaman@gmail.com> wrote: > I'm fairly new to PostgreSQL 9.1 but I need it, so here I am. > > This a similar question to this one, so I have encoded a database with > LATIN-1 as suggested but can't copy a CSV file into a table within the > database. > well, that mail is from 2005... what version of postgres are you running at? > ERROR: invalid byte sequence for encoding "UTF8": 0xe17371 > run: SET client_encoding TO UTF8; before running the copy command, or maybe set to LATIN1 -- Jaime Casanova www.2ndQuadrant.com Professional PostgreSQL: Soporte 24x7 y capacitación Phone: +593 4 5107566 Cell: +593 987171157
Re: Re: [NOVICE] Problems with ñ and tildes / CSV import problems in PostgreSQL 9.1
From
Ken Benson
Date:
I think the problem may be that specific character translation.
The chart I typically use is here: http://www.utf8-chartable.de/unicode-utf8-table.pl
The 'valid' UTF-8 codes jump from 0x e0 bf bf (at the bottom of this page: http://www.utf8-chartable.de/unicode-utf8-table.pl?start=3840 )
To: 0x e1 80 80 (at the top of this page: http://www.utf8-chartable.de/unicode-utf8-table.pl?start=4096
So - the problem may be that truly 0x e1 73 71 is not a valid UTF-8 character in the current iteration of PostgreSQL - or at all.
Jut my thoughts.
Ken
The chart I typically use is here: http://www.utf8-chartable.de/unicode-utf8-table.pl
The 'valid' UTF-8 codes jump from 0x e0 bf bf (at the bottom of this page: http://www.utf8-chartable.de/unicode-utf8-table.pl?start=3840 )
To: 0x e1 80 80 (at the top of this page: http://www.utf8-chartable.de/unicode-utf8-table.pl?start=4096
So - the problem may be that truly 0x e1 73 71 is not a valid UTF-8 character in the current iteration of PostgreSQL - or at all.
Jut my thoughts.
Ken
On 2/7/2013 7:03 AM, Jaime Casanova wrote:
On Wed, Feb 6, 2013 at 7:56 PM, Zach Seaman <znseaman@gmail.com> wrote:I'm fairly new to PostgreSQL 9.1 but I need it, so here I am. This a similar question to this one, so I have encoded a database with LATIN-1 as suggested but can't copy a CSV file into a table within the database.well, that mail is from 2005... what version of postgres are you running at?ERROR: invalid byte sequence for encoding "UTF8": 0xe17371run: SET client_encoding TO UTF8; before running the copy command, or maybe set to LATIN1
Re: [NOVICE] Problems with ñ and tildes / CSV import problems in PostgreSQL 9.1
From
Zach Seaman
Date:
I'm running PostgreSQL 9.1
On Thu, Feb 7, 2013 at 9:03 AM, Jaime Casanova <jaime@2ndquadrant.com> wrote:
On Wed, Feb 6, 2013 at 7:56 PM, Zach Seaman <znseaman@gmail.com> wrote:> I'm fairly new to PostgreSQL 9.1 but I need it, so here I am.well, that mail is from 2005... what version of postgres are you running at?
>
> This a similar question to this one, so I have encoded a database with
> LATIN-1 as suggested but can't copy a CSV file into a table within the
> database.
>run:
> ERROR: invalid byte sequence for encoding "UTF8": 0xe17371
>
SET client_encoding TO UTF8;
before running the copy command, or maybe set to LATIN1
--
Jaime Casanova www.2ndQuadrant.com
Professional PostgreSQL: Soporte 24x7 y capacitación
Phone: +593 4 5107566 Cell: +593 987171157
--
Zach Seaman
GIS Expert, IRRI-México
Master of Regional & Community Planning
GIS Expert, IRRI-México
Master of Regional & Community Planning
m 55.2247.1740 (México)
m 01.913.4860.832 (U.S.)
m 01.913.4860.832 (U.S.)
Re: Re: [NOVICE] Problems with ñ and tildes / CSV import problems in PostgreSQL 9.1
From
Tom Lane
Date:
Ken Benson <ken@infowerks.com> writes: > So - the problem may be that /*truly**0x e1 73 71*/ is not a valid UTF-8 > character in the current iteration of PostgreSQL - or at all. Of course it isn't, which is why Postgres is complaining. Presumably what that data really is is three characters (looks like "�sq") in LATIN1. But Postgres is trying to interpret it in UTF8. As mentioned upthread, the solution is to adjust the client_encoding setting before running the COPY command. regards, tom lane
Re: [NOVICE] Re: [NOVICE] Problems with ñ and tildes / CSV import problems in PostgreSQL 9.1
From
Zach Seaman
Date:
I changed from LATIN1, set my database to UTF8, and my client_encoding is UTF8.
ERROR: invalid byte sequence for encoding "UTF8": 0xe17320
ás[space]
Is it a trial and error type problem now?
On Thu, Feb 7, 2013 at 10:15 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
Ken Benson <ken@infowerks.com> writes:
> So - the problem may be that /*truly**0x e1 73 71*/ is not a valid UTF-8> character in the current iteration of PostgreSQL - or at all.Of course it isn't, which is why Postgres is complaining. Presumably
what that data really is is three characters (looks like "ásq") in
LATIN1. But Postgres is trying to interpret it in UTF8. As mentioned
upthread, the solution is to adjust the client_encoding setting before
running the COPY command.
regards, tom lane
--
Sent via pgsql-novice mailing list (pgsql-novice@postgresql.org)To make changes to your subscription:http://www.postgresql.org/mailpref/pgsql-novice
--
Zach Seaman
GIS Expert, IRRI-México
Master of Regional & Community Planning
GIS Expert, IRRI-México
Master of Regional & Community Planning
m 55.2247.1740 (México)
m 01.913.4860.832 (U.S.)
m 01.913.4860.832 (U.S.)
Re: [NOVICE] Re: [NOVICE] Problems with ñ and tildes / CSV import problems in PostgreSQL 9.1
From
Zach Seaman
Date:
Keeping the names, in tact, would be helpful. Whatever I change it to, I receive the same error because of the first entry.
I've encoded the csv using Notepad++ to UTF8 and still no luck. On Thu, Feb 7, 2013 at 10:51 AM, Zach Seaman <znseaman@gmail.com> wrote:
I changed from LATIN1, set my database to UTF8, and my client_encoding is UTF8.ERROR: invalid byte sequence for encoding "UTF8": 0xe17320ás[space]Is it a trial and error type problem now?On Thu, Feb 7, 2013 at 10:15 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:Ken Benson <ken@infowerks.com> writes:
> So - the problem may be that /*truly**0x e1 73 71*/ is not a valid UTF-8> character in the current iteration of PostgreSQL - or at all.Of course it isn't, which is why Postgres is complaining. Presumably
what that data really is is three characters (looks like "ásq") in
LATIN1. But Postgres is trying to interpret it in UTF8. As mentioned
upthread, the solution is to adjust the client_encoding setting before
running the COPY command.
regards, tom lane
--
Sent via pgsql-novice mailing list (pgsql-novice@postgresql.org)To make changes to your subscription:http://www.postgresql.org/mailpref/pgsql-novice--Zach Seaman
GIS Expert, IRRI-México
Master of Regional & Community Planningm 55.2247.1740 (México)
m 01.913.4860.832 (U.S.)
--
Zach Seaman
GIS Expert, IRRI-México
Master of Regional & Community Planning
GIS Expert, IRRI-México
Master of Regional & Community Planning
m 55.2247.1740 (México)
m 01.913.4860.832 (U.S.)
m 01.913.4860.832 (U.S.)
Re: Re: [NOVICE] Re: [NOVICE] Problems with ñ and tildes / CSV import problems in PostgreSQL 9.1
From
Tom Lane
Date:
Zach Seaman <znseaman@gmail.com> writes: > I changed from LATIN1, set my database to UTF8, and my client_encoding is > UTF8. > ERROR: invalid byte sequence for encoding "UTF8": 0xe17320 > �s[space] No, the client encoding needs to be LATIN1 to read this file. regards, tom lane
Re: [NOVICE] Re: [NOVICE] Re: [NOVICE] Problems with ñ and tildes / CSV import problems in PostgreSQL 9.1
From
Zach Seaman
Date:
Ok, client encoding is back to LATIN1.
Do I have to sacrifice the readability of these names or is there a way to work around this invalid byte sequence problem?On Thu, Feb 7, 2013 at 11:51 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
Zach Seaman <znseaman@gmail.com> writes:No, the client encoding needs to be LATIN1 to read this file.
> I changed from LATIN1, set my database to UTF8, and my client_encoding is
> UTF8.
> ERROR: invalid byte sequence for encoding "UTF8": 0xe17320
> ás[space]
regards, tom lane
--
Zach Seaman
GIS Expert, IRRI-México
Master of Regional & Community Planning
GIS Expert, IRRI-México
Master of Regional & Community Planning
m 55.2247.1740 (México)
m 01.913.4860.832 (U.S.)
m 01.913.4860.832 (U.S.)
Re: [NOVICE] Re: [NOVICE] Re: [NOVICE] Problems with ñ and tildes / CSV import problems in PostgreSQL 9.1
From
Michael Swierczek
Date:
On Thu, Feb 7, 2013 at 12:05 PM, Zach Seaman <znseaman@gmail.com> wrote: > > Keeping the names, in tact, would be helpful. Whatever I change it to, I receive the same error because of the first entry. > > I've encoded the csv using Notepad++ to UTF8 and still no luck. > > I think "á" followed by the next 2 characters causes the problem. Is there a better encoding for special characters? Isthis possible in WIN-1252? Zach, I've been bitten by this misunderstanding myself. Changing the file encoding in Notepad++ just changes a few bytes at the very beginning of the file to indicate that it's supposed to be read as your new encoding. It does not automatically go through the file converting character like "à" from its 224 (decimal) character value in LATIN1 encoding to the U+00E0 UTF-8 equivalent. Maybe some other text editors support actually re-encoding the characters in the file for you, I don't know. Good luck, -Mike Swierczek