Re: Maintaining accents with "COPY" ? - Mailing list pgsql-general

From Erik Wienhold
Subject Re: Maintaining accents with "COPY" ?
Date
Msg-id 564260964.222854.1685018238693@office.mailbox.org
Whole thread Raw
In response to Re: Maintaining accents with "COPY" ?  (Laura Smith <n5d9xq3ti233xiyif2vp@protonmail.ch>)
List pgsql-general
> On 25/05/2023 12:08 CEST Laura Smith <n5d9xq3ti233xiyif2vp@protonmail.ch> wrote:
>
> > Looks like an encoding issue and a mismatch between database encoding and
> > client encoding. You can check both with:
> >
> > SHOW server_encoding;
> > SHOW client_encoding;
> >
> > Then either set the client encoding or use COPY's encoding option to match
> > the database encoding (I assume utf8 in this example):
> >
> > SET client_encoding = 'utf8';
> > COPY (...) TO /tmp/bar.csv DELIMITER ',' CSV HEADER ENCODING 'utf8';
>
> Hi Erik,
>
> Looks like you could well be right about encoding:
>
> postgres=# SHOW server_encoding;
>  server_encoding
> -----------------
>  UTF8
> (1 row)
>
> postgres=# SHOW client_encoding;
>  client_encoding
> -----------------
>  SQL_ASCII
> (1 row)
>
> I will try your suggestion...

The client encoding is not the problem here.  Using SQL_ASCII effectively uses
the server encoding.  SQL_ASCII basically means uninterpreted bytes/characters.

From https://www.postgresql.org/docs/15/multibyte.html#id-1.6.11.5.7:

"If the client character set is defined as SQL_ASCII, encoding conversion is
 disabled, regardless of the server's character set. (However, if the server's
 character set is not SQL_ASCII, the server will still check that incoming data
 is valid for that encoding; so the net effect is as though the client character
 set were the same as the server's.) Just as for the server, use of SQL_ASCII is
 unwise unless you are working with all-ASCII data."

--
Erik



pgsql-general by date:

Previous
From: Erik Wienhold
Date:
Subject: Re: Maintaining accents with "COPY" ?
Next
From: Jim Vanns
Date:
Subject: CREATE TEMPORARY TABLE LIKE