Re: BUG #17142: COPY ignores client_encoding for octal digit characters - Mailing list pgsql-bugs

From Heikki Linnakangas
Subject Re: BUG #17142: COPY ignores client_encoding for octal digit characters
Date
Msg-id d06c9ac0-1e22-7247-8b98-3c13d550d43d@iki.fi
Whole thread Raw
In response to BUG #17142: COPY ignores client_encoding for octal digit characters  (PG Bug reporting form <noreply@postgresql.org>)
Responses Re: BUG #17142: COPY ignores client_encoding for octal digit characters  (vilarion@illarion.org)
List pgsql-bugs
On 12/08/2021 00:24, PG Bug reporting form wrote:
> Characters in octal digits should be possible as per
> https://www.postgresql.org/docs/13/sql-copy.html
> When using characters directly (char buffer[] = "\304\366\337") the expected
> output is displayed.
> 
> My apologies if I misunderstood something.

The code is pretty clear that the \123 and \x12 escapes are evaluated 
after encoding conversion. That means, the escapes are interpreted using 
the database encoding, regardless of client encoding. The documentation 
doesn't say anything about that, though. We should fix the docs. How 
does the attached patch look?

You could get weird results if you use the escapes for some bytes in a 
multi-byte character. Mostly you'd get invalid byte sequence errors, but 
I think with the right combination of the client and database encodings, 
it could get more strange. I think the wording in the attached docs 
patch is enough to cover that, though.

- Heikki

Attachment

pgsql-bugs by date:

Previous
From: Emil Iggland
Date:
Subject: Re: BUG #17141: SELECT LIMIT WITH TIES FOR UPDATE SKIP LOCKED returns wrong number of rows
Next
From: vilarion@illarion.org
Date:
Subject: Re: BUG #17142: COPY ignores client_encoding for octal digit characters