[PATCH] Reject ENCODING option for COPY TO FORMAT JSON - Mailing list pgsql-hackers

From Ayush Tiwari
Subject [PATCH] Reject ENCODING option for COPY TO FORMAT JSON
Date
Msg-id CAJTYsWVTarpEyFyx=ivXO88ORcAM35FNwgO35nnNSNGdEhN1aw@mail.gmail.com
Whole thread
Responses Re: [PATCH] Reject ENCODING option for COPY TO FORMAT JSON
Re: [PATCH] Reject ENCODING option for COPY TO FORMAT JSON
List pgsql-hackers
Hi hackers,

COPY TO FORMAT JSON silently accepts the ENCODING option but doesn't
perform encoding conversion(?)  CopyToJsonOneRow() sends the output of
composite_to_json() via CopySendData() without calling
pg_server_to_any(), unlike the text and CSV paths.

  COPY t TO '/tmp/out.json' WITH (FORMAT json, ENCODING 'LATIN1');

On a UTF-8 server this produces UTF-8 output, not LATIN1.

RFC 8259 says JSON text must be UTF-8, so arguably JSON output
should never be converted.  But even under that interpretation,
silently accepting the option and ignoring it looks wrong, the user
explicitly asked for LATIN1 and got something else.  The same issue
also affects COPY TO STDOUT when client_encoding differs from the
server encoding, since the default file_encoding is the client
encoding and CopyToJsonOneRow never checks need_transcoding.

The attached patch rejects the explicit ENCODING option for JSON
mode, consistent with how DELIMITER, NULL, DEFAULT, and HEADER are
already rejected.  The implicit client_encoding case is a separate
design question (should COPY TO JSON always emit UTF-8 regardless
of client_encoding?) that maybe we should address separately and not as 
part of v19.

Introduced by 7dadd38cda9 (json format for COPY TO). I've attached a patch
for rejecting the ENCODING option. Thoughts?
Attachment

pgsql-hackers by date:

Previous
From: Amit Kapila
Date:
Subject: Re: Possible premature SNAPBUILD_CONSISTENT with DB-specific running_xacts
Next
From: Ayush Tiwari
Date:
Subject: Re: [BUG] Race in online checksums launcher_exit()