Re: copying json data and backslashes - Mailing list pgsql-general
From | Alastair McKinley |
---|---|
Subject | Re: copying json data and backslashes |
Date | |
Msg-id | PAXPR02MB7600A3361EE070A23F226481E30D9@PAXPR02MB7600.eurprd02.prod.outlook.com Whole thread Raw |
In response to | Re: copying json data and backslashes ("pbj@cmicdo.com" <pbj@cmicdo.com>) |
Responses |
Re: copying json data and backslashes
("Peter J. Holzer" <hjp-pgsql@hjp.at>)
|
List | pgsql-general |
> From: pbj@cmicdo.com <pbj@cmicdo.com> > Sent: 22 November 2022 15:30 > To: Alastair McKinley <a.mckinley@analyticsengines.com>; pgsql-general@lists.postgresql.org <pgsql-general@lists.postgresql.org>;Erik Wienhold <ewie@ewie.name> > Subject: Re: copying json data and backslashes > > > > > On Tuesday, November 22, 2022 at 10:16:11 AM EST, Erik Wienhold <ewie@ewie.name> wrote: > > > > > > > On 22/11/2022 15:23 CET Alastair McKinley <a.mckinley@analyticsengines.com> wrote: > > > > > > Hi all, > > > > > > I have come across this apparently common issue COPY-ing json and wondering if > > > there is potentially a better solution. > > > > > > I am copying data into a jsonb column originating from a 3rd party API. The > > > data may have literal \r,\t,\n and also double backslashes. > > > > > > I discovered that I can cast this data to a jsonb value directly but I can't > > > COPY the data without pre-processing. > > > > > > > Is there any other solution with COPY that doesn't require manual > > > implementation of search/replace to handle these edge cases? > > > Why does ::jsonb work but COPY doesn't? It seems a bit inconsistent. > > > > COPY handles special backslash sequences[1]. The \r in your sample JSON, > > although properly escaped according to JSON, is replaced with an actual > > carriage return by COPY before casting to jsonb. The error results from JSON > > prohibiting unescaped control characters in strings[2]. > > > > You must double escape to pass those characters through COPY. > > > > See how COPY outputs backslash sequences: > > > > -- Actual carriage return: > > copy (select e'\r') to stdout; > > \r > > > > -- Backslash sequence for carriage return: > > copy (select '\r') to stdout; > > > > \\r > > I have been able to get around this problem by using the following method: > > \copy footable from 'input.json' (format csv, escape '^B', delimieter '^C', quote '^E') > > where the control characters are the actual control char, not the caret-letter, and it requires no escaping escapes. Irealize this won't work for all > situations. > > PJ Hi PJ, Thanks for the suggestion, this is interesting to me to try but I am not quite sure how this works. As far as I understand, escape/quote/delimiter have to be a single character, and CTRL-C etc. are multiple characters. What way do you input each of the escape/quote/delimiter characters? Best regards, Alastair > > > > > [1] > > https://www.postgresql.org/docs/current/sql-copy.html#id-1.9.3.55.9.2 > > [2] https://www.json.org/json-en.html > > > > -- > > Erik >
pgsql-general by date: