Home > mailing lists

copying json data and backslashes - Mailing list pgsql-general

From	Alastair McKinley
Subject	copying json data and backslashes
Date	November 22, 2022 14:23:16
Msg-id	PAXPR02MB7600EDF4AA4898094EBD8B10E30D9@PAXPR02MB7600.eurprd02.prod.outlook.com Whole thread Raw
Responses	Re: copying json data and backslashes
List	pgsql-general

Tree view

Hi all,

I have come across this apparently common issue COPY-ing json and wondering if there is potentially a better solution.

I am copying data into a jsonb column originating from a 3^rd party API. The data may have literal \r,\t,\n and also double backslashes.

I discovered that I can cast this data to a jsonb value directly but I can't COPY the data without pre-processing.

The example below illustrates my issue (only with \r, but the problem extends to other \X combinations).

do $$
lines=[r'{"test" : "\r this data has a carriage return"}']

with open("/tmp/test1.json","w") as f:
for line in lines:
f.write(line.strip() + "\n")

$$ language plpython3u;

create temp table testing (data jsonb);

-- this works
insert into testing (data)
select l::jsonb
from pg_read_file('/tmp/test1.json') f,
lateral regexp_split_to_table(f,'\n') l where l <> '';

-- fails
copy testing (data) from '/tmp/test1.json';

-- works
copy testing (data) from program $c$ sed -e 's/\\r/\\\\u000a/g' /tmp/test1.json $c$;

Is there any other solution with COPY that doesn't require manual implementation of search/replace to handle these edge cases?

Why does ::jsonb work but COPY doesn't? It seems a bit inconsistent.

Best regards,

Alastair

pgsql-general by date:

From: Laurenz Albe
Date: 22 November 2022, 08:43:04
Subject: Re: system variable can be edited by all user?

From: Erik Wienhold
Date: 22 November 2022, 15:15:57
Subject: Re: copying json data and backslashes

copying json data and backslashes - Mailing list pgsql-general

Previous

Next