Re: COPY TO STDOUT WITH (FORMAT CSV, HEADER), and embedded newlines - Mailing list pgsql-general

From Francisco Olarte
Subject Re: COPY TO STDOUT WITH (FORMAT CSV, HEADER), and embedded newlines
Date
Msg-id CA+bJJbxZGr6EJwEzhuk8krWJ0T5isCzrv5u1jxWsRupvmwY-iw@mail.gmail.com
Whole thread Raw
In response to Re: COPY TO STDOUT WITH (FORMAT CSV, HEADER), and embedded newlines  (Dominique Devienne <ddevienne@gmail.com>)
List pgsql-general
Dominique:

On Fri, 11 Mar 2022 at 21:13, Dominique Devienne <ddevienne@gmail.com> wrote:
> But sure, if TEXT does the kind of pseudo-CSV I need, I'd change it to use it.

Text, the original format for copy, is much easier to manage than CSV.
It can easily be managed as you can split the whole input on newlines
to get records, split each record on tabs to get fields, then unescape
each field.

In C++ you can easily read it a char at a time and build along the way
or, if you have a whole line, unescape it in place and build a
vector<char*> pointing two the buffer. If you are testing, the split
on newline/split on tab approach gives you a list of escaped strings
easily compared to escaped patterns.

I've never had problems with it in decades, and in fact I use a
extension of it ( with a \E code similar to the \N trick for nulls
for 0-element lines, which are not useful in db dumps, but I and I
need to use, as "\n" decodes to {""} but I need to express {}, which I
emit as "\\E\n". It is and old problem, "join by tabs join by
newlines" makes things "prettier" but can lead to no final new line
and no way to express empty sets, "terminate with tabs terminate with
newlines" leads to uglier/harder to read lines but can express them).

Francisco Olarte.



pgsql-general by date:

Previous
From: Francisco Olarte
Date:
Subject: Re: COPY TO STDOUT WITH (FORMAT CSV, HEADER), and embedded newlines
Next
From: Roger Bos
Date:
Subject: delete query using CTE