Home > mailing lists

Re: raw output from copy - Mailing list pgsql-hackers

From	Pavel Stehule
Subject	Re: raw output from copy
Date	March 31, 2016 06:13:13
Msg-id	CAFj8pRCUa8QMKmqbfVmsE5zDyH9rdSVL0i=Hku9+nRUVqsYK8w@mail.gmail.com Whole thread
In response to	Re: raw output from copy (Tom Lane <tgl@sss.pgh.pa.us>)
Responses	Re: raw output from copy
List	pgsql-hackers

Tree view

2016-03-29 20:59 GMT+02:00 Tom Lane <tgl@sss.pgh.pa.us>:

Pavel Stehule <pavel.stehule@gmail.com> writes:
> I am writing few lines as summary:

> 1. invention RAW_TEXT and RAW_BINARY
> 2. for RAW_BINARY: PQbinaryTuples() returns 1 and PQfformat() returns 1
> 3.a for RAW_TEXT: PQbinaryTuples() returns 0 and PQfformat() returns 0, but
> the client should to check PQcopyFormat() to not print "\n" on the end
> 3.b for RAW_TEXT: PQbinaryTuples() returns 1 and PQfformat() returns 1, but
> used output function, not necessary client modification
> 4. PQcopyFormat() returns 0 for text, 1 for binary, 2 for RAW_TEXT, 3 for
> RAW_BINARY
> 5. create tests for ecpg

3.b certainly seems completely wrong. PQfformat==1 would imply binary
data.

I suggest that PQcopyFormat should be understood as defining the format
of the copy data encapsulation, not the individual fields. So it would go
like 0 = traditional text format, 1 = traditional binary format, 2 = raw
(no encapsulation). You'd need to also look at PQfformat to distinguish
raw text from raw binary. But if we do it as you suggest above, we've
locked ourselves into only ever having two field format codes, which
is something the existing design is specifically intended to allow
expansion in.

I wrote concept of raw_text, raw_binary modes.

I am trying to implement text data passing like text format - but for RAW_TEXT it is not practical. Text passing is designed for one line data, for multiline data enforces escaping, what we don't would for RAW mode. I have to skip escaping, and the code is not nice.

So I propose different schema - RAW_TEXT uses text values (uses input/output functions), enforce encoding from/to client codes and for passing to client mode is used binary mode - then I don't need to read the content with line by line. PQbinaryTuples() returns 1 for RAW_TEXT and RAW_BINARY - in these cases data are passed as one binary value. PQfformat returns 2 for RAW_TEXT and 3 for RAW_BINARY.

Any objections to this design?

Regards

Pavel

regards, tom lane

Attachment

copy-raw-format-20160331-04.patch

pgsql-hackers by date:

From: Rajkumar Raghuwanshi
Date: 31 March 2016, 05:35:44
Subject: Re: Postgres_fdw join pushdown - INNER - FULL OUTER join combination generating wrong result

From: Craig Ringer
Date: 31 March 2016, 06:34:58
Subject: Re: raw output from copy

Re: raw output from copy - Mailing list pgsql-hackers

Attachment

Previous

Next