Re: Fixing backslash dot for COPY FROM...CSV - Mailing list pgsql-hackers
From | Sutou Kouhei |
---|---|
Subject | Re: Fixing backslash dot for COPY FROM...CSV |
Date | |
Msg-id | 20240719.151031.622666269892202885.kou@clear-code.com Whole thread Raw |
In response to | Re: Fixing backslash dot for COPY FROM...CSV (Tom Lane <tgl@sss.pgh.pa.us>) |
Responses |
Re: Fixing backslash dot for COPY FROM...CSV
Re: Fixing backslash dot for COPY FROM...CSV |
List | pgsql-hackers |
Hi, I'm reviewing patches in Commitfest 2024-07 from top to bottom: https://commitfest.postgresql.org/48/ This is the 4th patch: https://commitfest.postgresql.org/48/4710/ FYI: https://commitfest.postgresql.org/48/4681/ is my patch. In <2726138.1712462833@sss.pgh.pa.us> "Re: Fixing backslash dot for COPY FROM...CSV" on Sun, 07 Apr 2024 00:07:13 -0400, Tom Lane <tgl@sss.pgh.pa.us> wrote: > Here's the problem: if some client-side code thinks it's okay to > quote "." as "\.", what is likely to happen when it's presented > a data value consisting only of "."? It could very easily fall > into the trap of producing an end-of-data marker. > > If we get rid of the whole concept of end-of-data markers, then > it'd be totally reasonable to accept "\." as ".". But as long > as we still have end-of-data markers, I think it's unwise to allow > "\." to appear as anything but an end-of-data marker. It'd just > add camouflage to the booby trap. I read through this thread. It seems that this thread discuss 2 things: 1. \. in CSV mode 2. \. in non-CSV mode Recent messages discussed mainly 2. but how about create a separated thread for 2.? Because the original mail focused on 1. and it seems that we can handle them separately. Here are comments for the latest v4 patch: ---- diff --git a/src/bin/psql/copy.c b/src/bin/psql/copy.c index 961ae32694..a39818b193 100644 --- a/src/bin/psql/copy.c +++ b/src/bin/psql/copy.c @@ -620,20 +620,32 @@ handleCopyIn(PGconn *conn, FILE *copystream, bool isbinary, PGresult **res) ... + if (copystream == pset.cur_cmd_source) + { + *fgresult='\0'; + buflen -= linelen; ---- Spaces around "=" are missing for the fgresult line. BTW, here is a diff after pgindent: ---- diff --git a/src/bin/psql/copy.c b/src/bin/psql/copy.c index a39818b1933..4ee8481998a 100644 --- a/src/bin/psql/copy.c +++ b/src/bin/psql/copy.c @@ -621,13 +621,13 @@ handleCopyIn(PGconn *conn, FILE *copystream, bool isbinary, PGresult **res) if (buf[buflen - 1] == '\n') { /* - * When at the beginning of the line, check for EOF marker. - * If the marker is found and the data is inlined, + * When at the beginning of the line, check for EOF + * marker. If the marker is found and the data is inlined, * we must stop at this point. If not, the \. line can be - * sent to the server, and we let it decide whether - * it's an EOF or not depending on the format: in - * basic TEXT, \. is going to be interpreted as an EOF, in - * CSV, it will not. + * sent to the server, and we let it decide whether it's + * an EOF or not depending on the format: in basic TEXT, + * \. is going to be interpreted as an EOF, in CSV, it + * will not. */ if (at_line_begin && copystream == pset.cur_cmd_source) { @@ -635,15 +635,16 @@ handleCopyIn(PGconn *conn, FILE *copystream, bool isbinary, PGresult **res) (linelen == 4 && memcmp(fgresult, "\\.\r\n", 4) == 0)) { copydone = true; + /* - * Remove the EOF marker from the data sent. - * In the case of CSV, the EOF marker must be + * Remove the EOF marker from the data sent. In + * the case of CSV, the EOF marker must be * removed, otherwise it would be interpreted by * the server as valid data. */ if (copystream == pset.cur_cmd_source) { - *fgresult='\0'; + *fgresult = '\0'; buflen -= linelen; } } ---- I also confirmed that the updated server and non-updated psql compatibility problem (the end-of-data "\." is inserted). It seems that it's difficult to solve without introducing incompatibility. How about introducing a new COPY option that controls whether "\." is ignored or not instead of this approach? Here is a migration idea: 1. Add the new COPY option to v18 2. psql detects whether a server has the new COPY option 2.1. If it's available, psql uses the option and doesn't send "\." 2.2. If it's not available, psql doesn't use the option and sends "\." 3. psql always doesn't sends "\." after v17 or earlier reach EOL Thanks, -- kou
pgsql-hackers by date: