Re: [BUG] psql: Make \copy from 'text' and 'csv' formats fail on NUL bytes - Mailing list pgsql-hackers
From | Joel Jacobson |
---|---|
Subject | Re: [BUG] psql: Make \copy from 'text' and 'csv' formats fail on NUL bytes |
Date | |
Msg-id | 327d8dac-2321-4147-909f-6f18474ff4e1@app.fastmail.com Whole thread Raw |
In response to | Re: [BUG] psql: Make \copy from 'text' and 'csv' formats fail on NUL bytes (Tom Lane <tgl@sss.pgh.pa.us>) |
List | pgsql-hackers |
On Sun, Nov 10, 2024, at 23:14, Tom Lane wrote: > "Joel Jacobson" <joel@compiler.org> writes: >> On Sun, Nov 10, 2024, at 22:37, Tom Lane wrote: >>> That seems like a hack, as it also changes the behavior w.r.t. >>> prompts and EOF-mark detection, neither for the better. > >> Hmm, in what way does it change the behavior? >> I ran almost all the tests, including the TAP ones, and none of them failed with >> this fix. Is there some behavior that is currently not covered by the test suite? > > Of course. (Our regression tests are very far from covering 100% of > the behavioral space.) > > In the case at hand, this would break the > expected line-by-line prompting for "\copy ... from '/dev/tty'". Yes, of course, just wondered what kind of behavior in this area that wasn't tested, thanks for explaining it. > Playing with this just now, I notice that the prompt you get > still claims that \. works to end input, although right now > it does not: > > regression=# \copy int8_tbl from '/dev/tty' > Enter data to be copied followed by a newline. > End with a backslash and a period on a line by itself, or an EOF signal. >>> 1 3 >>> 5 6 >>> \. >>> ^D > COPY 2 > > I'm inclined to think that the prompt still describes what should > happen, at least in non-binary mode, although in binary mode we > probably ought to just say (and do) "End with an EOF signal". > > So perhaps the if-test to choose the code path could be > > if ((isbinary || copystream != pset.cur_cmd_source) && !showprompt) Thanks for guidance, that seems to fix, for 6/8 cases I've figured out how to test. > which would allow dropping the vestigial prompt logic in the first > path, I assume you mean that due to "&& !showprompt" the "if (showprompt)" becomes unreachable and can therefore be dropped? > and we would also need to change the test in the second path > that decides if we should check for \. (Likely this should be > refactored a bit to make it more understandable. An intermediate > flag saying whether we intend to check for \. might help.) Maybe check_dot_command? const bool check_dot_command = (copystream == pset.cur_cmd_source); I haven't tried yet to refactor the code, except than replacing the two "copystream == pset.cur_cmd_source" occurrences with the new check_dot_command flag. First want to understand if the two remaining cases are valid, and if they can be tested: > Anyway, my point is that we need to think through the desirable > behavior for each possible combination of showprompt, isbinary, and > copystream != pset.cur_cmd_source, because all 8 cases are reachable. I guess these are the 8 cases? +--------+-------------+----------+------------------+ | CASE | showprompt | isbinary | check_dot_command | +--------+-------------+----------+------------------+ | 1 | false | false | false | | 2 | false | false | true | | 3 | false | true | false | | 4 | false | true | true | | 5 | true | false | false | | 6 | true | false | true | | 7* | true | true | false | | 8* | true | true | true | +--------+-------------+----------+------------------+ * Cases 7 and 8 not tested yet With the changed if-test, case 1-6 works, and for case 1, then binary mode branch is taken instead of the text mode branch, whereas cases 2-6 take the same branch as before. joel@Joels-MBP psql_tester % git diff --no-index -U100 /tmp/psql.log.HEAD /tmp/psql.log diff --git a/tmp/psql.log.HEAD b/tmp/psql.log index 5e44e30..1f48ac9 100644 --- a/tmp/psql.log.HEAD +++ b/tmp/psql.log @@ -1,6 +1,6 @@ -COPY case: 1 TEXT MODE +COPY case: 1 BINARY MODE COPY case: 2 TEXT MODE COPY case: 3 BINARY MODE COPY case: 4 BINARY MODE COPY case: 5 TEXT MODE COPY case: 6 TEXT MODE Here is how I tested each case: # CASE 1: # showprompt: false # isbinary: false # check_dot_command: false psql -c "\copy int8_tbl from '/tmp/int8_tbl.data'" # CASE 2: # showprompt: false # isbinary: false # check_dot_command: true psql -f /tmp/copy_stdin_text.sql # CASE 3: # showprompt: false # isbinary: true # check_dot_command: false psql -c "\copy int8_tbl from '/tmp/int8_tbl.bin' (format binary)" # CASE 4: # showprompt: false # isbinary: true # check_dot_command: true printf '\\copy int8_tbl from stdin (format binary)\n' >/tmp/copy_stdin_binary.sql cat /tmp/int8_tbl.bin >>/tmp/copy_stdin_binary.sql psql -f copy_stdin_binary.sql # CASE 5: # showprompt: true # isbinary: false # check_dot_command: false psql \copy int8_tbl from '/dev/tty' 17 18 19 20 \. # Send EOF (Ctrl+D) # CASE 6: # showprompt: true # isbinary: false # check_dot_command: true psql \copy int8_tbl from stdin 21 22 23 24 \. # CASE 7: # showprompt: true # isbinary: true # check_dot_command: false CASE 7 would be like CASE 5, with the addition of (format binary) that is: \copy int8_tbl from '/dev/tty' (format binary) but I wonder if this is a valid case? Could we really copy/paste or by some other means give psql binary data here? I tried to wrap psql and send the binary content of my '/tmp/int8_tbl.bin' file, and tried sending the EOF control code, and I see I get PGCOPY-message, but the txn isn't committed, so not sure what's happening. Before I continue trying to figure this one out, I just wanted to make sure this is a valid case, and how it is supposed to be used if so? # CASE 8: # showprompt: true # isbinary: true # check_dot_command: true CASE 8 would be like CASE 6, with the addition of (format binary) that is: \copy int8_tbl from stdin (format binary) I've had the same problem with this one, as with CASE 7. Many thanks for guidance. /Joel
pgsql-hackers by date: