Re: COPY FROM WHEN condition - Mailing list pgsql-hackers
From | Masahiko Sawada |
---|---|
Subject | Re: COPY FROM WHEN condition |
Date | |
Msg-id | CAD21AoCHM6FU0aErWUq+WhWQqx89ppZmbJAR3B2w_HLmaiPLMA@mail.gmail.com Whole thread Raw |
In response to | Re: COPY FROM WHEN condition (Surafel Temesgen <surafel3000@gmail.com>) |
Responses |
Re: COPY FROM WHEN condition
|
List | pgsql-hackers |
On Tue, Oct 30, 2018 at 11:47 PM Surafel Temesgen <surafel3000@gmail.com> wrote: > > > Hi, > Thank you for looking at it . > On Sun, Oct 28, 2018 at 7:19 PM Tomas Vondra <tomas.vondra@2ndquadrant.com> wrote: >> >> >> 1) I think this deserves at least some regression tests. Plenty of tests >> already use COPY, but there's no coverage for the new piece. So let's >> add a new test suite, or maybe add a couple of tests into copy2.sql. >> >> >> 2) In copy.sqml, the new item is defined like this >> >> <term><literal>WHEN Clause</literal></term> >> >> I suggest we use just <term><literal>WHEN</literal></term>, that's what >> the other items do (see ENCODING). >> >> The other thing is that this does not say what expressions are allowed >> in the WHEN clause. It seems pretty close to WHEN clause for triggers, >> which e.g. mentions that subselects are not allowed. I'm pretty sure >> that's true here too, because it fails like this (118 = T_SubLink): >> >> test=# copy t(a,b,c) from '/tmp/t.data' when ((select 1) < 10); >> ERROR: unrecognized node type: 118 >> >> So, the patch needs to detect this, produce a reasonable error message >> and document the limitations in copy.sqml, just like we do for CREATE >> TRIGGER. > > fixed >> >> >> 3) For COPY TO, the WHEN clause is accepted but ignored, leading to >> confusing cases like this: >> >> test=# copy t(a,b,c) to '/tmp/t.data' when ((select 100) < 10); >> COPY 151690 >> >> So, it contains subselect, but unlike COPY FROM it does not fail >> (because we never execute it). The fun part is that the expression is >> logically false, so a user might expect it to filter rows, yet we copy >> everything. >> >> IMHO we need to either error-out in these cases, complaining about WHEN >> not being supported for COPY TO, or make it work (effectively treating >> it as a simpler alternative to COPY (subselect) TO). > > English is not my first language but I chose error-out because WHEN condition for COPY TO seems to me semantically incorrect >> >> >> AFAICS we could just get rid of the extra when_cluase variable and mess >> with the cstate->whenClause directly, depending on how (3) gets fixed. > > I did it this way because CopyState structure memory allocate and initialize in BeginCopyFrom but the analysis done beforeit >> >> >> 5) As I mentioned, the CREATE TRIGGER already has WHEN clause, but it >> requires it to be 'WHEN (expr)'. I suggest we do the same thing here, >> requiring the parentheses. >> >> >> 6) The skip logic in CopyFrom() seems to be slightly wrong. It does >> work, but the next_record label is defined after CHECK_FOR_INTERRUPTS() >> so a COPY will not respond to Ctrl-C unless it finds a row matching the >> WHEN condition. If you have a highly selective condition, that's a bit >> inconvenient. >> >> It also skips >> >> MemoryContextSwitchTo(GetPerTupleMemoryContext(estate)); >> >> so I wonder what the heap_form_tuple() right after the next_record label >> will use for tuples right after a skipped one. I'd bet it'll use the >> oldcontext (essentially the long-lived context), essentially making it >> a memory leak. >> >> So I suggest to get rid of the next_record label, and use 'continue' >> instead of the 'goto next_record'. >> > fixed >> I've looked at this patch and tested. When I use a function returning string in WHEN clause I got the following error: =# copy test from '/tmp/aaa.csv' (format 'csv') when (lower(t) = 'hello'); ERROR: could not determine which collation to use for lower() function HINT: Use the COLLATE clause to set the collation explicitly. CONTEXT: COPY hoge, line 1: "1,hoge,2018-01-01" And then although I specified COLLATE I got an another error (127 = T_CollateExpr): =# copy test from '/tmp/aaa.csv' (format 'csv') when (lower(t) collate "en_US" = 'hello'); ERROR: unrecognized node type: 127 This error doesn't happen if I put the similar condition in WHEN clause for triggers. I think the patch needs to produce a reasonable error message. Regards, -- Masahiko Sawada NIPPON TELEGRAPH AND TELEPHONE CORPORATION NTT Open Source Software Center
pgsql-hackers by date: