Re: COPY FROM WHEN condition - Mailing list pgsql-hackers

From Surafel Temesgen
Subject Re: COPY FROM WHEN condition
Date
Msg-id CALAY4q_9bs-e5Ypx6Q_HTrsWXG-znV4eb0hzJgxoOBB5UcP0yA@mail.gmail.com
Whole thread Raw
In response to Re: COPY FROM WHEN condition  (Tomas Vondra <tomas.vondra@2ndquadrant.com>)
Responses Re: COPY FROM WHEN condition  (Masahiko Sawada <sawada.mshk@gmail.com>)
List pgsql-hackers

Hi,
Thank you for looking at it .
On Sun, Oct 28, 2018 at 7:19 PM Tomas Vondra <tomas.vondra@2ndquadrant.com> wrote: 

1) I think this deserves at least some regression tests. Plenty of tests
already use COPY, but there's no coverage for the new piece. So let's
add a new test suite, or maybe add a couple of tests into copy2.sql.


2) In copy.sqml, the new item is defined like this

    <term><literal>WHEN Clause</literal></term>

I suggest we use just <term><literal>WHEN</literal></term>, that's what
the other items do (see ENCODING).

The other thing is that this does not say what expressions are allowed
in the WHEN clause. It seems pretty close to WHEN clause for triggers,
which e.g. mentions that subselects are not allowed. I'm pretty sure
that's true here too, because it fails like this (118 = T_SubLink):

    test=# copy t(a,b,c) from '/tmp/t.data' when ((select 1) < 10);
    ERROR:  unrecognized node type: 118

So, the patch needs to detect this, produce a reasonable error message
and document the limitations in copy.sqml, just like we do for CREATE
TRIGGER.
fixed 

3) For COPY TO, the WHEN clause is accepted but ignored, leading to
confusing cases like this:

    test=# copy t(a,b,c) to '/tmp/t.data' when ((select 100) < 10);
    COPY 151690

So, it contains subselect, but unlike COPY FROM it does not fail
(because we never execute it). The fun part is that the expression is
logically false, so a user might expect it to filter rows, yet we copy
everything.

IMHO we need to either error-out in these cases, complaining about WHEN
not being supported for COPY TO, or make it work (effectively treating
it as a simpler alternative to COPY (subselect) TO).

English is not my first language but I chose error-out because WHEN condition for COPY TO seems to me semantically incorrect


AFAICS we could just get rid of the extra when_cluase variable and mess
with the cstate->whenClause directly, depending on how (3) gets fixed.

I did it this way because CopyState structure memory allocate and initialize in BeginCopyFrom but the analysis done before it


5) As I mentioned, the CREATE TRIGGER already has WHEN clause, but it
requires it to be 'WHEN (expr)'. I suggest we do the same thing here,
requiring the parentheses.


6) The skip logic in CopyFrom() seems to be slightly wrong. It does
work, but the next_record label is defined after CHECK_FOR_INTERRUPTS()
so a COPY will not respond to Ctrl-C unless it finds a row matching the
WHEN condition. If you have a highly selective condition, that's a bit
inconvenient.

It also skips

    MemoryContextSwitchTo(GetPerTupleMemoryContext(estate));

so I wonder what the heap_form_tuple() right after the next_record label
will use for tuples right after a skipped one. I'd bet it'll use the
oldcontext (essentially the long-lived context), essentially making it
a memory leak.

So I suggest to get rid of the next_record label, and use 'continue'
instead of the 'goto next_record'.

fixed 
regards

--
Tomas Vondra                  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Attachment

pgsql-hackers by date:

Previous
From: Sergei Kornilov
Date:
Subject: Re: [HACKERS] generated columns
Next
From: Peter Geoghegan
Date:
Subject: PSA: rr recorder/debugger works well with Postgres + Linux