On 11/23/18 12:14 PM, Surafel Temesgen wrote:
>
>
> On Sun, Nov 11, 2018 at 11:59 PM Tomas Vondra
> <tomas.vondra@2ndquadrant.com <mailto:tomas.vondra@2ndquadrant.com>> wrote:
>
>
> So, what about using FILTER here? We already use it for aggregates when
> filtering rows to process.
>
> i think its good idea and describe its purpose more. Attache is a
> patch that use FILTER instead
Thanks, looks good to me. A couple of minor points:
1) While comparing this to the FILTER clause we already have for
aggregates, I've noticed the aggregate version is
FILTER '(' WHERE a_expr ')'
while here we have
FILTER '(' a_expr ')'
For a while I was thinking that maybe we should use the same syntax
here, but I don't think so. The WHERE bit seems rather unnecessary and
we probably implemented it only because it's required by SQL standard,
which does not apply to COPY. So I think this is fine.
2) The various parser checks emit errors like this:
case EXPR_KIND_COPY_FILTER:
err = _("cannot use subquery in copy from FILTER condition");
break;
I think the "copy from" should be capitalized, to make it clear that it
refers to a COPY FROM command and not a copy of something.
3) I think there should be regression tests for these prohibited things,
i.e. for a set-returning function, for a non-existent column, etc.
4) What might be somewhat confusing for users is that the filter uses a
single snapshot to evaluate the conditions for all rows. That is, one
might do this
create or replace function f() returns int as $$
select count(*)::int from t;
$$ language sql;
and hope that
copy t from '/...' filter (f() <= 100);
only ever imports the first 100 rows - but that's not true, of course,
because f() uses the snapshot acquired at the very beginning. For
example INSERT SELECT does behave differently:
test=# copy t from '/home/user/t.data' filter (f() < 100);
COPY 81
test=# insert into t select * from t where f() < 100;
INSERT 0 19
Obviously, this is not an issue when the filter clause references only
the input row (which I assume will be the primary use case).
Not sure if this is expected / appropriate behavior, and if the patch
needs to do something else here.
regards
--
Tomas Vondra http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services