Re: POC PATCH: copy from ... exceptions to: (was Re: VLDB Features) - Mailing list pgsql-hackers
From | Andres Freund |
---|---|
Subject | Re: POC PATCH: copy from ... exceptions to: (was Re: VLDB Features) |
Date | |
Msg-id | 20230322175000.qbdctk7bnmifh5an@awork3.anarazel.de Whole thread Raw |
In response to | Re: POC PATCH: copy from ... exceptions to: (was Re: VLDB Features) (torikoshia <torikoshia@oss.nttdata.com>) |
Responses |
Re: POC PATCH: copy from ... exceptions to: (was Re: VLDB Features)
|
List | pgsql-hackers |
Hi, Tom, see below - I wonder if should provide one more piece of infrastructure around the saved error stuff... Have you measured whether this has negative performance effects when *NOT* using the new option? As-is this does not work with FORMAT BINARY - and converting the binary input functions to support soft errors won't happen for 16. So I think you need to raise an error if BINARY and IGNORE_DATATYPE_ERRORS are specified. On 2023-03-22 22:34:20 +0900, torikoshia wrote: > @@ -985,9 +986,28 @@ CopyFrom(CopyFromState cstate) > > ExecClearTuple(myslot); > > + if (cstate->opts.ignore_datatype_errors) > + { > + escontext.details_wanted = true; > + cstate->escontext = escontext; > + } I think it might be worth pulling this out of the loop. That does mean you'd have to reset escontext.error_occurred after an error, but that doesn't seem too bad, you need to do other cleanup anyway. > @@ -956,10 +957,20 @@ NextCopyFrom(CopyFromState cstate, ExprContext *econtext, > values[m] = ExecEvalExpr(defexprs[m], econtext, &nulls[m]); > } > else > - values[m] = InputFunctionCall(&in_functions[m], > - string, > - typioparams[m], > - att->atttypmod); > + /* If IGNORE_DATATYPE_ERRORS is enabled skip rows with datatype errors */ > + if (!InputFunctionCallSafe(&in_functions[m], > + string, > + typioparams[m], > + att->atttypmod, > + (Node *) &cstate->escontext, > + &values[m])) > + { > + cstate->ignored_errors++; > + > + ereport(WARNING, > + errmsg("%s", cstate->escontext.error_data->message)); That isn't right - you loose all the details of the message. As is you'd also leak the error context. I think the best bet for now is to do something like /* adjust elevel so we don't jump out */ cstate->escontext.error_data->elevel = WARNING; /* despite the name, this won't raise an error if elevel < ERROR */ ThrowErrorData(cstate->escontext.error_data); I wonder if we ought to provide a wrapper for this? It could e.g. know to mention the original elevel and such? I don't think NextCopyFrom() is the right place to emit this warning - it e.g. is also called from file_fdw.c, which might want to do something else with the error. From a layering POV it seems cleaner to do this in CopyFrom(). You already have a check for escontext.error_occurred there anyway. > @@ -3378,6 +3378,10 @@ copy_opt_item: > { > $$ = makeDefElem("freeze", (Node *) makeBoolean(true), @1); > } > + | IGNORE_DATATYPE_ERRORS > + { > + $$ = makeDefElem("ignore_datatype_errors", (Node *)makeBoolean(true), @1); > + } > | DELIMITER opt_as Sconst > { > $$ = makeDefElem("delimiter", (Node *) makeString($3), @1); I think we shouldn't add a new keyword for this, but only support this via /* new COPY option syntax */ copy_generic_opt_list: copy_generic_opt_elem Further increasing the size of the grammar with random keywords when we have more generic ways to represent them seems unnecessary. > +-- tests for IGNORE_DATATYPE_ERRORS option > +CREATE TABLE check_ign_err (n int, m int[], k int); > +COPY check_ign_err FROM STDIN WITH IGNORE_DATATYPE_ERRORS; > +1 {1} 1 > +a {2} 2 > +3 {3} 3333333333 > +4 {a, 4} 4 > + > +5 {5} 5 > +\. > +SELECT * FROM check_ign_err; > + I suggest adding a few more tests: - COPY with a datatype error that can't be handled as a soft error - test documenting that COPY FORMAT BINARY is incompatible with IGNORE_DATATYPE_ERRORS - a soft error showing the error context - although that will require some care to avoid the function name + line in the output Greetings, Andres Freund
pgsql-hackers by date: