Re: Make COPY format extendable: Extract COPY TO format implementations - Mailing list pgsql-hackers
From | Sutou Kouhei |
---|---|
Subject | Re: Make COPY format extendable: Extract COPY TO format implementations |
Date | |
Msg-id | 20240205.180515.746623205615816813.kou@clear-code.com Whole thread Raw |
In response to | Re: Make COPY format extendable: Extract COPY TO format implementations (Michael Paquier <michael@paquier.xyz>) |
Responses |
Re: Make COPY format extendable: Extract COPY TO format implementations
|
List | pgsql-hackers |
Hi, In <ZcCKwAeFrlOqPBuN@paquier.xyz> "Re: Make COPY format extendable: Extract COPY TO format implementations" on Mon, 5 Feb 2024 16:14:08 +0900, Michael Paquier <michael@paquier.xyz> wrote: > So, I've looked at all that today, and finished by applying two > patches as of 2889fd23be56 and 95fb5b49024a to get some of the > weirdness with the workhorse routines out of the way. Thanks! > As this is called within the OneRow routine, I can live with that. If > there is an opposition to that, we could just attach it within the > routines. I don't object the approach. > I am attaching a v12 which is close to what I want it to be, with > much more documentation and comments. There are two things that I've > changed compared to the previous versions though: > 1) I have added a callback to set up the input and output functions > rather than attach that in the Start callback. I'm OK with this. I just don't use them in Apache Arrow COPY FORMAT extension. > - No need for plugins to think about how to allocate this data. v11 > and other versions were doing things the wrong way by allocating this > stuff in the wrong memory context as we switch to the COPY context > when we are in the Start routines. Oh, sorry. I missed it when I moved them. > 2) I have backpedaled on the postpare callback, which did not bring > much in clarity IMO while being a CSV-only callback. Note that we > have in copyfromparse.c more paths that are only for CSV but the past > versions of the patch never cared about that. This makes the text and > CSV implementations much closer to each other, as a result. Ah, sorry. I forgot to eliminate cstate->opts.csv_mode in CopyReadLineText(). The postpare callback is for optimization. If it doesn't improve performance, we don't need to introduce it. We may want to try eliminating cstate->opts.csv_mode in CopyReadLineText() for performance. But we don't need to do this in introducing CopyFromRoutine. We can defer it. So I don't object removing the postpare callback. > Let me know if you have > comments about all that. Here are some comments for the patch: + /* + * Called when COPY FROM is started to set up the input functions + * associated to the relation's attributes writing to. `fmgr_info` can be fmgr_info -> finfo + * optionally filled to provide the catalog information of the input + * function. `typioparam` can be optinally filled to define the OID of optinally -> optionally + * the type to pass to the input function. `atttypid` is the OID of data + * type used by the relation's attribute. + */ + void (*CopyFromInFunc) (Oid atttypid, FmgrInfo *finfo, + Oid *typioparam); How about passing CopyFromState cstate too like other callbacks for consistency? + /* + * Copy one row to a set of `values` and `nulls` of size tupDesc->natts. + * + * 'econtext' is used to evaluate default expression for each column that + * is either not read from the file or is using the DEFAULT option of COPY or is -> or (I'm not sure...) + * FROM. It is NULL if no default values are used. + * + * Returns false if there are no more tuples to copy. + */ + bool (*CopyFromOneRow) (CopyFromState cstate, ExprContext *econtext, + Datum *values, bool *nulls); +typedef struct CopyToRoutine +{ + /* + * Called when COPY TO is started to set up the output functions + * associated to the relation's attributes reading from. `fmgr_info` can fmgr_info -> finfo + * be optionally filled. `atttypid` is the OID of data type used by the + * relation's attribute. + */ + void (*CopyToOutFunc) (Oid atttypid, FmgrInfo *finfo); How about passing CopyToState cstate too like other callbacks for consistency? @@ -200,4 +204,10 @@ extern void ReceiveCopyBinaryHeader(CopyFromState cstate); extern int CopyReadAttributesCSV(CopyFromState cstate); extern int CopyReadAttributesText(CopyFromState cstate); +/* Callbacks for CopyFromRoutine->OneRow */ CopyFromRoutine->OneRow -> CopyFromRoutine->CopyFromOneRow +extern bool CopyFromTextOneRow(CopyFromState cstate, ExprContext *econtext, + Datum *values, bool *nulls); +extern bool CopyFromBinaryOneRow(CopyFromState cstate, ExprContext *econtext, + Datum *values, bool *nulls); + #endif /* COPYFROM_INTERNAL_H */ +/* + * CopyFromTextStart CopyFromTextStart -> CopyFromBinaryStart + * + * Start of COPY FROM for binary format. + */ +static void +CopyFromBinaryStart(CopyFromState cstate, TupleDesc tupDesc) +{ + /* Read and verify binary header */ + ReceiveCopyBinaryHeader(cstate); +} + +/* + * CopyFromTextEnd CopyFromTextEnd -> CopyFromBinaryEnd + * + * End of COPY FROM for binary format. + */ +static void +CopyFromBinaryEnd(CopyFromState cstate) +{ + /* nothing to do */ +} diff --git a/src/tools/pgindent/typedefs.list b/src/tools/pgindent/typedefs.list index 91433d439b..d02a7773e3 100644 --- a/src/tools/pgindent/typedefs.list +++ b/src/tools/pgindent/typedefs.list @@ -473,6 +473,7 @@ ConvertRowtypeExpr CookedConstraint CopyDest CopyFormatOptions +CopyFromRoutine CopyFromState CopyFromStateData CopyHeaderChoice @@ -482,6 +483,7 @@ CopyMultiInsertInfo CopyOnErrorChoice CopySource CopyStmt +CopyToRoutine CopyToState CopyToStateData Cost Wow! I didn't know that we need to update typedefs.list when I add a "typedef struct". Thanks, -- kou
pgsql-hackers by date: