RE: [PROPOSAL]a new data type 'bytea' for ECPG - Mailing list pgsql-hackers
From | Matsumura, Ryo |
---|---|
Subject | RE: [PROPOSAL]a new data type 'bytea' for ECPG |
Date | |
Msg-id | 03040DFF97E6E54E88D3BFEE5F5480F737A1CF3E@G01JPEXMBYT04 Whole thread Raw |
In response to | [PROPOSAL]a new data type 'bytea' for ECPG ("Matsumura, Ryo" <matsumura.ryo@jp.fujitsu.com>) |
List | pgsql-hackers |
Hackers No one commented to the proposal, but I'm not discouraged. I attach a patch. Please review or comment to proposal. Note: - The patch cannot yet decode escape format data from backend. - [ecpg/test/expected/sql-bytea.stderr] in patch includes non-ascii data. I explain a little about the patch. Preproc: Almost same as varchar. Ecpglib: - ecpg_build_params() Build two more arrays paramlengths and paramformats for PQexecParams(). If input variable type is bytea, set pramformats to 1(= is binary) and set binary data length to paramlengths. - ecpg_store_input() If input variable type is bytea, copy its binary data to ecpg_alloc-ed area directly. - ecpg_get_data() If output variable type is bytea, decode received results to user area. Encode/decode function is imported from backend/utils/adt/encode.c - ECPGset_desc() Currently ecpglib saves data to internal area(struct descriptor_item) for execution phase, but doesn't save type information that is needed in case of bytea. So I add a member is_binary to descriptor_item structure. Thank you. Regards Ryo Matsumura > -----Original Message----- > From: Matsumura, Ryo [mailto:matsumura.ryo@jp.fujitsu.com] > Sent: Monday, October 1, 2018 5:04 PM > To: pgsql-hackers@lists.postgresql.org > Subject: [PROPOSAL]a new data type 'bytea' for ECPG > > Hi, Hackers > > # This is my first post. > > I will try to implement a new data type 'bytea' for ECPG. > I think that the implementation is not complicated. > Does anyone need it ? > > > * Why do I need bytea ? > > Currently, ECPG program can treat binary data for bytea column with 'char' > type > of C language, but it must convert from/to escaped format with PQunescapeBytea/ > PQescapeBytea(). It forces users to add an unnecessary code and to pay cost > for > the conversion in runtime. > # My PoC will not be able to solve output conversion cost. > > I think that set/put data for host variable should be more simple. > The following is an example of Oracle Pro *C program for RAW type column. > > VARCHAR raw_data[20]; > > /* preprocessed to the following > * struct > * { > * unsigned short len; > * unsigned char arr[20]; > * } raw_data; > */ > > raw_data.len = 10; > memcpy(raw_data.arr, data, 10); > > see also: > > https://docs.oracle.com/cd/E11882_01/appdev.112/e10825/pc_04dat.htm#i2330 > 5 > > In ECPG, varchar host variable cannot be used for bytea because it cannot treat > '\0' as part of data. If the length is set to 10 and there is '\0' at 3rd byte, > ecpglib truncates 3rd byte and later at the following: > > [src/interfaces/ecpg/ecpglib/execute.c] > ecpg_store_input(const int lineno, const bool force_indicator, const struct > : > switch (var->type) > : > case ECPGt_varchar: > if (!(newcopy = (char *) ecpg_alloc(variable->len + 1, lineno))) > return false; > !! strncpy(newcopy, variable->arr, variable->len); > newcopy[variable->len] = '\0'; > > I also think that the behavior of varchar host variable should not be changed > because of compatibility. > Therefore, a new type of host variable 'bytea' is needed. > > Since ecpglib can distinguish between C string and binary data, it can send > binary data to backend directly by using 'paramFormats' argument of > PQexecParams(). > Unfortunately, the conversion of output data cannot be omitted in ecpglib > because > libpq doesn't provide like 'paramFormats'. > ('resultFormat' means that *all* data from backend is formatted by binary > or not.) > > PQexecParams(PGconn *conn, > const char *command, > int nParams, > const Oid *paramTypes, > const char *const *paramValues, > const int *paramLengths, > !! const int *paramFormats, > int resultFormat) > > > > * How to use new 'bytea' ? > > ECPG programmers can use almost same as 'varchar' but cannot use as names. > (e.g. connection name, prepared statement name, cursor name and so on) > > - Can use in Declare Section. > > exec sql begin declare section; > bytea data1[512]; > bytea data2[DATA_SIZE]; /* can use macro */ > bytea send_data[DATA_NUM][DATA_SIZE]; /* can use two dimensional array > */ > bytea recv_data[][DATA_SIZE]; /* can use flexible array */ > exec sql end declare section; > > - Can *not* use for name. > > exec sql begin declare section; > bytea conn_name[DATA_SIZE]; > exec sql end declare section; > > exec sql connect to :conn_name; !! error > > - Conversion is not needed in user program. > > exec sql begin declare section; > bytea send_buf[DATA_SIZE]; > bytea recv_buf[DATA_SIZE - 13]; > int ind_recv; > exec sql end declare section; > > exec sql create table test (data1 bytea); > exec sql truncate test; > exec sql insert into test (data1) values (:send_buf); > exec sql select data1 into :recv_buf:ind_recv from test; > /* ind_recv is set to 13. */ > > > > * How to preprocess 'bytea' ? > > 'bytea' is preprocessed almost same as varchar. > The following is preprocessed to the next. > > exec sql begin declare section; > bytea data[DATA_SIZE]; > bytea send_data[DATA_NUM][DATA_SIZE]; > bytea recv_data[][DATA_SIZE]; > exec sql end declare section; > > struct bytea_1 {int len; char arr[DATA_SIZE]} data; > struct bytea_2 {int len; char arr[DATA_SIZE]} send_data[DATA_NUM]; > struct bytea_3 {int len; char arr[DATA_SIZE]} *recv_data; > > > Thank you for your consideration. > > > Regards > Ryo Matsumura > >
Attachment
pgsql-hackers by date: