RE: [PROPOSAL]a new data type 'bytea' for ECPG - Mailing list pgsql-hackers

From Matsumura, Ryo
Subject RE: [PROPOSAL]a new data type 'bytea' for ECPG
Date
Msg-id 03040DFF97E6E54E88D3BFEE5F5480F737A1CF3E@G01JPEXMBYT04
Whole thread Raw
In response to [PROPOSAL]a new data type 'bytea' for ECPG  ("Matsumura, Ryo" <matsumura.ryo@jp.fujitsu.com>)
List pgsql-hackers
Hackers

No one commented to the proposal, but I'm not discouraged.
I attach a patch. Please review or comment to proposal.

Note:
- The patch cannot yet decode escape format data from backend.
- [ecpg/test/expected/sql-bytea.stderr] in patch includes non-ascii data.


I explain a little about the patch.

Preproc:
  Almost same as varchar.

Ecpglib:
- ecpg_build_params()
  Build two more arrays paramlengths and paramformats for PQexecParams().
  If input variable type is bytea, set pramformats to 1(= is binary) and
  set binary data length to paramlengths.

- ecpg_store_input()
  If input variable type is bytea, copy its binary data to ecpg_alloc-ed area directly.

- ecpg_get_data()
  If output variable type is bytea, decode received results to user area.
  Encode/decode function is imported from backend/utils/adt/encode.c

- ECPGset_desc()
  Currently ecpglib saves data to internal area(struct descriptor_item) for execution phase,
  but doesn't save type information that is needed in case of bytea.
  So I add a member is_binary to descriptor_item structure.

Thank you.

Regards
Ryo Matsumura

> -----Original Message-----
> From: Matsumura, Ryo [mailto:matsumura.ryo@jp.fujitsu.com]
> Sent: Monday, October 1, 2018 5:04 PM
> To: pgsql-hackers@lists.postgresql.org
> Subject: [PROPOSAL]a new data type 'bytea' for ECPG
> 
> Hi, Hackers
> 
> # This is my first post.
> 
> I will try to implement a new data type 'bytea' for ECPG.
> I think that the implementation is not complicated.
> Does anyone need it ?
> 
> 
> * Why do I need bytea ?
> 
> Currently, ECPG program can treat binary data for bytea column with 'char'
> type
> of C language, but it must convert from/to escaped format with PQunescapeBytea/
> PQescapeBytea(). It forces users to add an unnecessary code and to pay cost
> for
> the conversion in runtime.
> # My PoC will not be able to solve output conversion cost.
> 
> I think that set/put data for host variable should be more simple.
> The following is an example of Oracle Pro *C program for RAW type column.
> 
>   VARCHAR   raw_data[20];
> 
>   /* preprocessed to the following
>    * struct
>    * {
>    *    unsigned short  len;
>    *    unsigned char   arr[20];
>    * } raw_data;
>    */
> 
>   raw_data.len = 10;
>   memcpy(raw_data.arr, data, 10);
> 
>   see also:
> 
> https://docs.oracle.com/cd/E11882_01/appdev.112/e10825/pc_04dat.htm#i2330
> 5
> 
> In ECPG, varchar host variable cannot be used for bytea because it cannot treat
> '\0' as part of data. If the length is set to 10 and there is '\0' at 3rd byte,
> ecpglib truncates 3rd byte and later at the following:
> 
>   [src/interfaces/ecpg/ecpglib/execute.c]
>   ecpg_store_input(const int lineno, const bool force_indicator, const struct
>   :
>       switch (var->type)
>   :
>         case ECPGt_varchar:
>           if (!(newcopy = (char *) ecpg_alloc(variable->len + 1, lineno)))
>             return false;
>   !!      strncpy(newcopy, variable->arr, variable->len);
>           newcopy[variable->len] = '\0';
> 
> I also think that the behavior of varchar host variable should not be changed
> because of compatibility.
> Therefore, a new type of host variable 'bytea' is needed.
> 
> Since ecpglib can distinguish between C string and binary data, it can send
> binary data to backend directly by using 'paramFormats' argument of
> PQexecParams().
> Unfortunately, the conversion of output data cannot be omitted in ecpglib
> because
> libpq doesn't provide like 'paramFormats'.
>  ('resultFormat' means that *all* data from backend is formatted by binary
> or not.)
> 
>   PQexecParams(PGconn *conn,
>              const char *command,
>              int nParams,
>              const Oid *paramTypes,
>              const char *const *paramValues,
>              const int *paramLengths,
>   !!         const int *paramFormats,
>              int resultFormat)
> 
> 
> 
> * How to use new 'bytea' ?
> 
> ECPG programmers can use almost same as 'varchar' but cannot use as names.
> (e.g. connection name, prepared statement name, cursor name and so on)
> 
>  - Can use in Declare Section.
> 
>   exec sql begin declare section;
>     bytea data1[512];
>     bytea data2[DATA_SIZE];   /* can use macro */
>     bytea send_data[DATA_NUM][DATA_SIZE];  /* can use two dimensional array
> */
>     bytea recv_data[][DATA_SIZE]; /* can use flexible array */
>   exec sql end declare section;
> 
>  - Can *not* use for name.
> 
>   exec sql begin declare section;
>     bytea conn_name[DATA_SIZE];
>   exec sql end declare section;
> 
>   exec sql connect to :conn_name;   !! error
> 
>  - Conversion is not needed in user program.
> 
>   exec sql begin declare section;
>       bytea send_buf[DATA_SIZE];
>       bytea recv_buf[DATA_SIZE - 13];
>       int ind_recv;
>   exec sql end declare section;
> 
>   exec sql create table test (data1 bytea);
>   exec sql truncate test;
>   exec sql insert into test (data1) values (:send_buf);
>   exec sql select data1 into :recv_buf:ind_recv from test;
>   /* ind_recv is set to 13. */
> 
> 
> 
> * How to preprocess 'bytea' ?
> 
>   'bytea' is preprocessed almost same as varchar.
>   The following is preprocessed to the next.
> 
>     exec sql begin declare section;
>       bytea data[DATA_SIZE];
>       bytea send_data[DATA_NUM][DATA_SIZE];
>       bytea recv_data[][DATA_SIZE];
>     exec sql end declare section;
> 
>     struct bytea_1 {int len; char arr[DATA_SIZE]} data;
>     struct bytea_2 {int len; char arr[DATA_SIZE]} send_data[DATA_NUM];
>     struct bytea_3 {int len; char arr[DATA_SIZE]} *recv_data;
> 
> 
> Thank you for your consideration.
> 
> 
> Regards
> Ryo Matsumura
> 
> 


Attachment

pgsql-hackers by date:

Previous
From: ilmari@ilmari.org (Dagfinn Ilmari Mannsåker)
Date:
Subject: Re: [PATCH] Tab complete EXECUTE FUNCTION for CREATE (EVENT) TRIGGER
Next
From: Tom Lane
Date:
Subject: Re: Log timestamps at higher resolution