Thread: BUG #16200: returned data from ESQL/C FETCH is trampling outside assigned memory for CHAR column
BUG #16200: returned data from ESQL/C FETCH is trampling outside assigned memory for CHAR column
From
PG Bug reporting form
Date:
The following bug has been logged on the website: Bug reference: 16200 Logged by: Matthias Apitz Email address: guru@unixarea.de PostgreSQL version: 11.4 Operating system: SuSE Linux SLES 12 SP4 Description: We encounter the following problem with ESQL/C: Imagine a table with two columns: CHAR(16) and DATE In the database the CHAR column can contain not only 16 bytes, but 16 Unicode chars, which are longer than 16 bytes if one or more of the chars is an UTF-8 multibyte encoded char. If one provides in C a host structure to FETCH the data as: EXEC SQL BEGIN DECLARE SECTION; struct r_d02ben_ec { char string[17]; char date[11]; }; typedef struct r_d02ben_ec t_d02ben_ec; t_d02ben_ec *hp_d02ben, hrec_d02ben; EXEC SQL END DECLARE SECTION; and fetches the data with ESQL/C as: EXEC SQL FETCH hc_d02ben INTO :hrec_d02ben; The generated C-code looks like this: ... ECPGdo(__LINE__, 0, 1, NULL, 0, ECPGst_normal, "fetch hc_d02ben", ECPGt_EOIT, ECPGt_char,&(hrec_d02ben.string),(long)17,(long)1,sizeof( struct r_d02ben_ec ), ECPGt_NO_INDICATOR, NULL , 0L, 0L, 0L, ECPGt_char,&(hrec_d02ben.date),(long)11,(long)1,sizeof( struct r_d02ben_ec ), ECPGt_NO_INDICATOR, NULL , 0L, 0L, 0L, ... As you can see for the first element CHAR the length 17 is given as argument to ECPGdo() together with the pointer to where the data should be stored and for the second element DATE the length 11 is ggiven as argument (which is big enough to receive the ASCII string "MM.DD.YYYY" and a trailing \0). What we now see using GDB is that for the first element all UTF-8 data is returned, lets asume only one multibyte char, which gives 17 bytes, not only 16, and the trailing NULL is already placed into the memory for the DATE. Now the function ECPGdo() places the DATE as "MM.DD.YYYY" into the area pointed to for the 2nd argument and with this overwrites the NULL terminator of the string[17] element. Result is later a SIGSEGV because the expected string in string[17] is not NULL terminated anymore :-) I would call it a bug, that ECPGdo() puts more than 17 bytes (16 bytes + NULL) as return into the place pointed to by the host var pointer when the column in the database has more (UTF-8) chars as will fit into 16+1 byte.
Re: BUG #16200: returned data from ESQL/C FETCH is tramplingoutside assigned memory for CHAR column
From
Michael Meskes
Date:
Hi, > We encounter the following problem with ESQL/C: Imagine a table with I assume you mean ECPG, right? ESQL/C would be the Informix compiler. > In the database the CHAR column can contain not only 16 bytes, but 16 > Unicode chars, > which are longer than 16 bytes if one or more of the chars is an UTF- > 8 > multibyte > encoded char. > ... > the DATE. Now the function ECPGdo() places the DATE as "MM.DD.YYYY" > into the area pointed to for the 2nd argument and with this > overwrites > the NULL terminator of the string[17] element. Result is later a > SIGSEGV because the expected string in string[17] is not NULL > terminated anymore :-) > > I would call it a bug, that ECPGdo() puts more than 17 bytes (16 > bytes + > NULL) as return into the place pointed to by the host var pointer > when > the column in the database has more (UTF-8) chars as will fit into > 16+1 byte. Actually I am not sure if this is a bug. I do not remember the standard asking for a null termination at the end of a partial string copy. Please correct me if I am wrong. What it does ask for is setting the indicator accordingly. However, you do not mention any indicator, so I wonder if you checked that one at all. If the string is truncated and the appropriate error action is not taken, that would definitely qualify as a bug. Could you please verify if the indicator is set accordingly? If you have a small test case to reproduce the issue, please send that one. Otherwise I can create my own but probably won't have time before next week. Thanks Michael -- Michael Meskes Michael at Fam-Meskes dot De, Michael at Meskes dot (De|Com|Net|Org) Meskes at (Debian|Postgresql) dot Org Jabber: michael at xmpp dot meskes dot org VfL Borussia! Força Barça! SF 49ers! Use Debian GNU/Linux, PostgreSQL
Re: BUG #16200: returned data from ESQL/C FETCH is trampling outsideassigned memory for CHAR column
From
Matthias Apitz
Date:
El día viernes, enero 10, 2020 a las 01:43:38p. m. +0100, Michael Meskes escribió: > Hi, > > > We encounter the following problem with ESQL/C: Imagine a table with > > I assume you mean ECPG, right? ESQL/C would be the Informix compiler. Hi. I mean with ESQL/C the general term "Embedded SQL in C". Of course I do know that for PostgreSQL the precompiler is named "ecpg". > > In the database the CHAR column can contain not only 16 bytes, but 16 > > Unicode chars, > > which are longer than 16 bytes if one or more of the chars is an UTF- > > 8 > > multibyte > > encoded char. > > ... > > the DATE. Now the function ECPGdo() places the DATE as "MM.DD.YYYY" > > into the area pointed to for the 2nd argument and with this > > overwrites > > the NULL terminator of the string[17] element. Result is later a > > SIGSEGV because the expected string in string[17] is not NULL > > terminated anymore :-) > > > > I would call it a bug, that ECPGdo() puts more than 17 bytes (16 > > bytes + > > NULL) as return into the place pointed to by the host var pointer > > when > > the column in the database has more (UTF-8) chars as will fit into > > 16+1 byte. > > Actually I am not sure if this is a bug. I do not remember the standard > asking for a null termination at the end of a partial string copy. Neither do I know if there is some written standard, but char strings IMHO should be NULL terminated. Our application expects in this example no more than 16 bytes and that's why we provide a host variable as the struct member of 17 bytes so the NULL fits into. I digged into the sources of the ecpglib and this small fix solved our problem: postgresql-11.4/src/interfaces/ecpg> diff ecpglib/data.c ecpglib/data.c.orig 527,531c527 < /* guru@unixarea.de: strncpy() only varcharsize-1 */ < strncpy(str, pval, varcharsize-1); < /* guru@unixarea.de: and terminate the string */ < str[varcharsize - 1] = '\0'; < ecpg_log("DEBUG ESQL/C: result [%s] len %d\n", str, strlen(str)); --- > strncpy(str, pval, varcharsize); If I now store 16 UTF-8 chars German Umlaut 'ä' into the column, the ECPG log is this: [30481] [10.01.2020 12:29:28:936]: ecpg_get_data on line 3181: RESULT: ääääääääääääääää offset: 1600; array: no [30481] [10.01.2020 12:29:28:936]: DEBUG ESQL/C: varcharsize 17 size 32 [30481] [10.01.2020 12:29:28:936]: DEBUG ESQL/C: result [ääääääää] len 16 (The two lines with DEBUG are added by me in execute.c to understand better the problem). ECPG would return as RESULT as it says 'ääääääääääääääää' which is 32 bytes (16 UTF-8 chars), our varcharsize is only 17 and we now truncate the RESULT to 16 bytes (8 UTF-8 'ä') and a trailing \0 with my proposed change and all is fine. Btw: truncating UTF-8 strings in any case brings the risk, that the resulting string is not longer clean UTF-8 if you hit into the middle of a multibyte char. But such issue should be handled by the application and not by ECPG. > Please correct me if I am wrong. What it does ask for is setting the > indicator accordingly. However, you do not mention any indicator, so I > wonder if you checked that one at all. If the string is truncated and > the appropriate error action is not taken, that would definitely > qualify as a bug. > > Could you please verify if the indicator is set accordingly? If I read the docs here https://www.postgresql.org/docs/11/ecpg-variables.html#ECPG-INDICATORS indicator vars are meant to show null values in the table and not the string truncation. Am I wrong? We do not use indicators at all, but catch the error condition -213. In any case, thanks for your attention. matthias -- Matthias Apitz, ✉ guru@unixarea.de, http://www.unixarea.de/ +49-176-38902045 Public GnuPG key: http://www.unixarea.de/key.pub "Glaube wenig, hinterfrage alles, denke selbst: Wie man Manipulationen durchschaut" "Believe little, scrutinise all, think by your own: How see through manipulations" ISBN-10: 386489218X
Attachment
Re: BUG #16200: returned data from ESQL/C FETCH is tramplingoutside assigned memory for CHAR column
From
Michael Meskes
Date:
> I mean with ESQL/C the general term "Embedded SQL in C". Of course I > do > know that for PostgreSQL the precompiler is named "ecpg". No worries, I was just making sure we do not talk about different things as I have never seen anyone use this abbreviation before. > Neither do I know if there is some written standard, but char strings There is. > IMHO > should be NULL terminated. Our application expects in this example no > more than 16 bytes and that's why we provide a host variable as the > struct member of 17 bytes so the NULL fits into. > > I digged into the sources of the ecpglib and this small fix solved > our problem: > ... Sure, this fixes your use case, but as I said, I'm not sure it's the right thing to do for general use. > > Could you please verify if the indicator is set accordingly? > > If I read the docs here > https://www.postgresql.org/docs/11/ecpg-variables.html#ECPG-INDICATORS > indicator vars are meant to show null values in the table and not the > string truncation. Am I wrong? We do not use indicators at all, but > catch the error condition -213. Sorry, you're right of course. What I meant was the sqlca structure which includes: ... char sqlwarn[8]; /* Element 0: set to 'W' if at least one other is 'W' */ /* 1: if 'W' at least one character string */ /* value was truncated when it was */ /* stored into a host variable. */ ... Michael -- Michael Meskes Michael at Fam-Meskes dot De, Michael at Meskes dot (De|Com|Net|Org) Meskes at (Debian|Postgresql) dot Org Jabber: michael at xmpp dot meskes dot org VfL Borussia! Força Barça! SF 49ers! Use Debian GNU/Linux, PostgreSQL