Thread: BUG #17421: Core dump in ECPGdo() when calling PostgreSQL API from 32-bit client for RHEL8
BUG #17421: Core dump in ECPGdo() when calling PostgreSQL API from 32-bit client for RHEL8
From
PG Bug reporting form
Date:
The following bug has been logged on the website: Bug reference: 17421 Logged by: Masayuki Hirose Email address: hirose.masay-01@jp.fujitsu.com PostgreSQL version: 12.1 Operating system: Red Hat Enterprise Linux Description: Core dump in ECPGdo() when calling PostgreSQL API from 32-bit client for RHEL8 Hello, I have encountered the core dump in ECPGdo () issue. Have you ever seen this error? -------- (gdb) where #0 0xf7163dc7 in strlen () from /usr/lib/libc.so.6 #1 0x08169faf in dopr () #2 0x08169c19 in pg_vsnprintf () #3 0x08169c6b in pg_snprintf () #4 0x08153ec7 in ecpg_raise_backend () #5 0x081540bb in ecpg_check_Pqresult () #6 0x0814da6a in ecpg_autostart_transaction () #7 0x0814ebc6 in ecpg_do () #8 0x0814ec79 in ECPGdo () #9 0xf7f96e40 in TJVvDatabaseAPI::_ExecSQL (this=0x9fb73f0, ctxnum=0, -------- I am using the 32-bit client for RHEL8 and it calls PostgreSQL API, thus I encountered this error. Regards, Masa
Re: BUG #17421: Core dump in ECPGdo() when calling PostgreSQL API from 32-bit client for RHEL8
From
Michael Paquier
Date:
On Sun, Feb 27, 2022 at 05:21:32AM +0000, PG Bug reporting form wrote: > Have you ever seen this error? No such issue has been reported AFAIK. If you really are on 12.1, you may want to update to the latest version of 12.X and retry if the error still shows up. > -------- > (gdb) where > #0 0xf7163dc7 in strlen () from /usr/lib/libc.so.6 > #1 0x08169faf in dopr () > #2 0x08169c19 in pg_vsnprintf () > #3 0x08169c6b in pg_snprintf () > #4 0x08153ec7 in ecpg_raise_backend () > #5 0x081540bb in ecpg_check_Pqresult () > #6 0x0814da6a in ecpg_autostart_transaction () > #7 0x0814ebc6 in ecpg_do () > #8 0x0814ec79 in ECPGdo () > #9 0xf7f96e40 in TJVvDatabaseAPI::_ExecSQL (this=0x9fb73f0, ctxnum=0, > -------- > > I am using the 32-bit client for RHEL8 and it calls PostgreSQL API, thus I > encountered this error. Hm. Could you isolate that in a self-contained test case? Based on this trace, it looks like "message" is NULL, which may be possible because pqInternalNotice() missed something? I would not bet on errorMessage being NULL, but there may be holes.. -- Michael
Attachment
Re: BUG #17421: Core dump in ECPGdo() when calling PostgreSQL API from 32-bit client for RHEL8
From
Tom Lane
Date:
Michael Paquier <michael@paquier.xyz> writes: > Hm. Could you isolate that in a self-contained test case? Based on > this trace, it looks like "message" is NULL, which may be possible > because pqInternalNotice() missed something? I would not bet on > errorMessage being NULL, but there may be holes.. Yeah. It seems likely that this is a longstanding ecpglib bug that was previously masked by platform snprintfs not crashing on printf("%s", NULL). If so, it's masked again in 12.8 and later (cf 3779ac62d), but it's still a bug in that ecpg won't print anything useful when this edge condition --- whatever it is --- happens. So, could we see a test case? regards, tom lane
RE: BUG #17421: Core dump in ECPGdo() when calling PostgreSQL API from 32-bit client for RHEL8
From
"hirose.masay-01@fujitsu.com"
Date:
Hi Tom and Michael, >Michael Paquier <michael@paquier.xyz> writes: >> Hm. Could you isolate that in a self-contained test case? Based on >> this trace, it looks like "message" is NULL, which may be possible >> because pqInternalNotice() missed something? I would not bet on >> errorMessage being NULL, but there may be holes.. > >Yeah. It seems likely that this is a longstanding ecpglib bug that was previously masked by platform snprintfs not crashingon printf("%s", NULL). If so, it's masked again in 12.8 and later (cf 3779ac62d), but it's still a bug in that ecpgwon't print anything useful when this edge condition --- whatever it is --- happens. So, could we see a test case? > > regards, tom lane My test case to reproduce the issue is: 1. The client connects Postgres Database and issues SQL continuously. 2. Switch the Database role from Active to Standby. The Database is mirrored by the Mirroring Controller between two clustered servers. the Mirroring Controller may be the originalfeature added by the enterprise. Please let me know if you have notice and advice. Regards,
Re: BUG #17421: Core dump in ECPGdo() when calling PostgreSQL API from 32-bit client for RHEL8
From
Michael Paquier
Date:
On Sat, Mar 05, 2022 at 06:45:21AM +0000, hirose.masay-01@fujitsu.com wrote: > My test case to reproduce the issue is: > 1. The client connects Postgres Database and issues SQL continuously. > 2. Switch the Database role from Active to Standby. > The Database is mirrored by the Mirroring Controller between two > clustered servers. the Mirroring Controller may be the original > feature added by the enterprise. A self-contained test case enters in the category of an ECPG script that we could use to reproduce the problem. Personally, I have no idea what kind of application stack you are using, and I don't know TJVvDatabaseAPI, which I suspect is a proprietary solution for something related to databases. The information you are providing here is not enough for one to know how to reproduce this problem. -- Michael
Attachment
Re: BUG #17421: Core dump in ECPGdo() when calling PostgreSQL API from 32-bit client for RHEL8
From
Tom Lane
Date:
Michael Paquier <michael@paquier.xyz> writes: > On Sat, Mar 05, 2022 at 06:45:21AM +0000, hirose.masay-01@fujitsu.com wrote: >> My test case to reproduce the issue is: >> 1. The client connects Postgres Database and issues SQL continuously. >> 2. Switch the Database role from Active to Standby. >> The Database is mirrored by the Mirroring Controller between two >> clustered servers. the Mirroring Controller may be the original >> feature added by the enterprise. > A self-contained test case enters in the category of an ECPG script > that we could use to reproduce the problem. Personally, I have no > idea what kind of application stack you are using, and I don't know > TJVvDatabaseAPI, which I suspect is a proprietary solution for > something related to databases. The information you are providing > here is not enough for one to know how to reproduce this problem. "Switch from active to standby" isn't even possible in community Postgres, so there are definitely moving parts in this recipe that we are not responsible for or familiar with. Perhaps the problem can be reproduced with just stock Postgres, but nobody here is going to expend the effort to try to build a reproducer from this amount of information. We have a wiki page offering advice about creating actionable problem reports: https://wiki.postgresql.org/wiki/Guide_to_reporting_problems regards, tom lane
RE: BUG #17421: Core dump in ECPGdo() when calling PostgreSQL API from 32-bit client for RHEL8
From
"hirose.masay-01@fujitsu.com"
Date:
Hi, Sorry for the late response. I report my test result using the client 12.10, formerly 12.10, and insight with the code trace. I tested the same scenario to reproduce the issue using the postgreSQL client 12.10. The issue was reproduced same as the previous version 12.1. The frequency is as follows : Client 12.1 : DB server down 30 times -> Core dump 7 times Client 12.10: DB server down 30 times -> Core dump 6 times I cannot tell the detailed condition to reproduce the issue, however it looks that the issue occurs when the client issuesSQL commit and simultaneously the Database server goes down. Next, I tried to find out where the core is generated in ecpg_raise_backend code with debug log inserted. In case the core is NOT generated - Case(a), ecpg_raise_backend() processed error information(sqlca) normally. In this case,my application retried the transaction. In case the core is generated - Case(b), ecpg_raise_backend() did not process error information(sqlca) correctly but generatedthe core file. The difference between Case(a) and Case(b) is as follows: In case (a), the client detects communication error between Database server 20 seconds after it issued SQL commit. When ecpg_raise_backendis called, "status" in PGconn context was set as "CONNECTION_BAD(1)" In case (b) of core dumped, the client detects communication error between Database server 2 minutes after it issued SQLcommit. When ecpg_raise_backend is called, "status" in PGconn context was set as "CONNECTION_OK(0)". Please check the following log message and the code trace in ecpg_raise_backend(). When "status" in the PGconn context is "CONNECTION_OK(0)" - Case(b), the core dump could occur in snprintf of ecpg_raise_backend().In this scenario, once the "message" is NULL and the value is set unchanged until the snprintf is called. Please note that Japanese log messages are replaced to "xxx" Case(a) normal -------------- Client log: ... 20:20:18 [...]: ecpg_execute on line 941: using PQexec 20:20:18 [...]: ecpg_process_output on line 941: OK: INSERT 0 1 20:20:18 [...]: ECPGtrans on line 716: action "commit work"; connection "0" 20:20:39 [...]: ECPGnoticeReceiver: xxx (18563) 20:20:39 [...]: raising sqlcode 0 20:20:39 [...]: ecpg_check_PQresult on line 941: bad response - server closed the connection unexpectedly 20:20:39 This probably means the server terminated abnormally 20:20:39 before or while processing the request. 20:20:39 [...]: raise_backend start: conn - server closed the connection unexpectedly 20:20:39 This probably means the server terminated abnormally 20:20:39 before or while processing the request. 20:20:39 [...]: raise_backend start: conn - 1 20:20:39 [...]: raise_backend: result not NULL 20:20:39 [...]: raise_backend: sqlstate NULL 20:20:39 [...]: raise_backend: message NULL 20:20:39 [...]: raise_backend: sqlstate INTERNAL_ERROR 20:20:39 [...]: ecpg_raise_backend sqlstate: 57P02 messase: the connection to the server was lost errno: 0 20:20:39 [...]: raising sqlstate 57P02 (sqlcode -400): the connection to the server was lost on line 941 20:20:39 [...]: ECPGtrans on line 768: action "rollback work"; connection "0" 20:20:39 [...]: ecpg_check_PQresult on line 768: no result - no connection to the server 20:20:39 [...]: raise_backend start: conn - no connection to the server 20:20:39 [...]: raise_backend start: conn - 1 20:20:39 [...]: raise_backend: result NULL 20:20:39 [...]: raise_backend: sqlstate INTERNAL_ERROR 20:20:39 [...]: ecpg_raise_backend sqlstate: 57P02 messase: the connection to the server was lost errno: 0 20:20:39 [...]: raising sqlstate 57P02 (sqlcode -400): the connection to the server was lost on line 768 20:20:39 [...]: ecpg_finish: connection 0 closed DB server log: ... 20:20:31.227 ... PostgreSQL JDBC Driver) WARNING: 57P02: xxx (...) ... Case(b) core dump -------------- Client log: ... 18:38:17 [...]: ecpg_execute on line 561: using PQexec 18:38:17 [...]: ecpg_process_output on line 561: OK: CLOSE CURSOR 18:38:17 [...]: deallocate_one on line 562: name stmid 18:38:17 [...]: ECPGtrans on line 716: action "commit work"; connection "0" 18:40:15 [...]: ECPGnoticeReceiver: xxx (...) 18:40:15 [...]: raising sqlcode 0 18:40:15 [...]: ecpg_check_PQresult on line 941: bad response - server closed the connection unexpectedly 18:40:15 This probably means the server terminated abnormally 18:40:15 before or while processing the request. 18:40:15 [...]: raise_backend start: conn - server closed the connection unexpectedly 18:40:15 This probably means the server terminated abnormally 18:40:15 before or while processing the request. 18:40:15 [...]: raise_backend start: conn - 0 18:40:15 [...]: raise_backend: result not NULL 18:40:15 [...]: raise_backend: sqlstate NULL 18:40:15 [...]: raise_backend: message NULL 18:40:15 [...]: raise_backend: sqlstate INTERNAL_ERROR 18:40:16 // No subsequent log messages due to the core dump // DB server log: ... 18:38:17.952 ... PostgreSQL JDBC Driver) WARNING: 57P02: xxx (...) ... The ecpg_raise_backend code used with debug log: void ecpg_raise_backend(int line, PGresult *result, PGconn *conn, int compat) { struct sqlca_t *sqlca = ECPGget_sqlca(); char *sqlstate; char *message; /* debug */ ecpg_log("raise_backend start: conn - %s", PQerrorMessage(conn)); ecpg_log("raise_backend start: conn - %d\n", PQstatus(conn)); ... if (result) { sqlstate = PQresultErrorField(result, PG_DIAG_SQLSTATE); /* debug */ ecpg_log("raise_backend: result not NULL\n"); /* debug end */ if (sqlstate == NULL) { /* debug */ ecpg_log("raise_backend: sqlstate NULL\n"); /* debug end */ sqlstate = ECPG_SQLSTATE_ECPG_INTERNAL_ERROR; } /* debug */ message = PQresultErrorField(result, PG_DIAG_MESSAGE_PRIMARY); /* debug */ if (message == NULL) { ecpg_log("raise_backend: message NULL\n"); } /* debug end */ } else { /* debug */ ecpg_log("raise_backend: result NULL\n"); /* debug end */ sqlstate = ECPG_SQLSTATE_ECPG_INTERNAL_ERROR; message = PQerrorMessage(conn); } if (strcmp(sqlstate, ECPG_SQLSTATE_ECPG_INTERNAL_ERROR) == 0) { /* * we might get here if the connection breaks down, so let's check for * this instead of giving just the generic internal error */ /* debug */ ecpg_log("raise_backend: sqlstate INTERNAL_ERROR\n"); /* debug end */ if (PQstatus(conn) == CONNECTION_BAD) { sqlstate = "57P02"; message = ecpg_gettext("the connection to the server was lost"); } } /* Debug start */ ecpg_log("ecpg_raise_backend sqlstate: %s messase: %s errno: %d\n", sqlstate, message, errno); /* Debug end */ /* copy error message */ snprintf(sqlca->sqlerrm.sqlerrmc, sizeof(sqlca->sqlerrm.sqlerrmc), "%s on line %d", message, line); sqlca->sqlerrm.sqlerrml = strlen(sqlca->sqlerrm.sqlerrmc); ... Regards, Masa -----Original Message----- From: Tom Lane <tgl@sss.pgh.pa.us> Sent: Sunday, March 6, 2022 12:19 AM To: Michael Paquier <michael@paquier.xyz> Cc: Hirose, Masayuki/廣世 政幸 <hirose.masay-01@fujitsu.com>; pgsql-bugs@lists.postgresql.org Subject: Re: BUG #17421: Core dump in ECPGdo() when calling PostgreSQL API from 32-bit client for RHEL8 Michael Paquier <michael@paquier.xyz> writes: > On Sat, Mar 05, 2022 at 06:45:21AM +0000, hirose.masay-01@fujitsu.com wrote: >> My test case to reproduce the issue is: >> 1. The client connects Postgres Database and issues SQL continuously. >> 2. Switch the Database role from Active to Standby. >> The Database is mirrored by the Mirroring Controller between two >> clustered servers. the Mirroring Controller may be the original >> feature added by the enterprise. > A self-contained test case enters in the category of an ECPG script > that we could use to reproduce the problem. Personally, I have no > idea what kind of application stack you are using, and I don't know > TJVvDatabaseAPI, which I suspect is a proprietary solution for > something related to databases. The information you are providing > here is not enough for one to know how to reproduce this problem. "Switch from active to standby" isn't even possible in community Postgres, so there are definitely moving parts in this recipethat we are not responsible for or familiar with. Perhaps the problem can be reproduced with just stock Postgres,but nobody here is going to expend the effort to try to build a reproducer from this amount of information. We have a wiki page offering advice about creating actionable problem reports: https://wiki.postgresql.org/wiki/Guide_to_reporting_problems regards, tom lane
RE: BUG #17421: Core dump in ECPGdo() when calling PostgreSQL API from 32-bit client for RHEL8
From
"hirose.masay-01@fujitsu.com"
Date:
Hello, I could not find out the root cause of this case. Instead I propose a workaround fix for ecpglib not to generate the corefile in snprintf(). I expect the fix would avoid most of irregular cases. Could you review the fix and include it to the latest sources bothof Postgres12.x and 14.x? I explain the problem and the fix below: [Problem] The potential problem is no value is set to "message" after PQresultErrorField() is called in ecpg_raise_backend(). src\interfaces\ecpg\ecpglib\error.c ---------------------------- if (result) { sqlstate = PQresultErrorField(result, PG_DIAG_SQLSTATE); if (sqlstate == NULL) sqlstate = ECPG_SQLSTATE_ECPG_INTERNAL_ERROR; message = PQresultErrorField(result, PG_DIAG_MESSAGE_PRIMARY); } ---------------------------- I take another function as an example, ECPGnoticeReceiver() in connect.c, which has similar code. The value "empty message text" is set to "message" just in case. I propose to add the same code to ecpg_raise_backend() asECPGnoticeReceiver(). src\interfaces\ecpg\ecpglib\connect.c ---------------------------- static void ECPGnoticeReceiver(void *arg, const PGresult *result) { char *sqlstate = PQresultErrorField(result, PG_DIAG_SQLSTATE); char *message = PQresultErrorField(result, PG_DIAG_MESSAGE_PRIMARY); if (sqlstate == NULL) sqlstate = ECPG_SQLSTATE_ECPG_INTERNAL_ERROR; if (message == NULL) /* Shouldn't happen, but need to be sure */ message = ecpg_gettext("empty message text"); ---------------------------- [Fix] The fix is to add 2 lines with changebar below in ecpg_raise_backend(): src\interfaces\ecpg\ecpglib\error.c (version12.1 line:237) ---------------------------- if (result) { sqlstate = PQresultErrorField(result, PG_DIAG_SQLSTATE); if (sqlstate == NULL) sqlstate = ECPG_SQLSTATE_ECPG_INTERNAL_ERROR; message = PQresultErrorField(result, PG_DIAG_MESSAGE_PRIMARY); | if (message == NULL) | message = ecpg_gettext("empty message text"); } ---------------------------- [Test result] With the fix, I confirmed the snprintf() did not generate the core and returned the sqlstate "YE000" as expected. Regards, Masa -----Original Message----- From: Hirose, Masayuki/廣世 政幸 Sent: Thursday, April 7, 2022 2:31 AM To: 'Tom Lane' <tgl@sss.pgh.pa.us>; 'Michael Paquier' <michael@paquier.xyz> Cc: 'pgsql-bugs@lists.postgresql.org' <pgsql-bugs@lists.postgresql.org> Subject: RE: BUG #17421: Core dump in ECPGdo() when calling PostgreSQL API from 32-bit client for RHEL8 Hi, Sorry for the late response. I report my test result using the client 12.10, formerly 12.10, and insight with the code trace. I tested the same scenario to reproduce the issue using the postgreSQL client 12.10. The issue was reproduced same as the previous version 12.1. The frequency is as follows : Client 12.1 : DB server down 30 times -> Core dump 7 times Client 12.10: DB server down 30 times -> Core dump 6 times I cannot tell the detailed condition to reproduce the issue, however it looks that the issue occurs when the client issuesSQL commit and simultaneously the Database server goes down. Next, I tried to find out where the core is generated in ecpg_raise_backend code with debug log inserted. In case the core is NOT generated - Case(a), ecpg_raise_backend() processed error information(sqlca) normally. In this case,my application retried the transaction. In case the core is generated - Case(b), ecpg_raise_backend() did not process error information(sqlca) correctly but generatedthe core file. The difference between Case(a) and Case(b) is as follows: In case (a), the client detects communication error between Database server 20 seconds after it issued SQL commit. When ecpg_raise_backendis called, "status" in PGconn context was set as "CONNECTION_BAD(1)" In case (b) of core dumped, the client detects communication error between Database server 2 minutes after it issued SQLcommit. When ecpg_raise_backend is called, "status" in PGconn context was set as "CONNECTION_OK(0)". Please check the following log message and the code trace in ecpg_raise_backend(). When "status" in the PGconn context is "CONNECTION_OK(0)" - Case(b), the core dump could occur in snprintf of ecpg_raise_backend().In this scenario, once the "message" is NULL and the value is set unchanged until the snprintf is called. Please note that Japanese log messages are replaced to "xxx" Case(a) normal -------------- Client log: ... 20:20:18 [...]: ecpg_execute on line 941: using PQexec 20:20:18 [...]: ecpg_process_output on line 941: OK: INSERT 0 1 20:20:18 [...]: ECPGtrans on line 716: action "commit work"; connection "0" 20:20:39 [...]: ECPGnoticeReceiver: xxx (18563) 20:20:39 [...]: raising sqlcode 0 20:20:39 [...]: ecpg_check_PQresult on line 941: bad response - server closed the connection unexpectedly 20:20:39 This probably means the server terminated abnormally 20:20:39 before or while processing the request. 20:20:39 [...]: raise_backend start: conn - server closed the connection unexpectedly 20:20:39 This probably means the server terminated abnormally 20:20:39 before or while processing the request. 20:20:39 [...]: raise_backend start: conn - 1 20:20:39 [...]: raise_backend: result not NULL 20:20:39 [...]: raise_backend: sqlstate NULL 20:20:39 [...]: raise_backend: message NULL 20:20:39 [...]: raise_backend: sqlstate INTERNAL_ERROR 20:20:39 [...]: ecpg_raise_backend sqlstate: 57P02 messase: the connection to the server was lost errno: 0 20:20:39 [...]: raising sqlstate 57P02 (sqlcode -400): the connection to the server was lost on line 941 20:20:39 [...]: ECPGtrans on line 768: action "rollback work"; connection "0" 20:20:39 [...]: ecpg_check_PQresult on line 768: no result - no connection to the server 20:20:39 [...]: raise_backend start: conn - no connection to the server 20:20:39 [...]: raise_backend start: conn - 1 20:20:39 [...]: raise_backend: result NULL 20:20:39 [...]: raise_backend: sqlstate INTERNAL_ERROR 20:20:39 [...]: ecpg_raise_backend sqlstate: 57P02 messase: the connection to the server was lost errno: 0 20:20:39 [...]: raising sqlstate 57P02 (sqlcode -400): the connection to the server was lost on line 768 20:20:39 [...]: ecpg_finish: connection 0 closed DB server log: ... 20:20:31.227 ... PostgreSQL JDBC Driver) WARNING: 57P02: xxx (...) ... Case(b) core dump -------------- Client log: ... 18:38:17 [...]: ecpg_execute on line 561: using PQexec 18:38:17 [...]: ecpg_process_output on line 561: OK: CLOSE CURSOR 18:38:17 [...]: deallocate_one on line 562: name stmid 18:38:17 [...]: ECPGtrans on line 716: action "commit work"; connection "0" 18:40:15 [...]: ECPGnoticeReceiver: xxx (...) 18:40:15 [...]: raising sqlcode 0 18:40:15 [...]: ecpg_check_PQresult on line 941: bad response - server closed the connection unexpectedly 18:40:15 This probably means the server terminated abnormally 18:40:15 before or while processing the request. 18:40:15 [...]: raise_backend start: conn - server closed the connection unexpectedly 18:40:15 This probably means the server terminated abnormally 18:40:15 before or while processing the request. 18:40:15 [...]: raise_backend start: conn - 0 18:40:15 [...]: raise_backend: result not NULL 18:40:15 [...]: raise_backend: sqlstate NULL 18:40:15 [...]: raise_backend: message NULL 18:40:15 [...]: raise_backend: sqlstate INTERNAL_ERROR 18:40:16 // No subsequent log messages due to the core dump // DB server log: ... 18:38:17.952 ... PostgreSQL JDBC Driver) WARNING: 57P02: xxx (...) ... The ecpg_raise_backend code used with debug log: void ecpg_raise_backend(int line, PGresult *result, PGconn *conn, int compat) { struct sqlca_t *sqlca = ECPGget_sqlca(); char *sqlstate; char *message; /* debug */ ecpg_log("raise_backend start: conn - %s", PQerrorMessage(conn)); ecpg_log("raise_backend start: conn - %d\n", PQstatus(conn)); ... if (result) { sqlstate = PQresultErrorField(result, PG_DIAG_SQLSTATE); /* debug */ ecpg_log("raise_backend: result not NULL\n"); /* debug end */ if (sqlstate == NULL) { /* debug */ ecpg_log("raise_backend: sqlstate NULL\n"); /* debug end */ sqlstate = ECPG_SQLSTATE_ECPG_INTERNAL_ERROR; } /* debug */ message = PQresultErrorField(result, PG_DIAG_MESSAGE_PRIMARY); /* debug */ if (message == NULL) { ecpg_log("raise_backend: message NULL\n"); } /* debug end */ } else { /* debug */ ecpg_log("raise_backend: result NULL\n"); /* debug end */ sqlstate = ECPG_SQLSTATE_ECPG_INTERNAL_ERROR; message = PQerrorMessage(conn); } if (strcmp(sqlstate, ECPG_SQLSTATE_ECPG_INTERNAL_ERROR) == 0) { /* * we might get here if the connection breaks down, so let's check for * this instead of giving just the generic internal error */ /* debug */ ecpg_log("raise_backend: sqlstate INTERNAL_ERROR\n"); /* debug end */ if (PQstatus(conn) == CONNECTION_BAD) { sqlstate = "57P02"; message = ecpg_gettext("the connection to the server was lost"); } } /* Debug start */ ecpg_log("ecpg_raise_backend sqlstate: %s messase: %s errno: %d\n", sqlstate, message, errno); /* Debug end */ /* copy error message */ snprintf(sqlca->sqlerrm.sqlerrmc, sizeof(sqlca->sqlerrm.sqlerrmc), "%s on line %d", message, line); sqlca->sqlerrm.sqlerrml = strlen(sqlca->sqlerrm.sqlerrmc); ... Regards, Masa -----Original Message----- From: Tom Lane <tgl@sss.pgh.pa.us> Sent: Sunday, March 6, 2022 12:19 AM To: Michael Paquier <michael@paquier.xyz> Cc: Hirose, Masayuki/廣世 政幸 <hirose.masay-01@fujitsu.com>; pgsql-bugs@lists.postgresql.org Subject: Re: BUG #17421: Core dump in ECPGdo() when calling PostgreSQL API from 32-bit client for RHEL8 Michael Paquier <michael@paquier.xyz> writes: > On Sat, Mar 05, 2022 at 06:45:21AM +0000, hirose.masay-01@fujitsu.com wrote: >> My test case to reproduce the issue is: >> 1. The client connects Postgres Database and issues SQL continuously. >> 2. Switch the Database role from Active to Standby. >> The Database is mirrored by the Mirroring Controller between two >> clustered servers. the Mirroring Controller may be the original >> feature added by the enterprise. > A self-contained test case enters in the category of an ECPG script > that we could use to reproduce the problem. Personally, I have no > idea what kind of application stack you are using, and I don't know > TJVvDatabaseAPI, which I suspect is a proprietary solution for > something related to databases. The information you are providing > here is not enough for one to know how to reproduce this problem. "Switch from active to standby" isn't even possible in community Postgres, so there are definitely moving parts in this recipethat we are not responsible for or familiar with. Perhaps the problem can be reproduced with just stock Postgres,but nobody here is going to expend the effort to try to build a reproducer from this amount of information. We have a wiki page offering advice about creating actionable problem reports: https://wiki.postgresql.org/wiki/Guide_to_reporting_problems regards, tom lane
Re: BUG #17421: Core dump in ECPGdo() when calling PostgreSQL API from 32-bit client for RHEL8
From
Tom Lane
Date:
"hirose.masay-01@fujitsu.com" <hirose.masay-01@fujitsu.com> writes: > [Problem] > The potential problem is no value is set to "message" after PQresultErrorField() is called in ecpg_raise_backend(). Ah-hah. You're right, and this explains the symptoms exactly, because libpq-generated error results don't contain broken-down fields, so we'd get a null precisely in cases such as lost connection. > | if (message == NULL) > | message = ecpg_gettext("empty message text"); No, that'd be pretty unhelpful. The best response is to substitute PQerrorMessage(conn) in such cases. We do that in, for example, postgres_fdw. (I don't feel a need to change ECPGnoticeReceiver, because that only deals with NOTICE results which should always have such a field, and PQerrorMessage wouldn't be relevant to a non-ERROR result anyway.) Will fix, thanks for the report! regards, tom lane diff --git a/src/interfaces/ecpg/ecpglib/error.c b/src/interfaces/ecpg/ecpglib/error.c index cd6c6a6819..26fdcdb69e 100644 --- a/src/interfaces/ecpg/ecpglib/error.c +++ b/src/interfaces/ecpg/ecpglib/error.c @@ -229,18 +229,17 @@ ecpg_raise_backend(int line, PGresult *result, PGconn *conn, int compat) return; } - if (result) - { - sqlstate = PQresultErrorField(result, PG_DIAG_SQLSTATE); - if (sqlstate == NULL) - sqlstate = ECPG_SQLSTATE_ECPG_INTERNAL_ERROR; - message = PQresultErrorField(result, PG_DIAG_MESSAGE_PRIMARY); - } - else - { + /* + * PQresultErrorField will return NULL if "result" is NULL, or if there is + * no such field, which will happen for libpq-generated errors. Fall back + * to PQerrorMessage in such cases. + */ + sqlstate = PQresultErrorField(result, PG_DIAG_SQLSTATE); + if (sqlstate == NULL) sqlstate = ECPG_SQLSTATE_ECPG_INTERNAL_ERROR; + message = PQresultErrorField(result, PG_DIAG_MESSAGE_PRIMARY); + if (message == NULL) message = PQerrorMessage(conn); - } if (strcmp(sqlstate, ECPG_SQLSTATE_ECPG_INTERNAL_ERROR) == 0) {
RE: BUG #17421: Core dump in ECPGdo() when calling PostgreSQL API from 32-bit client for RHEL8
From
"hirose.masay-01@fujitsu.com"
Date:
Hello, Thanks a lot for review and patch. With the patch I will test to verify my issue is fixed. Kindly let me know if you have expected release month for the fix. Regards, Masa