Thread: Is PREPARE of ecpglib thread safe?
Hi I'm afraid that PREPARE of ecpglib is not thread safe. The following global variables are modified without any locking system. Is it unnecessary worry? [interfaces/ecpg/ecpglib/prepare.c] static int nextStmtID = 1; static stmtCacheEntry *stmtCacheEntries = NULL; static struct declared_statement *g_declared_list; Regards Ryo Matsumura
Hello. At Thu, 14 Mar 2019 07:17:08 +0000, "Matsumura, Ryo" <matsumura.ryo@jp.fujitsu.com> wrote in <03040DFF97E6E54E88D3BFEE5F5480F737AC390D@G01JPEXMBYT04> > Hi > > I'm afraid that PREPARE of ecpglib is not thread safe. > The following global variables are modified without any locking system. > Is it unnecessary worry? > > [interfaces/ecpg/ecpglib/prepare.c] > static int nextStmtID = 1; > static stmtCacheEntry *stmtCacheEntries = NULL; > static struct declared_statement *g_declared_list; A connection cannot be concurrently used by multiple threads so the programmer must guard connections using mutex [1] or friends. If it is done by a single mutex (I suppose it is common.), there's no race condition also on the prepared statement storage. I'm not sure it is explicitly aimed but I suppose that there's no problem in a common usage of the library. [1] https://www.postgresql.org/docs/current/ecpg-connect.html > If your application uses multiple threads of execution, they > cannot share a connection concurrently. You must either > explicitly control access to the connection (using mutexes) or > use a connection for each thread. regards. -- Kyotaro Horiguchi NTT Open Source Software Center
Horiguchi-san Thank you for your comment. > A connection cannot be concurrently used by multiple threads so > the programmer must guard connections using mutex [1] or > friends. If it is done by a single mutex (I suppose it is > common.), there's no race condition also on the prepared > statement storage. I'm not sure it is explicitly aimed but I > suppose that there's no problem in a common usage of the library. I understand it, but current scope of StatementCache and DeclareStatementList seems not to be limitted within each connection, isn't it? Therefore, I thought the operation on them must be thread safe. For example, scope of DescriptorList in descriptor.c is within thread (not connection) by using pthread_getspecific/ pthread_setspecific(). Regards Ryo Matsumura
At Thu, 14 Mar 2019 09:49:11 +0000, "Matsumura, Ryo" <matsumura.ryo@jp.fujitsu.com> wrote in <03040DFF97E6E54E88D3BFEE5F5480F737AC3AD8@G01JPEXMBYT04> > Horiguchi-san > > Thank you for your comment. > > > A connection cannot be concurrently used by multiple threads so > > the programmer must guard connections using mutex [1] or > > friends. If it is done by a single mutex (I suppose it is > > common.), there's no race condition also on the prepared > > statement storage. I'm not sure it is explicitly aimed but I > > suppose that there's no problem in a common usage of the library. > > I understand it, but current scope of StatementCache and DeclareStatementList seems not > to be limitted within each connection, isn't it? Yes, so I wrote that "if it is done by a single mutex". Feel free to fix that if it is still problematic:) > Therefore, I thought the operation on them must be thread safe. I'm not against that. > For example, scope of DescriptorList in descriptor.c is within thread (not connection) > by using pthread_getspecific/ pthread_setspecific(). It seems like a local cache of server-side data, which is similar to catcache on server side for each process. I don't think prepared statements is in that category. A prepared statement is bonded to a connection, not to a thread. Different threads can execute the same prepared statement on the same connection. regards. -- Kyotaro Horiguchi NTT Open Source Software Center
Hi Horiguchi-san, Kuroda-san Horiguchi-san, thank you for your comment. I have a question. A bug of StatementCache is occurred in previous versions. Should a patch be separated? > Horiguchi-san wrote: > It seems like a local cache of server-side data, which is similar > to catcache on server side for each process. I agree. I will fix it with using pthread_setspecific like descriptor.c. > I don't think > prepared statements is in that category. A prepared statement is > bonded to a connection, not to a thread. Different threads can > execute the same prepared statement on the same connection. A namespace of declared statement is not connection independent. Therefore, we must manage the namespce in global and consider about race condition. For example, ecpglib must refer the information of (A) when ecpglib executes (B) in order to occur "double declare" error. (A) exec sql at conn1 declare st1 statement; (B) exec sql at conn2 declare st1 statement; // If ecpglib didn't reject the above, ecpglib cannot judge // which connection the followings should be executed on. exec sql prepare st1 from "select 1"; exec sql execute st1; Kuroda-san, is it right? If it's right, I will fix it with using pthread_lock. Regards Ryo Matsumura
At Fri, 15 Mar 2019 05:27:01 +0000, "Matsumura, Ryo" <matsumura.ryo@jp.fujitsu.com> wrote in <03040DFF97E6E54E88D3BFEE5F5480F737AC3F24@G01JPEXMBYT04> > Hi Horiguchi-san, Kuroda-san > > Horiguchi-san, thank you for your comment. > > I have a question. > A bug of StatementCache is occurred in previous versions. > Should a patch be separated? > > > Horiguchi-san wrote: > > It seems like a local cache of server-side data, which is similar > > to catcache on server side for each process. > > I agree. > I will fix it with using pthread_setspecific like descriptor.c. > > > I don't think > > prepared statements is in that category. A prepared statement is > > bonded to a connection, not to a thread. Different threads can > > execute the same prepared statement on the same connection. > > A namespace of declared statement is not connection independent. > Therefore, we must manage the namespce in global and consider about race condition. Right, and but thread independent. > For example, ecpglib must refer the information of (A) when ecpglib executes (B) > in order to occur "double declare" error. > > (A) exec sql at conn1 declare st1 statement; > (B) exec sql at conn2 declare st1 statement; On an interactive SQL environment like psql, we can declar pareared statements with the same name on different connections. Do you mean you are going to implement different way on ECPG? Actually the current ECPGprepare seems already managing prepared statements separately for each connections. This is naturally guarded by per-connection concurrencly control that applications should do. > this = ecpg_find_prepared_statement(name, con, &prev); What you showed at the beginning of this thread was the sutff for auto prepare, the name of which is generated using the global variable nextStmtID and stored into global stmtCacheEntries. They are not guarded at all and seem to need multithread-protection. > // If ecpglib didn't reject the above, ecpglib cannot judge > // which connection the followings should be executed on. > exec sql prepare st1 from "select 1"; > exec sql execute st1; I'm not sure about ECPG, but according to the documentation, the following statements should work correctly. SQL SET CONNECTION con1; EXEC SQL PREPARE st1 FROM "select 1"; EXEC SQL EXECUTE st1; should succeed and executed on con1. > Kuroda-san, is it right? > If it's right, I will fix it with using pthread_lock. Mmm. Are you saying that prepared statements on ECPG should have names in global namespace and EXECUTE should implicitly choose the underlying connection automatically from the name of a prepared statement? I don't think it is the right direction. regards. -- Kyotaro Horiguchi NTT Open Source Software Center
Oops. At Fri, 15 Mar 2019 15:33:50 +0900 (Tokyo Standard Time), Kyotaro HORIGUCHI <horiguchi.kyotaro@lab.ntt.co.jp> wrote in <20190315.153350.226491548.horiguchi.kyotaro@lab.ntt.co.jp> > > // If ecpglib didn't reject the above, ecpglib cannot judge > > // which connection the followings should be executed on. > > exec sql prepare st1 from "select 1"; > > exec sql execute st1; > > I'm not sure about ECPG, but according to the documentation, the > following statements should work correctly. > > SQL SET CONNECTION con1; Of course this is missing prefixing "EXEC". > EXEC SQL PREPARE st1 FROM "select 1"; > EXEC SQL EXECUTE st1; > > should succeed and executed on con1. > > > Kuroda-san, is it right? > > If it's right, I will fix it with using pthread_lock. > > Mmm. Are you saying that prepared statements on ECPG should have > names in global namespace and EXECUTE should implicitly choose > the underlying connection automatically from the name of a > prepared statement? I don't think it is the right direction. -- Kyotaro Horiguchi NTT Open Source Software Center
Horiguchi-san, Kuroda-san > Horiguchi-san wrote: > > A namespace of declared statement is not connection independent. > > Therefore, we must manage the namespce in global and consider about race condition. > > Right, and but thread independent. I was wrong. I understand that DECLARE STATEMENT should be same policy as the combination of PREPARE STATEMENT and SET CONNECTION. We should fix the current implementation of DECLARE STATEMENT. Current: t1:Thread1: exec sql at conn1 declare st1 statement; t2:Thread2: exec sql at conn2 declare st1 statement; // NG ToBe: t1:Thread1: exec sql at conn1 declare st1 statement; t2:Thread2: exec sql at conn2 declare st1 statement; // OK t3:Thread2: exec sql prepared st1 from "select 1"; // OK: prepared on conn2 t4:Thread1: exec sql execute st1; // NG: not prepared t5:Thread2: exec sql execute st1; // OK: executed on conn2 t1:Thread1: exec sql at conn1 declare st1 statement; t2:Thread1: exec sql at conn2 declare st1 statement; // NG Regards Ryo Matsumura
Dear Matsumura-san, Horiguchi-san, We should check the specfication of Pro*c before coding because this follows its feature. I'll test some cases and send you a simple report. Best Regards, Hayato Kuroda Fujitsu LIMITED
Dear all, I apologize for my late reply. I realized that the current implementation could not imitate the oracle's precompiler. The attached example can be accepted by both precompilers, but building the c file made by ecpg fails. For handling this source, we have to refactor some sources related with DECLARE STATEMENT. My draft amendment is converting the sentence from executable to declarative, that is: * change to operate only if a pgc file is precompiled * remove related code from ecpglib directory In this case, the namespace of a SQL identifier is file independent, and sources becomes more simple. I will start making patches. Do you have any comments or suggestions? Best Regards. Hayato Kuroda Fujitsu LIMITED