Thread: Reconnect a single connection used by multiple threads in embedded SQL in C application causes error.

Reconnect a single connection used by multiple threads in embedded SQL in C application causes error.

From
"egashira.yusuke@fujitsu.com"
Date:
There seems to be something wrong with using a single connection in multiple threads in embedded SQL in C.

Our customer is writing an embedded SQL in C application in Postgresql 12, which handles a single connection from
multiplethreads. 
In this application, database operations from each threads are serialized and do not operate on the connection at the
sametime. 
However, reconnecting (DISCONNECT and CONNECT) in one thread causes SQL execution in the other thread to fail with
followingmessage. 

  the connection to the server was lost

An overview of the application process is provided below. Attached is a simple reproduction application.
I checked this with PostgreSQL 12.9.

1. Thread#1 EXEC SQL CONNECT;
2. Thread#2 EXEC SQL PREPARE / EXECUTE;
3. Thread#2 EXEC SQL DISCONNECT;
4. Thread#2 EXEC SQL CONNECT;
5. Thread#1 EXEC SQL PREPARE / EXECUTE;
6. Thread#1 EXEC SQL DISCONNECT;
 -> In step 5, Thread#1 takes above error.

Document "35.2.2. Choosing a Connection" (*) says the following notes on choosing a connection and multithreading.

> SQL statements in embedded SQL programs are by default executed on the current connection, that is, the most recently
openedone. 

> If your application uses multiple threads of execution, they cannot share a connection concurrently. You must either
explicitlycontrol access to the connection using mutexes) or use a connection for each thread. 

The reproduction application conforms to these notes.
Each thread is using a recently opened connection, and they are not using a connection at the same time.
However, the SQL execution in step 5 (Thread#1) does not use the connection opened in step 4 (Thread#2) and returns
"theconnection to the server was lost". 
# And the DISCONNECT in step 6(Thread#1) causes application crash by double free.
#   *** Error in `./app': double free or corruption (fasttop): 0x00007f12bc009780 ***

When adding the connection-name to each CONNECT and executing a statement to switch the current connection (EXEC SQL
SETCONNECTION connection-name;) before step 5,the SQL of step 5 could be executed. 
Do we need to create applications this way?
And, is there limitation that we can't CONNECT or DISCONNECT the DEFAULT connection inside a thread in multithreaded
application?

[*] https://www.postgresql.org/docs/12/ecpg-connect.html#ECPG-SET-CONNECTION


Regards,
Yusuke Egashira

Attachment
On Mon, Feb 21, 2022 at 12:09:46PM +0000, egashira.yusuke@fujitsu.com wrote:
> There seems to be something wrong with using a single connection in multiple threads in embedded SQL in C.
> 
> Our customer is writing an embedded SQL in C application in Postgresql 12, which handles a single connection from
multiplethreads.
 
> In this application, database operations from each threads are serialized and do not operate on the connection at the
sametime.
 
> However, reconnecting (DISCONNECT and CONNECT) in one thread causes SQL execution in the other thread to fail with
followingmessage.
 
> 
>   the connection to the server was lost
> 
> An overview of the application process is provided below. Attached is a simple reproduction application.
> I checked this with PostgreSQL 12.9.
> 
> 1. Thread#1 EXEC SQL CONNECT;
> 2. Thread#2 EXEC SQL PREPARE / EXECUTE;
> 3. Thread#2 EXEC SQL DISCONNECT;
> 4. Thread#2 EXEC SQL CONNECT;
> 5. Thread#1 EXEC SQL PREPARE / EXECUTE;
> 6. Thread#1 EXEC SQL DISCONNECT;
>  -> In step 5, Thread#1 takes above error.
> 
> Document "35.2.2. Choosing a Connection" (*) says the following notes on choosing a connection and multithreading.
> 
> > SQL statements in embedded SQL programs are by default executed on the current connection, that is, the most
recentlyopened one.
 
> 
> > If your application uses multiple threads of execution, they cannot share a connection concurrently. You must
eitherexplicitly control access to the connection using mutexes) or use a connection for each thread.
 
> 
> The reproduction application conforms to these notes.
> Each thread is using a recently opened connection, and they are not using a connection at the same time.
> However, the SQL execution in step 5 (Thread#1) does not use the connection opened in step 4 (Thread#2) and returns
"theconnection to the server was lost".
 
> # And the DISCONNECT in step 6(Thread#1) causes application crash by double free.
> #   *** Error in `./app': double free or corruption (fasttop): 0x00007f12bc009780 ***
> 
> When adding the connection-name to each CONNECT and executing a statement to switch the current connection (EXEC SQL
SETCONNECTION connection-name;) before step 5,the SQL of step 5 could be executed.
 
> Do we need to create applications this way?

For now, yes.

> And, is there limitation that we can't CONNECT or DISCONNECT the DEFAULT connection inside a thread in multithreaded
application?

For now, yes.  This is a bug.  The documentation you quoted is out of date
with respect to the code.  That documentation comes from commit 8f61184
(2003-12).  The behavior of connections shared across threads changed in
commit 757fb0e (2004-03).  Since that change, there's a per-thread record of
the default connection, in addition to a global notion of the default
connection.  I suspect a few use cases work well today:

- One thread does all the CONNECT and DISCONNECT commands.  Other threads just
  use the default connection, with mutual exclusion.
- Each thread does its own CONNECT, DISCONNECT, and other commands.  Each
  thread has its own default connection.  No sharing at all.

Your example does the CONNECT to establish the default connection in one
thread, and it does the DISCONNECT of the default connection in another
thread.  That's buggy today.

Another behavior that may qualify as a bug: ECPGsetconn() updates only the
thread-specific connection, while ECPGconnect() updates both the
thread-specific and global connections.



RE: Reconnect a single connection used by multiple threads in embedded SQL in C application causes error.

From
"egashira.yusuke@fujitsu.com"
Date:
Hi Noah,

Thank you for replying.

> > And, is there limitation that we can't CONNECT or DISCONNECT the DEFAULT connection inside a thread in
multithreadedapplication? 
>
> For now, yes.  This is a bug.  The documentation you quoted is out of date
> with respect to the code.  That documentation comes from commit 8f61184
> (2003-12).  The behavior of connections shared across threads changed in
> commit 757fb0e (2004-03).  Since that change, there's a per-thread record of
> the default connection, in addition to a global notion of the default
> connection.  I suspect a few use cases work well today:
>
> - One thread does all the CONNECT and DISCONNECT commands.  Other threads just
>   use the default connection, with mutual exclusion.
> - Each thread does its own CONNECT, DISCONNECT, and other commands.  Each
>   thread has its own default connection.  No sharing at all.

The documentation from commit 8f61184 seems to have been updated by the commit 0f33ee0 (2017-06).
However, there is still lacking of information about the DEFAULT connection on per-thread, I understood.


> Your example does the CONNECT to establish the default connection in one
> thread, and it does the DISCONNECT of the default connection in another
> thread.  That's buggy today.
>
> Another behavior that may qualify as a bug: ECPGsetconn() updates only the
> thread-specific connection, while ECPGconnect() updates both the
> thread-specific and global connections.

I also understood that my example is impossible by the implementation bugs.

I think it would take some time to fix these bugs.
If it takes long time to fix it, will this limitation(*) be added to the documentation?

[*] Based on your explanation:
    "If we use the default connection inside a thread in multithreaded application,
     CONNECT and DISCONNECT commands are allowed only following cases.
     - One thread does all the CONNECT and DISCONNECT commands.
       Other threads just use the default connection, with mutual exclusion.
     - Each thread does its own CONNECT, DISCONNECT, and other commands.
       Each thread has its own default connection. No sharing at all."

----

An additional report:
I also ran my reproduction application built with rh-postgresql12 libraries (installed from Red Hat Software
Collections), 
and it crashed instead of an error in step 5 in my first email.

* Actually, the error message in my first email came from postgresql built by myself (./configure --prefix=xxx).

The backtrace is following.
(gdb) bt
#0  0x00007fa11a4ddd0f in ecpg_find_prepared_statement () from
/opt/rh/rh-postgresql12/root/usr//lib64/libecpg.so.rh-postgresql12-6
#1  0x00007fa11a4de11b in ecpg_prepared () from /opt/rh/rh-postgresql12/root/usr//lib64/libecpg.so.rh-postgresql12-6
#2  0x00007fa11a4d8b65 in ecpg_do_prologue () from /opt/rh/rh-postgresql12/root/usr//lib64/libecpg.so.rh-postgresql12-6
#3  0x00007fa11a4d8c33 in ecpg_do () from /opt/rh/rh-postgresql12/root/usr//lib64/libecpg.so.rh-postgresql12-6
#4  0x00007fa11a4d8d5b in ECPGdo () from /opt/rh/rh-postgresql12/root/usr//lib64/libecpg.so.rh-postgresql12-6
#5  0x0000000000401143 in thread_main2 ()
#6  0x00007fa11a2bdea5 in start_thread () from /lib64/libpthread.so.0
#7  0x00007fa119fe68cd in clone () from /lib64/libc.so.6

I think this happened because the PGconn referenced by Thread#1 was destroyed in step 4 and then overwritten with
invaliddata. 

Regards,
Yusuke Egashira




On Tue, Feb 22, 2022 at 12:06:23PM +0000, egashira.yusuke@fujitsu.com wrote:
> > > And, is there limitation that we can't CONNECT or DISCONNECT the DEFAULT connection inside a thread in
multithreadedapplication?
 
> > 
> > For now, yes.  This is a bug.  The documentation you quoted is out of date
> > with respect to the code.  That documentation comes from commit 8f61184
> > (2003-12).  The behavior of connections shared across threads changed in
> > commit 757fb0e (2004-03).  Since that change, there's a per-thread record of
> > the default connection, in addition to a global notion of the default
> > connection.  I suspect a few use cases work well today:
> > 
> > - One thread does all the CONNECT and DISCONNECT commands.  Other threads just
> >   use the default connection, with mutual exclusion.
> > - Each thread does its own CONNECT, DISCONNECT, and other commands.  Each
> >   thread has its own default connection.  No sharing at all.
> 
> The documentation from commit 8f61184 seems to have been updated by the commit 0f33ee0 (2017-06).
> However, there is still lacking of information about the DEFAULT connection on per-thread, I understood.

Yes.

> > Your example does the CONNECT to establish the default connection in one
> > thread, and it does the DISCONNECT of the default connection in another
> > thread.  That's buggy today.
> > 
> > Another behavior that may qualify as a bug: ECPGsetconn() updates only the
> > thread-specific connection, while ECPGconnect() updates both the
> > thread-specific and global connections.
> 
> I also understood that my example is impossible by the implementation bugs.
> 
> I think it would take some time to fix these bugs. 

Yes.  I suspect the hard part will be picking a specification that makes your
case work better without breaking other important cases that work today.

> If it takes long time to fix it, will this limitation(*) be added to the documentation?
> 
> [*] Based on your explanation:
>     "If we use the default connection inside a thread in multithreaded application, 
>      CONNECT and DISCONNECT commands are allowed only following cases.
>      - One thread does all the CONNECT and DISCONNECT commands. 
>        Other threads just use the default connection, with mutual exclusion.
>      - Each thread does its own CONNECT, DISCONNECT, and other commands. 
>        Each thread has its own default connection. No sharing at all."

Would you like to propose a patch?

> An additional report: 
> I also ran my reproduction application built with rh-postgresql12 libraries (installed from Red Hat Software
Collections),
 
> and it crashed instead of an error in step 5 in my first email.
> 
> * Actually, the error message in my first email came from postgresql built by myself (./configure --prefix=xxx).
> 
> The backtrace is following.
> (gdb) bt
> #0  0x00007fa11a4ddd0f in ecpg_find_prepared_statement () from
/opt/rh/rh-postgresql12/root/usr//lib64/libecpg.so.rh-postgresql12-6
> #1  0x00007fa11a4de11b in ecpg_prepared () from /opt/rh/rh-postgresql12/root/usr//lib64/libecpg.so.rh-postgresql12-6
> #2  0x00007fa11a4d8b65 in ecpg_do_prologue () from
/opt/rh/rh-postgresql12/root/usr//lib64/libecpg.so.rh-postgresql12-6
> #3  0x00007fa11a4d8c33 in ecpg_do () from /opt/rh/rh-postgresql12/root/usr//lib64/libecpg.so.rh-postgresql12-6
> #4  0x00007fa11a4d8d5b in ECPGdo () from /opt/rh/rh-postgresql12/root/usr//lib64/libecpg.so.rh-postgresql12-6
> #5  0x0000000000401143 in thread_main2 ()
> #6  0x00007fa11a2bdea5 in start_thread () from /lib64/libpthread.so.0
> #7  0x00007fa119fe68cd in clone () from /lib64/libc.so.6
> 
> I think this happened because the PGconn referenced by Thread#1 was destroyed in step 4 and then overwritten with
invaliddata.
 

That makes sense.



RE: Reconnect a single connection used by multiple threads in embedded SQL in C application causes error.

From
"egashira.yusuke@fujitsu.com"
Date:
Hi Noah,

> Would you like to propose a patch?

Sure. I will work on creating a document patch.

However, I found another problem while testing to make sure that the use cases were valid.

>      - Each thread does its own CONNECT, DISCONNECT, and other commands.
>        Each thread has its own default connection. No sharing at all."

In that case, if I execute CONNECT with same connection-string on each threads without connection-name, the later
CONNECTwas skipped. 
Then, ECPGdebug shows following message.

  ECPGconnect: connection identifier (null) is already in use

This seems to indicate that CONNECT need to specify connection-name even if we use the default connection in each
thread.
This also seems to me to be an undesirable behavior in multiple thread.
Or do we need to name all connections if we use multiple connections, even in multiple thread?


Regards,
Yusuke Egashira




Hi Noah,

> > Would you like to propose a patch?
>
> Sure. I will work on creating a document patch.

I reconfirmed the conditions under which this issue occurred.
Since this happens when the current connection is disconnected by another thread in a multiple threaded application,
I added an information about "the current connection in a multiple threaded application" to explain the limitations.

I attached the patch to this mail.
I reported a version of PostgreSQL 12, but this patch is based on the master branch.

Regards,
Yusuke Egashira


Attachment
At Wed, 2 Mar 2022 12:43:34 +0000, "egashira.yusuke@fujitsu.com" <egashira.yusuke@fujitsu.com> wrote in 
> --_002_TYWPR01MB72027307B41182F8A12524ADFF039TYWPR01MB7202jpnp_
> Content-Type: text/plain; charset="iso-2022-jp"
> Content-Transfer-Encoding: quoted-printable
> 
> Hi Noah,=20
> 
> > > Would you like to propose a patch?
> >=20
> > Sure. I will work on creating a document patch.
> 
> I reconfirmed the conditions under which this issue occurred.
> Since this happens when the current connection is disconnected by another t=
> hread in a multiple threaded application,
> I added an information about "the current connection in a multiple threaded=
>  application" to explain the limitations.
> 
> I attached the patch to this mail.
> I reported a version of PostgreSQL 12, but this patch is based on the maste=
> r branch.

+   The most recently opened connection becomes the current connection.
+   If your application uses multiple threads, each thread has its own
+   current connection, and it is the most recently opened connection
+   in each thread.  Threads that have not yet opened the current
+   connection treat the most recently opened connection in the
+   application as the current connection.  The current connection is
+   used by default when an SQL statement is to be executed (see later
+   in this chapter).

It looks like too verbose.  Couldn't it be summarized as something
like this?

> Current connection is managed in a per-thread manner. For the first
> use on a thread, it is the most recently used one in the process.

+     If you execute an SQL statement without connection name in a thread in
+     multiple thread application, a thread must not do
+     <command>DISCONNECT</command> command for a connection which is
+     used as the current connection in other threads.  In particular,
+     you should follow one of the following two use cases.  The first is that
+     one thread does all the <command>CONNECT</command> and
+     <command>DISCONNECT</command> commands.  Other threads just use the
+     current connection, with mutual exclusion.  The second is that each
+     thread does its own <command>CONNECT</command>, <command>DISCONNECT</command>,
+     and other commands.  Each thread has its own current connection.
+     No sharing at all.

This doesn't look like a API documentation but a procedual
documentation.  IMO there's no particular reason that we should
officially concretely suggest how mulitple connectinos are managed on
somebody's application.  And part of the section is already in the
documentation.

|  <para>
|   If your application uses multiple threads of execution, they cannot share a
|   connection concurrently. You must either explicitly control access to the connection
|   (using mutexes) or use a connection for each thread.
|  </para>

So, it would boil down to the caution about disconnection.

> Note that connections are shared among all threads.  Be careful not
> to DISCONNECT a connection that is to be used in another thread.

regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center



Hi Horiguchi-san,

Thank you for the comment.
I attached the revised patch.

> It looks like too verbose.  Couldn't it be summarized as something
> like this?
>
> > Current connection is managed in a per-thread manner. For the first
> > use on a thread, it is the most recently used one in the process.

Yes.
I thought that I needed a detailed explanation of the current connection for the note below, but this certainly seems
tobe summable. 


> This doesn't look like a API documentation but a procedual
> documentation.  IMO there's no particular reason that we should
> officially concretely suggest how mulitple connectinos are managed on
> somebody's application.  And part of the section is already in the
> documentation.
>
> |  <para>
> |   If your application uses multiple threads of execution, they cannot share a
> |   connection concurrently. You must either explicitly control access to the connection
> |   (using mutexes) or use a connection for each thread.
> |  </para>
>
> So, it would boil down to the caution about disconnection.
>
> > Note that connections are shared among all threads.  Be careful not
> > to DISCONNECT a connection that is to be used in another thread.

I wanted to make a document that could avoid the use cases that I reported, so I tried to write a note according to the
occurrenceconditions. 
However, this was also too verbose, so I rewrote the note as you suggested.
And since there is no mention of the current connection, I changed the section to add it to "36.2.3. Closing a
Connection".

I think I might be better to add "In the worst case, this may cause an application crash." to this note so that users
canavoid the problem more strongly. 
However, this sentence can also be read as claiming that there is an implementation problem.
Should not I add this sentence?

I am also a little concerned that the revised text will be interpreted as prohibiting the following use cases:

> - One thread does all the CONNECT and DISCONNECT commands. Other threads just use the default connection, with mutual
exclusion.



Regards,
Yusuke Egashira

Attachment
On Fri, Feb 25, 2022 at 11:29:55AM +0000, egashira.yusuke@fujitsu.com wrote:
> > Would you like to propose a patch?
> 
> Sure. I will work on creating a document patch.
> 
> However, I found another problem while testing to make sure that the use cases were valid.
> 
> >      - Each thread does its own CONNECT, DISCONNECT, and other commands. 
> >        Each thread has its own default connection. No sharing at all."
> 
> In that case, if I execute CONNECT with same connection-string on each threads without connection-name, the later
CONNECTwas skipped.
 
> Then, ECPGdebug shows following message.
> 
>   ECPGconnect: connection identifier (null) is already in use
> 
> This seems to indicate that CONNECT need to specify connection-name even if we use the default connection in each
thread.
> This also seems to me to be an undesirable behavior in multiple thread.

Agreed.  It might have been a nice feature for every thread to have its own
nameless connection.

> Or do we need to name all connections if we use multiple connections, even in multiple thread?

Yes.

On Fri, Mar 04, 2022 at 11:49:39AM +0000, egashira.yusuke@fujitsu.com wrote:
> > It looks like too verbose.  Couldn't it be summarized as something
> > like this?
> > 
> > > Current connection is managed in a per-thread manner. For the first
> > > use on a thread, it is the most recently used one in the process.

There's no behavior specific to "the first use on a thread", and "most
recently used" is not a factor.  Avoid those phrases.  (In a thread that has
never opened a connection, each command that retrieves the current connection
will retrieve the non-thread-specific current connection.)

> Yes.
> I thought that I needed a detailed explanation of the current connection for the note below, but this certainly seems
tobe summable.
 

For this particular paragraph, I liked your v1's level of verbosity more than
this compact version.

> > This doesn't look like a API documentation but a procedual
> > documentation.  IMO there's no particular reason that we should
> > officially concretely suggest how mulitple connectinos are managed on
> > somebody's application.  And part of the section is already in the
> > documentation.

I'm okay with your removal of that paragraph.  As a general matter of
documentation strategy, it's valid for our manual to contain both API docs and
higher-level advice about known strategies for combining the APIs into an
effective program.

> > |  <para>
> > |   If your application uses multiple threads of execution, they cannot share a
> > |   connection concurrently. You must either explicitly control access to the connection
> > |   (using mutexes) or use a connection for each thread.
> > |  </para>
> > 
> > So, it would boil down to the caution about disconnection.
> > 
> > > Note that connections are shared among all threads.  Be careful not
> > > to DISCONNECT a connection that is to be used in another thread.
> 
> I wanted to make a document that could avoid the use cases that I reported, so I tried to write a note according to
theoccurrence conditions.
 
> However, this was also too verbose, so I rewrote the note as you suggested.
> And since there is no mention of the current connection, I changed the section to add it to "36.2.3. Closing a
Connection".
> 
> I think I might be better to add "In the worst case, this may cause an application crash." to this note so that users
canavoid the problem more strongly.
 
> However, this sentence can also be read as claiming that there is an implementation problem.
> Should not I add this sentence?

It's okay to admit implementation problems.  I might write "Behavior is
undefined if the current thread is not the thread that opened the connection."
Referring to undefined behavior is a customary way for API docs to imply that
a crash is possible.  If one wanted to be even more precise, "Behavior is
undefined if the connection is the thread-specific current connection of any
other thread."  However, the more-precise wording is harder to understand.
Also, being more precise now could limit our freedom to change behavior later.

> --- a/doc/src/sgml/ecpg.sgml
> +++ b/doc/src/sgml/ecpg.sgml
> @@ -222,10 +222,15 @@ EXEC SQL CONNECT TO <replaceable>target</replaceable> <optional>AS <replaceable>
>    <para>
>     The <replaceable>connection-name</replaceable> is used to handle
>     multiple connections in one program.  It can be omitted if a
> -   program uses only one connection.  The most recently opened
> -   connection becomes the current connection, which is used by default
> -   when an SQL statement is to be executed (see later in this
> -   chapter).
> +   program uses only one connection.
> +  </para>
> +
> +  <para>
> +   The most recently opened connection becomes the current connection.

This is true.  Disconnecting also changes the current connection, to another
one of the open connections.  Please document that somewhere appropriate.

> +   The current connection is managed in a per-thread manner.  For the
> +   first use on a thread, it is the most recently opened one in the
> +   process.  The current connection is used by default when an SQL
> +   statement is to be executed (see later in this chapter).
>    </para>
>  
>    <para>
> @@ -276,8 +281,8 @@ EXEC SQL CONNECT TO :target USER :user USING :passwd;
>  
>    <para>
>     SQL statements in embedded SQL programs are by default executed on
> -   the current connection, that is, the most recently opened one.  If
> -   an application needs to manage multiple connections, then there are
> +   the current connection, that is, the most recently opened one in each thread.

Likewise, this is not accurate if a disconnect of the current connection
happened after the most recent open.  You could simply delete this claim,
which would avoid defining the current connection in two places.

> +   If an application needs to manage multiple connections, then there are
>     three ways to handle this.
>    </para>



Hi.

Thank you for the comment.
I attached the revised patch v3.

> On Fri, Mar 04, 2022 at 11:49:39AM +0000, egashira.yusuke@fujitsu.com wrote:
> > > It looks like too verbose.  Couldn't it be summarized as something
> > > like this?
> > >
> > > > Current connection is managed in a per-thread manner. For the first
> > > > use on a thread, it is the most recently used one in the process.
>
> There's no behavior specific to "the first use on a thread", and "most
> recently used" is not a factor.  Avoid those phrases.  (In a thread that has
> never opened a connection, each command that retrieves the current connection
> will retrieve the non-thread-specific current connection.)
>
> > Yes.
> > I thought that I needed a detailed explanation of the current connection for the note below, but this certainly
seemsto be summable. 
>
> For this particular paragraph, I liked your v1's level of verbosity more than
> this compact version.

I have reinstated v1's explanation of this paragraph because the current connection is relevant to the subsequent note.


> > I wanted to make a document that could avoid the use cases that I reported, so I tried to write a note according to
theoccurrence conditions. 
> > However, this was also too verbose, so I rewrote the note as you suggested.
> > And since there is no mention of the current connection, I changed the section to add it to "36.2.3. Closing a
Connection".
> >
> > I think I might be better to add "In the worst case, this may cause an application crash." to this note so that
userscan avoid the problem more strongly. 
> > However, this sentence can also be read as claiming that there is an implementation problem.
> > Should not I add this sentence?
>
> It's okay to admit implementation problems.  I might write "Behavior is
> undefined if the current thread is not the thread that opened the connection."
> Referring to undefined behavior is a customary way for API docs to imply that
> a crash is possible.  If one wanted to be even more precise, "Behavior is
> undefined if the connection is the thread-specific current connection of any
> other thread."  However, the more-precise wording is harder to understand.
> Also, being more precise now could limit our freedom to change behavior later.

The most important thing I want to explain in the documentation is that the current connection should not be closed
fromanother thread. 
So I just added that to the DISCONNECT Note.


> > --- a/doc/src/sgml/ecpg.sgml
> > +++ b/doc/src/sgml/ecpg.sgml
> > @@ -222,10 +222,15 @@ EXEC SQL CONNECT TO <replaceable>target</replaceable> <optional>AS <replaceable>
> >    <para>
> >     The <replaceable>connection-name</replaceable> is used to handle
> >     multiple connections in one program.  It can be omitted if a
> > -   program uses only one connection.  The most recently opened
> > -   connection becomes the current connection, which is used by default
> > -   when an SQL statement is to be executed (see later in this
> > -   chapter).
> > +   program uses only one connection.
> > +  </para>
> > +
> > +  <para>
> > +   The most recently opened connection becomes the current connection.
>
> This is true.  Disconnecting also changes the current connection, to another
> one of the open connections.  Please document that somewhere appropriate.

I have added that behavior to the "Closing a Connection" section.


> > @@ -276,8 +281,8 @@ EXEC SQL CONNECT TO :target USER :user USING :passwd;
> >
> >    <para>
> >     SQL statements in embedded SQL programs are by default executed on
> > -   the current connection, that is, the most recently opened one.  If
> > -   an application needs to manage multiple connections, then there are
> > +   the current connection, that is, the most recently opened one in each thread.
>
> Likewise, this is not accurate if a disconnect of the current connection
> happened after the most recent open.  You could simply delete this claim,
> which would avoid defining the current connection in two places.

As I explained the current connection in detail before this section, I think that there is no need to supplement the
definitionof the current connection. 
Therefore, I removed the second half of this statement.


Regards,
Yusuke Egashira

Attachment
On Fri, Mar 25, 2022 at 01:34:16PM +0000, egashira.yusuke@fujitsu.com wrote:
> > On Fri, Mar 04, 2022 at 11:49:39AM +0000, egashira.yusuke@fujitsu.com wrote:
> > > @@ -222,10 +222,15 @@ EXEC SQL CONNECT TO <replaceable>target</replaceable> <optional>AS <replaceable>
> > >    <para>
> > >     The <replaceable>connection-name</replaceable> is used to handle
> > >     multiple connections in one program.  It can be omitted if a
> > > -   program uses only one connection.  The most recently opened
> > > -   connection becomes the current connection, which is used by default
> > > -   when an SQL statement is to be executed (see later in this
> > > -   chapter).
> > > +   program uses only one connection.
> > > +  </para>
> > > +
> > > +  <para>
> > > +   The most recently opened connection becomes the current connection.
> > 
> > This is true.  Disconnecting also changes the current connection, to another
> > one of the open connections.  Please document that somewhere appropriate.
> 
> I have added that behavior to the "Closing a Connection" section.

I intended to ask you to document it such that the reader can predict which
connection becomes the current connection.  Just saying "another" doesn't give
the reader enough information to rely on any particular behavior.  Having said
that, I've changed my mind about documenting this.  The current behavior is
hard to use, particularly in a multithreaded program.  Also, we might want to
change it later.  Moreover, I think it's intuitive that use of current
connection is undefined if one closes current connection and does not later
initialize some connection.  Let's try to leave this disconnect matter
undocumented after all.

> --- a/doc/src/sgml/ecpg.sgml
> +++ b/doc/src/sgml/ecpg.sgml
> @@ -222,10 +222,18 @@ EXEC SQL CONNECT TO <replaceable>target</replaceable> <optional>AS <replaceable>
>    <para>
>     The <replaceable>connection-name</replaceable> is used to handle
>     multiple connections in one program.  It can be omitted if a
> -   program uses only one connection.  The most recently opened
> -   connection becomes the current connection, which is used by default
> -   when an SQL statement is to be executed (see later in this
> -   chapter).
> +   program uses only one connection.
> +  </para>
> +
> +  <para>
> +   The most recently opened connection becomes the current connection.
> +   If your application uses multiple threads, each thread has its own
> +   current connection, and it is the most recently opened connection
> +   in each thread.  Threads that have not yet opened the current
> +   connection treat the most recently opened connection in the
> +   application as the current connection.  The current connection is
> +   used by default when an SQL statement is to be executed (see later
> +   in this chapter).
>    </para>

Disconnects can make this false.

> @@ -435,10 +442,24 @@ EXEC SQL DISCONNECT <optional><replaceable>connection</replaceable></optional>;
>     closed.
>    </para>
>  
> +  <para>
> +   Disconnecting changes the current connection to another opened
> +   connection if the closed connection was the current connection.
> +  </para>
> +
>    <para>
>     It is good style that an application always explicitly disconnect
>     from every connection it opened.
>    </para>
> +
> +  <note>
> +    <para>
> +     Connections are shared among all threads.  Be careful not to
> +     disconnect a connection that is to be used in another thread.

In your original example, I suspect the program author would not think of the
closed connection as "to be used in" Thread#1.  I'm not sure what to write
here, but I feel this isn't it.  We could probably omit the two sentences
without replacing them.

> +     In particular, the behavior is undefined if a connection treated as
> +     the current connection in one thread is closed by another thread.

Contrary to this text, it's okay to close the global current connection, so
long as that connection is not the thread-specific current connection of any
thread other than the current thread.  (This is so awkward.)  It is safe even
though the connection in question would be the current connection of every
thread not having initiated a connection.  The "undefined if the current
thread is not the thread that opened" wording I suggested in my last message
was also flawed, because SET CONNECTION or a disconnect can cause the
thread-specific current connection to become a connection opened in a
different thread.

I'm surprised to find how hard it is to document this without introducing a
false statement.