Re: pgbnech: allow to cancel queries during benchmark - Mailing list pgsql-hackers
From | Yugo NAGATA |
---|---|
Subject | Re: pgbnech: allow to cancel queries during benchmark |
Date | |
Msg-id | 20230810122311.1ff225f932fc032c3af95475@sraoss.co.jp Whole thread Raw |
In response to | Re: pgbnech: allow to cancel queries during benchmark (Fabien COELHO <coelho@cri.ensmp.fr>) |
List | pgsql-hackers |
Hello Fabien, On Wed, 9 Aug 2023 11:06:24 +0200 (CEST) Fabien COELHO <coelho@cri.ensmp.fr> wrote: > > Hello Yugo-san, > > Some feedback about v2. > > There is some dead code (&& false) which should be removed. I forgot to remove the debug code. I'll remove it. > >>> In multi-threads, only one thread can catches the signal and other threads > >>> continue to run. > > Yep. This why I see plenty uncontrolled race conditions if thread 0 > cancels clients which are managed in parallel by other threads and may be > active. I'm not really motivated to delve into libpq internals to check > whether there are possibly bad issues or not, but if two threads write > message at the same time in the same socket, I assume that this can be > bad if you are unlucky. > > ISTM that the rational convention should be that each thread cancels its > own clients, which ensures that there is no bad interaction between > threads. Actually, thread #0 and other threads never write message at the same time in the same socket. When thread #0 sends cancel requests, they are not sent to sockets that other threads are reading or writing. Rather, new another socket for cancel is created for each client, and the backend PID and cancel request are sent to the socket. PostgreSQL establishes a new connection for the cancel request, and sent a cancel signal to the specified backend. Therefore, thread #0 and other threads don't access any resources in the same time except to CancelRequested. Is still there any concern about race condition? > >>> Therefore, if Ctrl+C is pressed while threads are waiting > >>> responses from the backend in wait_on_socket_set, only one thread can be > >>> interrupted and return, but other threads will continue to wait and cannot > >>> check CancelRequested. > > >>> So, for implementing your suggestion, we need any hack > >>> to make all threads return from wait_on_socket_set when the event occurs, but > >>> I don't have idea to do this in simpler way. > > >>> In my patch, all threads can return from wait_on_socket_set at Ctrl+C > >>> because when thread #0 cancels all connections, the following error is > >>> sent to all sessions: > >>> > >>> ERROR: canceling statement due to user request > >>> > >>> and all threads will receive the response from the backend. > > Hmmm. > > I understand that the underlying issue you are raising is that other > threads may be stuck while waiting on socket events and that with your > approach they will be cleared somehow by socket 0. > > I'll say that (1) this point does not address potential race condition > issues with several thread interacting directly on the same client ; > (2) thread 0 may also be stuck waiting for events so the cancel is only > taken into account when it is woken up. I answered to (1) the consern about race condition above. And, as to (2), the SIGINT signal is handle only in thread #0 because it is blocked in other threads. So, when SIGINT is delivered, thread #0 will be interrupted and woken up immediately from waiting on socket, returning EINTR. Therefore, thread #0 would not be stuck. Regards, Yugo Nagata > If we accept that each thread cancels its clients when it is woken up, > which may imply some (usually small) delay, then it is not so different > from the current version because the code must wait for 0 to wake up > anyway, and it solves (1). The current version does not address potential > thread interactions. > > -- > Fabien. -- Yugo NAGATA <nagata@sraoss.co.jp>
pgsql-hackers by date: