Thread: psql's \watch is broken

psql's \watch is broken

From
Jeff Janes
Date:
If I do something like this:

explain (analyze) select * from pgbench_accounts \watch 1

It behaves as expected.  But once I break out of the loop with ctrl-C, then if I execute the same thing again it executes the command once, but shows no output and doesn't loop.  It seems like some flag is getting set with ctrl-C, but then never gets reset.

It was broken in this commit:

commit a4fd3aa719e8f97299dfcf1a8f79b3017e2b8d8b
Author: Michael Paquier <michael@paquier.xyz>
Date:   Mon Dec 2 11:18:56 2019 +0900

    Refactor query cancellation code into src/fe_utils/
 

I've not dug into code itself, I just bisected it.

Cheers,

Jeff

Re: psql's \watch is broken

From
Fabien COELHO
Date:
> explain (analyze) select * from pgbench_accounts \watch 1
>
> It behaves as expected.  But once I break out of the loop with ctrl-C, then
> if I execute the same thing again it executes the command once, but shows
> no output and doesn't loop.  It seems like some flag is getting set with
> ctrl-C, but then never gets reset.
>
> It was broken in this commit:
>
> commit a4fd3aa719e8f97299dfcf1a8f79b3017e2b8d8b
> Author: Michael Paquier <michael@paquier.xyz>
> Date:   Mon Dec 2 11:18:56 2019 +0900
>
>    Refactor query cancellation code into src/fe_utils/
>
> I've not dug into code itself, I just bisected it.

Thanks for the report. I'll look into it.

-- 
Fabien.



Re: psql's \watch is broken

From
Michael Paquier
Date:
On Sat, Dec 14, 2019 at 12:09:51AM +0100, Fabien COELHO wrote:
>
>> explain (analyze) select * from pgbench_accounts \watch 1
>>
>> It behaves as expected.  But once I break out of the loop with ctrl-C, then
>> if I execute the same thing again it executes the command once, but shows
>> no output and doesn't loop.  It seems like some flag is getting set with
>> ctrl-C, but then never gets reset.
>>
>>
>> I've not dug into code itself, I just bisected it.
>
> Thanks for the report. I'll look into it.

Looked at it already.   And yes, I can see the difference.  This comes
from the switch from cancel_pressed to CancelRequested in psql,
especially PSQLexecWatch() in this case.  And actually, now that I
look at it, I think that we should simply get rid of cancel_pressed in
psql completely and replace it with CancelRequested.  This also
removes the need of having cancel_pressed defined in print.c, which
was not really wanted originally.  Attached is a patch which addresses
the issue for me, and cleans up the code while on it.  Fabien, Jeff,
can you confirm please?
--
Michael

Attachment

Re: psql's \watch is broken

From
Jeff Janes
Date:


On Fri, Dec 13, 2019 at 9:45 PM Michael Paquier <michael@paquier.xyz> wrote:
On Sat, Dec 14, 2019 at 12:09:51AM +0100, Fabien COELHO wrote:
>
>> explain (analyze) select * from pgbench_accounts \watch 1
>>
>> It behaves as expected.  But once I break out of the loop with ctrl-C, then
>> if I execute the same thing again it executes the command once, but shows
>> no output and doesn't loop.  It seems like some flag is getting set with
>> ctrl-C, but then never gets reset.
>>
>>
>> I've not dug into code itself, I just bisected it.
>
> Thanks for the report. I'll look into it.

Looked at it already.   And yes, I can see the difference.  This comes
from the switch from cancel_pressed to CancelRequested in psql,
especially PSQLexecWatch() in this case.  And actually, now that I
look at it, I think that we should simply get rid of cancel_pressed in
psql completely and replace it with CancelRequested.  This also
removes the need of having cancel_pressed defined in print.c, which
was not really wanted originally.  Attached is a patch which addresses
the issue for me, and cleans up the code while on it.  Fabien, Jeff,
can you confirm please?
--
Michael

This works for me.
 
Thanks,

Jeff

Re: psql's \watch is broken

From
Fabien COELHO
Date:
>>> I've not dug into code itself, I just bisected it.
>>
>> Thanks for the report. I'll look into it.
>
> Looked at it already.

Ah, the magic of timezones!

> And yes, I can see the difference.  This comes from the switch from 
> cancel_pressed to CancelRequested in psql, especially PSQLexecWatch() in 
> this case.  And actually, now that I look at it, I think that we should 
> simply get rid of cancel_pressed in psql completely and replace it with 
> CancelRequested.  This also removes the need of having cancel_pressed 
> defined in print.c, which was not really wanted originally.  Attached is 
> a patch which addresses the issue for me, and cleans up the code while
> on it.  Fabien, Jeff, can you confirm please?

Yep. Patch applies cleanly, compiles, works for me as well.

-- 
Fabien.



Re: psql's \watch is broken

From
Tom Lane
Date:
Michael Paquier <michael@paquier.xyz> writes:
> Looked at it already.   And yes, I can see the difference.  This comes
> from the switch from cancel_pressed to CancelRequested in psql,
> especially PSQLexecWatch() in this case.  And actually, now that I
> look at it, I think that we should simply get rid of cancel_pressed in
> psql completely and replace it with CancelRequested.  This also
> removes the need of having cancel_pressed defined in print.c, which
> was not really wanted originally.  Attached is a patch which addresses
> the issue for me, and cleans up the code while on it.  Fabien, Jeff,
> can you confirm please?

Given the rather small number of existing uses of CancelRequested,
I wonder if it wouldn't be a better idea to rename it to cancel_pressed?

Also, perhaps I am missing something, but I do not see anyplace in the
current code base that ever *clears* CancelRequested.  How much has
this code been tested?  Is it really sane to remove the setting of that
flag from psql_cancel_callback, as this patch does?  Is it sane that
CancelRequested isn't declared volatile?

            regards, tom lane



Re: psql's \watch is broken

From
Fabien COELHO
Date:
Hello Tom,

My 0.02 €:

> Given the rather small number of existing uses of CancelRequested,
> I wonder if it wouldn't be a better idea to rename it to cancel_pressed?

I prefer the former because it is more functional (a cancellation has been 
requested, however the mean to do so) while "pressed" rather suggest a 
particular operation.

> Also, perhaps I am missing something, but I do not see anyplace in the
> current code base that ever *clears* CancelRequested.

This was already like that in the initial version before the refactoring.

  ./src/bin/scripts/common.h:extern bool CancelRequested;
  ./src/bin/scripts/common.c:bool         CancelRequested = false;
  ./src/bin/scripts/common.c:                     CancelRequested = true;
  ./src/bin/scripts/common.c:             CancelRequested = true;
  ./src/bin/scripts/common.c:                             CancelRequested = true;
  ./src/bin/scripts/common.c:                     CancelRequested = true;
  ./src/bin/scripts/vacuumdb.c:           if (CancelRequested)
  ./src/bin/scripts/vacuumdb.c:   if (CancelRequested)
  ./src/bin/scripts/vacuumdb.c:           if (i < 0 || CancelRequested)

However "cancel_request" resets are in "psql/mainloop.c", and are changed 
to "CancelRequest = false" resets by Michaël patch, so all seems fine.

> How much has this code been tested?  Is it really sane to remove the 
> setting of that flag from psql_cancel_callback, as this patch does?

ISTM that the callback is called from a function which sets CancelRequest?

>  Is it sane that CancelRequested isn't declared volatile?

I agree that it would seem appropriate, but the initial version I 
refactored was not declaring CancelRequested as volatile, so I did not 
change that. However, "cancel_pressed" is volatile, so merging the two
would indeed suggest to declare it as volatile.

-- 
Fabien.

Re: psql's \watch is broken

From
Michael Paquier
Date:
On Sun, Dec 15, 2019 at 10:35:54PM +0100, Fabien COELHO wrote:
>> Also, perhaps I am missing something, but I do not see anyplace in the
>> current code base that ever *clears* CancelRequested.

For bin/scripts/, that's not really a problem, because all code paths
triggering a cancellation, aka in vacuumdb and reindexdb, exit
immediately.

>> How much has this code been tested?

Sorry about that, not enough visibly :(

>> Is it really sane to remove the setting of that flag from
>> psql_cancel_callback, as this patch does?
>
> ISTM that the callback is called from a function which sets CancelRequest?

Hmm, that's not right.  Note that there is a subtle difference between
psql and bin/scripts/.  In the case of the scripts, originally
CancelRequested tracks if a cancellation request has been sent to the
backend or not.  Hence, if the client has not called SetCancelConn()
to set up cancelConn, then CancelRequested is switched to true all the
time.  Now, if cancelConn is set, but a cancellation request has not
correctly completed, then CancelRequested never set to true.

In the case of psql, the original code sets cancel_pressed all the
time, even if a cancellation request has been done and that it failed,
and did not care if cancelConn was set or not.  So, the intention of
psql is to always track when a cancellation attempt is done, even if
it has failed to issue it, while for all our other frontends we want
to make sure that a cancellation attempt is done, and that the
cancellation has succeeded before looping out and exit.

>>  Is it sane that CancelRequested isn't declared volatile?
>
> I agree that it would seem appropriate, but the initial version I refactored
> was not declaring CancelRequested as volatile, so I did not change that.
> However, "cancel_pressed" is volatile, so merging the two
> would indeed suggest to declare it as volatile.

Actually, it seems to me that both suggestions are not completely
right either about that stuff since the flag has been introduced in
bin/scripts/ in a1792320, no?  The way to handle such variables safely
in a signal handler it to mark them as volatile and sig_atomic_t.  The
same can be said about the older cancel_pressed as of 718bb2c in psql.
So fixed all that while on it.

As the concepts behind cancel_pressed and CancelRequested are
different, we need to keep cancel_pressed and make psql use it.  And
the callback used for WIN32 also needs to set the flag. I also think
that we should do a better effort in documenting CancelRequested
properly in cancel.c.  All that should be fixed as of the attached,
tested on Linux and from a Windows console.  From a point of view of
consistency, this actually brings back the code of psql to the same
state as it was before a4fd3aa, except that we still have the
refactored pieces.

Thoughts?
--
Michael

Attachment

Re: psql's \watch is broken

From
Michael Paquier
Date:
On Mon, Dec 16, 2019 at 11:40:07AM +0900, Michael Paquier wrote:
> As the concepts behind cancel_pressed and CancelRequested are
> different, we need to keep cancel_pressed and make psql use it.  And
> the callback used for WIN32 also needs to set the flag. I also think
> that we should do a better effort in documenting CancelRequested
> properly in cancel.c.  All that should be fixed as of the attached,
> tested on Linux and from a Windows console.  From a point of view of
> consistency, this actually brings back the code of psql to the same
> state as it was before a4fd3aa, except that we still have the
> refactored pieces.

Merging both flags can actually prove to be tricky, as we have some
code paths involving --single-step where psql visibly assumes that a
cancellation pressed does not necessarily imply one that succeeds is
there is a cancellation object around (ExecQueryTuples, tuple printing
and \copy).  So I have fixed the issue by making the code of psql
consistent with what we had before a4fd3aa.  I think that it should be
actually possible to merge CancelRequested and cancel_pressed while
keeping the user-visible changes acceptable, and this requires a very
careful lookup.
--
Michael

Attachment