Thread: Dereference before NULL check (src/backend/storage/ipc/latch.c)

Dereference before NULL check (src/backend/storage/ipc/latch.c)

From
Ranier Vilela
Date:
Hi,

Per Coverity.

If test set->latch against NULL, is why it can be NULL.
ResetEvent can dereference NULL.

regards,
Ranier Vilela
Attachment

Re: Dereference before NULL check (src/backend/storage/ipc/latch.c)

From
Kyotaro Horiguchi
Date:
At Sat, 31 Oct 2020 11:40:53 -0300, Ranier Vilela <ranier.vf@gmail.com> wrote in 
> Hi,
> 
> Per Coverity.
> 
> If test set->latch against NULL, is why it can be NULL.
> ResetEvent can dereference NULL.

If the returned event is WL_LATCH_SET, set->latch cannot be NULL. We
shouldn't inadvertently ignore the unexpected or broken situation.We
could put Assert instead, but I think that we don't need do something
here at all since SIGSEGV would be raised at the right location.

regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center



Re: Dereference before NULL check (src/backend/storage/ipc/latch.c)

From
Thomas Munro
Date:
On Mon, Nov 2, 2020 at 1:49 PM Kyotaro Horiguchi
<horikyota.ntt@gmail.com> wrote:
> At Sat, 31 Oct 2020 11:40:53 -0300, Ranier Vilela <ranier.vf@gmail.com> wrote in
> > Per Coverity.
> >
> > If test set->latch against NULL, is why it can be NULL.
> > ResetEvent can dereference NULL.
>
> If the returned event is WL_LATCH_SET, set->latch cannot be NULL. We
> shouldn't inadvertently ignore the unexpected or broken situation.We
> could put Assert instead, but I think that we don't need do something
> here at all since SIGSEGV would be raised at the right location.

Hmm.  I changed that to support set->latch == NULL, so that you can
use the long lived WES in the rare code paths that call WaitLatch()
without a latch (for example the code I proposed at [1]).  The Windows
version leaves the event handle of the most recently used latch in
set->handles[n] (because AFAICS there is no way to have a "hole" in
the handles array).  The event can fire while you are waiting on "no
latch".  Perhaps it should be changed to
ResetEvent(set->handles[cur_event->pos + 1])?

[1]
https://www.postgresql.org/message-id/flat/CA%2BhUKGK1607VmtrDUHQXrsooU%3Dap4g4R2yaoByWOOA3m8xevUQ%40mail.gmail.com



Re: Dereference before NULL check (src/backend/storage/ipc/latch.c)

From
Kyotaro Horiguchi
Date:
At Mon, 2 Nov 2020 16:22:09 +1300, Thomas Munro <thomas.munro@gmail.com> wrote in 
> On Mon, Nov 2, 2020 at 1:49 PM Kyotaro Horiguchi
> <horikyota.ntt@gmail.com> wrote:
> > At Sat, 31 Oct 2020 11:40:53 -0300, Ranier Vilela <ranier.vf@gmail.com> wrote in
> > > Per Coverity.
> > >
> > > If test set->latch against NULL, is why it can be NULL.
> > > ResetEvent can dereference NULL.
> >
> > If the returned event is WL_LATCH_SET, set->latch cannot be NULL. We
> > shouldn't inadvertently ignore the unexpected or broken situation.We
> > could put Assert instead, but I think that we don't need do something
> > here at all since SIGSEGV would be raised at the right location.
> 
> Hmm.  I changed that to support set->latch == NULL, so that you can
> use the long lived WES in the rare code paths that call WaitLatch()
> without a latch (for example the code I proposed at [1]).  The Windows

Ooo.  We don't update epoll events in that case. Ok, I understand
WL_LATCH_SET can fire while set->latch == NULL.

(I was confused by WaitEventAdjust* asserts set->latch != NULL for
 WL_LATCH_SET. Shouldn't we move the check from ModifyWaitEvent() to
 WaitEventAdjust*()?))

> version leaves the event handle of the most recently used latch in
> set->handles[n] (because AFAICS there is no way to have a "hole" in
> the handles array).  The event can fire while you are waiting on "no
> latch".  Perhaps it should be changed to
> ResetEvent(set->handles[cur_event->pos + 1])?
> 
> [1]
https://www.postgresql.org/message-id/flat/CA%2BhUKGK1607VmtrDUHQXrsooU%3Dap4g4R2yaoByWOOA3m8xevUQ%40mail.gmail.com

Seems right. Just doing that *seems* to work fine, but somehow I
cannot build on Windows for now...

regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center
diff --git a/src/backend/storage/ipc/latch.c b/src/backend/storage/ipc/latch.c
index 24d44c982d..318261962e 100644
--- a/src/backend/storage/ipc/latch.c
+++ b/src/backend/storage/ipc/latch.c
@@ -1835,7 +1835,11 @@ WaitEventSetWaitBlock(WaitEventSet *set, int cur_timeout,
 
     if (cur_event->events == WL_LATCH_SET)
     {
-        if (!ResetEvent(set->latch->event))
+        /*
+         * We cannot use set->latch->event to reset the fired event if we
+         * actually aren't waiting on this latch now.
+         */
+        if (!ResetEvent(set->handles[cur_event->pos + 1]))
             elog(ERROR, "ResetEvent failed: error code %lu", GetLastError());
 
         if (set->latch && set->latch->is_set)

Re: Dereference before NULL check (src/backend/storage/ipc/latch.c)

From
Kyotaro Horiguchi
Date:
At Mon, 02 Nov 2020 14:33:40 +0900 (JST), Kyotaro Horiguchi <horikyota.ntt@gmail.com> wrote in 
> At Mon, 2 Nov 2020 16:22:09 +1300, Thomas Munro <thomas.munro@gmail.com> wrote in 
> > On Mon, Nov 2, 2020 at 1:49 PM Kyotaro Horiguchi
> > <horikyota.ntt@gmail.com> wrote:
> > > At Sat, 31 Oct 2020 11:40:53 -0300, Ranier Vilela <ranier.vf@gmail.com> wrote in
> > > > Per Coverity.
> > > >
> > > > If test set->latch against NULL, is why it can be NULL.
> > > > ResetEvent can dereference NULL.
> > >
> > > If the returned event is WL_LATCH_SET, set->latch cannot be NULL. We
> > > shouldn't inadvertently ignore the unexpected or broken situation.We
> > > could put Assert instead, but I think that we don't need do something
> > > here at all since SIGSEGV would be raised at the right location.
> > 
> > Hmm.  I changed that to support set->latch == NULL, so that you can
> > use the long lived WES in the rare code paths that call WaitLatch()
> > without a latch (for example the code I proposed at [1]).  The Windows
> 
> Ooo.  We don't update epoll events in that case. Ok, I understand
> WL_LATCH_SET can fire while set->latch == NULL.
> 
> (I was confused by WaitEventAdjust* asserts set->latch != NULL for
>  WL_LATCH_SET. Isn't it better if we moved the check on latch from
>  ModifyWaitEvent() to WaitEventAdjust*()?))
> 
> > version leaves the event handle of the most recently used latch in
> > set->handles[n] (because AFAICS there is no way to have a "hole" in
> > the handles array).  The event can fire while you are waiting on "no
> > latch".  Perhaps it should be changed to
> > ResetEvent(set->handles[cur_event->pos + 1])?
> > 
> > [1]
https://www.postgresql.org/message-id/flat/CA%2BhUKGK1607VmtrDUHQXrsooU%3Dap4g4R2yaoByWOOA3m8xevUQ%40mail.gmail.com
> 
> Seems right. Just doing that *seems* to work fine, but somehow I
> cannot build on Windows for now...

That was caused by a leftover change on config_default.pl I made when
I tried to enable NLS.

I called SetLatch() during WaitLatch(NULL, ) but that doesn't fire
WL_LATCH_SET event for me on Windows. (I got it fired on Linux..)  On
Windows, the latch is detected after exiting the WaitLatch()
call. Seems like MyLatch of waiter is different from
peerPGPROC->procLatch.  And... an update for Visual Studio broke my
environment... I will investigate this further but everything feel
cumbersome on Windows...

regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center



Re: Dereference before NULL check (src/backend/storage/ipc/latch.c)

From
Ranier Vilela
Date:
Em seg., 2 de nov. de 2020 às 05:25, Kyotaro Horiguchi <horikyota.ntt@gmail.com> escreveu:
At Mon, 02 Nov 2020 14:33:40 +0900 (JST), Kyotaro Horiguchi <horikyota.ntt@gmail.com> wrote in
> At Mon, 2 Nov 2020 16:22:09 +1300, Thomas Munro <thomas.munro@gmail.com> wrote in
> > On Mon, Nov 2, 2020 at 1:49 PM Kyotaro Horiguchi
> > <horikyota.ntt@gmail.com> wrote:
> > > At Sat, 31 Oct 2020 11:40:53 -0300, Ranier Vilela <ranier.vf@gmail.com> wrote in
> > > > Per Coverity.
> > > >
> > > > If test set->latch against NULL, is why it can be NULL.
> > > > ResetEvent can dereference NULL.
> > >
> > > If the returned event is WL_LATCH_SET, set->latch cannot be NULL. We
> > > shouldn't inadvertently ignore the unexpected or broken situation.We
> > > could put Assert instead, but I think that we don't need do something
> > > here at all since SIGSEGV would be raised at the right location.
> >
> > Hmm.  I changed that to support set->latch == NULL, so that you can
> > use the long lived WES in the rare code paths that call WaitLatch()
> > without a latch (for example the code I proposed at [1]).  The Windows
>
> Ooo.  We don't update epoll events in that case. Ok, I understand
> WL_LATCH_SET can fire while set->latch == NULL.
>
> (I was confused by WaitEventAdjust* asserts set->latch != NULL for
>  WL_LATCH_SET. Isn't it better if we moved the check on latch from
>  ModifyWaitEvent() to WaitEventAdjust*()?))
>
> > version leaves the event handle of the most recently used latch in
> > set->handles[n] (because AFAICS there is no way to have a "hole" in
> > the handles array).  The event can fire while you are waiting on "no
> > latch".  Perhaps it should be changed to
> > ResetEvent(set->handles[cur_event->pos + 1])?
> >
> > [1] https://www.postgresql.org/message-id/flat/CA%2BhUKGK1607VmtrDUHQXrsooU%3Dap4g4R2yaoByWOOA3m8xevUQ%40mail.gmail.com
>
> Seems right. Just doing that *seems* to work fine, but somehow I
> cannot build on Windows for now...

That was caused by a leftover change on config_default.pl I made when
I tried to enable NLS.

I called SetLatch() during WaitLatch(NULL, ) but that doesn't fire
WL_LATCH_SET event for me on Windows. (I got it fired on Linux..)  On
Windows, the latch is detected after exiting the WaitLatch()
call. Seems like MyLatch of waiter is different from
peerPGPROC->procLatch.  And... an update for Visual Studio broke my
environment... I will investigate this further but everything feel
cumbersome on Windows...
I can build.
vc_regress is it enough to test it?

regards,
Ranier Vilela

Re: Dereference before NULL check (src/backend/storage/ipc/latch.c)

From
Kyotaro Horiguchi
Date:
At Mon, 02 Nov 2020 17:25:04 +0900 (JST), Kyotaro Horiguchi <horikyota.ntt@gmail.com> wrote in 
> I called SetLatch() during WaitLatch(NULL, ) but that doesn't fire
> WL_LATCH_SET event for me on Windows. (I got it fired on Linux..)  On
> Windows, the latch is detected after exiting the WaitLatch()
> call. Seems like MyLatch of waiter is different from
> peerPGPROC->procLatch.  And... an update for Visual Studio broke my
> environment... I will investigate this further but everything feel
> cumbersome on Windows...

I managed to reproduce the issue. FWIW the attached modifies
pg_backend_pid() to call "WaitLatch(NULL," and
pg_terminate_backend(pid) to SetLatch() to the process latch of the
pid.  (It's minunderstanding that I could reproduce this on Linux.)

Session A:
  =# select pg_backend_pid();  -- sleeps for 10 seconds.

Session B:
  =# select pg_terminate_backend(A-pid);

[11628] LOG:  server process (PID 14568) was terminated by exception 0xC0000005
[11628] DETAIL:  Failed process was running: select pg_backend_pid();
[11628] HINT:  See C include file "ntstatus.h" for a description of the hexadecimal value.
[11628] LOG:  terminating any other active server processes
[2948] WARNING:  terminating connection because of crash of another server process
2

With the fix patch, it changes to:

[16632] LOG:  FALSE LATCH: 0000000000000000

rebards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center

diff --git a/src/backend/storage/ipc/latch.c b/src/backend/storage/ipc/latch.c
index 24d44c98..e900bd3c 100644
--- a/src/backend/storage/ipc/latch.c
+++ b/src/backend/storage/ipc/latch.c
@@ -1845,6 +1845,8 @@ WaitEventSetWaitBlock(WaitEventSet *set, int cur_timeout,
             occurred_events++;
             returned_events++;
         }
+        else if (!set->latch)
+          elog(LOG, "FALSE LATCH: %p", set->latch);
     }
     else if (cur_event->events == WL_POSTMASTER_DEATH)
     {
diff --git a/src/backend/storage/ipc/signalfuncs.c b/src/backend/storage/ipc/signalfuncs.c
index d822e82c..b7340bf4 100644
--- a/src/backend/storage/ipc/signalfuncs.c
+++ b/src/backend/storage/ipc/signalfuncs.c
@@ -134,7 +134,13 @@ pg_cancel_backend(PG_FUNCTION_ARGS)
 Datum
 pg_terminate_backend(PG_FUNCTION_ARGS)
 {
-    int            r = pg_signal_backend(PG_GETARG_INT32(0), SIGTERM);
+    int r = 0;
+    PGPROC *p = BackendPidGetProc(PG_GETARG_INT32(0));
+    
+    if (p)
+        SetLatch(&p->procLatch);
+
+    PG_RETURN_BOOL(false);
 
     if (r == SIGNAL_BACKEND_NOSUPERUSER)
         ereport(ERROR,
diff --git a/src/backend/utils/adt/pgstatfuncs.c b/src/backend/utils/adt/pgstatfuncs.c
index a210fc93..1e2b97a9 100644
--- a/src/backend/utils/adt/pgstatfuncs.c
+++ b/src/backend/utils/adt/pgstatfuncs.c
@@ -960,6 +960,9 @@ pg_stat_get_activity(PG_FUNCTION_ARGS)
 Datum
 pg_backend_pid(PG_FUNCTION_ARGS)
 {
+    WaitLatch(MyLatch, WL_LATCH_SET | WL_TIMEOUT | WL_EXIT_ON_PM_DEATH, 1, WAIT_EVENT_RECOVERY_PAUSE);
+    WaitLatch(NULL, WL_TIMEOUT | WL_EXIT_ON_PM_DEATH, 10000, WAIT_EVENT_RECOVERY_PAUSE);
+
     PG_RETURN_INT32(MyProcPid);
 }


Re: Dereference before NULL check (src/backend/storage/ipc/latch.c)

From
Thomas Munro
Date:
On Tue, Nov 3, 2020 at 12:50 AM Kyotaro Horiguchi
<horikyota.ntt@gmail.com> wrote:
> With the fix patch, it changes to:
>
> [16632] LOG:  FALSE LATCH: 0000000000000000

Nice repo.  But is it OK to not reset the Win32 event in this case?
Does it still work correctly if you wait on the latch after that
happened, and perhaps after the PG latch is reset?



Re: Dereference before NULL check (src/backend/storage/ipc/latch.c)

From
Kyotaro Horiguchi
Date:
At Tue, 3 Nov 2020 20:44:23 +1300, Thomas Munro <thomas.munro@gmail.com> wrote in 
> On Tue, Nov 3, 2020 at 12:50 AM Kyotaro Horiguchi
> <horikyota.ntt@gmail.com> wrote:
> > With the fix patch, it changes to:
> >
> > [16632] LOG:  FALSE LATCH: 0000000000000000
> 
> Nice repo.  But is it OK to not reset the Win32 event in this case?
> Does it still work correctly if you wait on the latch after that
> happened, and perhaps after the PG latch is reset?

I'm not sure what is the point of the question, but the previous patch
doesn't omit resetting the Win32-event in that case.  In the same way
with other implements of the same function, it resets the trigger that
woke up the process since the trigger is no longer needed even if we
are not waiting on it.

If we call WaitLatch(OrSocket) that waits on the latch, it immediately
returns because the latch is set. If we called ResetLatch before the
next call to WaitLatch(), it correctly waits on a trigger to be
pulled.

regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center



Re: Dereference before NULL check (src/backend/storage/ipc/latch.c)

From
Ranier Vilela
Date:
Em ter., 3 de nov. de 2020 às 22:09, Kyotaro Horiguchi <horikyota.ntt@gmail.com> escreveu:
At Tue, 3 Nov 2020 20:44:23 +1300, Thomas Munro <thomas.munro@gmail.com> wrote in
> On Tue, Nov 3, 2020 at 12:50 AM Kyotaro Horiguchi
> <horikyota.ntt@gmail.com> wrote:
> > With the fix patch, it changes to:
> >
> > [16632] LOG:  FALSE LATCH: 0000000000000000
>
> Nice repo.  But is it OK to not reset the Win32 event in this case?
> Does it still work correctly if you wait on the latch after that
> happened, and perhaps after the PG latch is reset?

I'm not sure what is the point of the question, but the previous patch
doesn't omit resetting the Win32-event in that case.  In the same way
with other implements of the same function, it resets the trigger that
woke up the process since the trigger is no longer needed even if we
are not waiting on it.

If we call WaitLatch(OrSocket) that waits on the latch, it immediately
returns because the latch is set. If we called ResetLatch before the
next call to WaitLatch(), it correctly waits on a trigger to be
pulled.
+1
The patch for me is syntactically equal to the code changed and
avoids the dereference.

regards,
Ranier Vilela

Re: Dereference before NULL check (src/backend/storage/ipc/latch.c)

From
Thomas Munro
Date:
On Thu, Nov 5, 2020 at 10:47 AM Ranier Vilela <ranier.vf@gmail.com> wrote:
> Em ter., 3 de nov. de 2020 às 22:09, Kyotaro Horiguchi <horikyota.ntt@gmail.com> escreveu:
>> If we call WaitLatch(OrSocket) that waits on the latch, it immediately
>> returns because the latch is set. If we called ResetLatch before the
>> next call to WaitLatch(), it correctly waits on a trigger to be
>> pulled.
>
> +1
> The patch for me is syntactically equal to the code changed and
> avoids the dereference.

Thanks!  Pushed.