Thread: Re: Add “FOR UPDATE NOWAIT” lock details to the log.

Re: Add “FOR UPDATE NOWAIT” lock details to the log.

From
Fujii Masao
Date:

On 2024/09/13 20:49, Seino Yuki wrote:
> Hello,
> 
> I would like to add the information of the PID that caused the failure
> when acquiring a lock with "FOR UPDATE NOWAIT".
> 
> When "FOR UPDATE" is executed and interrupted by lock_timeout,
> xid and PID are output in the logs,

Could you explain how to reproduce this? In my quick test,
lock waits canceled by lock_timeout didn’t report either xid or PID,
so I might be missing something.


> but in the case of "FOR UPDATE NOWAIT",
> no information is output, making it impossible to identify the cause of the lock failure.
> Therefore, I would like to output information in the logs in the same way as
> when "FOR UPDATE" is executed and interrupted by lock_timeout.

I think NOWAIT is often used for lock waits that can fail frequently.
Logging detailed information for each failed NOWAIT lock wait could
lead to excessive and potentially noisy log messages for some users.

On the other hand, I understand that even with NOWAIT, we might want
to investigate why a lock wait failed. In such cases,
detailed information about the locking process would be useful.

Considering both points, I’m leaning toward adding a new GUC parameter
to control whether detailed logs should be generated for failed
NOWAIT locks (and possibly for those canceled by lock_timeout).
Alternatively, if adding a new GUC is not ideal, we could extend
log_lock_waits to accept a new value like 'error,' which would trigger
detailed logging when a lock wait fails due to NOWAIT, lock_timeout,
or cancellation.


> The patch is attached as well.

I got the following compiler errors;

proc.c:1240:3: error: call to undeclared function 'CollectLockHoldersAndWaiters'; ISO C99 and later do not support
implicitfunction declarations [-Wimplicit-function-declaration]
 
  1240 |                 CollectLockHoldersAndWaiters(proclock, lock, &lock_holders_sbuf, &lock_waiters_sbuf,
&lockHoldersNum);
       |                 ^
proc.c:1549:4: error: call to undeclared function 'CollectLockHoldersAndWaiters'; ISO C99 and later do not support
implicitfunction declarations [-Wimplicit-function-declaration]
 
  1549 |                         CollectLockHoldersAndWaiters(NULL, lock, &lock_holders_sbuf, &lock_waiters_sbuf,
&lockHoldersNum);
       |                         ^
proc.c:1126:8: warning: unused variable 'first_holder' [-Wunused-variable]
  1126 |         bool            first_holder = true,
       |                         ^~~~~~~~~~~~
proc.c:1127:5: warning: unused variable 'first_waiter' [-Wunused-variable]
  1127 |                                 first_waiter = true;
       |                                 ^~~~~~~~~~~~
proc.c:1982:1: error: conflicting types for 'CollectLockHoldersAndWaiters'
  1982 | CollectLockHoldersAndWaiters(PROCLOCK *waitProcLock, LOCK *lock, StringInfo lock_holders_sbuf, StringInfo
lock_waiters_sbuf,int *lockHoldersNum)
 
       | ^
proc.c:1240:3: note: previous implicit declaration is here
  1240 |                 CollectLockHoldersAndWaiters(proclock, lock, &lock_holders_sbuf, &lock_waiters_sbuf,
&lockHoldersNum);
       |                 ^
2 warnings and 3 errors generated.

Regards,

-- 
Fujii Masao
Advanced Computing Technology Center
Research and Development Headquarters
NTT DATA CORPORATION




Re: Add “FOR UPDATE NOWAIT” lock details to the log.

From
Seino Yuki
Date:
Thank you for Review.

> Could you explain how to reproduce this? In my quick test,
> lock waits canceled by lock_timeout didn’t report either xid or PID,
> so I might be missing something.
It seems to be outputting on my end, how about you?
===== Console =====
postgres=# SHOW log_lock_waits;
  log_lock_waits
----------------
  on
(1 row)

postgres=# SHOW lock_timeout;
  lock_timeout
--------------
  2s
(1 row)

(tx1) postgres=# BEGIN;
       BEGIN
       postgres=*# SELECT aid FROM pgbench_accounts WHERE aid = 1 FOR 
UPDATE;
        aid
       -----
          1
       (1 row)
(tx2) postgres=# SELECT aid FROM pgbench_accounts WHERE aid = 1 FOR 
UPDATE;
       ERROR:  canceling statement due to lock timeout
       CONTEXT:  while locking tuple (0,1) in relation "pgbench_accounts"

===== Log Output =====
# lock start
2024-10-17 17:49:58.157 JST [1443346] LOG:  process 1443346 still 
waiting for ShareLock on transaction 770 after 1000.096 ms
2024-10-17 17:49:58.157 JST [1443346] DETAIL:  Process holding the lock: 
1443241. Wait queue: 1443346.
2024-10-17 17:49:58.157 JST [1443346] CONTEXT:  while locking tuple 
(0,1) in relation "pgbench_accounts"
2024-10-17 17:49:58.157 JST [1443346] STATEMENT:  SELECT * FROM 
pgbench_accounts WHERE aid = 1 FOR UPDATE;
# lock timeout
2024-10-17 17:49:59.157 JST [1443346] ERROR:  canceling statement due to 
lock timeout
2024-10-17 17:49:59.157 JST [1443346] CONTEXT:  while locking tuple 
(0,1) in relation "pgbench_accounts"
2024-10-17 17:49:59.157 JST [1443346] STATEMENT:  SELECT * FROM 
pgbench_accounts WHERE aid = 1 FOR UPDATE;


> I think NOWAIT is often used for lock waits that can fail frequently.
> Logging detailed information for each failed NOWAIT lock wait could
> lead to excessive and potentially noisy log messages for some users.
> 
> On the other hand, I understand that even with NOWAIT, we might want
> to investigate why a lock wait failed. In such cases,
> detailed information about the locking process would be useful.
> 
> Considering both points, I’m leaning toward adding a new GUC parameter
> to control whether detailed logs should be generated for failed
> NOWAIT locks (and possibly for those canceled by lock_timeout).
> Alternatively, if adding a new GUC is not ideal, we could extend
> log_lock_waits to accept a new value like 'error,' which would trigger
> detailed logging when a lock wait fails due to NOWAIT, lock_timeout,
> or cancellation.
That's a good idea. I'll try that.

> I got the following compiler errors;
thank you. I missed it.

Regards,
--
Yuki Seino
NTT DATA CORPORATION



Re: Add “FOR UPDATE NOWAIT” lock details to the log.

From
Seino Yuki
Date:
>> Considering both points, I’m leaning toward adding a new GUC parameter
>> to control whether detailed logs should be generated for failed
>> NOWAIT locks (and possibly for those canceled by lock_timeout).
>> Alternatively, if adding a new GUC is not ideal, we could extend
>> log_lock_waits to accept a new value like 'error,' which would trigger
>> detailed logging when a lock wait fails due to NOWAIT, lock_timeout,
>> or cancellation.
> That's a good idea. I'll try that.

Sorry, I misunderstood.
The pid and xid are output during deadlock_timeout.

> LOG:  process 1443346 still waiting for ShareLock on transaction 770 
> after 1000.096 ms
> DETAIL:  Process holding the lock: 1443241. Wait queue: 1443346.
This log output is not triggered by lock_timeout or cancellation.
Therefore, it is difficult to support lock_timeout and cancellation.

I am thinking of supporting only NOWAIT. What do you think?

Regards,
--
Yuki Seino
NTT DATA CORPORATION



Re: Add “FOR UPDATE NOWAIT” lock details to the log.

From
Fujii Masao
Date:

On 2024/10/17 22:15, Seino Yuki wrote:
> 
>>> Considering both points, I’m leaning toward adding a new GUC parameter
>>> to control whether detailed logs should be generated for failed
>>> NOWAIT locks (and possibly for those canceled by lock_timeout).
>>> Alternatively, if adding a new GUC is not ideal, we could extend
>>> log_lock_waits to accept a new value like 'error,' which would trigger
>>> detailed logging when a lock wait fails due to NOWAIT, lock_timeout,
>>> or cancellation.
>> That's a good idea. I'll try that.
> 
> Sorry, I misunderstood.
> The pid and xid are output during deadlock_timeout.
> 
>> LOG:  process 1443346 still waiting for ShareLock on transaction 770 after 1000.096 ms
>> DETAIL:  Process holding the lock: 1443241. Wait queue: 1443346.
> This log output is not triggered by lock_timeout or cancellation.

Yes, these log messages with PID or XID are generated by log_lock_waits, not lock_timeout.


> Therefore, it is difficult to support lock_timeout and cancellation.
> 
> I am thinking of supporting only NOWAIT. What do you think?

I'm not sure why it's challenging to provide detailed log messages for lock waits canceled
by lock_timeout or user cancellation, while it's considered feasible for the NOWAIT case.
However, I think it's reasonable to implement this feature step by step. We can start
by adding support for the NOWAIT case and consider extending it to handle lock_timeout and
cancellation scenarios later if possible.

Regards,

-- 
Fujii Masao
Advanced Computing Technology Center
Research and Development Headquarters
NTT DATA CORPORATION




Re: Add “FOR UPDATE NOWAIT” lock details to the log.

From
Fujii Masao
Date:

On 2024/10/18 9:37, Seino Yuki wrote:
> Currently, we are discussing two improvements:
> 
> 1. Log output when NOWAIT fails.
> 2. Adding control via GUC parameters (NOWAIT, lock_timeout, cancellation).
> 
>> I'm not sure why it's challenging to provide detailed log messages for lock waits canceled
>> by lock_timeout or user cancellation, while it's considered feasible for the NOWAIT case.
> 
> Does this statement mean that for 2, why can NOWAIT but lock_timeout,cancellation cannot?
> 
> For item 2, the lock_timeout and cancellation will log outputs after the deadlock_timeout(e.g. 1s) has elapsed (by
log_lock_waits).

This log message isn't directly related to lock_timeout or cancellations. It appears
if log_lock_waits is enabled and the lock wait time exceeds deadlock_timeout,
regardless of whether lock_timeout is set or if users cancel the operation.

I understood your original proposal as suggesting detailed log output whenever
a NOWAIT lock attempt fails. However, as I mentioned earlier, NOWAIT failures can
happen frequently, so logging every failure in detail could be too noisy for some users.
That’s why I proposed adding a new GUC parameter (or extending log_lock_waits)
to control whether such detailed logs should be generated for NOWAIT failures.

During our discussion, I also thought it would be useful to provide detailed logs
for lock waits canceled by user actions or lock_timeout. This behavior could be
controlled by a similar GUC parameter. However, this is an additional feature
beyond your original proposal, so I think it's fine to leave it out of the initial patch.


> At the time this log is output, it is unclear whether the lock will be cancellation or lock_timeout.
> This means that the timing at "error is determined" and "output logged" do not match.
> 
>> However, I think it's reasonable to implement this feature step by step. We can start
>> by adding support for the NOWAIT case and consider extending it to handle lock_timeout and
>> cancellation scenarios later if possible.
> 
> +1.
> I will send the version with the GUC parameter added from the previous patch.

The choice between adding a new GUC or extending the existing one (e.g., log_lock_waits)
is debatable, but I prefer the latter. I'm considering extending log_lock_waits
to accept a value like "fail". If set to "on" (the current behavior),
detailed logs are generated when the lock wait time exceeds deadlock_timeout.
If set to "fail", logs are generated whenever a lock wait fails. If both are
specified, logs would be triggered when the wait time exceeds deadlock_timeout or
when a lock wait fails.

Regards,

-- 
Fujii Masao
Advanced Computing Technology Center
Research and Development Headquarters
NTT DATA CORPORATION




Re: Add “FOR UPDATE NOWAIT” lock details to the log.

From
Seino Yuki
Date:
> During our discussion, I also thought it would be useful to provide detailed logs 
> > for lock waits canceled by user actions or lock_timeout. 

Thank you. I understand now.
You want to output the logs based on a different trigger than 
log_lock_waits.

At first, I found that strange, but I see now that there are cases where 
it's necessary to output logs for errors even when log_lock_waits is 
disabled.

I have also understood the proposal for improving the GUC parameters.
I will revise the patch.

Regards,
--
Yuki Seino
NTT DATA CORPORATION