Thread: Add Information during standby recovery conflicts
Hi hackers,
I've attached a patch to add information during standby recovery conflicts.
The motive behind is to be able to get some information:
- On the apply side
- On the blocker(s) side
When a standby recovery conflict occurs it could be useful to get more information to be able to dive deep on the root cause and find a way to avoid/mitigate new occurrences.
Adding this information would make the investigations easier, it could help answering questions like:
- On which LSN was the WAL apply blocked?
- What was the purpose of the bocked WAL record?
- On which relation (if any) was the blocked WAL record related to?
- What was the blocker(s) doing?
- When did the blocker(s) started their queries (if any)?
- What was the blocker(s) waiting for? on which wait event?
There is 2 points in this patch:
- Add the information about the blocked WAL record. This is done in standby.c (ResolveRecoveryConflictWithVirtualXIDs, ResolveRecoveryConflictWithDatabase, StandbyTimeoutHandler)
- Add the information about the blocker(s). This is done in postgres.c (RecoveryConflictInterrupt)
2020-06-15 06:48:23.774 UTC [6971] LOG: wal record apply is blocked by 2 connection(s), reason: User query might have needed to see row versions that must be removed. 2020-06-15 06:48:23.774 UTC [6971] LOG: blocked wal record rmgr: Heap2, lsn: 0/038E2678, received at: 2020-06-15 06:48:23.774098+00, desc: CLEAN, relation: rel 1663/13586/16652 fork main blk 0 2020-06-15 06:48:54.773 UTC [7088] LOG: about to interrupt pid: 7088, backend_type: client backend, state: active, wait_event_type: Timeout, wait_event: PgSleep, query_start: 2020-06-15 06:48:14.87672+00 2020-06-15 06:48:54.773 UTC [7088] DETAIL: statement: select *, pg_sleep(120) from bdt; 2020-06-15 06:48:54.773 UTC [7088] STATEMENT: select *, pg_sleep(120) from bdt; 2020-06-15 06:48:54.773 UTC [7088] ERROR: canceling statement due to conflict with recovery 2020-06-15 06:48:54.773 UTC [7088] DETAIL: User query might have needed to see row versions that must be removed. 2020-06-15 06:48:54.773 UTC [7088] STATEMENT: select *, pg_sleep(120) from bdt; 2020-06-15 06:48:54.778 UTC [7037] LOG: about to interrupt pid: 7037, backend_type: client backend, state: active, wait_event_type: Timeout, wait_event: PgSleep, query_start: 2020-06-15 06:48:13.008427+00 2020-06-15 06:48:54.778 UTC [7037] DETAIL: statement: select *, pg_sleep(300) from bdt; 2020-06-15 06:48:54.778 UTC [7037] STATEMENT: select *, pg_sleep(300) from bdt; 2020-06-15 06:48:54.778 UTC [7037] ERROR: canceling statement due to conflict with recovery 2020-06-15 06:48:54.778 UTC [7037] DETAIL: User query might have needed to see row versions that must be removed. 2020-06-15 06:48:54.778 UTC [7037] STATEMENT: select *, pg_sleep(300) from bdt;
It takes care of the other conflicts reason too.
So, should a standby recovery conflict occurs, we could see:
- information about the blocked WAL record apply
- information about the blocker(s)
Regards,
Bertrand
Attachment
On Thu, 18 Jun 2020 at 16:28, Drouvot, Bertrand <bdrouvot@amazon.com> wrote: > > Hi hackers, > > I've attached a patch to add information during standby recovery conflicts. > The motive behind is to be able to get some information: > > On the apply side > On the blocker(s) side > > Motivation: > > When a standby recovery conflict occurs it could be useful to get more information to be able to dive deep on the rootcause and find a way to avoid/mitigate new occurrences. I think this is a good feature. Like log_lock_waits, it will help the users to investigate recovery conflict issues. > > Adding this information would make the investigations easier, it could help answering questions like: > > On which LSN was the WAL apply blocked? > What was the purpose of the bocked WAL record? > On which relation (if any) was the blocked WAL record related to? > What was the blocker(s) doing? > When did the blocker(s) started their queries (if any)? > What was the blocker(s) waiting for? on which wait event? > > Technical context and proposal: > > There is 2 points in this patch: > > Add the information about the blocked WAL record. This is done in standby.c (ResolveRecoveryConflictWithVirtualXIDs, ResolveRecoveryConflictWithDatabase,StandbyTimeoutHandler) I think we already have the information about the WAL record being applied in errcontext. I wonder if we can show the recovery conflict information in the main LOG message, the blocker information in errdetail, and use errcontext with regard to WAL record information. For example: LOG: process 500 waiting for recovery conflict on snapshot DETAIL: conflicting transition id: 100, 200, 300 CONTEXT: WAL redo at 0/3001970 for Heap2/CLEAN: remxid 506 > Outcome Example: > > 2020-06-15 06:48:23.774 UTC [6971] LOG: wal record apply is blocked by 2 connection(s), reason: User query might haveneeded to see row versions that must be removed. > 2020-06-15 06:48:23.774 UTC [6971] LOG: blocked wal record rmgr: Heap2, lsn: 0/038E2678, received at: 2020-06-15 06:48:23.774098+00,desc: CLEAN, relation: rel 1663/13586/16652 fork main blk 0 > 2020-06-15 06:48:54.773 UTC [7088] LOG: about to interrupt pid: 7088, backend_type: client backend, state: active, wait_event_type:Timeout, wait_event: PgSleep, query_start: 2020-06-15 06:48:14.87672+00 I'm concerned that the above information on the process who is about to be interrupted is very detailed but I'm not sure it will be helpful for the users. If the blocker is waiting on something lock, the information should be logged by log_lock_waits. Also the canceled backend will emit the ERROR log with the message "canceling statement due to conflict with recovery”, and its pid can be logged at the log prefix. The user can know who has been canceled by recovery conflicts and the query. Probably we need to consider having a time threshold (or boolean to turn on/off) to emit this information to the server logs. Otherwise, we will end up writing every conflict information, making the log dirty needlessly. Regards, -- Masahiko Sawada http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Hey, On 6/29/20 10:55 AM, Masahiko Sawada wrote: > CAUTION: This email originated from outside of the organization. Do not click links or open attachments unless you canconfirm the sender and know the content is safe. > > > > On Thu, 18 Jun 2020 at 16:28, Drouvot, Bertrand <bdrouvot@amazon.com> wrote: >> Hi hackers, >> >> I've attached a patch to add information during standby recovery conflicts. >> The motive behind is to be able to get some information: >> >> On the apply side >> On the blocker(s) side >> >> Motivation: >> >> When a standby recovery conflict occurs it could be useful to get more information to be able to dive deep on the rootcause and find a way to avoid/mitigate new occurrences. > I think this is a good feature. Like log_lock_waits, it will help the > users to investigate recovery conflict issues. exactly, thanks for looking at it. > >> Adding this information would make the investigations easier, it could help answering questions like: >> >> On which LSN was the WAL apply blocked? >> What was the purpose of the bocked WAL record? >> On which relation (if any) was the blocked WAL record related to? >> What was the blocker(s) doing? >> When did the blocker(s) started their queries (if any)? >> What was the blocker(s) waiting for? on which wait event? >> >> Technical context and proposal: >> >> There is 2 points in this patch: >> >> Add the information about the blocked WAL record. This is done in standby.c (ResolveRecoveryConflictWithVirtualXIDs, ResolveRecoveryConflictWithDatabase,StandbyTimeoutHandler) > I think we already have the information about the WAL record being > applied in errcontext. right, but it won’t be displayed in case log_error_verbosity is set to terse. Or did you had in mind to try to avoid using the new “current_replayed_xlog” in xlog.c? > > I wonder if we can show the recovery conflict information in the main > LOG message, the blocker information in errdetail, and use errcontext > with regard to WAL record information. For example: > > LOG: process 500 waiting for recovery conflict on snapshot > DETAIL: conflicting transition id: 100, 200, 300 > CONTEXT: WAL redo at 0/3001970 for Heap2/CLEAN: remxid 506 Not sure at all if it would be possible to put all the information at the same time. For example in case of shared buffer pin conflict, the blocker is currently known “only” during the RecoveryConflictInterrupt call (so still not known yet when we can “already” report the blocked LSN information). It might also happen that the blocker(s) will never get interrupted (was not blocking anymore before max_standby_streaming_delay has been reached): then it would not be possible to display all the information here (aka when it is interrupted) while we still want to be warned that the WAL replay has been blocked. > >> Outcome Example: >> >> 2020-06-15 06:48:23.774 UTC [6971] LOG: wal record apply is blocked by 2 connection(s), reason: User query might haveneeded to see row versions that must be removed. >> 2020-06-15 06:48:23.774 UTC [6971] LOG: blocked wal record rmgr: Heap2, lsn: 0/038E2678, received at: 2020-06-15 06:48:23.774098+00,desc: CLEAN, relation: rel 1663/13586/16652 fork main blk 0 >> 2020-06-15 06:48:54.773 UTC [7088] LOG: about to interrupt pid: 7088, backend_type: client backend, state: active, wait_event_type:Timeout, wait_event: PgSleep, query_start: 2020-06-15 06:48:14.87672+00 > I'm concerned that the above information on the process who is about > to be interrupted is very detailed but I'm not sure it will be helpful > for the users. If the blocker is waiting on something lock, the > information should be logged by log_lock_waits. The blocker could also be in “idle in transaction”, say for example, on the standby (with hot_standby_feedback set to off): standby> begin; standby> SET TRANSACTION ISOLATION LEVEL REPEATABLE READ; standby> select * from bdt; on the master: master> update bdt set generate_series = 15; master> vacuum verbose bdt; would produce: 2020-07-01 09:18:55.256 UTC [32751] LOG: about to interrupt pid: 32751, backend_type: client backend, state: idle in transaction, wait_event_type: Client, wait_event: ClientRead, query_start: 2020-07-01 09:18:17.390572+00 2020-07-01 09:18:55.256 UTC [32751] DETAIL: statement: select * from bdt; I think those information are useful to have (might get rid of wait_event_type though: done in the new attached patch). > Also the canceled > backend will emit the ERROR log with the message "canceling statement > due to conflict with recovery”, and its pid can be logged at the log > prefix. The user can know who has been canceled by recovery conflicts > and the query. Right, we can also get rid of the pid and the query in the new log message too (done in the new attached patch). > Probably we need to consider having a time threshold (or boolean to > turn on/off) to emit this information to the server logs. Otherwise, > we will end up writing every conflict information, making the log > dirty needlessly. good point, I do agree (done in the new attached patch). Bertrand > > Regards, > > -- > Masahiko Sawada http://www.2ndQuadrant.com/ > PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Attachment
On Wed, 1 Jul 2020 at 21:52, Drouvot, Bertrand <bdrouvot@amazon.com> wrote: > > Hey, > > On 6/29/20 10:55 AM, Masahiko Sawada wrote: > > CAUTION: This email originated from outside of the organization. Do not click links or open attachments unless you canconfirm the sender and know the content is safe. > > > > > > > > On Thu, 18 Jun 2020 at 16:28, Drouvot, Bertrand <bdrouvot@amazon.com> wrote: > >> Hi hackers, > >> > >> I've attached a patch to add information during standby recovery conflicts. > >> The motive behind is to be able to get some information: > >> > >> On the apply side > >> On the blocker(s) side > >> > >> Motivation: > >> > >> When a standby recovery conflict occurs it could be useful to get more information to be able to dive deep on the rootcause and find a way to avoid/mitigate new occurrences. > > I think this is a good feature. Like log_lock_waits, it will help the > > users to investigate recovery conflict issues. > exactly, thanks for looking at it. > > > >> Adding this information would make the investigations easier, it could help answering questions like: > >> > >> On which LSN was the WAL apply blocked? > >> What was the purpose of the bocked WAL record? > >> On which relation (if any) was the blocked WAL record related to? > >> What was the blocker(s) doing? > >> When did the blocker(s) started their queries (if any)? > >> What was the blocker(s) waiting for? on which wait event? > >> > >> Technical context and proposal: > >> > >> There is 2 points in this patch: > >> > >> Add the information about the blocked WAL record. This is done in standby.c (ResolveRecoveryConflictWithVirtualXIDs,ResolveRecoveryConflictWithDatabase, StandbyTimeoutHandler) > > I think we already have the information about the WAL record being > > applied in errcontext. > right, but it won’t be displayed in case log_error_verbosity is set to > terse. Yes. I think in this case errcontext or errdetail is more appropriate for this information in this case. Otherwise, we will end up emitting same WAL record information twice in log_error_verbosity = verbose. For instance, here is the server logs when I tested a recovery conflict on buffer pin but it looks redundant: 2020-07-03 11:01:15.339 JST [60585] LOG: wal record apply is blocked by 1 connection(s), reason: User query might have needed to see row versions that must be removed. 2020-07-03 11:01:15.339 JST [60585] CONTEXT: WAL redo at 0/30025E0 for Heap2/CLEAN: remxid 505 2020-07-03 11:01:15.339 JST [60585] LOG: blocked wal record rmgr: Heap2, lsn: 0/030025E0, received at: 2020-07-03 11:01:15.338997+09, desc: CLEAN, relation: rel 1663/12930/16384 fork main blk 0 2020-07-03 11:01:15.339 JST [60585] CONTEXT: WAL redo at 0/30025E0 for Heap2/CLEAN: remxid 505 2020-07-03 11:01:15.347 JST [60604] LOG: about to be interrupted: backend_type: client backend, state: active, wait_event: PgSleep, query_start: 2020-07-03 11:01:14.337708+09 2020-07-03 11:01:15.347 JST [60604] STATEMENT: select pg_sleep(30); 2020-07-03 11:01:15.347 JST [60604] ERROR: canceling statement due to conflict with recovery 2020-07-03 11:01:15.347 JST [60604] DETAIL: User query might have needed to see row versions that must be removed. 2020-07-03 11:01:15.347 JST [60604] STATEMENT: select pg_sleep(30); There are the information WAL record three times and the reason for interruption twice. > Or did you had in mind to try to avoid using the new > “current_replayed_xlog” in xlog.c? Regarding LogBlockedWalRecordInfo(), it seems to me that the main message of this logging is to let users know both that the process got stuck due to recovery conflict and its reason (lock, database, bufferpin etc). Other information such as the details of blocked WAL, how many processes are blocking seems like additional information. So I think this information would be better to be shown errcontext or errdetails and we don’t need to create a similar feature as we already show the WAL record being replayed in errcontext. Also, this function emits two LOG messages related to each other but I'm concerned that it can be hard for users to associate these separate log lines especially when server logs are written at a high rate. And I think these two log lines don't follow our error message style[1]. > > > > > I wonder if we can show the recovery conflict information in the main > > LOG message, the blocker information in errdetail, and use errcontext > > with regard to WAL record information. For example: > > > > LOG: process 500 waiting for recovery conflict on snapshot > > DETAIL: conflicting transition id: 100, 200, 300 > > CONTEXT: WAL redo at 0/3001970 for Heap2/CLEAN: remxid 506 > Not sure at all if it would be possible to put all the information at > the same time. > For example in case of shared buffer pin conflict, the blocker is > currently known “only” during the RecoveryConflictInterrupt call (so > still not known yet when we can “already” report the blocked LSN > information). > It might also happen that the blocker(s) will never get interrupted (was > not blocking anymore before max_standby_streaming_delay has been > reached): then it would not be possible to display all the information > here (aka when it is interrupted) while we still want to be warned that > the WAL replay has been blocked. I think it's okay with showing different information for different types of recovery conflict. In buffer pin conflict case, I think we can let the user know the process is waiting for recovery conflict on buffer pin in the main message and the WAL record being replayed in errdetails. > > > >> Outcome Example: > >> > >> 2020-06-15 06:48:23.774 UTC [6971] LOG: wal record apply is blocked by 2 connection(s), reason: User query might haveneeded to see row versions that must be removed. > >> 2020-06-15 06:48:23.774 UTC [6971] LOG: blocked wal record rmgr: Heap2, lsn: 0/038E2678, received at: 2020-06-15 06:48:23.774098+00,desc: CLEAN, relation: rel 1663/13586/16652 fork main blk 0 > >> 2020-06-15 06:48:54.773 UTC [7088] LOG: about to interrupt pid: 7088, backend_type: client backend, state: active,wait_event_type: Timeout, wait_event: PgSleep, query_start: 2020-06-15 06:48:14.87672+00 > > I'm concerned that the above information on the process who is about > > to be interrupted is very detailed but I'm not sure it will be helpful > > for the users. If the blocker is waiting on something lock, the > > information should be logged by log_lock_waits. > > The blocker could also be in “idle in transaction”, say for example, on > the standby (with hot_standby_feedback set to off): > > standby> begin; > standby> SET TRANSACTION ISOLATION LEVEL REPEATABLE READ; > standby> select * from bdt; > > on the master: > > master> update bdt set generate_series = 15; > master> vacuum verbose bdt; > > would produce: > > 2020-07-01 09:18:55.256 UTC [32751] LOG: about to interrupt pid: 32751, > backend_type: client backend, state: idle in transaction, > wait_event_type: Client, wait_event: ClientRead, query_start: 2020-07-01 > 09:18:17.390572+00 > 2020-07-01 09:18:55.256 UTC [32751] DETAIL: statement: select * from bdt; > > I think those information are useful to have (might get rid of > wait_event_type though: done in the new attached patch). Since the backend cancels query at a convenient time (CHECK_FOR_INTERRUPT()), I'm concerned that the wait event information displayed in that log might not be helpful. For example, even if the blocker is getting stuck on disk I/O (WAIT_EVENT_BUFFILE_READ etc) while holding a lock that the recovery process is waiting for, when the blocker is able to cancel the query it's no longer waiting for disk I/O. Also, regarding displaying the backend type, we already show the backend type by setting %d in log_line_prefix. I still think we should show this additional information (wait event, query start etc) in errdetails even if we want to show this information in the server logs. Perhaps we can improve errdetail_recovery_conflict()? Aside from the above comments from the perspective of high-level design, I think we can split this patch into two patches: logging the recovery process is waiting (adding log_recovery_conficts) and logging the details of blockers who is about to be interrupted. I think it'll make review easy. Regards, [1] https://www.postgresql.org/docs/devel/error-style-guide.html -- Masahiko Sawada http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
On 7/3/20 4:20 AM, Masahiko Sawada wrote: > CAUTION: This email originated from outside of the organization. Do not click links or open attachments unless you canconfirm the sender and know the content is safe. > > > > On Wed, 1 Jul 2020 at 21:52, Drouvot, Bertrand <bdrouvot@amazon.com> wrote: >> Hey, >> >> On 6/29/20 10:55 AM, Masahiko Sawada wrote: >>> CAUTION: This email originated from outside of the organization. Do not click links or open attachments unless you canconfirm the sender and know the content is safe. >>> >>> >>> >>> On Thu, 18 Jun 2020 at 16:28, Drouvot, Bertrand <bdrouvot@amazon.com> wrote: >>>> Hi hackers, >>>> >>>> I've attached a patch to add information during standby recovery conflicts. >>>> The motive behind is to be able to get some information: >>>> >>>> On the apply side >>>> On the blocker(s) side >>>> >>>> Motivation: >>>> >>>> When a standby recovery conflict occurs it could be useful to get more information to be able to dive deep on the rootcause and find a way to avoid/mitigate new occurrences. >>> I think this is a good feature. Like log_lock_waits, it will help the >>> users to investigate recovery conflict issues. >> exactly, thanks for looking at it. >>>> Adding this information would make the investigations easier, it could help answering questions like: >>>> >>>> On which LSN was the WAL apply blocked? >>>> What was the purpose of the bocked WAL record? >>>> On which relation (if any) was the blocked WAL record related to? >>>> What was the blocker(s) doing? >>>> When did the blocker(s) started their queries (if any)? >>>> What was the blocker(s) waiting for? on which wait event? >>>> >>>> Technical context and proposal: >>>> >>>> There is 2 points in this patch: >>>> >>>> Add the information about the blocked WAL record. This is done in standby.c (ResolveRecoveryConflictWithVirtualXIDs,ResolveRecoveryConflictWithDatabase, StandbyTimeoutHandler) >>> I think we already have the information about the WAL record being >>> applied in errcontext. >> right, but it won’t be displayed in case log_error_verbosity is set to >> terse. > Yes. I think in this case errcontext or errdetail is more appropriate > for this information in this case. Otherwise, we will end up emitting > same WAL record information twice in log_error_verbosity = verbose. > > For instance, here is the server logs when I tested a recovery > conflict on buffer pin but it looks redundant: > > 2020-07-03 11:01:15.339 JST [60585] LOG: wal record apply is blocked > by 1 connection(s), reason: User query might have needed to see row > versions that must be removed. > 2020-07-03 11:01:15.339 JST [60585] CONTEXT: WAL redo at 0/30025E0 > for Heap2/CLEAN: remxid 505 > 2020-07-03 11:01:15.339 JST [60585] LOG: blocked wal record rmgr: > Heap2, lsn: 0/030025E0, received at: 2020-07-03 11:01:15.338997+09, > desc: CLEAN, relation: rel 1663/12930/16384 fork main blk 0 > 2020-07-03 11:01:15.339 JST [60585] CONTEXT: WAL redo at 0/30025E0 > for Heap2/CLEAN: remxid 505 > 2020-07-03 11:01:15.347 JST [60604] LOG: about to be interrupted: > backend_type: client backend, state: active, wait_event: PgSleep, > query_start: 2020-07-03 11:01:14.337708+09 > 2020-07-03 11:01:15.347 JST [60604] STATEMENT: select pg_sleep(30); > 2020-07-03 11:01:15.347 JST [60604] ERROR: canceling statement due to > conflict with recovery > 2020-07-03 11:01:15.347 JST [60604] DETAIL: User query might have > needed to see row versions that must be removed. > 2020-07-03 11:01:15.347 JST [60604] STATEMENT: select pg_sleep(30); > > There are the information WAL record three times and the reason for > interruption twice. Fully makes sense, the new patch version attached is now producing: 2020-07-06 06:10:36.022 UTC [14035] LOG: waiting for recovery conflict on snapshot 2020-07-06 06:10:36.022 UTC [14035] DETAIL: WAL record received at 2020-07-06 06:10:36.021963+00. Tablespace/database/relation are 1663/13586/16672, fork is main and block is 0. There is 1 blocking connection(s). 2020-07-06 06:10:36.022 UTC [14035] CONTEXT: WAL redo at 0/3A05708 for Heap2/CLEAN: remxid 972 It does not provide the pid(s) of the blocking processes as they'll appear during the interruption(s). >> Or did you had in mind to try to avoid using the new >> “current_replayed_xlog” in xlog.c? > Regarding LogBlockedWalRecordInfo(), it seems to me that the main > message of this logging is to let users know both that the process got > stuck due to recovery conflict and its reason (lock, database, > bufferpin etc). Other information such as the details of blocked WAL, > how many processes are blocking seems like additional information. So > I think this information would be better to be shown errcontext or > errdetails and we don’t need to create a similar feature as we already > show the WAL record being replayed in errcontext. I got rid of current_replayed_xlog in the new patch attached and make use of the errcontext argument instead. > > Also, this function emits two LOG messages related to each other but > I'm concerned that it can be hard for users to associate these > separate log lines especially when server logs are written at a high > rate. And I think these two log lines don't follow our error message > style[1]. > >>> I wonder if we can show the recovery conflict information in the main >>> LOG message, the blocker information in errdetail, and use errcontext >>> with regard to WAL record information. For example: >>> >>> LOG: process 500 waiting for recovery conflict on snapshot >>> DETAIL: conflicting transition id: 100, 200, 300 >>> CONTEXT: WAL redo at 0/3001970 for Heap2/CLEAN: remxid 506 >> Not sure at all if it would be possible to put all the information at >> the same time. >> For example in case of shared buffer pin conflict, the blocker is >> currently known “only” during the RecoveryConflictInterrupt call (so >> still not known yet when we can “already” report the blocked LSN >> information). >> It might also happen that the blocker(s) will never get interrupted (was >> not blocking anymore before max_standby_streaming_delay has been >> reached): then it would not be possible to display all the information >> here (aka when it is interrupted) while we still want to be warned that >> the WAL replay has been blocked. > I think it's okay with showing different information for different > types of recovery conflict. In buffer pin conflict case, I think we > can let the user know the process is waiting for recovery conflict on > buffer pin in the main message and the WAL record being replayed in > errdetails. > >>>> Outcome Example: >>>> >>>> 2020-06-15 06:48:23.774 UTC [6971] LOG: wal record apply is blocked by 2 connection(s), reason: User query might haveneeded to see row versions that must be removed. >>>> 2020-06-15 06:48:23.774 UTC [6971] LOG: blocked wal record rmgr: Heap2, lsn: 0/038E2678, received at: 2020-06-15 06:48:23.774098+00,desc: CLEAN, relation: rel 1663/13586/16652 fork main blk 0 >>>> 2020-06-15 06:48:54.773 UTC [7088] LOG: about to interrupt pid: 7088, backend_type: client backend, state: active,wait_event_type: Timeout, wait_event: PgSleep, query_start: 2020-06-15 06:48:14.87672+00 >>> I'm concerned that the above information on the process who is about >>> to be interrupted is very detailed but I'm not sure it will be helpful >>> for the users. If the blocker is waiting on something lock, the >>> information should be logged by log_lock_waits. >> The blocker could also be in “idle in transaction”, say for example, on >> the standby (with hot_standby_feedback set to off): >> >> standby> begin; >> standby> SET TRANSACTION ISOLATION LEVEL REPEATABLE READ; >> standby> select * from bdt; >> >> on the master: >> >> master> update bdt set generate_series = 15; >> master> vacuum verbose bdt; >> >> would produce: >> >> 2020-07-01 09:18:55.256 UTC [32751] LOG: about to interrupt pid: 32751, >> backend_type: client backend, state: idle in transaction, >> wait_event_type: Client, wait_event: ClientRead, query_start: 2020-07-01 >> 09:18:17.390572+00 >> 2020-07-01 09:18:55.256 UTC [32751] DETAIL: statement: select * from bdt; >> >> I think those information are useful to have (might get rid of >> wait_event_type though: done in the new attached patch). > Since the backend cancels query at a convenient time > (CHECK_FOR_INTERRUPT()), I'm concerned that the wait event information > displayed in that log might not be helpful. For example, even if the > blocker is getting stuck on disk I/O (WAIT_EVENT_BUFFILE_READ etc) > while holding a lock that the recovery process is waiting for, when > the blocker is able to cancel the query it's no longer waiting for > disk I/O. Also, regarding displaying the backend type, we already show > the backend type by setting %d in log_line_prefix. > > I still think we should show this additional information (wait event, > query start etc) in errdetails even if we want to show this > information in the server logs. Perhaps we can improve > errdetail_recovery_conflict()? > > Aside from the above comments from the perspective of high-level > design, I think we can split this patch into two patches: logging the > recovery process is waiting (adding log_recovery_conficts) and logging > the details of blockers who is about to be interrupted. I think it'll > make review easy. Ok. Let's keep this thread for the new attached patch that focus on the recovery process waiting. I'll create a new one with a dedicated thread for the blockers information once done. Thanks Bertrand > Regards, > > [1] https://www.postgresql.org/docs/devel/error-style-guide.html > > -- > Masahiko Sawada http://www.2ndQuadrant.com/ > PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Attachment
On Mon, 6 Jul 2020 at 15:42, Drouvot, Bertrand <bdrouvot@amazon.com> wrote: > > > On 7/3/20 4:20 AM, Masahiko Sawada wrote: > > CAUTION: This email originated from outside of the organization. Do not click links or open attachments unless you canconfirm the sender and know the content is safe. > > > > > > > > On Wed, 1 Jul 2020 at 21:52, Drouvot, Bertrand <bdrouvot@amazon.com> wrote: > >> Hey, > >> > >> On 6/29/20 10:55 AM, Masahiko Sawada wrote: > >>> CAUTION: This email originated from outside of the organization. Do not click links or open attachments unless youcan confirm the sender and know the content is safe. > >>> > >>> > >>> > >>> On Thu, 18 Jun 2020 at 16:28, Drouvot, Bertrand <bdrouvot@amazon.com> wrote: > >>>> Hi hackers, > >>>> > >>>> I've attached a patch to add information during standby recovery conflicts. > >>>> The motive behind is to be able to get some information: > >>>> > >>>> On the apply side > >>>> On the blocker(s) side > >>>> > >>>> Motivation: > >>>> > >>>> When a standby recovery conflict occurs it could be useful to get more information to be able to dive deep on theroot cause and find a way to avoid/mitigate new occurrences. > >>> I think this is a good feature. Like log_lock_waits, it will help the > >>> users to investigate recovery conflict issues. > >> exactly, thanks for looking at it. > >>>> Adding this information would make the investigations easier, it could help answering questions like: > >>>> > >>>> On which LSN was the WAL apply blocked? > >>>> What was the purpose of the bocked WAL record? > >>>> On which relation (if any) was the blocked WAL record related to? > >>>> What was the blocker(s) doing? > >>>> When did the blocker(s) started their queries (if any)? > >>>> What was the blocker(s) waiting for? on which wait event? > >>>> > >>>> Technical context and proposal: > >>>> > >>>> There is 2 points in this patch: > >>>> > >>>> Add the information about the blocked WAL record. This is done in standby.c (ResolveRecoveryConflictWithVirtualXIDs,ResolveRecoveryConflictWithDatabase, StandbyTimeoutHandler) > >>> I think we already have the information about the WAL record being > >>> applied in errcontext. > >> right, but it won’t be displayed in case log_error_verbosity is set to > >> terse. > > Yes. I think in this case errcontext or errdetail is more appropriate > > for this information in this case. Otherwise, we will end up emitting > > same WAL record information twice in log_error_verbosity = verbose. > > > > For instance, here is the server logs when I tested a recovery > > conflict on buffer pin but it looks redundant: > > > > 2020-07-03 11:01:15.339 JST [60585] LOG: wal record apply is blocked > > by 1 connection(s), reason: User query might have needed to see row > > versions that must be removed. > > 2020-07-03 11:01:15.339 JST [60585] CONTEXT: WAL redo at 0/30025E0 > > for Heap2/CLEAN: remxid 505 > > 2020-07-03 11:01:15.339 JST [60585] LOG: blocked wal record rmgr: > > Heap2, lsn: 0/030025E0, received at: 2020-07-03 11:01:15.338997+09, > > desc: CLEAN, relation: rel 1663/12930/16384 fork main blk 0 > > 2020-07-03 11:01:15.339 JST [60585] CONTEXT: WAL redo at 0/30025E0 > > for Heap2/CLEAN: remxid 505 > > 2020-07-03 11:01:15.347 JST [60604] LOG: about to be interrupted: > > backend_type: client backend, state: active, wait_event: PgSleep, > > query_start: 2020-07-03 11:01:14.337708+09 > > 2020-07-03 11:01:15.347 JST [60604] STATEMENT: select pg_sleep(30); > > 2020-07-03 11:01:15.347 JST [60604] ERROR: canceling statement due to > > conflict with recovery > > 2020-07-03 11:01:15.347 JST [60604] DETAIL: User query might have > > needed to see row versions that must be removed. > > 2020-07-03 11:01:15.347 JST [60604] STATEMENT: select pg_sleep(30); > > > > There are the information WAL record three times and the reason for > > interruption twice. > > > Fully makes sense, the new patch version attached is now producing: > > 2020-07-06 06:10:36.022 UTC [14035] LOG: waiting for recovery conflict > on snapshot How about adding the subject? that is, "recovery is waiting for recovery conflict on %s" or "recovery process <pid> is waiting for conflict on %s". > 2020-07-06 06:10:36.022 UTC [14035] DETAIL: WAL record received at > 2020-07-06 06:10:36.021963+00. > Tablespace/database/relation are 1663/13586/16672, fork is main > and block is 0. > There is 1 blocking connection(s). To follow the existing log message, perhaps this can be something like "WAL record received at %s, rel %u/%u/%u, fork %s, blkno %u. %d processes". But I'm not sure the errdetail is the best place to display the WAL information as I mentioned in the latter part of this email. > 2020-07-06 06:10:36.022 UTC [14035] CONTEXT: WAL redo at 0/3A05708 for > Heap2/CLEAN: remxid 972 > > It does not provide the pid(s) of the blocking processes as they'll > appear during the interruption(s). > > >> Or did you had in mind to try to avoid using the new > >> “current_replayed_xlog” in xlog.c? > > Regarding LogBlockedWalRecordInfo(), it seems to me that the main > > message of this logging is to let users know both that the process got > > stuck due to recovery conflict and its reason (lock, database, > > bufferpin etc). Other information such as the details of blocked WAL, > > how many processes are blocking seems like additional information. So > > I think this information would be better to be shown errcontext or > > errdetails and we don’t need to create a similar feature as we already > > show the WAL record being replayed in errcontext. > > > I got rid of current_replayed_xlog in the new patch attached and make > use of the errcontext argument instead. > > > > > Also, this function emits two LOG messages related to each other but > > I'm concerned that it can be hard for users to associate these > > separate log lines especially when server logs are written at a high > > rate. And I think these two log lines don't follow our error message > > style[1]. > > > >>> I wonder if we can show the recovery conflict information in the main > >>> LOG message, the blocker information in errdetail, and use errcontext > >>> with regard to WAL record information. For example: > >>> > >>> LOG: process 500 waiting for recovery conflict on snapshot > >>> DETAIL: conflicting transition id: 100, 200, 300 > >>> CONTEXT: WAL redo at 0/3001970 for Heap2/CLEAN: remxid 506 > >> Not sure at all if it would be possible to put all the information at > >> the same time. > >> For example in case of shared buffer pin conflict, the blocker is > >> currently known “only” during the RecoveryConflictInterrupt call (so > >> still not known yet when we can “already” report the blocked LSN > >> information). > >> It might also happen that the blocker(s) will never get interrupted (was > >> not blocking anymore before max_standby_streaming_delay has been > >> reached): then it would not be possible to display all the information > >> here (aka when it is interrupted) while we still want to be warned that > >> the WAL replay has been blocked. > > I think it's okay with showing different information for different > > types of recovery conflict. In buffer pin conflict case, I think we > > can let the user know the process is waiting for recovery conflict on > > buffer pin in the main message and the WAL record being replayed in > > errdetails. > > > >>>> Outcome Example: > >>>> > >>>> 2020-06-15 06:48:23.774 UTC [6971] LOG: wal record apply is blocked by 2 connection(s), reason: User query mighthave needed to see row versions that must be removed. > >>>> 2020-06-15 06:48:23.774 UTC [6971] LOG: blocked wal record rmgr: Heap2, lsn: 0/038E2678, received at: 2020-06-1506:48:23.774098+00, desc: CLEAN, relation: rel 1663/13586/16652 fork main blk 0 > >>>> 2020-06-15 06:48:54.773 UTC [7088] LOG: about to interrupt pid: 7088, backend_type: client backend, state: active,wait_event_type: Timeout, wait_event: PgSleep, query_start: 2020-06-15 06:48:14.87672+00 > >>> I'm concerned that the above information on the process who is about > >>> to be interrupted is very detailed but I'm not sure it will be helpful > >>> for the users. If the blocker is waiting on something lock, the > >>> information should be logged by log_lock_waits. > >> The blocker could also be in “idle in transaction”, say for example, on > >> the standby (with hot_standby_feedback set to off): > >> > >> standby> begin; > >> standby> SET TRANSACTION ISOLATION LEVEL REPEATABLE READ; > >> standby> select * from bdt; > >> > >> on the master: > >> > >> master> update bdt set generate_series = 15; > >> master> vacuum verbose bdt; > >> > >> would produce: > >> > >> 2020-07-01 09:18:55.256 UTC [32751] LOG: about to interrupt pid: 32751, > >> backend_type: client backend, state: idle in transaction, > >> wait_event_type: Client, wait_event: ClientRead, query_start: 2020-07-01 > >> 09:18:17.390572+00 > >> 2020-07-01 09:18:55.256 UTC [32751] DETAIL: statement: select * from bdt; > >> > >> I think those information are useful to have (might get rid of > >> wait_event_type though: done in the new attached patch). > > Since the backend cancels query at a convenient time > > (CHECK_FOR_INTERRUPT()), I'm concerned that the wait event information > > displayed in that log might not be helpful. For example, even if the > > blocker is getting stuck on disk I/O (WAIT_EVENT_BUFFILE_READ etc) > > while holding a lock that the recovery process is waiting for, when > > the blocker is able to cancel the query it's no longer waiting for > > disk I/O. Also, regarding displaying the backend type, we already show > > the backend type by setting %d in log_line_prefix. > > > > I still think we should show this additional information (wait event, > > query start etc) in errdetails even if we want to show this > > information in the server logs. Perhaps we can improve > > errdetail_recovery_conflict()? > > > > Aside from the above comments from the perspective of high-level > > design, I think we can split this patch into two patches: logging the > > recovery process is waiting (adding log_recovery_conficts) and logging > > the details of blockers who is about to be interrupted. I think it'll > > make review easy. > > > Ok. Let's keep this thread for the new attached patch that focus on the > recovery process waiting. Thank you for updating the patch! I've tested the latest patch. On recovery conflict on lock and on bufferpin, if max_standby_streaming_delay is disabled (set to -1), the logs don't appear even if log_recovery_conflicts is true. Here is random comments on the code: + recovery_conflict_main_message = psprintf("waiting for recovery conflict on %s", + get_procsignal_reason_desc(reason)); : + ereport(LOG, + (errmsg("%s", recovery_conflict_main_message), + errdetail("%s\n" "There is %d blocking connection(s).", wal_record_detail_str, num_waitlist_entries))); It's not translation-support-friendly. I think the message "waiting for recovery conflict on %s" should be surrounded by _(). Or we can just put it to ereport as follows: ereport(LOG, (errmsg("waiting for recovery conflicts on %s", get_procsignal_reason_desc(reason)) ... --- + oldcontext = MemoryContextSwitchTo(ErrorContext); + econtext = error_context_stack; + + if (XLogRecGetBlockTag(econtext->arg, 0, &rnode, &forknum, &blknum)) I don't think it's a good idea to rely on error_context_stack because other codes might set another error context before reaching here in the future. --- + if (XLogRecGetBlockTag(econtext->arg, 0, &rnode, &forknum, &blknum)) + wal_record_detail_str = psprintf("WAL record received at %s.\nTablespace/database/relation are %u/%u/%u, fork is %s and block is %u.", + receipt_time_str, rnode.spcNode, rnode.dbNode, rnode.relNode, + forkNames[forknum], + blknum); There might be a block tag in block ids other than 0. I'm not sure the errdetail is the best place where we display WAL information. For instance, we display both the relation oid and block number depending on RM as follows: 2020-07-07 15:50:30.733 JST [13344] LOG: waiting for recovery conflict on lock 2020-07-07 15:50:30.733 JST [13344] DETAIL: WAL record received at 2020-07-07 15:50:27.73378+09. There is 1 blocking connection(s). 2020-07-07 15:50:30.733 JST [13344] CONTEXT: WAL redo at 0/3000028 for Standby/LOCK: xid 506 db 13586 rel 16384 I wonder if we can display the details of redo WAL information by improving xlog_outdesc() or rm_redo_error_callback() so that it displays relfilenode, forknum, and block number. What do you think? --- + /* display wal record information */ + if (log_recovery_conflicts) + LogBlockedWalRecordInfo(num_waitlist_entries, reason); I'm concerned that we should log the recovery conflict information every time when a recovery conflict happens from the perspective of overheads and the amount of the logs. Can we logs that information after waiting for deadlock_timeouts secs like log_lock_waits or waiting for the fixed duration? --- @@ -609,6 +682,10 @@ StandbyTimeoutHandler(void) /* forget any pending STANDBY_DEADLOCK_TIMEOUT request */ disable_timeout(STANDBY_DEADLOCK_TIMEOUT, false); + /* display wal record information */ + if (log_recovery_conflicts) + LogBlockedWalRecordInfo(-1, PROCSIG_RECOVERY_CONFLICT_BUFFERPIN); + SendRecoveryConflictWithBufferPin(PROCSIG_RECOVERY_CONFLICT_BUFFERPIN); } ResolveRecoveryConflictWithBufferPin() which sets a timer to call StandbyTimeoutHandler() can be called multiple times even if the recovery is waiting for one buffer pin. I think we should avoid logging the same contents multiple times. --- - + { + {"log_recovery_conflicts", PGC_SUSET, LOGGING_WHAT, + gettext_noop("Logs standby recovery conflicts."), + NULL + }, + &log_recovery_conflicts, + true, + NULL, NULL, NULL + }, Other logging parameters such as log_lock_waits is false by default. I think this parameter can also be false by default but is there any reason to enable it by default? Since this parameter applies only to the startup process, I think it should be PGC_SIGHUP. --- + /* display wal record information */ + if (log_recovery_conflicts) + LogBlockedWalRecordInfo(CountDBBackends(dbid), PROCSIG_RECOVERY_CONFLICT_DATABASE); We log the recovery conflict into the server log but we don't update the process title to append "waiting". While discussing the process title update on recovery conflict, I got the review comment[1] that we don't need to update the process title because no wait occurs when recovery conflict with database happens. As the comment says, recovery is canceling the existing processes on the database being removed, but not waiting for something. Regards, [1] https://www.postgresql.org/message-id/84e4ea5f-80ec-9862-d419-c4433b3c2965%40oss.nttdata.com -- Masahiko Sawada http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
On 7/7/20 9:43 AM, Masahiko Sawada wrote: > Fully makes sense, the new patch version attached is now producing: >> 2020-07-06 06:10:36.022 UTC [14035] LOG: waiting for recovery conflict >> on snapshot > How about adding the subject? that is, "recovery is waiting for > recovery conflict on %s" or "recovery process <pid> is waiting for > conflict on %s". The subject is now added in the new attached patch (I did not include the pid as it is part of the log prefix). It now looks like: 2020-07-11 12:00:41.092 UTC [23217] LOG: recovery is waiting for recovery conflict on snapshot 2020-07-11 12:00:41.092 UTC [23217] DETAIL: There is 1 blocking connection(s). 2020-07-11 12:00:41.092 UTC [23217] CONTEXT: WAL redo at 0/4A0A6BF0 for Heap2/CLEAN: remxid 1128 WAL record received at 2020-07-11 12:00:41.092231+00 tbs 1663 db 13586 rel 16805, fork main, blkno 0 > >> 2020-07-06 06:10:36.022 UTC [14035] DETAIL: WAL record received at >> 2020-07-06 06:10:36.021963+00. >> Tablespace/database/relation are 1663/13586/16672, fork is main >> and block is 0. >> There is 1 blocking connection(s). > To follow the existing log message, perhaps this can be something like > "WAL record received at %s, rel %u/%u/%u, fork %s, blkno %u. %d > processes". But I'm not sure the errdetail is the best place to > display the WAL information as I mentioned in the latter part of this > email. moved to the context and formatted the same way as the current Standby/LOCK context. > Ok. Let's keep this thread for the new attached patch that focus on the >> recovery process waiting. > Thank you for updating the patch! > > I've tested the latest patch. Thank you for testing and reviewing! > On recovery conflict on lock and on > bufferpin, if max_standby_streaming_delay is disabled (set to -1), the > logs don't appear even if log_recovery_conflicts is true. Nice catch! it is fixed in the new attached patch (the log reporting has been moved out of StandbyTimeoutHandler()). > > Here is random comments on the code: > > + recovery_conflict_main_message = psprintf("waiting for > recovery conflict on %s", > + > get_procsignal_reason_desc(reason)); > : > + ereport(LOG, > + (errmsg("%s", recovery_conflict_main_message), > + errdetail("%s\n" "There is %d blocking > connection(s).", wal_record_detail_str, num_waitlist_entries))); > > It's not translation-support-friendly. I think the message "waiting > for recovery conflict on %s" should be surrounded by _(). Or we can > just put it to ereport as follows: > > ereport(LOG, > (errmsg("waiting for recovery conflicts on %s", > get_procsignal_reason_desc(reason)) > ... changed in the new attached patch. > > --- > + oldcontext = MemoryContextSwitchTo(ErrorContext); > + econtext = error_context_stack; > + > + if (XLogRecGetBlockTag(econtext->arg, 0, &rnode, &forknum, &blknum)) > > I don't think it's a good idea to rely on error_context_stack because > other codes might set another error context before reaching here in > the future. right, changed in the new attached patch: this is now done in rm_redo_error_callback() and using the XLogReaderState passed as argument. > > --- > + if (XLogRecGetBlockTag(econtext->arg, 0, &rnode, &forknum, &blknum)) > + wal_record_detail_str = psprintf("WAL record received > at %s.\nTablespace/database/relation are %u/%u/%u, fork is %s and > block is %u.", > + receipt_time_str, > rnode.spcNode, rnode.dbNode, rnode.relNode, > + forkNames[forknum], > + blknum); > > There might be a block tag in block ids other than 0. right, fixed in the new attached patch. > I'm not sure the > errdetail is the best place where we display WAL information. moved to context in the new attached patch. > For > instance, we display both the relation oid and block number depending > on RM as follows: > > 2020-07-07 15:50:30.733 JST [13344] LOG: waiting for recovery conflict on lock > 2020-07-07 15:50:30.733 JST [13344] DETAIL: WAL record received at > 2020-07-07 15:50:27.73378+09. > There is 1 blocking connection(s). > 2020-07-07 15:50:30.733 JST [13344] CONTEXT: WAL redo at 0/3000028 > for Standby/LOCK: xid 506 db 13586 rel 16384 > > I wonder if we can display the details of redo WAL information by > improving xlog_outdesc() or rm_redo_error_callback() so that it > displays relfilenode, forknum, and block number. What do you think? I think that fully make sense to move this to rm_redo_error_callback(). This is where the information is now been displayed in the new attached patch. > > --- > + /* display wal record information */ > + if (log_recovery_conflicts) > + LogBlockedWalRecordInfo(num_waitlist_entries, reason); > > I'm concerned that we should log the recovery conflict information > every time when a recovery conflict happens from the perspective of > overheads and the amount of the logs. Can we logs that information > after waiting for deadlock_timeouts secs like log_lock_waits or > waiting for the fixed duration? The new attached patch is now waiting for deadlock_timeout duration. > > --- > @@ -609,6 +682,10 @@ StandbyTimeoutHandler(void) > /* forget any pending STANDBY_DEADLOCK_TIMEOUT request */ > disable_timeout(STANDBY_DEADLOCK_TIMEOUT, false); > > + /* display wal record information */ > + if (log_recovery_conflicts) > + LogBlockedWalRecordInfo(-1, > PROCSIG_RECOVERY_CONFLICT_BUFFERPIN); > + > SendRecoveryConflictWithBufferPin(PROCSIG_RECOVERY_CONFLICT_BUFFERPIN); > } > > ResolveRecoveryConflictWithBufferPin() which sets a timer to call > StandbyTimeoutHandler() can be called multiple times even if the > recovery is waiting for one buffer pin. I think we should avoid > logging the same contents multiple times. I do agree, only the first pass is now been logged. > > --- > - > + { > + {"log_recovery_conflicts", PGC_SUSET, LOGGING_WHAT, > + gettext_noop("Logs standby recovery conflicts."), > + NULL > + }, > + &log_recovery_conflicts, > + true, > + NULL, NULL, NULL > + }, > > Other logging parameters such as log_lock_waits is false by default. I > think this parameter can also be false by default but is there any > reason to enable it by default? now set to false by default. > > Since this parameter applies only to the startup process, I think it > should be PGC_SIGHUP. changed that way. > > --- > + /* display wal record information */ > + if (log_recovery_conflicts) > + LogBlockedWalRecordInfo(CountDBBackends(dbid), > PROCSIG_RECOVERY_CONFLICT_DATABASE); > > We log the recovery conflict into the server log but we don't update > the process title to append "waiting". While discussing the process > title update on recovery conflict, I got the review comment[1] that we > don't need to update the process title because no wait occurs when > recovery conflict with database happens. As the comment says, recovery > is canceling the existing processes on the database being removed, but > not waiting for something. Ok, I keep reporting the conflict but changed the wording for this particular case. Thanks Bertrand
Attachment
On Sat, 11 Jul 2020 at 21:56, Drouvot, Bertrand <bdrouvot@amazon.com> wrote: > > > On 7/7/20 9:43 AM, Masahiko Sawada wrote: > > Fully makes sense, the new patch version attached is now producing: > >> 2020-07-06 06:10:36.022 UTC [14035] LOG: waiting for recovery conflict > >> on snapshot > > How about adding the subject? that is, "recovery is waiting for > > recovery conflict on %s" or "recovery process <pid> is waiting for > > conflict on %s". > > > The subject is now added in the new attached patch (I did not include > the pid as it is part of the log prefix). > > It now looks like: > > 2020-07-11 12:00:41.092 UTC [23217] LOG: recovery is waiting for > recovery conflict on snapshot > 2020-07-11 12:00:41.092 UTC [23217] DETAIL: There is 1 blocking > connection(s). > 2020-07-11 12:00:41.092 UTC [23217] CONTEXT: WAL redo at 0/4A0A6BF0 for > Heap2/CLEAN: remxid 1128 > WAL record received at 2020-07-11 12:00:41.092231+00 > tbs 1663 db 13586 rel 16805, fork main, blkno 0 > > > >> 2020-07-06 06:10:36.022 UTC [14035] DETAIL: WAL record received at > >> 2020-07-06 06:10:36.021963+00. > >> Tablespace/database/relation are 1663/13586/16672, fork is main > >> and block is 0. > >> There is 1 blocking connection(s). > > To follow the existing log message, perhaps this can be something like > > "WAL record received at %s, rel %u/%u/%u, fork %s, blkno %u. %d > > processes". But I'm not sure the errdetail is the best place to > > display the WAL information as I mentioned in the latter part of this > > email. > > moved to the context and formatted the same way as the current > Standby/LOCK context. > > > > Ok. Let's keep this thread for the new attached patch that focus on the > >> recovery process waiting. > > Thank you for updating the patch! > > > > I've tested the latest patch. > > > Thank you for testing and reviewing! > > > > On recovery conflict on lock and on > > bufferpin, if max_standby_streaming_delay is disabled (set to -1), the > > logs don't appear even if log_recovery_conflicts is true. > > > Nice catch! it is fixed in the new attached patch (the log reporting has > been moved out of StandbyTimeoutHandler()). > > > > > Here is random comments on the code: > > > > + recovery_conflict_main_message = psprintf("waiting for > > recovery conflict on %s", > > + > > get_procsignal_reason_desc(reason)); > > : > > + ereport(LOG, > > + (errmsg("%s", recovery_conflict_main_message), > > + errdetail("%s\n" "There is %d blocking > > connection(s).", wal_record_detail_str, num_waitlist_entries))); > > > > It's not translation-support-friendly. I think the message "waiting > > for recovery conflict on %s" should be surrounded by _(). Or we can > > just put it to ereport as follows: > > > > ereport(LOG, > > (errmsg("waiting for recovery conflicts on %s", > > get_procsignal_reason_desc(reason)) > > ... > > > changed in the new attached patch. > > > > > > --- > > + oldcontext = MemoryContextSwitchTo(ErrorContext); > > + econtext = error_context_stack; > > + > > + if (XLogRecGetBlockTag(econtext->arg, 0, &rnode, &forknum, &blknum)) > > > > I don't think it's a good idea to rely on error_context_stack because > > other codes might set another error context before reaching here in > > the future. > > > right, changed in the new attached patch: this is now done in > rm_redo_error_callback() and using the XLogReaderState passed as argument. > > > > > > --- > > + if (XLogRecGetBlockTag(econtext->arg, 0, &rnode, &forknum, &blknum)) > > + wal_record_detail_str = psprintf("WAL record received > > at %s.\nTablespace/database/relation are %u/%u/%u, fork is %s and > > block is %u.", > > + receipt_time_str, > > rnode.spcNode, rnode.dbNode, rnode.relNode, > > + forkNames[forknum], > > + blknum); > > > > There might be a block tag in block ids other than 0. > > > right, fixed in the new attached patch. > > > > I'm not sure the > > errdetail is the best place where we display WAL information. > > > moved to context in the new attached patch. > > > > For > > instance, we display both the relation oid and block number depending > > on RM as follows: > > > > 2020-07-07 15:50:30.733 JST [13344] LOG: waiting for recovery conflict on lock > > 2020-07-07 15:50:30.733 JST [13344] DETAIL: WAL record received at > > 2020-07-07 15:50:27.73378+09. > > There is 1 blocking connection(s). > > 2020-07-07 15:50:30.733 JST [13344] CONTEXT: WAL redo at 0/3000028 > > for Standby/LOCK: xid 506 db 13586 rel 16384 > > > > I wonder if we can display the details of redo WAL information by > > improving xlog_outdesc() or rm_redo_error_callback() so that it > > displays relfilenode, forknum, and block number. What do you think? > > > I think that fully make sense to move this to rm_redo_error_callback(). > > This is where the information is now been displayed in the new attached > patch. > > > > > > --- > > + /* display wal record information */ > > + if (log_recovery_conflicts) > > + LogBlockedWalRecordInfo(num_waitlist_entries, reason); > > > > I'm concerned that we should log the recovery conflict information > > every time when a recovery conflict happens from the perspective of > > overheads and the amount of the logs. Can we logs that information > > after waiting for deadlock_timeouts secs like log_lock_waits or > > waiting for the fixed duration? > > > The new attached patch is now waiting for deadlock_timeout duration. > > > > > --- > > @@ -609,6 +682,10 @@ StandbyTimeoutHandler(void) > > /* forget any pending STANDBY_DEADLOCK_TIMEOUT request */ > > disable_timeout(STANDBY_DEADLOCK_TIMEOUT, false); > > > > + /* display wal record information */ > > + if (log_recovery_conflicts) > > + LogBlockedWalRecordInfo(-1, > > PROCSIG_RECOVERY_CONFLICT_BUFFERPIN); > > + > > SendRecoveryConflictWithBufferPin(PROCSIG_RECOVERY_CONFLICT_BUFFERPIN); > > } > > > > ResolveRecoveryConflictWithBufferPin() which sets a timer to call > > StandbyTimeoutHandler() can be called multiple times even if the > > recovery is waiting for one buffer pin. I think we should avoid > > logging the same contents multiple times. > > > I do agree, only the first pass is now been logged. > > > > > > --- > > - > > + { > > + {"log_recovery_conflicts", PGC_SUSET, LOGGING_WHAT, > > + gettext_noop("Logs standby recovery conflicts."), > > + NULL > > + }, > > + &log_recovery_conflicts, > > + true, > > + NULL, NULL, NULL > > + }, > > > > Other logging parameters such as log_lock_waits is false by default. I > > think this parameter can also be false by default but is there any > > reason to enable it by default? > > > now set to false by default. > > > > > > Since this parameter applies only to the startup process, I think it > > should be PGC_SIGHUP. > > > changed that way. > > > > > > --- > > + /* display wal record information */ > > + if (log_recovery_conflicts) > > + LogBlockedWalRecordInfo(CountDBBackends(dbid), > > PROCSIG_RECOVERY_CONFLICT_DATABASE); > > > > We log the recovery conflict into the server log but we don't update > > the process title to append "waiting". While discussing the process > > title update on recovery conflict, I got the review comment[1] that we > > don't need to update the process title because no wait occurs when > > recovery conflict with database happens. As the comment says, recovery > > is canceling the existing processes on the database being removed, but > > not waiting for something. > > > Ok, I keep reporting the conflict but changed the wording for this > particular case. Thank you for updating the patch! Here is my comments on the latest version patch: + /* + * Ensure we are in the startup process + * if we want to log standby recovery conflicts + */ + if (AmStartupProcess() && log_recovery_conflicts) + { This function must be used by only the startup process. Not sure but if we need this check I think we should use an assertion rather than a condition in if statement. I think the information about relfilenode and forknumber is useful even in a normal recovery case. I think we can always add this information to errcontext. What do you think? + GetXLogReceiptTime(&rtime, &fromStream); + if (fromStream) + { + receipt_time_str = pstrdup(timestamptz_to_str(rtime)); + appendStringInfo(&buf,"\nWAL record received at %s", receipt_time_str); Not sure showing the receipt time of WAL record is useful for users in this case. IIUC that the receipt time doesn't correspond to individual WAL records whereas the errcontext information is about the particular WAL record. + for (block_id = 0; block_id <= record->max_block_id; block_id++) + { + if (XLogRecGetBlockTag(record, block_id, &rnode, &forknum, &blknum)) + appendStringInfo(&buf,"\ntbs %u db %u rel %u, fork %s, blkno %u", + rnode.spcNode, rnode.dbNode, rnode.relNode, + forkNames[forknum], + blknum); How about showing something like pg_waldump? CONTEXT: WAL redo at 0/3059100 for Heap2/CLEAN: remxid 506, blkref #0: rel 1000/20000/1234 fork main blk 10, blkref #1: rel 1000/20000/1234 fork main blk 11 or CONTEXT: WAL redo at 0/3059100 for Heap2/CLEAN: remxid 506 blkref #0: rel 1000/20000/1234 fork main blk 10 blkref #1: rel 1000/20000/1234 fork main blk 11 But the latter format makes grepping difficult. Also, I guess the changes in rm_redo_error_callback can also be a separate patch. +/* + * Display information about the wal record + * apply being blocked + */ +static void +LogBlockedWalRecordInfo(int num_waitlist_entries, ProcSignalReason reason) I think the function name needs to be updated. The function no longer logs WAL record information. +{ + if (num_waitlist_entries > 0) + if (reason == PROCSIG_RECOVERY_CONFLICT_DATABASE) + ereport(LOG, + (errmsg("recovery is experiencing recovery conflict on %s", get_procsignal_reason_desc(reason)), + errdetail("There is %d conflicting connection(s).", num_waitlist_entries))); + else + ereport(LOG, + (errmsg("recovery is waiting for recovery conflict on %s", get_procsignal_reason_desc(reason)), + errdetail("There is %d blocking connection(s).", num_waitlist_entries))); + else + ereport(LOG, + (errmsg("recovery is waiting for recovery conflict on %s", get_procsignal_reason_desc(reason)))); +} How about displaying actual virtual transaction ids or process ids the startup process is waiting for? For example, the log is going to be: LOG: recovery is waiting for recovery conflict on snapshot DETAIL: Conflicting virtual transaction ids: 100/101, 200/202, 300/303 or LOG: recovery is waiting for recovery conflict on snapshot DETAIL: Conflicting processes: 123, 456, 789 FYI, errdetail_plural() or errdetail_log_plural() can be used for pluralization. + tmpWaitlist = waitlist; + while (VirtualTransactionIdIsValid(*tmpWaitlist)) + { + tmpWaitlist++; + } + + num_waitlist_entries = (tmpWaitlist - waitlist); + + /* display wal record information */ + if (log_recovery_conflicts && (TimestampDifferenceExceeds(recovery_conflicts_log_time, GetCurrentTimestamp(), + DeadlockTimeout))) { + LogBlockedWalRecordInfo(num_waitlist_entries, reason); + recovery_conflicts_log_time = GetCurrentTimestamp(); + } recovery_conflicts_log_time is not initialized. And shouldn't we compare the current timestamp to the timestamp when the startup process started waiting? I think we should call LogBlockedWalRecordInfo() inside of the inner while loop rather than at the beginning of ResolveRecoveryConflictWithVirtualXIDs(). In lock conflict cases, the startup process waits until 'ltime', then enters ResolveRecoveryConflictWithVirtualXIDs() after reaching 'ltime'. Therefore, it makes sense to call LogBlockedWalRecordInfo() at the beginning of ResolveRecoveryConflictWithVirtualXIDs(). However, in snapshot and tablespace conflict cases (i.g. ResolveRecoveryConflictWithSnapshot() and ResolveRecoveryConflictWithTablespace()), it enters ResolveRecoveryConflictWithVirtualXIDs() without waits and waits for reaching ‘ltime’ inside of the inner while look. So the above condition could always be false. I wonder if we can have something like the following function: bool LogRecoveryConflict(VirtualTransactionId *waitlist, ProcSignalReason reason, TimestampTz wait_start) { if (!TimestampDifferenceExceeds(wait_start, GetCurrentTimestamp(), max_standby_streaming_delay)) return false; if (waitlist) { char *buf; buf = construct a string containing all process ids (or virtual transaction ids) to resolve from waitlist; ereport(LOG, (errmsg("recovery is waiting recovery conflict on %s", get_procsignal_reason_desc(reason)), (errdetail_log_plural("Conflicting process : %s.", "Conflicting processes : %s.", the number of processes in *waitlist, buf)))); } else ereport(LOG, (errmsg("recovery is resolving recovery conflict on %s", get_procsignal_reason_desc(reason)))); return true; } 'wait_start' is the timestamp when the caller started waiting for the recovery conflict. This function logs resolving recovery conflict with detailed information if needed. The caller call this function when log_recovery_conflicts is enabled as follows: bool logged = false; : if (log_recovery_conflicts && !logged) logged = LogRecoveryConflict(waitlist, reason, waitStart); + <varlistentry id="guc-log-recovery-conflicts" xreflabel="log_recovery_conflicts"> + <term><varname>log_recovery_conflicts</varname> (<type>boolean</type>) + <indexterm> + <primary><varname>log_recovery_conflicts</varname> configuration parameter</primary> + </indexterm> + </term> + <listitem> + <para> + Controls whether detailed information is produced when a conflict + occurs during standby recovery. The default is <literal>on</literal>. + Only superusers can change this setting. + </para> + </listitem> + </varlistentry> I think the documentation needs to be updated. IIUC this feature is to logging the startup process is waiting due to recovery conflict or waiting for recovery conflict resolution. But it logs only when the startup process waits for longer than a certain time. I think we can improve the documentation in terms of that point. Also, the last sentence is no longer true; log_recovery_conflicts is now PGC_SIGHUP. BTW 'log_recovery_conflict_waits' might be better name for consistency with log_lock_waits? Regards, -- Masahiko Sawada http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Hi, On 7/31/20 7:12 AM, Masahiko Sawada wrote: > CAUTION: This email originated from outside of the organization. Do not click links or open attachments unless you canconfirm the sender and know the content is safe. > > > > On Sat, 11 Jul 2020 at 21:56, Drouvot, Bertrand <bdrouvot@amazon.com> wrote: >> >> On 7/7/20 9:43 AM, Masahiko Sawada wrote: >>> Fully makes sense, the new patch version attached is now producing: >>>> 2020-07-06 06:10:36.022 UTC [14035] LOG: waiting for recovery conflict >>>> on snapshot >>> How about adding the subject? that is, "recovery is waiting for >>> recovery conflict on %s" or "recovery process <pid> is waiting for >>> conflict on %s". >> >> The subject is now added in the new attached patch (I did not include >> the pid as it is part of the log prefix). >> >> It now looks like: >> >> 2020-07-11 12:00:41.092 UTC [23217] LOG: recovery is waiting for >> recovery conflict on snapshot >> 2020-07-11 12:00:41.092 UTC [23217] DETAIL: There is 1 blocking >> connection(s). >> 2020-07-11 12:00:41.092 UTC [23217] CONTEXT: WAL redo at 0/4A0A6BF0 for >> Heap2/CLEAN: remxid 1128 >> WAL record received at 2020-07-11 12:00:41.092231+00 >> tbs 1663 db 13586 rel 16805, fork main, blkno 0 >>>> 2020-07-06 06:10:36.022 UTC [14035] DETAIL: WAL record received at >>>> 2020-07-06 06:10:36.021963+00. >>>> Tablespace/database/relation are 1663/13586/16672, fork is main >>>> and block is 0. >>>> There is 1 blocking connection(s). >>> To follow the existing log message, perhaps this can be something like >>> "WAL record received at %s, rel %u/%u/%u, fork %s, blkno %u. %d >>> processes". But I'm not sure the errdetail is the best place to >>> display the WAL information as I mentioned in the latter part of this >>> email. >> moved to the context and formatted the same way as the current >> Standby/LOCK context. >> >> >>> Ok. Let's keep this thread for the new attached patch that focus on the >>>> recovery process waiting. >>> Thank you for updating the patch! >>> >>> I've tested the latest patch. >> >> Thank you for testing and reviewing! >> >> >>> On recovery conflict on lock and on >>> bufferpin, if max_standby_streaming_delay is disabled (set to -1), the >>> logs don't appear even if log_recovery_conflicts is true. >> >> Nice catch! it is fixed in the new attached patch (the log reporting has >> been moved out of StandbyTimeoutHandler()). >> >>> Here is random comments on the code: >>> >>> + recovery_conflict_main_message = psprintf("waiting for >>> recovery conflict on %s", >>> + >>> get_procsignal_reason_desc(reason)); >>> : >>> + ereport(LOG, >>> + (errmsg("%s", recovery_conflict_main_message), >>> + errdetail("%s\n" "There is %d blocking >>> connection(s).", wal_record_detail_str, num_waitlist_entries))); >>> >>> It's not translation-support-friendly. I think the message "waiting >>> for recovery conflict on %s" should be surrounded by _(). Or we can >>> just put it to ereport as follows: >>> >>> ereport(LOG, >>> (errmsg("waiting for recovery conflicts on %s", >>> get_procsignal_reason_desc(reason)) >>> ... >> >> changed in the new attached patch. >> >> >>> --- >>> + oldcontext = MemoryContextSwitchTo(ErrorContext); >>> + econtext = error_context_stack; >>> + >>> + if (XLogRecGetBlockTag(econtext->arg, 0, &rnode, &forknum, &blknum)) >>> >>> I don't think it's a good idea to rely on error_context_stack because >>> other codes might set another error context before reaching here in >>> the future. >> >> right, changed in the new attached patch: this is now done in >> rm_redo_error_callback() and using the XLogReaderState passed as argument. >> >> >>> --- >>> + if (XLogRecGetBlockTag(econtext->arg, 0, &rnode, &forknum, &blknum)) >>> + wal_record_detail_str = psprintf("WAL record received >>> at %s.\nTablespace/database/relation are %u/%u/%u, fork is %s and >>> block is %u.", >>> + receipt_time_str, >>> rnode.spcNode, rnode.dbNode, rnode.relNode, >>> + forkNames[forknum], >>> + blknum); >>> >>> There might be a block tag in block ids other than 0. >> >> right, fixed in the new attached patch. >> >> >>> I'm not sure the >>> errdetail is the best place where we display WAL information. >> >> moved to context in the new attached patch. >> >> >>> For >>> instance, we display both the relation oid and block number depending >>> on RM as follows: >>> >>> 2020-07-07 15:50:30.733 JST [13344] LOG: waiting for recovery conflict on lock >>> 2020-07-07 15:50:30.733 JST [13344] DETAIL: WAL record received at >>> 2020-07-07 15:50:27.73378+09. >>> There is 1 blocking connection(s). >>> 2020-07-07 15:50:30.733 JST [13344] CONTEXT: WAL redo at 0/3000028 >>> for Standby/LOCK: xid 506 db 13586 rel 16384 >>> >>> I wonder if we can display the details of redo WAL information by >>> improving xlog_outdesc() or rm_redo_error_callback() so that it >>> displays relfilenode, forknum, and block number. What do you think? >> >> I think that fully make sense to move this to rm_redo_error_callback(). >> >> This is where the information is now been displayed in the new attached >> patch. >> >> >>> --- >>> + /* display wal record information */ >>> + if (log_recovery_conflicts) >>> + LogBlockedWalRecordInfo(num_waitlist_entries, reason); >>> >>> I'm concerned that we should log the recovery conflict information >>> every time when a recovery conflict happens from the perspective of >>> overheads and the amount of the logs. Can we logs that information >>> after waiting for deadlock_timeouts secs like log_lock_waits or >>> waiting for the fixed duration? >> >> The new attached patch is now waiting for deadlock_timeout duration. >> >>> --- >>> @@ -609,6 +682,10 @@ StandbyTimeoutHandler(void) >>> /* forget any pending STANDBY_DEADLOCK_TIMEOUT request */ >>> disable_timeout(STANDBY_DEADLOCK_TIMEOUT, false); >>> >>> + /* display wal record information */ >>> + if (log_recovery_conflicts) >>> + LogBlockedWalRecordInfo(-1, >>> PROCSIG_RECOVERY_CONFLICT_BUFFERPIN); >>> + >>> SendRecoveryConflictWithBufferPin(PROCSIG_RECOVERY_CONFLICT_BUFFERPIN); >>> } >>> >>> ResolveRecoveryConflictWithBufferPin() which sets a timer to call >>> StandbyTimeoutHandler() can be called multiple times even if the >>> recovery is waiting for one buffer pin. I think we should avoid >>> logging the same contents multiple times. >> >> I do agree, only the first pass is now been logged. >> >> >>> --- >>> - >>> + { >>> + {"log_recovery_conflicts", PGC_SUSET, LOGGING_WHAT, >>> + gettext_noop("Logs standby recovery conflicts."), >>> + NULL >>> + }, >>> + &log_recovery_conflicts, >>> + true, >>> + NULL, NULL, NULL >>> + }, >>> >>> Other logging parameters such as log_lock_waits is false by default. I >>> think this parameter can also be false by default but is there any >>> reason to enable it by default? >> >> now set to false by default. >> >> >>> Since this parameter applies only to the startup process, I think it >>> should be PGC_SIGHUP. >> >> changed that way. >> >> >>> --- >>> + /* display wal record information */ >>> + if (log_recovery_conflicts) >>> + LogBlockedWalRecordInfo(CountDBBackends(dbid), >>> PROCSIG_RECOVERY_CONFLICT_DATABASE); >>> >>> We log the recovery conflict into the server log but we don't update >>> the process title to append "waiting". While discussing the process >>> title update on recovery conflict, I got the review comment[1] that we >>> don't need to update the process title because no wait occurs when >>> recovery conflict with database happens. As the comment says, recovery >>> is canceling the existing processes on the database being removed, but >>> not waiting for something. >> >> Ok, I keep reporting the conflict but changed the wording for this >> particular case. > Thank you for updating the patch! Here is my comments on the latest > version patch: > > + /* > + * Ensure we are in the startup process > + * if we want to log standby recovery conflicts > + */ > + if (AmStartupProcess() && log_recovery_conflicts) > + { > > This function must be used by only the startup process. Not sure but > if we need this check I think we should use an assertion rather than a > condition in if statement. > > I think the information about relfilenode and forknumber is useful > even in a normal recovery case. I think we can always add this > information to errcontext. What do you think? Fully agree, that it could be useful outside the context of this patch. > > + GetXLogReceiptTime(&rtime, &fromStream); > + if (fromStream) > + { > + receipt_time_str = pstrdup(timestamptz_to_str(rtime)); > + appendStringInfo(&buf,"\nWAL record received at %s", > receipt_time_str); > > Not sure showing the receipt time of WAL record is useful for users in > this case. IIUC that the receipt time doesn't correspond to individual > WAL records whereas the errcontext information is about the particular > WAL record. > > + for (block_id = 0; block_id <= record->max_block_id; block_id++) > + { > + if (XLogRecGetBlockTag(record, block_id, &rnode, > &forknum, &blknum)) > + appendStringInfo(&buf,"\ntbs %u db %u rel %u, fork > %s, blkno %u", > + rnode.spcNode, rnode.dbNode, rnode.relNode, > + forkNames[forknum], > + blknum); > > How about showing something like pg_waldump? > > CONTEXT: WAL redo at 0/3059100 for Heap2/CLEAN: remxid 506, blkref > #0: rel 1000/20000/1234 fork main blk 10, blkref #1: rel > 1000/20000/1234 fork main blk 11 > > or > > CONTEXT: WAL redo at 0/3059100 for Heap2/CLEAN: remxid 506 > blkref #0: rel 1000/20000/1234 fork main blk 10 > blkref #1: rel 1000/20000/1234 fork main blk 11 > > But the latter format makes grepping difficult. > > Also, I guess the changes in rm_redo_error_callback can also be a > separate patch. I fully agree with your comments. As you have already seen I created a new patch dedicated to the rm_redo_error_callback() changes: https://commitfest.postgresql.org/29/2668/. Then those changes are not part of the new attached patch related to this thread anymore. > > +/* > + * Display information about the wal record > + * apply being blocked > + */ > +static void > +LogBlockedWalRecordInfo(int num_waitlist_entries, ProcSignalReason reason) > > I think the function name needs to be updated. The function no longer > logs WAL record information. right. > > +{ > + if (num_waitlist_entries > 0) > + if (reason == PROCSIG_RECOVERY_CONFLICT_DATABASE) > + ereport(LOG, > + (errmsg("recovery is experiencing recovery conflict on > %s", get_procsignal_reason_desc(reason)), > + errdetail("There is %d conflicting connection(s).", > num_waitlist_entries))); > + else > + ereport(LOG, > + (errmsg("recovery is waiting for recovery conflict on > %s", get_procsignal_reason_desc(reason)), > + errdetail("There is %d blocking connection(s).", > num_waitlist_entries))); > + else > + ereport(LOG, > + (errmsg("recovery is waiting for recovery conflict on %s", > get_procsignal_reason_desc(reason)))); > +} > > How about displaying actual virtual transaction ids or process ids the > startup process is waiting for? For example, the log is going to be: > > LOG: recovery is waiting for recovery conflict on snapshot > DETAIL: Conflicting virtual transaction ids: 100/101, 200/202, 300/303 The new attached patch adds the information related to the virtual transaction ids, so that the outcome is now: 2020-08-10 13:31:00.485 UTC [1760] LOG: recovery is resolving recovery conflict on snapshot 2020-08-10 13:31:00.485 UTC [1760] DETAIL: Conflicting virtual transaction ids: 4/2, 3/2. > > or > > LOG: recovery is waiting for recovery conflict on snapshot > DETAIL: Conflicting processes: 123, 456, 789 I think it makes more sense to display the conflicting virtual transaction id(s), as the blocking process(es) will be known during the statement cancellation. > > FYI, errdetail_plural() or errdetail_log_plural() can be used for pluralization. > > + tmpWaitlist = waitlist; > + while (VirtualTransactionIdIsValid(*tmpWaitlist)) > + { > + tmpWaitlist++; > + } > + > + num_waitlist_entries = (tmpWaitlist - waitlist); > + > + /* display wal record information */ > + if (log_recovery_conflicts && > (TimestampDifferenceExceeds(recovery_conflicts_log_time, > GetCurrentTimestamp(), > + DeadlockTimeout))) { > + LogBlockedWalRecordInfo(num_waitlist_entries, reason); > + recovery_conflicts_log_time = GetCurrentTimestamp(); > + } > > recovery_conflicts_log_time is not initialized. And shouldn't we > compare the current timestamp to the timestamp when the startup > process started waiting? > > I think we should call LogBlockedWalRecordInfo() inside of the inner > while loop rather than at the beginning of > ResolveRecoveryConflictWithVirtualXIDs(). In lock conflict cases, the > startup process waits until 'ltime', then enters > ResolveRecoveryConflictWithVirtualXIDs() after reaching 'ltime'. > Therefore, it makes sense to call LogBlockedWalRecordInfo() at the > beginning of ResolveRecoveryConflictWithVirtualXIDs(). However, in > snapshot and tablespace conflict cases (i.g. > ResolveRecoveryConflictWithSnapshot() and > ResolveRecoveryConflictWithTablespace()), it enters > ResolveRecoveryConflictWithVirtualXIDs() without waits and waits for > reaching ‘ltime’ inside of the inner while look. So the above > condition could always be false. That would make the information being displayed after max_standby_streaming_delay is reached for the multiple cases you just described. It indeed makes more sense to call the function in the inside of the inner loop: this is done in the new attached patch. > > I wonder if we can have something like the following function: > > bool > LogRecoveryConflict(VirtualTransactionId *waitlist, ProcSignalReason reason, > TimestampTz wait_start) > { > if (!TimestampDifferenceExceeds(wait_start, GetCurrentTimestamp(), > > max_standby_streaming_delay)) > return false; > > if (waitlist) > { > char *buf; > > buf = construct a string containing all process ids (or > virtual transaction ids) to resolve from waitlist; > > ereport(LOG, > (errmsg("recovery is waiting recovery conflict on %s", > get_procsignal_reason_desc(reason)), > (errdetail_log_plural("Conflicting process : %s.", > "Conflicting processes : %s.", > the number of processes in *waitlist, buf)))); > } > else > ereport(LOG, > (errmsg("recovery is resolving recovery conflict on %s", > get_procsignal_reason_desc(reason)))); > > return true; > } > > 'wait_start' is the timestamp when the caller started waiting for the > recovery conflict. This function logs resolving recovery conflict with > detailed information if needed. The caller call this function when > log_recovery_conflicts is enabled as follows: > > bool logged = false; > : > if (log_recovery_conflicts && !logged) > logged = LogRecoveryConflict(waitlist, reason, waitStart); It's more or less what the new attached patch is doing, except that it deals with the waitStart comparison in the while loop (to make the changes to the LogRecoveryConflict() calls outside ResolveRecoveryConflictWithVirtualXIDs() simpler). For the buffer pin case, I moved the call to LogRecoveryConflict to StandbyTimeoutHandler (instead of the beginning of ResolveRecoveryConflictWithBufferPin). That way we are consistent across all the cases: we are logging once the conflict resolution is actually happening (means we are ready/about to cancel what is needed). Then, the LogRecoveryConflict() is now not reporting that "recovery is waiting recovery conflict" anymore but that "recovery is resolving recovery conflict" in all the cases. > > + <varlistentry id="guc-log-recovery-conflicts" > xreflabel="log_recovery_conflicts"> > + <term><varname>log_recovery_conflicts</varname> (<type>boolean</type>) > + <indexterm> > + <primary><varname>log_recovery_conflicts</varname> > configuration parameter</primary> > + </indexterm> > + </term> > + <listitem> > + <para> > + Controls whether detailed information is produced when a conflict > + occurs during standby recovery. The default is <literal>on</literal>. > + Only superusers can change this setting. > + </para> > + </listitem> > + </varlistentry> > > I think the documentation needs to be updated. Right, updated that way in the new attached patch. > > IIUC this feature is to logging the startup process is waiting due to > recovery conflict or waiting for recovery conflict resolution. But it > logs only when the startup process waits for longer than a certain > time. I think we can improve the documentation in terms of that point. > Also, the last sentence is no longer true; log_recovery_conflicts is > now PGC_SIGHUP. > > BTW 'log_recovery_conflict_waits' might be better name for consistency > with log_lock_waits? As per the changes in this new patch, we are not logging when the startup process starts waiting anymore, but only during the conflict resolution. I changed the GUC to "log_recovery_conflicts_resolution" instead (based on my comment above). Thanks, Bertrand
Attachment
On Mon, 10 Aug 2020 at 23:43, Drouvot, Bertrand <bdrouvot@amazon.com> wrote: > > Hi, > > On 7/31/20 7:12 AM, Masahiko Sawada wrote: > > + tmpWaitlist = waitlist; > > + while (VirtualTransactionIdIsValid(*tmpWaitlist)) > > + { > > + tmpWaitlist++; > > + } > > + > > + num_waitlist_entries = (tmpWaitlist - waitlist); > > + > > + /* display wal record information */ > > + if (log_recovery_conflicts && > > (TimestampDifferenceExceeds(recovery_conflicts_log_time, > > GetCurrentTimestamp(), > > + DeadlockTimeout))) { > > + LogBlockedWalRecordInfo(num_waitlist_entries, reason); > > + recovery_conflicts_log_time = GetCurrentTimestamp(); > > + } > > > > recovery_conflicts_log_time is not initialized. And shouldn't we > > compare the current timestamp to the timestamp when the startup > > process started waiting? > > > > I think we should call LogBlockedWalRecordInfo() inside of the inner > > while loop rather than at the beginning of > > ResolveRecoveryConflictWithVirtualXIDs(). In lock conflict cases, the > > startup process waits until 'ltime', then enters > > ResolveRecoveryConflictWithVirtualXIDs() after reaching 'ltime'. > > Therefore, it makes sense to call LogBlockedWalRecordInfo() at the > > beginning of ResolveRecoveryConflictWithVirtualXIDs(). However, in > > snapshot and tablespace conflict cases (i.g. > > ResolveRecoveryConflictWithSnapshot() and > > ResolveRecoveryConflictWithTablespace()), it enters > > ResolveRecoveryConflictWithVirtualXIDs() without waits and waits for > > reaching ‘ltime’ inside of the inner while look. So the above > > condition could always be false. > > That would make the information being displayed after > max_standby_streaming_delay is reached for the multiple cases you just > described. Sorry, it should be deadlock_timeout, not max_standby_streaming_delay. Otherwise, the recovery conflict log message is printed when resolution, which seems not to achieve the original purpose. Am I missing something? Regards, -- Masahiko Sawada http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
On 8/27/20 10:16 AM, Masahiko Sawada wrote > > On Mon, 10 Aug 2020 at 23:43, Drouvot, Bertrand <bdrouvot@amazon.com> wrote: >> Hi, >> >> On 7/31/20 7:12 AM, Masahiko Sawada wrote: >>> + tmpWaitlist = waitlist; >>> + while (VirtualTransactionIdIsValid(*tmpWaitlist)) >>> + { >>> + tmpWaitlist++; >>> + } >>> + >>> + num_waitlist_entries = (tmpWaitlist - waitlist); >>> + >>> + /* display wal record information */ >>> + if (log_recovery_conflicts && >>> (TimestampDifferenceExceeds(recovery_conflicts_log_time, >>> GetCurrentTimestamp(), >>> + DeadlockTimeout))) { >>> + LogBlockedWalRecordInfo(num_waitlist_entries, reason); >>> + recovery_conflicts_log_time = GetCurrentTimestamp(); >>> + } >>> >>> recovery_conflicts_log_time is not initialized. And shouldn't we >>> compare the current timestamp to the timestamp when the startup >>> process started waiting? >>> >>> I think we should call LogBlockedWalRecordInfo() inside of the inner >>> while loop rather than at the beginning of >>> ResolveRecoveryConflictWithVirtualXIDs(). In lock conflict cases, the >>> startup process waits until 'ltime', then enters >>> ResolveRecoveryConflictWithVirtualXIDs() after reaching 'ltime'. >>> Therefore, it makes sense to call LogBlockedWalRecordInfo() at the >>> beginning of ResolveRecoveryConflictWithVirtualXIDs(). However, in >>> snapshot and tablespace conflict cases (i.g. >>> ResolveRecoveryConflictWithSnapshot() and >>> ResolveRecoveryConflictWithTablespace()), it enters >>> ResolveRecoveryConflictWithVirtualXIDs() without waits and waits for >>> reaching ‘ltime’ inside of the inner while look. So the above >>> condition could always be false. >> That would make the information being displayed after >> max_standby_streaming_delay is reached for the multiple cases you just >> described. > Sorry, it should be deadlock_timeout, not max_standby_streaming_delay. > Otherwise, the recovery conflict log message is printed when > resolution, which seems not to achieve the original purpose. Am I > missing something? Ok, I understand where the confusion is coming from. Indeed the new version is now printing the recovery conflict log message during the conflict resolution (while the initial intention was to be warned as soon as the replay had to wait). The advantage of the new version is that it would be consistent across all the conflicts scenarios (if not, we would get messages during the resolution or when the replay started waiting, depending of the conflict scenario). On the other hand, the cons of the new version is that we would miss messages when no resolution is needed (replay wait duration < max_standby_streaming_delay), but is that really annoying? (As no cancellation would occur) Thinking about it, i like the new version (being warned during the resolution) as we would get messages only when cancelation will occur (which is what the user might want to avoid, so the extra info would be useful). What do you think? Bertrand
On Thu, 27 Aug 2020 at 20:58, Drouvot, Bertrand <bdrouvot@amazon.com> wrote: > > > On 8/27/20 10:16 AM, Masahiko Sawada wrote > > > > On Mon, 10 Aug 2020 at 23:43, Drouvot, Bertrand <bdrouvot@amazon.com> wrote: > >> Hi, > >> > >> On 7/31/20 7:12 AM, Masahiko Sawada wrote: > >>> + tmpWaitlist = waitlist; > >>> + while (VirtualTransactionIdIsValid(*tmpWaitlist)) > >>> + { > >>> + tmpWaitlist++; > >>> + } > >>> + > >>> + num_waitlist_entries = (tmpWaitlist - waitlist); > >>> + > >>> + /* display wal record information */ > >>> + if (log_recovery_conflicts && > >>> (TimestampDifferenceExceeds(recovery_conflicts_log_time, > >>> GetCurrentTimestamp(), > >>> + DeadlockTimeout))) { > >>> + LogBlockedWalRecordInfo(num_waitlist_entries, reason); > >>> + recovery_conflicts_log_time = GetCurrentTimestamp(); > >>> + } > >>> > >>> recovery_conflicts_log_time is not initialized. And shouldn't we > >>> compare the current timestamp to the timestamp when the startup > >>> process started waiting? > >>> > >>> I think we should call LogBlockedWalRecordInfo() inside of the inner > >>> while loop rather than at the beginning of > >>> ResolveRecoveryConflictWithVirtualXIDs(). In lock conflict cases, the > >>> startup process waits until 'ltime', then enters > >>> ResolveRecoveryConflictWithVirtualXIDs() after reaching 'ltime'. > >>> Therefore, it makes sense to call LogBlockedWalRecordInfo() at the > >>> beginning of ResolveRecoveryConflictWithVirtualXIDs(). However, in > >>> snapshot and tablespace conflict cases (i.g. > >>> ResolveRecoveryConflictWithSnapshot() and > >>> ResolveRecoveryConflictWithTablespace()), it enters > >>> ResolveRecoveryConflictWithVirtualXIDs() without waits and waits for > >>> reaching ‘ltime’ inside of the inner while look. So the above > >>> condition could always be false. > >> That would make the information being displayed after > >> max_standby_streaming_delay is reached for the multiple cases you just > >> described. > > Sorry, it should be deadlock_timeout, not max_standby_streaming_delay. > > Otherwise, the recovery conflict log message is printed when > > resolution, which seems not to achieve the original purpose. Am I > > missing something? > > Ok, I understand where the confusion is coming from. > > Indeed the new version is now printing the recovery conflict log message > during the conflict resolution (while the initial intention was to be > warned as soon as the replay had to wait). > > The advantage of the new version is that it would be consistent across > all the conflicts scenarios (if not, we would get messages during the > resolution or when the replay started waiting, depending of the conflict > scenario). > > On the other hand, the cons of the new version is that we would miss > messages when no resolution is needed (replay wait duration < > max_standby_streaming_delay), but is that really annoying? (As no > cancellation would occur) > > Thinking about it, i like the new version (being warned during the > resolution) as we would get messages only when cancelation will occur > (which is what the user might want to avoid, so the extra info would be > useful). > > What do you think? Hmm, I think we print the reason why backends are canceled even of as now by ProcessInterrupts(). With this patch and related patches you proposed on another thread, the startup process reports virtual xids being interrupted, the reason, and LSN of blocked WAL, then processes will also report its virtual xid and reason. Therefore, the new information added by these patches is only the LSN of blocked WAL. Also, the blocked WAL would be unblocked just after the startup process reports the resolution message. What use cases are you assuming? Regards, -- Masahiko Sawada http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
On 8/28/20 7:03 AM, Masahiko Sawada wrote: > CAUTION: This email originated from outside of the organization. Do not click links or open attachments unless you canconfirm the sender and know the content is safe. > > > > On Thu, 27 Aug 2020 at 20:58, Drouvot, Bertrand <bdrouvot@amazon.com> wrote: >> >> On 8/27/20 10:16 AM, Masahiko Sawada wrote >>> On Mon, 10 Aug 2020 at 23:43, Drouvot, Bertrand <bdrouvot@amazon.com> wrote: >>>> Hi, >>>> >>>> On 7/31/20 7:12 AM, Masahiko Sawada wrote: >>>>> + tmpWaitlist = waitlist; >>>>> + while (VirtualTransactionIdIsValid(*tmpWaitlist)) >>>>> + { >>>>> + tmpWaitlist++; >>>>> + } >>>>> + >>>>> + num_waitlist_entries = (tmpWaitlist - waitlist); >>>>> + >>>>> + /* display wal record information */ >>>>> + if (log_recovery_conflicts && >>>>> (TimestampDifferenceExceeds(recovery_conflicts_log_time, >>>>> GetCurrentTimestamp(), >>>>> + DeadlockTimeout))) { >>>>> + LogBlockedWalRecordInfo(num_waitlist_entries, reason); >>>>> + recovery_conflicts_log_time = GetCurrentTimestamp(); >>>>> + } >>>>> >>>>> recovery_conflicts_log_time is not initialized. And shouldn't we >>>>> compare the current timestamp to the timestamp when the startup >>>>> process started waiting? >>>>> >>>>> I think we should call LogBlockedWalRecordInfo() inside of the inner >>>>> while loop rather than at the beginning of >>>>> ResolveRecoveryConflictWithVirtualXIDs(). In lock conflict cases, the >>>>> startup process waits until 'ltime', then enters >>>>> ResolveRecoveryConflictWithVirtualXIDs() after reaching 'ltime'. >>>>> Therefore, it makes sense to call LogBlockedWalRecordInfo() at the >>>>> beginning of ResolveRecoveryConflictWithVirtualXIDs(). However, in >>>>> snapshot and tablespace conflict cases (i.g. >>>>> ResolveRecoveryConflictWithSnapshot() and >>>>> ResolveRecoveryConflictWithTablespace()), it enters >>>>> ResolveRecoveryConflictWithVirtualXIDs() without waits and waits for >>>>> reaching ‘ltime’ inside of the inner while look. So the above >>>>> condition could always be false. >>>> That would make the information being displayed after >>>> max_standby_streaming_delay is reached for the multiple cases you just >>>> described. >>> Sorry, it should be deadlock_timeout, not max_standby_streaming_delay. >>> Otherwise, the recovery conflict log message is printed when >>> resolution, which seems not to achieve the original purpose. Am I >>> missing something? >> Ok, I understand where the confusion is coming from. >> >> Indeed the new version is now printing the recovery conflict log message >> during the conflict resolution (while the initial intention was to be >> warned as soon as the replay had to wait). >> >> The advantage of the new version is that it would be consistent across >> all the conflicts scenarios (if not, we would get messages during the >> resolution or when the replay started waiting, depending of the conflict >> scenario). >> >> On the other hand, the cons of the new version is that we would miss >> messages when no resolution is needed (replay wait duration < >> max_standby_streaming_delay), but is that really annoying? (As no >> cancellation would occur) >> >> Thinking about it, i like the new version (being warned during the >> resolution) as we would get messages only when cancelation will occur >> (which is what the user might want to avoid, so the extra info would be >> useful). >> >> What do you think? > Hmm, I think we print the reason why backends are canceled even of as > now by ProcessInterrupts(). With this patch and related patches you > proposed on another thread, the startup process reports virtual xids > being interrupted, the reason, and LSN of blocked WAL, then processes > will also report its virtual xid and reason. Therefore, the new > information added by these patches is only the LSN of blocked WAL. That's completely right, let's come back to the original intention of this patch (means, don't wait for the conflict resolution to log messages). I'll submit a new version once updated. Thanks! Bertrand
On 8/28/20 4:14 PM, Drouvot, Bertrand wrote: > > On 8/28/20 7:03 AM, Masahiko Sawada wrote: >> CAUTION: This email originated from outside of the organization. Do >> not click links or open attachments unless you can confirm the sender >> and know the content is safe. >> >> >> >> On Thu, 27 Aug 2020 at 20:58, Drouvot, Bertrand <bdrouvot@amazon.com> >> wrote: >>> >>> On 8/27/20 10:16 AM, Masahiko Sawada wrote >>>> On Mon, 10 Aug 2020 at 23:43, Drouvot, Bertrand >>>> <bdrouvot@amazon.com> wrote: >>>>> Hi, >>>>> >>>>> On 7/31/20 7:12 AM, Masahiko Sawada wrote: >>>>>> + tmpWaitlist = waitlist; >>>>>> + while (VirtualTransactionIdIsValid(*tmpWaitlist)) >>>>>> + { >>>>>> + tmpWaitlist++; >>>>>> + } >>>>>> + >>>>>> + num_waitlist_entries = (tmpWaitlist - waitlist); >>>>>> + >>>>>> + /* display wal record information */ >>>>>> + if (log_recovery_conflicts && >>>>>> (TimestampDifferenceExceeds(recovery_conflicts_log_time, >>>>>> GetCurrentTimestamp(), >>>>>> + DeadlockTimeout))) { >>>>>> + LogBlockedWalRecordInfo(num_waitlist_entries, reason); >>>>>> + recovery_conflicts_log_time = GetCurrentTimestamp(); >>>>>> + } >>>>>> >>>>>> recovery_conflicts_log_time is not initialized. And shouldn't we >>>>>> compare the current timestamp to the timestamp when the startup >>>>>> process started waiting? >>>>>> >>>>>> I think we should call LogBlockedWalRecordInfo() inside of the inner >>>>>> while loop rather than at the beginning of >>>>>> ResolveRecoveryConflictWithVirtualXIDs(). In lock conflict cases, >>>>>> the >>>>>> startup process waits until 'ltime', then enters >>>>>> ResolveRecoveryConflictWithVirtualXIDs() after reaching 'ltime'. >>>>>> Therefore, it makes sense to call LogBlockedWalRecordInfo() at the >>>>>> beginning of ResolveRecoveryConflictWithVirtualXIDs(). However, in >>>>>> snapshot and tablespace conflict cases (i.g. >>>>>> ResolveRecoveryConflictWithSnapshot() and >>>>>> ResolveRecoveryConflictWithTablespace()), it enters >>>>>> ResolveRecoveryConflictWithVirtualXIDs() without waits and waits for >>>>>> reaching ‘ltime’ inside of the inner while look. So the above >>>>>> condition could always be false. >>>>> That would make the information being displayed after >>>>> max_standby_streaming_delay is reached for the multiple cases you >>>>> just >>>>> described. >>>> Sorry, it should be deadlock_timeout, not max_standby_streaming_delay. >>>> Otherwise, the recovery conflict log message is printed when >>>> resolution, which seems not to achieve the original purpose. Am I >>>> missing something? >>> Ok, I understand where the confusion is coming from. >>> >>> Indeed the new version is now printing the recovery conflict log >>> message >>> during the conflict resolution (while the initial intention was to be >>> warned as soon as the replay had to wait). >>> >>> The advantage of the new version is that it would be consistent across >>> all the conflicts scenarios (if not, we would get messages during the >>> resolution or when the replay started waiting, depending of the >>> conflict >>> scenario). >>> >>> On the other hand, the cons of the new version is that we would miss >>> messages when no resolution is needed (replay wait duration < >>> max_standby_streaming_delay), but is that really annoying? (As no >>> cancellation would occur) >>> >>> Thinking about it, i like the new version (being warned during the >>> resolution) as we would get messages only when cancelation will occur >>> (which is what the user might want to avoid, so the extra info would be >>> useful). >>> >>> What do you think? >> Hmm, I think we print the reason why backends are canceled even of as >> now by ProcessInterrupts(). With this patch and related patches you >> proposed on another thread, the startup process reports virtual xids >> being interrupted, the reason, and LSN of blocked WAL, then processes >> will also report its virtual xid and reason. Therefore, the new >> information added by these patches is only the LSN of blocked WAL. > > That's completely right, let's come back to the original intention of > this patch (means, don't wait for the conflict resolution to log > messages). > > I'll submit a new version once updated. Please find attached the new patch. It provides the following outcomes depending on the conflict: 2020-10-04 09:08:51.923 UTC [30788] LOG: recovery is waiting recovery conflict on buffer pin OR 2020-10-04 09:52:25.832 UTC [1249] LOG: recovery is waiting recovery conflict on snapshot 2020-10-04 09:52:25.832 UTC [1249] DETAIL: Conflicting virtual transaction ids: 3/2, 2/4. OR 2020-10-04 09:11:51.717 UTC [30788] LOG: recovery is waiting recovery conflict on lock 2020-10-04 09:11:51.717 UTC [30788] DETAIL: Conflicting virtual transaction id: 2/5. OR 2020-10-04 09:13:04.104 UTC [30788] LOG: recovery is resolving recovery conflict on database Thanks Bertrand
Attachment
> +extern const char *get_procsignal_reason_desc(ProcSignalReason reason) > + { > + const char *reasonDesc = "unknown reason"; > + > + switch (reason) > + { > + case PROCSIG_RECOVERY_CONFLICT_BUFFERPIN: > + reasonDesc = "buffer pin"; > + break; It doesn't work to construct sentences from pieces, for translatability reasons. Maybe you can return the whole errmsg sentence from this routine instead.
Hi, On 10/4/20 4:10 PM, Alvaro Herrera wrote >> +extern const char *get_procsignal_reason_desc(ProcSignalReason reason) >> + { >> + const char *reasonDesc = "unknown reason"; >> + >> + switch (reason) >> + { >> + case PROCSIG_RECOVERY_CONFLICT_BUFFERPIN: >> + reasonDesc = "buffer pin"; >> + break; > It doesn't work to construct sentences from pieces, for translatability > reasons. Maybe you can return the whole errmsg sentence from this > routine instead. > Thanks for the feedback! Enclosed a new version that take care of it. Bertrand
Attachment
On Sun, 4 Oct 2020 at 20:48, Drouvot, Bertrand <bdrouvot@amazon.com> wrote: > > > On 8/28/20 4:14 PM, Drouvot, Bertrand wrote: > > > > On 8/28/20 7:03 AM, Masahiko Sawada wrote: > >> CAUTION: This email originated from outside of the organization. Do > >> not click links or open attachments unless you can confirm the sender > >> and know the content is safe. > >> > >> > >> > >> On Thu, 27 Aug 2020 at 20:58, Drouvot, Bertrand <bdrouvot@amazon.com> > >> wrote: > >>> > >>> On 8/27/20 10:16 AM, Masahiko Sawada wrote > >>>> On Mon, 10 Aug 2020 at 23:43, Drouvot, Bertrand > >>>> <bdrouvot@amazon.com> wrote: > >>>>> Hi, > >>>>> > >>>>> On 7/31/20 7:12 AM, Masahiko Sawada wrote: > >>>>>> + tmpWaitlist = waitlist; > >>>>>> + while (VirtualTransactionIdIsValid(*tmpWaitlist)) > >>>>>> + { > >>>>>> + tmpWaitlist++; > >>>>>> + } > >>>>>> + > >>>>>> + num_waitlist_entries = (tmpWaitlist - waitlist); > >>>>>> + > >>>>>> + /* display wal record information */ > >>>>>> + if (log_recovery_conflicts && > >>>>>> (TimestampDifferenceExceeds(recovery_conflicts_log_time, > >>>>>> GetCurrentTimestamp(), > >>>>>> + DeadlockTimeout))) { > >>>>>> + LogBlockedWalRecordInfo(num_waitlist_entries, reason); > >>>>>> + recovery_conflicts_log_time = GetCurrentTimestamp(); > >>>>>> + } > >>>>>> > >>>>>> recovery_conflicts_log_time is not initialized. And shouldn't we > >>>>>> compare the current timestamp to the timestamp when the startup > >>>>>> process started waiting? > >>>>>> > >>>>>> I think we should call LogBlockedWalRecordInfo() inside of the inner > >>>>>> while loop rather than at the beginning of > >>>>>> ResolveRecoveryConflictWithVirtualXIDs(). In lock conflict cases, > >>>>>> the > >>>>>> startup process waits until 'ltime', then enters > >>>>>> ResolveRecoveryConflictWithVirtualXIDs() after reaching 'ltime'. > >>>>>> Therefore, it makes sense to call LogBlockedWalRecordInfo() at the > >>>>>> beginning of ResolveRecoveryConflictWithVirtualXIDs(). However, in > >>>>>> snapshot and tablespace conflict cases (i.g. > >>>>>> ResolveRecoveryConflictWithSnapshot() and > >>>>>> ResolveRecoveryConflictWithTablespace()), it enters > >>>>>> ResolveRecoveryConflictWithVirtualXIDs() without waits and waits for > >>>>>> reaching ‘ltime’ inside of the inner while look. So the above > >>>>>> condition could always be false. > >>>>> That would make the information being displayed after > >>>>> max_standby_streaming_delay is reached for the multiple cases you > >>>>> just > >>>>> described. > >>>> Sorry, it should be deadlock_timeout, not max_standby_streaming_delay. > >>>> Otherwise, the recovery conflict log message is printed when > >>>> resolution, which seems not to achieve the original purpose. Am I > >>>> missing something? > >>> Ok, I understand where the confusion is coming from. > >>> > >>> Indeed the new version is now printing the recovery conflict log > >>> message > >>> during the conflict resolution (while the initial intention was to be > >>> warned as soon as the replay had to wait). > >>> > >>> The advantage of the new version is that it would be consistent across > >>> all the conflicts scenarios (if not, we would get messages during the > >>> resolution or when the replay started waiting, depending of the > >>> conflict > >>> scenario). > >>> > >>> On the other hand, the cons of the new version is that we would miss > >>> messages when no resolution is needed (replay wait duration < > >>> max_standby_streaming_delay), but is that really annoying? (As no > >>> cancellation would occur) > >>> > >>> Thinking about it, i like the new version (being warned during the > >>> resolution) as we would get messages only when cancelation will occur > >>> (which is what the user might want to avoid, so the extra info would be > >>> useful). > >>> > >>> What do you think? > >> Hmm, I think we print the reason why backends are canceled even of as > >> now by ProcessInterrupts(). With this patch and related patches you > >> proposed on another thread, the startup process reports virtual xids > >> being interrupted, the reason, and LSN of blocked WAL, then processes > >> will also report its virtual xid and reason. Therefore, the new > >> information added by these patches is only the LSN of blocked WAL. > > > > That's completely right, let's come back to the original intention of > > this patch (means, don't wait for the conflict resolution to log > > messages). > > > > I'll submit a new version once updated. > > Please find attached the new patch. > > It provides the following outcomes depending on the conflict: > > 2020-10-04 09:08:51.923 UTC [30788] LOG: recovery is waiting recovery > conflict on buffer pin > > OR > > 2020-10-04 09:52:25.832 UTC [1249] LOG: recovery is waiting recovery > conflict on snapshot > 2020-10-04 09:52:25.832 UTC [1249] DETAIL: Conflicting virtual > transaction ids: 3/2, 2/4. > > OR > > 2020-10-04 09:11:51.717 UTC [30788] LOG: recovery is waiting recovery > conflict on lock > 2020-10-04 09:11:51.717 UTC [30788] DETAIL: Conflicting virtual > transaction id: 2/5. > > OR > > 2020-10-04 09:13:04.104 UTC [30788] LOG: recovery is resolving recovery > conflict on database > Sorry I didn't realize I failed to send review comments. Here is my comments on the previous version (v4) patch (attached to [1]): --- @ -3828,6 +3830,8 @@ LockBufferForCleanup(Buffer buffer) GetPrivateRefCount(buffer)); bufHdr = GetBufferDescriptor(buffer - 1); waitStart = GetCurrentTimestamp(); logged = false; I think it's better to avoid calling GetCurrentTimestamp() when the timestamp is not used. In the above case, we can call it only when log_recovery_conflict_waits is true. --- +void +LogRecoveryConflict(VirtualTransactionId *waitlist, ProcSignalReason reason, bool waiting) It seems to me that having 'reason' at the first argument is natural. --- + /* Log the recovery conflict */ + if (log_recovery_conflict_waits) + LogRecoveryConflict(NULL, PROCSIG_RECOVERY_CONFLICT_DATABASE, false); There seems no need to log recovery conflicts because no wait occurs when recovery conflict with database happens. --- if (InHotStandby) { /* Set a timer and wait for that or for the Lock to be granted */ + VirtualTransactionId *backends; + backends = GetLockConflicts(&locallock->tag.lock, AccessExclusiveLock, NULL); + if (log_recovery_conflict_waits && !logged) { + LogRecoveryConflict(backends, PROCSIG_RECOVERY_CONFLICT_LOCK, true); + logged = true; + } + ResolveRecoveryConflictWithLock(locallock->tag.lock); } We log the recovery conflicts without checking deadlock timeout here. I think we should call LogRecoveryConflict() after ResolveRecoveryConflictWithLock() with the check if we're waiting for longer than deadlock_time sec. That way, we can also remove the following description in the doc, which seems more appropriate behavior: + For the lock conflict case, it does not wait for <varname>deadlock_timeout</varname> to be reached. Also, in ResolveRecoveryConflictWithVirtualXIDs() called from ResolveRecoveryConflictWithLock(), we could log the same again here: + /* Log the recovery conflict */ + if (log_recovery_conflict_waits && !logged + && TimestampDifferenceExceeds(waitStart, + GetCurrentTimestamp(), DeadlockTimeout)) + { + LogRecoveryConflict(waitlist, reason, report_waiting); + logged = true; + } Note that since the 'report_waiting' is false in this case, waitStart is not initialized. I think we can use 'report_waiting' for this purpose too. The 'report_waiting' is false in the lock conflict cases where we also want to skip logging a recovery conflict in ResolveRecoveryConflictWithVirtualXIDs(). --- @@ -1069,6 +1069,7 @@ ProcSleep(LOCALLOCK *locallock, LockMethod lockMethodTable) PGPROC *proc; PGPROC *leader = MyProc->lockGroupLeader; int i; + bool logged; I guess it's better to rename 'logged' to 'logged_recovery_confclit' or other to make the purpose clear. This function logs not only recovery conflicts but also deadlock and waits for lock. As an alternative idea, possibly we can allow it to report the log multiple times with how long waiting for. We pass the time when the wait started to LogRecoveryConflict() and show something like "recovery is still waiting recovery conflict on buffer pin after 100.0 ms". But we can also leave it as the next improvement. --- +extern const char *get_procsignal_reason_desc(ProcSignalReason reason) + { + const char *reasonDesc = "unknown reason"; + I think we can move this function to standby.c. --- The patch needs to run pgindent. For instance, the following change doesn't follow the coding style: + + if (waitlist && waiting) { + vxids = waitlist; + count = 0; + initStringInfo(&buf); --- Currently, we report the list of the conflicted virtual transaction ids but perhaps showing process ids instead seems better. What do you think? I've attached the patch as an idea of fixing the above comments as well as the comment from Alvaro. I can be applied on top of v4 patch. [1] https://www.postgresql.org/message-id/29da248a-e21c-a3eb-a051-f1ec79b13d31%40amazon.com -- Masahiko Sawada http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Attachment
On 2020-Oct-14, Masahiko Sawada wrote: > I've attached the patch as an idea of fixing the above comments as > well as the comment from Alvaro. I can be applied on top of v4 patch. One note about the translation stuff. Currently you have _("...") where the string is produced, and then ereport(.., errmsg("%s", str) where it is used. Both those things will attempt to translate the string, which isn't great. It is better if we only translate once. You have two options to fix this: one is to change _() to gettext_noop() (which marks the string for translation so that it appears in the message catalog, but it does not return the translation -- it returns the original, and then errmsg() translates at run time). The other is to change errmsg() to errmsg_internal() .. so the function returns the translated message and errmsg_internal() doesn't apply a translation. I prefer the first option, because if we ever include a server feature to log both the non-translated message alongside the translated one, we will already have both in hand.
On Wed, 14 Oct 2020 at 07:44, Alvaro Herrera <alvherre@alvh.no-ip.org> wrote: > > On 2020-Oct-14, Masahiko Sawada wrote: > > > I've attached the patch as an idea of fixing the above comments as > > well as the comment from Alvaro. I can be applied on top of v4 patch. > > One note about the translation stuff. Currently you have _("...") where > the string is produced, and then ereport(.., errmsg("%s", str) where it > is used. Both those things will attempt to translate the string, which > isn't great. It is better if we only translate once. You have two > options to fix this: one is to change _() to gettext_noop() (which marks > the string for translation so that it appears in the message catalog, > but it does not return the translation -- it returns the original, and > then errmsg() translates at run time). The other is to change errmsg() > to errmsg_internal() .. so the function returns the translated message > and errmsg_internal() doesn't apply a translation. > > I prefer the first option, because if we ever include a server feature > to log both the non-translated message alongside the translated one, we > will already have both in hand. Thanks, I didn't know that. So perhaps ATWrongRelkindError() has the same translation problem? It uses _() when producing the message but also uses errmsg(). I've attached the patch changed accordingly. I also fixed some bugs around recovery conflicts on locks and changed the code so that the log shows pids instead of virtual transaction ids since pids are much easy to use for the users. Regards, -- Masahiko Sawada http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Attachment
At Wed, 14 Oct 2020 17:39:20 +0900, Masahiko Sawada <masahiko.sawada@2ndquadrant.com> wrote in > On Wed, 14 Oct 2020 at 07:44, Alvaro Herrera <alvherre@alvh.no-ip.org> wrote: > > > > On 2020-Oct-14, Masahiko Sawada wrote: > > > > > I've attached the patch as an idea of fixing the above comments as > > > well as the comment from Alvaro. I can be applied on top of v4 patch. > > > > One note about the translation stuff. Currently you have _("...") where > > the string is produced, and then ereport(.., errmsg("%s", str) where it > > is used. Both those things will attempt to translate the string, which > > isn't great. It is better if we only translate once. You have two > > options to fix this: one is to change _() to gettext_noop() (which marks > > the string for translation so that it appears in the message catalog, > > but it does not return the translation -- it returns the original, and > > then errmsg() translates at run time). The other is to change errmsg() > > to errmsg_internal() .. so the function returns the translated message > > and errmsg_internal() doesn't apply a translation. > > > > I prefer the first option, because if we ever include a server feature > > to log both the non-translated message alongside the translated one, we > > will already have both in hand. > > Thanks, I didn't know that. So perhaps ATWrongRelkindError() has the > same translation problem? It uses _() when producing the message but > also uses errmsg(). > > I've attached the patch changed accordingly. I also fixed some bugs > around recovery conflicts on locks and changed the code so that the > log shows pids instead of virtual transaction ids since pids are much > easy to use for the users. You're misunderstanding. ereport(..(errmsg("%s", _("hogehoge")))) results in fprintf((translated("%s")), translate("hogehoge")). So your change (errmsg("%s", gettext_noop("hogehoge")) results in fprintf((translated("%s")), DONT_translate("hogehoge")). which leads to a translation problem. (errmsg(gettext_noop("hogehoge")) works fine. You can see the instance in aclchk.c. regards. -- Kyotaro Horiguchi NTT Open Source Software Center
On Thu, 15 Oct 2020 at 12:13, Kyotaro Horiguchi <horikyota.ntt@gmail.com> wrote: > > At Wed, 14 Oct 2020 17:39:20 +0900, Masahiko Sawada <masahiko.sawada@2ndquadrant.com> wrote in > > On Wed, 14 Oct 2020 at 07:44, Alvaro Herrera <alvherre@alvh.no-ip.org> wrote: > > > > > > On 2020-Oct-14, Masahiko Sawada wrote: > > > > > > > I've attached the patch as an idea of fixing the above comments as > > > > well as the comment from Alvaro. I can be applied on top of v4 patch. > > > > > > One note about the translation stuff. Currently you have _("...") where > > > the string is produced, and then ereport(.., errmsg("%s", str) where it > > > is used. Both those things will attempt to translate the string, which > > > isn't great. It is better if we only translate once. You have two > > > options to fix this: one is to change _() to gettext_noop() (which marks > > > the string for translation so that it appears in the message catalog, > > > but it does not return the translation -- it returns the original, and > > > then errmsg() translates at run time). The other is to change errmsg() > > > to errmsg_internal() .. so the function returns the translated message > > > and errmsg_internal() doesn't apply a translation. > > > > > > I prefer the first option, because if we ever include a server feature > > > to log both the non-translated message alongside the translated one, we > > > will already have both in hand. > > > > Thanks, I didn't know that. So perhaps ATWrongRelkindError() has the > > same translation problem? It uses _() when producing the message but > > also uses errmsg(). > > > > I've attached the patch changed accordingly. I also fixed some bugs > > around recovery conflicts on locks and changed the code so that the > > log shows pids instead of virtual transaction ids since pids are much > > easy to use for the users. > > You're misunderstanding. Thank you! That's helpful for me. > ereport(..(errmsg("%s", _("hogehoge")))) results in > fprintf((translated("%s")), translate("hogehoge")). > > So your change (errmsg("%s", gettext_noop("hogehoge")) results in > > fprintf((translated("%s")), DONT_translate("hogehoge")). > > which leads to a translation problem. > > (errmsg(gettext_noop("hogehoge")) This seems equivalent to (errmsg("hogehoge")), right? I think I could understand translation stuff. Given we only report the const string returned from get_recovery_conflict_desc() without placeholders, the patch needs to use errmsg_internal() instead while not changing _() part. (errmsg(get_recovery_conflict_desc())) is not good (warned by -Wformat-security). Regards, -- Masahiko Sawada http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
At Thu, 15 Oct 2020 14:28:57 +0900, Masahiko Sawada <masahiko.sawada@2ndquadrant.com> wrote in > > ereport(..(errmsg("%s", _("hogehoge")))) results in > > fprintf((translated("%s")), translate("hogehoge")). > > > > So your change (errmsg("%s", gettext_noop("hogehoge")) results in > > > > fprintf((translated("%s")), DONT_translate("hogehoge")). > > > > which leads to a translation problem. > > > > (errmsg(gettext_noop("hogehoge")) > > This seems equivalent to (errmsg("hogehoge")), right? Yes and no. However eventually the two works the same way, "(errmsg(gettext_noop("hogehoge"))" is a shorthand of 1: char *msg = gettext_noop("hogehoge"); ... 2: .. (errmsg(msg)); That is, the line 1 only registers a message id "hogehoge" and doesn't translate. The line 2 tries to translate the content of msg and it finds the translation for the message id "hogehoge". > I think I could understand translation stuff. Given we only report the > const string returned from get_recovery_conflict_desc() without > placeholders, the patch needs to use errmsg_internal() instead while > not changing _() part. (errmsg(get_recovery_conflict_desc())) is not > good (warned by -Wformat-security). Ah, right. we get a complain if no value parameters added. We can silence it by adding a dummy parameter to errmsg, but I'm not sure which is preferable. regards. -- Kyotaro Horiguchi NTT Open Source Software Center
On Thu, 15 Oct 2020 at 14:52, Kyotaro Horiguchi <horikyota.ntt@gmail.com> wrote: > > At Thu, 15 Oct 2020 14:28:57 +0900, Masahiko Sawada <masahiko.sawada@2ndquadrant.com> wrote in > > > ereport(..(errmsg("%s", _("hogehoge")))) results in > > > fprintf((translated("%s")), translate("hogehoge")). > > > > > > So your change (errmsg("%s", gettext_noop("hogehoge")) results in > > > > > > fprintf((translated("%s")), DONT_translate("hogehoge")). > > > > > > which leads to a translation problem. > > > > > > (errmsg(gettext_noop("hogehoge")) > > > > This seems equivalent to (errmsg("hogehoge")), right? > > Yes and no. However eventually the two works the same way, > "(errmsg(gettext_noop("hogehoge"))" is a shorthand of > > 1: char *msg = gettext_noop("hogehoge"); > ... > 2: .. (errmsg(msg)); > > That is, the line 1 only registers a message id "hogehoge" and doesn't > translate. The line 2 tries to translate the content of msg and it > finds the translation for the message id "hogehoge". Understood. > > > I think I could understand translation stuff. Given we only report the > > const string returned from get_recovery_conflict_desc() without > > placeholders, the patch needs to use errmsg_internal() instead while > > not changing _() part. (errmsg(get_recovery_conflict_desc())) is not > > good (warned by -Wformat-security). > > Ah, right. we get a complain if no value parameters added. We can > silence it by adding a dummy parameter to errmsg, but I'm not sure > which is preferable. Okay, I'm going to use errmsg_internal() for now until a better idea comes. I've attached the updated patch that fixed the translation part. Regards, -- Masahiko Sawada http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Attachment
Hi, On 10/15/20 9:15 AM, Masahiko Sawada wrote: > CAUTION: This email originated from outside of the organization. Do not click links or open attachments unless you canconfirm the sender and know the content is safe. > > > > On Thu, 15 Oct 2020 at 14:52, Kyotaro Horiguchi <horikyota.ntt@gmail.com> wrote: >> At Thu, 15 Oct 2020 14:28:57 +0900, Masahiko Sawada <masahiko.sawada@2ndquadrant.com> wrote in >>>> ereport(..(errmsg("%s", _("hogehoge")))) results in >>>> fprintf((translated("%s")), translate("hogehoge")). >>>> >>>> So your change (errmsg("%s", gettext_noop("hogehoge")) results in >>>> >>>> fprintf((translated("%s")), DONT_translate("hogehoge")). >>>> >>>> which leads to a translation problem. >>>> >>>> (errmsg(gettext_noop("hogehoge")) >>> This seems equivalent to (errmsg("hogehoge")), right? >> Yes and no. However eventually the two works the same way, >> "(errmsg(gettext_noop("hogehoge"))" is a shorthand of >> >> 1: char *msg = gettext_noop("hogehoge"); >> ... >> 2: .. (errmsg(msg)); >> >> That is, the line 1 only registers a message id "hogehoge" and doesn't >> translate. The line 2 tries to translate the content of msg and it >> finds the translation for the message id "hogehoge". > Understood. > >>> I think I could understand translation stuff. Given we only report the >>> const string returned from get_recovery_conflict_desc() without >>> placeholders, the patch needs to use errmsg_internal() instead while >>> not changing _() part. (errmsg(get_recovery_conflict_desc())) is not >>> good (warned by -Wformat-security). >> Ah, right. we get a complain if no value parameters added. We can >> silence it by adding a dummy parameter to errmsg, but I'm not sure >> which is preferable. > Okay, I'm going to use errmsg_internal() for now until a better idea comes. > > I've attached the updated patch that fixed the translation part. Thanks for reviewing and helping on this patch! The patch tester bot is currently failing due to: "proc.c:1290:5: error: ‘standbyWaitStart’ may be used uninitialized in this function [-Werror=maybe-uninitialized]" I've attached a new version with the minor change to fix it. Bertrand
Attachment
On Tue, 20 Oct 2020 at 22:02, Drouvot, Bertrand <bdrouvot@amazon.com> wrote: > > Hi, > > On 10/15/20 9:15 AM, Masahiko Sawada wrote: > > CAUTION: This email originated from outside of the organization. Do not click links or open attachments unless you canconfirm the sender and know the content is safe. > > > > > > > > On Thu, 15 Oct 2020 at 14:52, Kyotaro Horiguchi <horikyota.ntt@gmail.com> wrote: > >> At Thu, 15 Oct 2020 14:28:57 +0900, Masahiko Sawada <masahiko.sawada@2ndquadrant.com> wrote in > >>>> ereport(..(errmsg("%s", _("hogehoge")))) results in > >>>> fprintf((translated("%s")), translate("hogehoge")). > >>>> > >>>> So your change (errmsg("%s", gettext_noop("hogehoge")) results in > >>>> > >>>> fprintf((translated("%s")), DONT_translate("hogehoge")). > >>>> > >>>> which leads to a translation problem. > >>>> > >>>> (errmsg(gettext_noop("hogehoge")) > >>> This seems equivalent to (errmsg("hogehoge")), right? > >> Yes and no. However eventually the two works the same way, > >> "(errmsg(gettext_noop("hogehoge"))" is a shorthand of > >> > >> 1: char *msg = gettext_noop("hogehoge"); > >> ... > >> 2: .. (errmsg(msg)); > >> > >> That is, the line 1 only registers a message id "hogehoge" and doesn't > >> translate. The line 2 tries to translate the content of msg and it > >> finds the translation for the message id "hogehoge". > > Understood. > > > >>> I think I could understand translation stuff. Given we only report the > >>> const string returned from get_recovery_conflict_desc() without > >>> placeholders, the patch needs to use errmsg_internal() instead while > >>> not changing _() part. (errmsg(get_recovery_conflict_desc())) is not > >>> good (warned by -Wformat-security). > >> Ah, right. we get a complain if no value parameters added. We can > >> silence it by adding a dummy parameter to errmsg, but I'm not sure > >> which is preferable. > > Okay, I'm going to use errmsg_internal() for now until a better idea comes. > > > > I've attached the updated patch that fixed the translation part. > > Thanks for reviewing and helping on this patch! > > The patch tester bot is currently failing due to: > > "proc.c:1290:5: error: ‘standbyWaitStart’ may be used uninitialized in > this function [-Werror=maybe-uninitialized]" > > I've attached a new version with the minor change to fix it. > Thank you for updating the patch! I've looked at the patch and revised a bit the formatting etc. After some thoughts, I think it might be better to report the waiting time as well. it would help users and there is no particular reason for logging the report only once. It also helps make the patch clean by reducing the variables such as recovery_conflict_logged. I’ve implemented it in the v8 patch. The log message is now like: LOG: recovery is still waiting recovery conflict on lock after 1062.601 ms DETAIL: Conflicting processes: 35116, 35115, 35114. CONTEXT: WAL redo at 0/3000028 for Standby/LOCK: xid 510 db 13185 rel 16384 LOG: recovery is still waiting recovery conflict on lock after 2065.682 ms DETAIL: Conflicting processes: 35115, 35114. CONTEXT: WAL redo at 0/3000028 for Standby/LOCK: xid 510 db 13185 rel 16384 LOG: recovery is still waiting recovery conflict on lock after 3087.926 ms DETAIL: Conflicting process: 35114. CONTEXT: WAL redo at 0/3000028 for Standby/LOCK: xid 510 db 13185 rel 16384 What do you think? Regards, -- Masahiko Sawada http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Attachment
On 2020/10/27 9:41, Masahiko Sawada wrote: > On Tue, 20 Oct 2020 at 22:02, Drouvot, Bertrand <bdrouvot@amazon.com> wrote: >> >> Hi, >> >> On 10/15/20 9:15 AM, Masahiko Sawada wrote: >>> CAUTION: This email originated from outside of the organization. Do not click links or open attachments unless you canconfirm the sender and know the content is safe. >>> >>> >>> >>> On Thu, 15 Oct 2020 at 14:52, Kyotaro Horiguchi <horikyota.ntt@gmail.com> wrote: >>>> At Thu, 15 Oct 2020 14:28:57 +0900, Masahiko Sawada <masahiko.sawada@2ndquadrant.com> wrote in >>>>>> ereport(..(errmsg("%s", _("hogehoge")))) results in >>>>>> fprintf((translated("%s")), translate("hogehoge")). >>>>>> >>>>>> So your change (errmsg("%s", gettext_noop("hogehoge")) results in >>>>>> >>>>>> fprintf((translated("%s")), DONT_translate("hogehoge")). >>>>>> >>>>>> which leads to a translation problem. >>>>>> >>>>>> (errmsg(gettext_noop("hogehoge")) >>>>> This seems equivalent to (errmsg("hogehoge")), right? >>>> Yes and no. However eventually the two works the same way, >>>> "(errmsg(gettext_noop("hogehoge"))" is a shorthand of >>>> >>>> 1: char *msg = gettext_noop("hogehoge"); >>>> ... >>>> 2: .. (errmsg(msg)); >>>> >>>> That is, the line 1 only registers a message id "hogehoge" and doesn't >>>> translate. The line 2 tries to translate the content of msg and it >>>> finds the translation for the message id "hogehoge". >>> Understood. >>> >>>>> I think I could understand translation stuff. Given we only report the >>>>> const string returned from get_recovery_conflict_desc() without >>>>> placeholders, the patch needs to use errmsg_internal() instead while >>>>> not changing _() part. (errmsg(get_recovery_conflict_desc())) is not >>>>> good (warned by -Wformat-security). >>>> Ah, right. we get a complain if no value parameters added. We can >>>> silence it by adding a dummy parameter to errmsg, but I'm not sure >>>> which is preferable. >>> Okay, I'm going to use errmsg_internal() for now until a better idea comes. >>> >>> I've attached the updated patch that fixed the translation part. >> >> Thanks for reviewing and helping on this patch! >> >> The patch tester bot is currently failing due to: >> >> "proc.c:1290:5: error: ‘standbyWaitStart’ may be used uninitialized in >> this function [-Werror=maybe-uninitialized]" >> >> I've attached a new version with the minor change to fix it. >> > > Thank you for updating the patch! > > I've looked at the patch and revised a bit the formatting etc. > > After some thoughts, I think it might be better to report the waiting > time as well. it would help users and there is no particular reason > for logging the report only once. It also helps make the patch clean > by reducing the variables such as recovery_conflict_logged. I’ve > implemented it in the v8 patch. I read v8 patch. Here are review comments. When recovery conflict with buffer pin happens, log message is output every deadlock_timeout. Is this intentional behavior? If yes, IMO that's not good because lots of messages can be output. + if (log_recovery_conflict_waits) + waitStart = GetCurrentTimestamp(); What happens if log_recovery_conflict_waits is off at the beginning and then it's changed during waiting for the conflict? In this case, waitStart is zero, but log_recovery_conflict_waits is on. This may cause some problems? + if (report_waiting) + ts = GetCurrentTimestamp(); GetCurrentTimestamp() doesn't need to be called every cycle in the loop after "logged" is true and "new_status" is not NULL. +extern const char *get_procsignal_reason_desc(ProcSignalReason reason); Is this garbage? When log_lock_waits is enabled, both "still waiting for ..." and "acquired ..." messages are output. Like this, when log_recovery_conflict_waits is enabled, not only "still waiting ..." but also "resolved ..." message should be output? The latter message might not need to be output if the conflict is canceled due to max_standby_xxx_delay parameter. The latter message would be useful to know how long the recovery was waiting for the conflict. Thought? It's ok to implement this as a separate patch later, though. + Controls whether a log message is produced when the startup process + is waiting longer than longer than <varname>deadlock_timeout</varname>. Typo: "longer than longer than" should be "longer than". Probably the section about hot standby in high-availability.sgml would need to be updated. Regards, -- Fujii Masao Advanced Computing Technology Center Research and Development Headquarters NTT DATA CORPORATION
, On Thu, 29 Oct 2020 at 00:16, Fujii Masao <masao.fujii@oss.nttdata.com> wrote: > > > > On 2020/10/27 9:41, Masahiko Sawada wrote: > > On Tue, 20 Oct 2020 at 22:02, Drouvot, Bertrand <bdrouvot@amazon.com> wrote: > >> > >> Hi, > >> > >> On 10/15/20 9:15 AM, Masahiko Sawada wrote: > >>> CAUTION: This email originated from outside of the organization. Do not click links or open attachments unless youcan confirm the sender and know the content is safe. > >>> > >>> > >>> > >>> On Thu, 15 Oct 2020 at 14:52, Kyotaro Horiguchi <horikyota.ntt@gmail.com> wrote: > >>>> At Thu, 15 Oct 2020 14:28:57 +0900, Masahiko Sawada <masahiko.sawada@2ndquadrant.com> wrote in > >>>>>> ereport(..(errmsg("%s", _("hogehoge")))) results in > >>>>>> fprintf((translated("%s")), translate("hogehoge")). > >>>>>> > >>>>>> So your change (errmsg("%s", gettext_noop("hogehoge")) results in > >>>>>> > >>>>>> fprintf((translated("%s")), DONT_translate("hogehoge")). > >>>>>> > >>>>>> which leads to a translation problem. > >>>>>> > >>>>>> (errmsg(gettext_noop("hogehoge")) > >>>>> This seems equivalent to (errmsg("hogehoge")), right? > >>>> Yes and no. However eventually the two works the same way, > >>>> "(errmsg(gettext_noop("hogehoge"))" is a shorthand of > >>>> > >>>> 1: char *msg = gettext_noop("hogehoge"); > >>>> ... > >>>> 2: .. (errmsg(msg)); > >>>> > >>>> That is, the line 1 only registers a message id "hogehoge" and doesn't > >>>> translate. The line 2 tries to translate the content of msg and it > >>>> finds the translation for the message id "hogehoge". > >>> Understood. > >>> > >>>>> I think I could understand translation stuff. Given we only report the > >>>>> const string returned from get_recovery_conflict_desc() without > >>>>> placeholders, the patch needs to use errmsg_internal() instead while > >>>>> not changing _() part. (errmsg(get_recovery_conflict_desc())) is not > >>>>> good (warned by -Wformat-security). > >>>> Ah, right. we get a complain if no value parameters added. We can > >>>> silence it by adding a dummy parameter to errmsg, but I'm not sure > >>>> which is preferable. > >>> Okay, I'm going to use errmsg_internal() for now until a better idea comes. > >>> > >>> I've attached the updated patch that fixed the translation part. > >> > >> Thanks for reviewing and helping on this patch! > >> > >> The patch tester bot is currently failing due to: > >> > >> "proc.c:1290:5: error: ‘standbyWaitStart’ may be used uninitialized in > >> this function [-Werror=maybe-uninitialized]" > >> > >> I've attached a new version with the minor change to fix it. > >> > > > > Thank you for updating the patch! > > > > I've looked at the patch and revised a bit the formatting etc. > > > > After some thoughts, I think it might be better to report the waiting > > time as well. it would help users and there is no particular reason > > for logging the report only once. It also helps make the patch clean > > by reducing the variables such as recovery_conflict_logged. I’ve > > implemented it in the v8 patch. > > I read v8 patch. Here are review comments. Thank you for your review. > > When recovery conflict with buffer pin happens, log message is output > every deadlock_timeout. Is this intentional behavior? If yes, IMO that's > not good because lots of messages can be output. Agreed. if we were to log the recovery conflict only once in bufferpin conflict case, we would log it multiple times only in lock conflict case. So I guess it's better to do that in all conflict cases. > > + if (log_recovery_conflict_waits) > + waitStart = GetCurrentTimestamp(); > > What happens if log_recovery_conflict_waits is off at the beginning and > then it's changed during waiting for the conflict? In this case, waitStart is > zero, but log_recovery_conflict_waits is on. This may cause some problems? Hmm, I didn't see a path that happens to reload the config file during waiting for buffer cleanup lock. Even if the startup process receives SIGHUP during that, it calls HandleStartupProcInterrupts() at the next convenient time. It could be the beginning of main apply loop or inside WaitForWALToBecomeAvailable() and so on but I didn’t see it in the wait loop for taking a buffer cleanup. However, I think it’s better to use (waitStart > 0) for safety when checking if we log the recovery conflict instead of log_recovery_conflict_waits. > > + if (report_waiting) > + ts = GetCurrentTimestamp(); > > GetCurrentTimestamp() doesn't need to be called every cycle > in the loop after "logged" is true and "new_status" is not NULL. Agreed > > +extern const char *get_procsignal_reason_desc(ProcSignalReason reason); > > Is this garbage? Yes. > > When log_lock_waits is enabled, both "still waiting for ..." and "acquired ..." > messages are output. Like this, when log_recovery_conflict_waits is enabled, > not only "still waiting ..." but also "resolved ..." message should be output? > The latter message might not need to be output if the conflict is canceled > due to max_standby_xxx_delay parameter. The latter message would be > useful to know how long the recovery was waiting for the conflict. Thought? > It's ok to implement this as a separate patch later, though. There was a discussion that the latter message without waiting time is not necessarily needed because the canceled process will log "canceling statement due to conflict with XXX" as you mentioned. I agreed with that. But I agree that the latter message with waiting time would help users, for instance when the startup process is waiting for multiple processes and it takes a time to cancel all of them. > > + Controls whether a log message is produced when the startup process > + is waiting longer than longer than <varname>deadlock_timeout</varname>. > > Typo: "longer than longer than" should be "longer than". > > Probably the section about hot standby in high-availability.sgml would need > to be updated. Agreed. Regards, -- Masahiko Sawada http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
On 2020/10/30 10:29, Masahiko Sawada wrote: > , > > On Thu, 29 Oct 2020 at 00:16, Fujii Masao <masao.fujii@oss.nttdata.com> wrote: >> >> >> >> On 2020/10/27 9:41, Masahiko Sawada wrote: >>> On Tue, 20 Oct 2020 at 22:02, Drouvot, Bertrand <bdrouvot@amazon.com> wrote: >>>> >>>> Hi, >>>> >>>> On 10/15/20 9:15 AM, Masahiko Sawada wrote: >>>>> CAUTION: This email originated from outside of the organization. Do not click links or open attachments unless youcan confirm the sender and know the content is safe. >>>>> >>>>> >>>>> >>>>> On Thu, 15 Oct 2020 at 14:52, Kyotaro Horiguchi <horikyota.ntt@gmail.com> wrote: >>>>>> At Thu, 15 Oct 2020 14:28:57 +0900, Masahiko Sawada <masahiko.sawada@2ndquadrant.com> wrote in >>>>>>>> ereport(..(errmsg("%s", _("hogehoge")))) results in >>>>>>>> fprintf((translated("%s")), translate("hogehoge")). >>>>>>>> >>>>>>>> So your change (errmsg("%s", gettext_noop("hogehoge")) results in >>>>>>>> >>>>>>>> fprintf((translated("%s")), DONT_translate("hogehoge")). >>>>>>>> >>>>>>>> which leads to a translation problem. >>>>>>>> >>>>>>>> (errmsg(gettext_noop("hogehoge")) >>>>>>> This seems equivalent to (errmsg("hogehoge")), right? >>>>>> Yes and no. However eventually the two works the same way, >>>>>> "(errmsg(gettext_noop("hogehoge"))" is a shorthand of >>>>>> >>>>>> 1: char *msg = gettext_noop("hogehoge"); >>>>>> ... >>>>>> 2: .. (errmsg(msg)); >>>>>> >>>>>> That is, the line 1 only registers a message id "hogehoge" and doesn't >>>>>> translate. The line 2 tries to translate the content of msg and it >>>>>> finds the translation for the message id "hogehoge". >>>>> Understood. >>>>> >>>>>>> I think I could understand translation stuff. Given we only report the >>>>>>> const string returned from get_recovery_conflict_desc() without >>>>>>> placeholders, the patch needs to use errmsg_internal() instead while >>>>>>> not changing _() part. (errmsg(get_recovery_conflict_desc())) is not >>>>>>> good (warned by -Wformat-security). >>>>>> Ah, right. we get a complain if no value parameters added. We can >>>>>> silence it by adding a dummy parameter to errmsg, but I'm not sure >>>>>> which is preferable. >>>>> Okay, I'm going to use errmsg_internal() for now until a better idea comes. >>>>> >>>>> I've attached the updated patch that fixed the translation part. >>>> >>>> Thanks for reviewing and helping on this patch! >>>> >>>> The patch tester bot is currently failing due to: >>>> >>>> "proc.c:1290:5: error: ‘standbyWaitStart’ may be used uninitialized in >>>> this function [-Werror=maybe-uninitialized]" >>>> >>>> I've attached a new version with the minor change to fix it. >>>> >>> >>> Thank you for updating the patch! >>> >>> I've looked at the patch and revised a bit the formatting etc. >>> >>> After some thoughts, I think it might be better to report the waiting >>> time as well. it would help users and there is no particular reason >>> for logging the report only once. It also helps make the patch clean >>> by reducing the variables such as recovery_conflict_logged. I’ve >>> implemented it in the v8 patch. >> >> I read v8 patch. Here are review comments. > > Thank you for your review. > >> >> When recovery conflict with buffer pin happens, log message is output >> every deadlock_timeout. Is this intentional behavior? If yes, IMO that's >> not good because lots of messages can be output. > > Agreed. > > if we were to log the recovery conflict only once in bufferpin > conflict case, we would log it multiple times only in lock conflict > case. So I guess it's better to do that in all conflict cases. Yes, I agree that this behavior basically should be consistent between all cases. > >> >> + if (log_recovery_conflict_waits) >> + waitStart = GetCurrentTimestamp(); >> >> What happens if log_recovery_conflict_waits is off at the beginning and >> then it's changed during waiting for the conflict? In this case, waitStart is >> zero, but log_recovery_conflict_waits is on. This may cause some problems? > > Hmm, I didn't see a path that happens to reload the config file during > waiting for buffer cleanup lock. Even if the startup process receives > SIGHUP during that, it calls HandleStartupProcInterrupts() at the next > convenient time. It could be the beginning of main apply loop or > inside WaitForWALToBecomeAvailable() and so on but I didn’t see it in > the wait loop for taking a buffer cleanup. Yes, you're right. I seem to have read the code wrongly. > However, I think it’s > better to use (waitStart > 0) for safety when checking if we log the > recovery conflict instead of log_recovery_conflict_waits. > >> >> + if (report_waiting) >> + ts = GetCurrentTimestamp(); >> >> GetCurrentTimestamp() doesn't need to be called every cycle >> in the loop after "logged" is true and "new_status" is not NULL. > > Agreed > >> >> +extern const char *get_procsignal_reason_desc(ProcSignalReason reason); >> >> Is this garbage? > > Yes. > >> >> When log_lock_waits is enabled, both "still waiting for ..." and "acquired ..." >> messages are output. Like this, when log_recovery_conflict_waits is enabled, >> not only "still waiting ..." but also "resolved ..." message should be output? >> The latter message might not need to be output if the conflict is canceled >> due to max_standby_xxx_delay parameter. The latter message would be >> useful to know how long the recovery was waiting for the conflict. Thought? >> It's ok to implement this as a separate patch later, though. > > There was a discussion that the latter message without waiting time is > not necessarily needed because the canceled process will log > "canceling statement due to conflict with XXX" as you mentioned. I > agreed with that. But I agree that the latter message with waiting > time would help users, for instance when the startup process is > waiting for multiple processes and it takes a time to cancel all of > them. I agree that it's useful to output the wait time. But as I told in previous email, it's ok to focus on the current patch for now and then implement this as a separate patch later. Regards, -- Fujii Masao Advanced Computing Technology Center Research and Development Headquarters NTT DATA CORPORATION
Hi, On 10/30/20 4:25 AM, Fujii Masao wrote: > CAUTION: This email originated from outside of the organization. Do > not click links or open attachments unless you can confirm the sender > and know the content is safe. > > > > On 2020/10/30 10:29, Masahiko Sawada wrote: >> , >> >> On Thu, 29 Oct 2020 at 00:16, Fujii Masao >> <masao.fujii@oss.nttdata.com> wrote: >>> >>> >>> >>> On 2020/10/27 9:41, Masahiko Sawada wrote: >>>> On Tue, 20 Oct 2020 at 22:02, Drouvot, Bertrand >>>> <bdrouvot@amazon.com> wrote: >>>>> >>>>> Hi, >>>>> >>>>> On 10/15/20 9:15 AM, Masahiko Sawada wrote: >>>>>> CAUTION: This email originated from outside of the organization. >>>>>> Do not click links or open attachments unless you can confirm the >>>>>> sender and know the content is safe. >>>>>> >>>>>> >>>>>> >>>>>> On Thu, 15 Oct 2020 at 14:52, Kyotaro Horiguchi >>>>>> <horikyota.ntt@gmail.com> wrote: >>>>>>> At Thu, 15 Oct 2020 14:28:57 +0900, Masahiko Sawada >>>>>>> <masahiko.sawada@2ndquadrant.com> wrote in >>>>>>>>> ereport(..(errmsg("%s", _("hogehoge")))) results in >>>>>>>>> fprintf((translated("%s")), translate("hogehoge")). >>>>>>>>> >>>>>>>>> So your change (errmsg("%s", gettext_noop("hogehoge")) results in >>>>>>>>> >>>>>>>>> fprintf((translated("%s")), DONT_translate("hogehoge")). >>>>>>>>> >>>>>>>>> which leads to a translation problem. >>>>>>>>> >>>>>>>>> (errmsg(gettext_noop("hogehoge")) >>>>>>>> This seems equivalent to (errmsg("hogehoge")), right? >>>>>>> Yes and no. However eventually the two works the same way, >>>>>>> "(errmsg(gettext_noop("hogehoge"))" is a shorthand of >>>>>>> >>>>>>> 1: char *msg = gettext_noop("hogehoge"); >>>>>>> ... >>>>>>> 2: .. (errmsg(msg)); >>>>>>> >>>>>>> That is, the line 1 only registers a message id "hogehoge" and >>>>>>> doesn't >>>>>>> translate. The line 2 tries to translate the content of msg and it >>>>>>> finds the translation for the message id "hogehoge". >>>>>> Understood. >>>>>> >>>>>>>> I think I could understand translation stuff. Given we only >>>>>>>> report the >>>>>>>> const string returned from get_recovery_conflict_desc() without >>>>>>>> placeholders, the patch needs to use errmsg_internal() instead >>>>>>>> while >>>>>>>> not changing _() part. (errmsg(get_recovery_conflict_desc())) >>>>>>>> is not >>>>>>>> good (warned by -Wformat-security). >>>>>>> Ah, right. we get a complain if no value parameters added. We can >>>>>>> silence it by adding a dummy parameter to errmsg, but I'm not sure >>>>>>> which is preferable. >>>>>> Okay, I'm going to use errmsg_internal() for now until a better >>>>>> idea comes. >>>>>> >>>>>> I've attached the updated patch that fixed the translation part. >>>>> >>>>> Thanks for reviewing and helping on this patch! >>>>> >>>>> The patch tester bot is currently failing due to: >>>>> >>>>> "proc.c:1290:5: error: ‘standbyWaitStart’ may be used >>>>> uninitialized in >>>>> this function [-Werror=maybe-uninitialized]" >>>>> >>>>> I've attached a new version with the minor change to fix it. >>>>> >>>> >>>> Thank you for updating the patch! >>>> >>>> I've looked at the patch and revised a bit the formatting etc. >>>> >>>> After some thoughts, I think it might be better to report the waiting >>>> time as well. it would help users and there is no particular reason >>>> for logging the report only once. It also helps make the patch clean >>>> by reducing the variables such as recovery_conflict_logged. I’ve >>>> implemented it in the v8 patch. >>> >>> I read v8 patch. Here are review comments. >> >> Thank you for your review. >> >>> >>> When recovery conflict with buffer pin happens, log message is output >>> every deadlock_timeout. Is this intentional behavior? If yes, IMO >>> that's >>> not good because lots of messages can be output. >> >> Agreed. >> >> if we were to log the recovery conflict only once in bufferpin >> conflict case, we would log it multiple times only in lock conflict >> case. So I guess it's better to do that in all conflict cases. > > Yes, I agree that this behavior basically should be consistent between > all cases. > >> >>> >>> + if (log_recovery_conflict_waits) >>> + waitStart = GetCurrentTimestamp(); >>> >>> What happens if log_recovery_conflict_waits is off at the beginning and >>> then it's changed during waiting for the conflict? In this case, >>> waitStart is >>> zero, but log_recovery_conflict_waits is on. This may cause some >>> problems? >> >> Hmm, I didn't see a path that happens to reload the config file during >> waiting for buffer cleanup lock. Even if the startup process receives >> SIGHUP during that, it calls HandleStartupProcInterrupts() at the next >> convenient time. It could be the beginning of main apply loop or >> inside WaitForWALToBecomeAvailable() and so on but I didn’t see it in >> the wait loop for taking a buffer cleanup. > > Yes, you're right. I seem to have read the code wrongly. > >> However, I think it’s >> better to use (waitStart > 0) for safety when checking if we log the >> recovery conflict instead of log_recovery_conflict_waits. >> >>> >>> + if (report_waiting) >>> + ts = GetCurrentTimestamp(); >>> >>> GetCurrentTimestamp() doesn't need to be called every cycle >>> in the loop after "logged" is true and "new_status" is not NULL. >> >> Agreed >> >>> >>> +extern const char *get_procsignal_reason_desc(ProcSignalReason >>> reason); >>> >>> Is this garbage? >> >> Yes. >> >>> >>> When log_lock_waits is enabled, both "still waiting for ..." and >>> "acquired ..." >>> messages are output. Like this, when log_recovery_conflict_waits is >>> enabled, >>> not only "still waiting ..." but also "resolved ..." message should >>> be output? >>> The latter message might not need to be output if the conflict is >>> canceled >>> due to max_standby_xxx_delay parameter. The latter message would be >>> useful to know how long the recovery was waiting for the conflict. >>> Thought? >>> It's ok to implement this as a separate patch later, though. >> >> There was a discussion that the latter message without waiting time is >> not necessarily needed because the canceled process will log >> "canceling statement due to conflict with XXX" as you mentioned. I >> agreed with that. But I agree that the latter message with waiting >> time would help users, for instance when the startup process is >> waiting for multiple processes and it takes a time to cancel all of >> them. > > I agree that it's useful to output the wait time. > > But as I told in previous email, it's ok to focus on the current patch > for now and then implement this as a separate patch later. > Thanks for your work and thoughts on this patch! I've attached a new version that take your remarks (hope i did not miss some of them) into account (but still leave the last one for a separate patch later). Bertrand
Attachment
On Fri, Oct 30, 2020 at 6:02 PM Drouvot, Bertrand <bdrouvot@amazon.com> wrote: > > Hi, > > On 10/30/20 4:25 AM, Fujii Masao wrote: > > CAUTION: This email originated from outside of the organization. Do > > not click links or open attachments unless you can confirm the sender > > and know the content is safe. > > > > > > > > On 2020/10/30 10:29, Masahiko Sawada wrote: > >> , > >> > >> On Thu, 29 Oct 2020 at 00:16, Fujii Masao > >> <masao.fujii@oss.nttdata.com> wrote: > >>> > >>> I read v8 patch. Here are review comments. > >> > >> Thank you for your review. > >> Thank you for updating the patch! Looking at the latest version patch (v8-0002-Log-the-standby-recovery-conflict-waits.patch), I think it doesn't address some comments from Fujii-san. > >>> > >>> When recovery conflict with buffer pin happens, log message is output > >>> every deadlock_timeout. Is this intentional behavior? If yes, IMO > >>> that's > >>> not good because lots of messages can be output. > >> > >> Agreed. I think the latest patch doesn't fix the above comment. Log message for recovery conflict on buffer pin is logged every deadlock_timeout. > >> > >> if we were to log the recovery conflict only once in bufferpin > >> conflict case, we would log it multiple times only in lock conflict > >> case. So I guess it's better to do that in all conflict cases. > > > > Yes, I agree that this behavior basically should be consistent between > > all cases. The latest patch seems not to address this comment as well. > > > >> > >>> > >>> + if (log_recovery_conflict_waits) > >>> + waitStart = GetCurrentTimestamp(); > >>> > >>> What happens if log_recovery_conflict_waits is off at the beginning and > >>> then it's changed during waiting for the conflict? In this case, > >>> waitStart is > >>> zero, but log_recovery_conflict_waits is on. This may cause some > >>> problems? > >> > >> Hmm, I didn't see a path that happens to reload the config file during > >> waiting for buffer cleanup lock. Even if the startup process receives > >> SIGHUP during that, it calls HandleStartupProcInterrupts() at the next > >> convenient time. It could be the beginning of main apply loop or > >> inside WaitForWALToBecomeAvailable() and so on but I didn’t see it in > >> the wait loop for taking a buffer cleanup. > > > > Yes, you're right. I seem to have read the code wrongly. > > > >> However, I think it’s > >> better to use (waitStart > 0) for safety when checking if we log the > >> recovery conflict instead of log_recovery_conflict_waits. > >> > >>> > >>> + if (report_waiting) > >>> + ts = GetCurrentTimestamp(); > >>> > >>> GetCurrentTimestamp() doesn't need to be called every cycle > >>> in the loop after "logged" is true and "new_status" is not NULL. > >> > >> Agreed > >> > >>> > >>> +extern const char *get_procsignal_reason_desc(ProcSignalReason > >>> reason); > >>> > >>> Is this garbage? > >> > >> Yes. > >> > >>> > >>> When log_lock_waits is enabled, both "still waiting for ..." and > >>> "acquired ..." > >>> messages are output. Like this, when log_recovery_conflict_waits is > >>> enabled, > >>> not only "still waiting ..." but also "resolved ..." message should > >>> be output? > >>> The latter message might not need to be output if the conflict is > >>> canceled > >>> due to max_standby_xxx_delay parameter. The latter message would be > >>> useful to know how long the recovery was waiting for the conflict. > >>> Thought? > >>> It's ok to implement this as a separate patch later, though. > >> > >> There was a discussion that the latter message without waiting time is > >> not necessarily needed because the canceled process will log > >> "canceling statement due to conflict with XXX" as you mentioned. I > >> agreed with that. But I agree that the latter message with waiting > >> time would help users, for instance when the startup process is > >> waiting for multiple processes and it takes a time to cancel all of > >> them. > > > > I agree that it's useful to output the wait time. > > > > But as I told in previous email, it's ok to focus on the current patch > > for now and then implement this as a separate patch later. Agreed. Regards, -- Masahiko Sawada EnterpriseDB: https://www.enterprisedb.com/
Hi, On 11/6/20 3:21 AM, Masahiko Sawada wrote: > CAUTION: This email originated from outside of the organization. Do not click links or open attachments unless you canconfirm the sender and know the content is safe. > > > > On Fri, Oct 30, 2020 at 6:02 PM Drouvot, Bertrand <bdrouvot@amazon.com> wrote: >> Hi, >> >> On 10/30/20 4:25 AM, Fujii Masao wrote: >>> CAUTION: This email originated from outside of the organization. Do >>> not click links or open attachments unless you can confirm the sender >>> and know the content is safe. >>> >>> >>> >>> On 2020/10/30 10:29, Masahiko Sawada wrote: >>>> , >>>> >>>> On Thu, 29 Oct 2020 at 00:16, Fujii Masao >>>> <masao.fujii@oss.nttdata.com> wrote: >>>>> I read v8 patch. Here are review comments. >>>> Thank you for your review. >>>> > Thank you for updating the patch! > > Looking at the latest version patch > (v8-0002-Log-the-standby-recovery-conflict-waits.patch), I think it > doesn't address some comments from Fujii-san. > >>>>> When recovery conflict with buffer pin happens, log message is output >>>>> every deadlock_timeout. Is this intentional behavior? If yes, IMO >>>>> that's >>>>> not good because lots of messages can be output. >>>> Agreed. > I think the latest patch doesn't fix the above comment. Log message > for recovery conflict on buffer pin is logged every deadlock_timeout. > >>>> if we were to log the recovery conflict only once in bufferpin >>>> conflict case, we would log it multiple times only in lock conflict >>>> case. So I guess it's better to do that in all conflict cases. >>> Yes, I agree that this behavior basically should be consistent between >>> all cases. > The latest patch seems not to address this comment as well. Oh, I missed those ones, thanks for the feedback. New version attached, so that recovery conflict will be logged only once also for buffer pin and lock cases. Bertrand
Attachment
On Mon, Nov 9, 2020 at 2:49 PM Drouvot, Bertrand <bdrouvot@amazon.com> wrote: > > Hi, > > On 11/6/20 3:21 AM, Masahiko Sawada wrote: > > CAUTION: This email originated from outside of the organization. Do not click links or open attachments unless you canconfirm the sender and know the content is safe. > > > > > > > > On Fri, Oct 30, 2020 at 6:02 PM Drouvot, Bertrand <bdrouvot@amazon.com> wrote: > >> Hi, > >> > >> On 10/30/20 4:25 AM, Fujii Masao wrote: > >>> CAUTION: This email originated from outside of the organization. Do > >>> not click links or open attachments unless you can confirm the sender > >>> and know the content is safe. > >>> > >>> > >>> > >>> On 2020/10/30 10:29, Masahiko Sawada wrote: > >>>> , > >>>> > >>>> On Thu, 29 Oct 2020 at 00:16, Fujii Masao > >>>> <masao.fujii@oss.nttdata.com> wrote: > >>>>> I read v8 patch. Here are review comments. > >>>> Thank you for your review. > >>>> > > Thank you for updating the patch! > > > > Looking at the latest version patch > > (v8-0002-Log-the-standby-recovery-conflict-waits.patch), I think it > > doesn't address some comments from Fujii-san. > > > >>>>> When recovery conflict with buffer pin happens, log message is output > >>>>> every deadlock_timeout. Is this intentional behavior? If yes, IMO > >>>>> that's > >>>>> not good because lots of messages can be output. > >>>> Agreed. > > I think the latest patch doesn't fix the above comment. Log message > > for recovery conflict on buffer pin is logged every deadlock_timeout. > > > >>>> if we were to log the recovery conflict only once in bufferpin > >>>> conflict case, we would log it multiple times only in lock conflict > >>>> case. So I guess it's better to do that in all conflict cases. > >>> Yes, I agree that this behavior basically should be consistent between > >>> all cases. > > The latest patch seems not to address this comment as well. > > Oh, I missed those ones, thanks for the feedback. > > New version attached, so that recovery conflict will be logged only once > also for buffer pin and lock cases. Thank you for updating the patch. Here are review comments. + if (report_waiting && (!logged_recovery_conflict || new_status == NULL)) + ts = GetCurrentTimestamp(); The condition will always be true if log_recovery_conflict_wait is false and report_waiting is true, leading to unnecessary calling of GetCurrentTimestamp(). --- + <para> + You can control whether a log message is produced when the startup process + is waiting longer than <varname>deadlock_timeout</varname> for recovery + conflicts. This is controled by the <xref linkend="guc-log-recovery-conflict-waits"/> + parameter. + </para> s/controled/controlled/ --- if (report_waiting) waitStart = GetCurrentTimestamp(); Similarly, we have the above code but we don't need to call GetCurrentTimestamp() if update_process_title is false, even if report_waiting is true. I've attached the patch that fixes the above comments. It can be applied on top of your v8 patch. Regards, -- Masahiko Sawada EnterpriseDB: https://www.enterprisedb.com/
Attachment
Hi, On 11/16/20 6:44 AM, Masahiko Sawada wrote: > Thank you for updating the patch. > > Here are review comments. > > + if (report_waiting && (!logged_recovery_conflict || > new_status == NULL)) > + ts = GetCurrentTimestamp(); > > The condition will always be true if log_recovery_conflict_wait is > false and report_waiting is true, leading to unnecessary calling of > GetCurrentTimestamp(). > > --- > + <para> > + You can control whether a log message is produced when the startup process > + is waiting longer than <varname>deadlock_timeout</varname> for recovery > + conflicts. This is controled by the <xref > linkend="guc-log-recovery-conflict-waits"/> > + parameter. > + </para> > > s/controled/controlled/ > > --- > if (report_waiting) > waitStart = GetCurrentTimestamp(); > > Similarly, we have the above code but we don't need to call > GetCurrentTimestamp() if update_process_title is false, even if > report_waiting is true. > > I've attached the patch that fixes the above comments. It can be > applied on top of your v8 patch. Thanks for the review and the associated fixes! I've attached a new version that contains your fixes. Thanks Bertrand
Attachment
On Mon, Nov 16, 2020 at 4:55 PM Drouvot, Bertrand <bdrouvot@amazon.com> wrote: > > Hi, > > On 11/16/20 6:44 AM, Masahiko Sawada wrote: > > Thank you for updating the patch. > > > > Here are review comments. > > > > + if (report_waiting && (!logged_recovery_conflict || > > new_status == NULL)) > > + ts = GetCurrentTimestamp(); > > > > The condition will always be true if log_recovery_conflict_wait is > > false and report_waiting is true, leading to unnecessary calling of > > GetCurrentTimestamp(). > > > > --- > > + <para> > > + You can control whether a log message is produced when the startup process > > + is waiting longer than <varname>deadlock_timeout</varname> for recovery > > + conflicts. This is controled by the <xref > > linkend="guc-log-recovery-conflict-waits"/> > > + parameter. > > + </para> > > > > s/controled/controlled/ > > > > --- > > if (report_waiting) > > waitStart = GetCurrentTimestamp(); > > > > Similarly, we have the above code but we don't need to call > > GetCurrentTimestamp() if update_process_title is false, even if > > report_waiting is true. > > > > I've attached the patch that fixes the above comments. It can be > > applied on top of your v8 patch. > > Thanks for the review and the associated fixes! > > I've attached a new version that contains your fixes. > Thank you for updating the patch. I have other comments: + <para> + You can control whether a log message is produced when the startup process + is waiting longer than <varname>deadlock_timeout</varname> for recovery + conflicts. This is controlled by the + <xref linkend="guc-log-recovery-conflict-waits"/> parameter. + </para> It would be better to use 'WAL replay' instead of 'the startup process' for consistency with circumjacent descriptions. What do you think? --- @@ -1260,6 +1262,8 @@ ProcSleep(LOCALLOCK *locallock, LockMethod lockMethodTable) else enable_timeout_after(DEADLOCK_TIMEOUT, DeadlockTimeout); } + else + standbyWaitStart = GetCurrentTimestamp(); I think we can add a check of log_recovery_conflict_waits to avoid unnecessary calling of GetCurrentTimestamp(). I've attached the updated version patch including the above comments as well as adding some assertions. Please review it. Regards, -- Masahiko Sawada EnterpriseDB: https://www.enterprisedb.com/
Attachment
On 11/17/20 2:09 AM, Masahiko Sawada wrote: > CAUTION: This email originated from outside of the organization. Do not click links or open attachments unless you canconfirm the sender and know the content is safe. > > > > On Mon, Nov 16, 2020 at 4:55 PM Drouvot, Bertrand <bdrouvot@amazon.com> wrote: >> Hi, >> >> On 11/16/20 6:44 AM, Masahiko Sawada wrote: >>> Thank you for updating the patch. >>> >>> Here are review comments. >>> >>> + if (report_waiting && (!logged_recovery_conflict || >>> new_status == NULL)) >>> + ts = GetCurrentTimestamp(); >>> >>> The condition will always be true if log_recovery_conflict_wait is >>> false and report_waiting is true, leading to unnecessary calling of >>> GetCurrentTimestamp(). >>> >>> --- >>> + <para> >>> + You can control whether a log message is produced when the startup process >>> + is waiting longer than <varname>deadlock_timeout</varname> for recovery >>> + conflicts. This is controled by the <xref >>> linkend="guc-log-recovery-conflict-waits"/> >>> + parameter. >>> + </para> >>> >>> s/controled/controlled/ >>> >>> --- >>> if (report_waiting) >>> waitStart = GetCurrentTimestamp(); >>> >>> Similarly, we have the above code but we don't need to call >>> GetCurrentTimestamp() if update_process_title is false, even if >>> report_waiting is true. >>> >>> I've attached the patch that fixes the above comments. It can be >>> applied on top of your v8 patch. >> Thanks for the review and the associated fixes! >> >> I've attached a new version that contains your fixes. >> > Thank you for updating the patch. > > I have other comments: > > + <para> > + You can control whether a log message is produced when the startup process > + is waiting longer than <varname>deadlock_timeout</varname> for recovery > + conflicts. This is controlled by the > + <xref linkend="guc-log-recovery-conflict-waits"/> parameter. > + </para> > > It would be better to use 'WAL replay' instead of 'the startup > process' for consistency with circumjacent descriptions. What do you > think? Agree that the wording is more appropriate. > > --- > @@ -1260,6 +1262,8 @@ ProcSleep(LOCALLOCK *locallock, LockMethod > lockMethodTable) > else > enable_timeout_after(DEADLOCK_TIMEOUT, DeadlockTimeout); > } > + else > + standbyWaitStart = GetCurrentTimestamp(); > > I think we can add a check of log_recovery_conflict_waits to avoid > unnecessary calling of GetCurrentTimestamp(). > > I've attached the updated version patch including the above comments > as well as adding some assertions. Please review it. > That looks all good to me. Thanks a lot for your precious help! Bertrand
On 2020/11/17 17:23, Drouvot, Bertrand wrote: > > On 11/17/20 2:09 AM, Masahiko Sawada wrote: >> CAUTION: This email originated from outside of the organization. Do not click links or open attachments unless you canconfirm the sender and know the content is safe. >> >> >> >> On Mon, Nov 16, 2020 at 4:55 PM Drouvot, Bertrand <bdrouvot@amazon.com> wrote: >>> Hi, >>> >>> On 11/16/20 6:44 AM, Masahiko Sawada wrote: >>>> Thank you for updating the patch. >>>> >>>> Here are review comments. >>>> >>>> + if (report_waiting && (!logged_recovery_conflict || >>>> new_status == NULL)) >>>> + ts = GetCurrentTimestamp(); >>>> >>>> The condition will always be true if log_recovery_conflict_wait is >>>> false and report_waiting is true, leading to unnecessary calling of >>>> GetCurrentTimestamp(). >>>> >>>> --- >>>> + <para> >>>> + You can control whether a log message is produced when the startup process >>>> + is waiting longer than <varname>deadlock_timeout</varname> for recovery >>>> + conflicts. This is controled by the <xref >>>> linkend="guc-log-recovery-conflict-waits"/> >>>> + parameter. >>>> + </para> >>>> >>>> s/controled/controlled/ >>>> >>>> --- >>>> if (report_waiting) >>>> waitStart = GetCurrentTimestamp(); >>>> >>>> Similarly, we have the above code but we don't need to call >>>> GetCurrentTimestamp() if update_process_title is false, even if >>>> report_waiting is true. >>>> >>>> I've attached the patch that fixes the above comments. It can be >>>> applied on top of your v8 patch. >>> Thanks for the review and the associated fixes! >>> >>> I've attached a new version that contains your fixes. >>> >> Thank you for updating the patch. >> >> I have other comments: >> >> + <para> >> + You can control whether a log message is produced when the startup process >> + is waiting longer than <varname>deadlock_timeout</varname> for recovery >> + conflicts. This is controlled by the >> + <xref linkend="guc-log-recovery-conflict-waits"/> parameter. >> + </para> >> >> It would be better to use 'WAL replay' instead of 'the startup >> process' for consistency with circumjacent descriptions. What do you >> think? > > Agree that the wording is more appropriate. > >> >> --- >> @@ -1260,6 +1262,8 @@ ProcSleep(LOCALLOCK *locallock, LockMethod >> lockMethodTable) >> else >> enable_timeout_after(DEADLOCK_TIMEOUT, DeadlockTimeout); >> } >> + else >> + standbyWaitStart = GetCurrentTimestamp(); >> >> I think we can add a check of log_recovery_conflict_waits to avoid >> unnecessary calling of GetCurrentTimestamp(). >> >> I've attached the updated version patch including the above comments >> as well as adding some assertions. Please review it. >> > That looks all good to me. > > Thanks a lot for your precious help! Thanks for updating the patch! Here are review comments. + Controls whether a log message is produced when the startup process + is waiting longer than <varname>deadlock_timeout</varname> + for recovery conflicts. But a log message can be produced also when the backend is waiting for recovery conflict. Right? If yes, this description needs to be corrected. + for recovery conflicts. This is useful in determining if recovery + conflicts prevents the recovery from applying WAL. "prevents" should be "prevent"? + TimestampDifference(waitStart, GetCurrentTimestamp(), &secs, &usecs); + msecs = secs * 1000 + usecs / 1000; GetCurrentTimestamp() is basically called before LogRecoveryConflict() is called. So isn't it better to avoid calling GetCurrentTimestamp() newly in LogRecoveryConflict() and to reuse the timestamp that we got? It's helpful to avoid the waste of cycles. + while (VirtualTransactionIdIsValid(*vxids)) + { + PGPROC *proc = BackendIdGetProc(vxids->backendId); BackendIdGetProc() can return NULL if the backend is not active at that moment. This case should be handled. + case PROCSIG_RECOVERY_CONFLICT_BUFFERPIN: + reasonDesc = gettext_noop("recovery is still waiting recovery conflict on buffer pin"); It's natural to use "waiting for recovery" rather than "waiting recovery"? + /* Also, set deadlock timeout for logging purpose if necessary */ + if (log_recovery_conflict_waits) + { + timeouts[cnt].id = STANDBY_TIMEOUT; + timeouts[cnt].type = TMPARAM_AFTER; + timeouts[cnt].delay_ms = DeadlockTimeout; + cnt++; + } This needs to be executed only when the message has not been logged yet. Right? Regards, -- Fujii Masao Advanced Computing Technology Center Research and Development Headquarters NTT DATA CORPORATION
Hi, On 11/17/20 4:44 PM, Fujii Masao wrote: > > Thanks for updating the patch! Here are review comments. > > + Controls whether a log message is produced when the startup > process > + is waiting longer than <varname>deadlock_timeout</varname> > + for recovery conflicts. > > But a log message can be produced also when the backend is waiting > for recovery conflict. Right? If yes, this description needs to be > corrected. Thanks for looking at it! I don't think so, only the startup process should write those new log messages. What makes you think that would not be the case? > > > + for recovery conflicts. This is useful in determining if > recovery > + conflicts prevents the recovery from applying WAL. > > "prevents" should be "prevent"? Indeed: fixed in the new attached patch. > > > + TimestampDifference(waitStart, GetCurrentTimestamp(), &secs, > &usecs); > + msecs = secs * 1000 + usecs / 1000; > > GetCurrentTimestamp() is basically called before LogRecoveryConflict() > is called. So isn't it better to avoid calling GetCurrentTimestamp() > newly in > LogRecoveryConflict() and to reuse the timestamp that we got? > It's helpful to avoid the waste of cycles. > good catch! fixed in the new attached patch. > > + while (VirtualTransactionIdIsValid(*vxids)) > + { > + PGPROC *proc = > BackendIdGetProc(vxids->backendId); > > BackendIdGetProc() can return NULL if the backend is not active > at that moment. This case should be handled. > handled in the new attached patch. > > + case PROCSIG_RECOVERY_CONFLICT_BUFFERPIN: > + reasonDesc = gettext_noop("recovery is still > waiting recovery conflict on buffer pin"); > > It's natural to use "waiting for recovery" rather than "waiting > recovery"? > I would be tempted to say so, the new patch makes use of "waiting for". > > + /* Also, set deadlock timeout for logging purpose if > necessary */ > + if (log_recovery_conflict_waits) > + { > + timeouts[cnt].id = STANDBY_TIMEOUT; > + timeouts[cnt].type = TMPARAM_AFTER; > + timeouts[cnt].delay_ms = DeadlockTimeout; > + cnt++; > + } > > This needs to be executed only when the message has not been logged yet. > Right? > good catch: fixed in the new attached patch. Bertrand
Attachment
On Fri, Nov 20, 2020 at 6:18 PM Drouvot, Bertrand <bdrouvot@amazon.com> wrote: > > Hi, > > On 11/17/20 4:44 PM, Fujii Masao wrote: > > > > Thanks for updating the patch! Here are review comments. > > > > + Controls whether a log message is produced when the startup > > process > > + is waiting longer than <varname>deadlock_timeout</varname> > > + for recovery conflicts. > > > > But a log message can be produced also when the backend is waiting > > for recovery conflict. Right? If yes, this description needs to be > > corrected. > > Thanks for looking at it! > > I don't think so, only the startup process should write those new log > messages. > > What makes you think that would not be the case? > > > > > > > + for recovery conflicts. This is useful in determining if > > recovery > > + conflicts prevents the recovery from applying WAL. > > > > "prevents" should be "prevent"? > > Indeed: fixed in the new attached patch. > > > > > > > + TimestampDifference(waitStart, GetCurrentTimestamp(), &secs, > > &usecs); > > + msecs = secs * 1000 + usecs / 1000; > > > > GetCurrentTimestamp() is basically called before LogRecoveryConflict() > > is called. So isn't it better to avoid calling GetCurrentTimestamp() > > newly in > > LogRecoveryConflict() and to reuse the timestamp that we got? > > It's helpful to avoid the waste of cycles. > > > good catch! fixed in the new attached patch. > > > > > + while (VirtualTransactionIdIsValid(*vxids)) > > + { > > + PGPROC *proc = > > BackendIdGetProc(vxids->backendId); > > > > BackendIdGetProc() can return NULL if the backend is not active > > at that moment. This case should be handled. > > > handled in the new attached patch. > > > > + case PROCSIG_RECOVERY_CONFLICT_BUFFERPIN: > > + reasonDesc = gettext_noop("recovery is still > > waiting recovery conflict on buffer pin"); > > > > It's natural to use "waiting for recovery" rather than "waiting > > recovery"? > > > I would be tempted to say so, the new patch makes use of "waiting for". > > > > + /* Also, set deadlock timeout for logging purpose if > > necessary */ > > + if (log_recovery_conflict_waits) > > + { > > + timeouts[cnt].id = STANDBY_TIMEOUT; > > + timeouts[cnt].type = TMPARAM_AFTER; > > + timeouts[cnt].delay_ms = DeadlockTimeout; > > + cnt++; > > + } > > > > This needs to be executed only when the message has not been logged yet. > > Right? > > > good catch: fixed in the new attached patch. > Thank you for updating the patch! Here are review comments on the latest version patch. + while (VirtualTransactionIdIsValid(*vxids)) + { + PGPROC *proc = BackendIdGetProc(vxids->backendId); + + if (proc) + { + if (nprocs == 0) + appendStringInfo(&buf, "%d", proc->pid); + else + appendStringInfo(&buf, ", %d", proc->pid); + + nprocs++; + vxids++; + } + } We need to increment vxids even if *proc is null. Otherwise, the loop won't end. --- + TimestampTz cur_ts = GetCurrentTimestamp();; There is an extra semi-colon. --- int max_standby_streaming_delay = 30 * 1000; +bool log_recovery_conflict_waits = false; +bool logged_lock_conflict = false; + if (log_recovery_conflict_waits && !logged_lock_conflict) + { + timeouts[cnt].id = STANDBY_TIMEOUT; + timeouts[cnt].type = TMPARAM_AFTER; + timeouts[cnt].delay_ms = DeadlockTimeout; + cnt++; + } Can we pass a bool indicating if a timeout may be needed for recovery conflict logging from ProcSleep() to ResolveRecoveryConflictWithLock() instead of using a static variable? Regards, -- Masahiko Sawada EnterpriseDB: https://www.enterprisedb.com/
Hi, On 11/25/20 2:20 PM, Masahiko Sawada wrote: > CAUTION: This email originated from outside of the organization. Do not click links or open attachments unless you canconfirm the sender and know the content is safe. > > > > On Fri, Nov 20, 2020 at 6:18 PM Drouvot, Bertrand <bdrouvot@amazon.com> wrote: >> Hi, >> >> On 11/17/20 4:44 PM, Fujii Masao wrote: >>> Thanks for updating the patch! Here are review comments. >>> >>> + Controls whether a log message is produced when the startup >>> process >>> + is waiting longer than <varname>deadlock_timeout</varname> >>> + for recovery conflicts. >>> >>> But a log message can be produced also when the backend is waiting >>> for recovery conflict. Right? If yes, this description needs to be >>> corrected. >> Thanks for looking at it! >> >> I don't think so, only the startup process should write those new log >> messages. >> >> What makes you think that would not be the case? >> >>> >>> + for recovery conflicts. This is useful in determining if >>> recovery >>> + conflicts prevents the recovery from applying WAL. >>> >>> "prevents" should be "prevent"? >> Indeed: fixed in the new attached patch. >> >>> >>> + TimestampDifference(waitStart, GetCurrentTimestamp(), &secs, >>> &usecs); >>> + msecs = secs * 1000 + usecs / 1000; >>> >>> GetCurrentTimestamp() is basically called before LogRecoveryConflict() >>> is called. So isn't it better to avoid calling GetCurrentTimestamp() >>> newly in >>> LogRecoveryConflict() and to reuse the timestamp that we got? >>> It's helpful to avoid the waste of cycles. >>> >> good catch! fixed in the new attached patch. >> >>> + while (VirtualTransactionIdIsValid(*vxids)) >>> + { >>> + PGPROC *proc = >>> BackendIdGetProc(vxids->backendId); >>> >>> BackendIdGetProc() can return NULL if the backend is not active >>> at that moment. This case should be handled. >>> >> handled in the new attached patch. >>> + case PROCSIG_RECOVERY_CONFLICT_BUFFERPIN: >>> + reasonDesc = gettext_noop("recovery is still >>> waiting recovery conflict on buffer pin"); >>> >>> It's natural to use "waiting for recovery" rather than "waiting >>> recovery"? >>> >> I would be tempted to say so, the new patch makes use of "waiting for". >>> + /* Also, set deadlock timeout for logging purpose if >>> necessary */ >>> + if (log_recovery_conflict_waits) >>> + { >>> + timeouts[cnt].id = STANDBY_TIMEOUT; >>> + timeouts[cnt].type = TMPARAM_AFTER; >>> + timeouts[cnt].delay_ms = DeadlockTimeout; >>> + cnt++; >>> + } >>> >>> This needs to be executed only when the message has not been logged yet. >>> Right? >>> >> good catch: fixed in the new attached patch. >> > Thank you for updating the patch! Here are review comments on the > latest version patch. > > + while (VirtualTransactionIdIsValid(*vxids)) > + { > + PGPROC *proc = BackendIdGetProc(vxids->backendId); > + > + if (proc) > + { > + if (nprocs == 0) > + appendStringInfo(&buf, "%d", proc->pid); > + else > + appendStringInfo(&buf, ", %d", proc->pid); > + > + nprocs++; > + vxids++; > + } > + } > > We need to increment vxids even if *proc is null. Otherwise, the loop won't end. My bad, that's fixed. > > --- > + TimestampTz cur_ts = GetCurrentTimestamp();; Fixed > > There is an extra semi-colon. > > --- > int max_standby_streaming_delay = 30 * 1000; > +bool log_recovery_conflict_waits = false; > +bool logged_lock_conflict = false; > > > + if (log_recovery_conflict_waits && !logged_lock_conflict) > + { > + timeouts[cnt].id = STANDBY_TIMEOUT; > + timeouts[cnt].type = TMPARAM_AFTER; > + timeouts[cnt].delay_ms = DeadlockTimeout; > + cnt++; > + } > > Can we pass a bool indicating if a timeout may be needed for recovery > conflict logging from ProcSleep() to ResolveRecoveryConflictWithLock() > instead of using a static variable? Yeah that makes more sense, specially as we already have logged_recovery_conflict at our disposal. New patch version attached. Thanks! Bertrand
Attachment
On Thu, Nov 26, 2020 at 12:49 AM Drouvot, Bertrand <bdrouvot@amazon.com> wrote: > > Hi, > > On 11/25/20 2:20 PM, Masahiko Sawada wrote: > > CAUTION: This email originated from outside of the organization. Do not click links or open attachments unless you canconfirm the sender and know the content is safe. > > > > > > > > On Fri, Nov 20, 2020 at 6:18 PM Drouvot, Bertrand <bdrouvot@amazon.com> wrote: > >> Hi, > >> > >> On 11/17/20 4:44 PM, Fujii Masao wrote: > >>> Thanks for updating the patch! Here are review comments. > >>> > >>> + Controls whether a log message is produced when the startup > >>> process > >>> + is waiting longer than <varname>deadlock_timeout</varname> > >>> + for recovery conflicts. > >>> > >>> But a log message can be produced also when the backend is waiting > >>> for recovery conflict. Right? If yes, this description needs to be > >>> corrected. > >> Thanks for looking at it! > >> > >> I don't think so, only the startup process should write those new log > >> messages. > >> > >> What makes you think that would not be the case? > >> > >>> > >>> + for recovery conflicts. This is useful in determining if > >>> recovery > >>> + conflicts prevents the recovery from applying WAL. > >>> > >>> "prevents" should be "prevent"? > >> Indeed: fixed in the new attached patch. > >> > >>> > >>> + TimestampDifference(waitStart, GetCurrentTimestamp(), &secs, > >>> &usecs); > >>> + msecs = secs * 1000 + usecs / 1000; > >>> > >>> GetCurrentTimestamp() is basically called before LogRecoveryConflict() > >>> is called. So isn't it better to avoid calling GetCurrentTimestamp() > >>> newly in > >>> LogRecoveryConflict() and to reuse the timestamp that we got? > >>> It's helpful to avoid the waste of cycles. > >>> > >> good catch! fixed in the new attached patch. > >> > >>> + while (VirtualTransactionIdIsValid(*vxids)) > >>> + { > >>> + PGPROC *proc = > >>> BackendIdGetProc(vxids->backendId); > >>> > >>> BackendIdGetProc() can return NULL if the backend is not active > >>> at that moment. This case should be handled. > >>> > >> handled in the new attached patch. > >>> + case PROCSIG_RECOVERY_CONFLICT_BUFFERPIN: > >>> + reasonDesc = gettext_noop("recovery is still > >>> waiting recovery conflict on buffer pin"); > >>> > >>> It's natural to use "waiting for recovery" rather than "waiting > >>> recovery"? > >>> > >> I would be tempted to say so, the new patch makes use of "waiting for". > >>> + /* Also, set deadlock timeout for logging purpose if > >>> necessary */ > >>> + if (log_recovery_conflict_waits) > >>> + { > >>> + timeouts[cnt].id = STANDBY_TIMEOUT; > >>> + timeouts[cnt].type = TMPARAM_AFTER; > >>> + timeouts[cnt].delay_ms = DeadlockTimeout; > >>> + cnt++; > >>> + } > >>> > >>> This needs to be executed only when the message has not been logged yet. > >>> Right? > >>> > >> good catch: fixed in the new attached patch. > >> > > Thank you for updating the patch! Here are review comments on the > > latest version patch. > > > > + while (VirtualTransactionIdIsValid(*vxids)) > > + { > > + PGPROC *proc = BackendIdGetProc(vxids->backendId); > > + > > + if (proc) > > + { > > + if (nprocs == 0) > > + appendStringInfo(&buf, "%d", proc->pid); > > + else > > + appendStringInfo(&buf, ", %d", proc->pid); > > + > > + nprocs++; > > + vxids++; > > + } > > + } > > > > We need to increment vxids even if *proc is null. Otherwise, the loop won't end. > > My bad, that's fixed. > > > > > --- > > + TimestampTz cur_ts = GetCurrentTimestamp();; > Fixed > > > > There is an extra semi-colon. > > > > --- > > int max_standby_streaming_delay = 30 * 1000; > > +bool log_recovery_conflict_waits = false; > > +bool logged_lock_conflict = false; > > > > > > + if (log_recovery_conflict_waits && !logged_lock_conflict) > > + { > > + timeouts[cnt].id = STANDBY_TIMEOUT; > > + timeouts[cnt].type = TMPARAM_AFTER; > > + timeouts[cnt].delay_ms = DeadlockTimeout; > > + cnt++; > > + } > > > > Can we pass a bool indicating if a timeout may be needed for recovery > > conflict logging from ProcSleep() to ResolveRecoveryConflictWithLock() > > instead of using a static variable? > > Yeah that makes more sense, specially as we already have > logged_recovery_conflict at our disposal. > > New patch version attached. > Thank you for updating the patch! The patch works fine and looks good to me except for the following small comments: +/* + * Log the recovery conflict. + * + * waitStart is the timestamp when the caller started to wait. This function also + * reports the details about the conflicting process ids if *waitlist is not NULL. + */ +void +LogRecoveryConflict(ProcSignalReason reason, TimestampTz waitStart, + TimestampTz cur_ts, VirtualTransactionId *waitlist) I think it's better to explain cur_ts as well in the function comment. Regarding the function arguments, 'waitStart' is camel case whereas 'cur_ts' is snake case and 'waitlist' is using only lower cases. I think we should unify them. --- -ResolveRecoveryConflictWithLock(LOCKTAG locktag) +ResolveRecoveryConflictWithLock(LOCKTAG locktag, bool logged_recovery_conflict) The function argument name 'logged_recovery_conflict' sounds a bit redundant to me as this function is used only for recovery conflict resolution. How about 'need_log' or something? Also it’s better to explain it in the function comment. Regards, -- Masahiko Sawada EnterpriseDB: https://www.enterprisedb.com/
Hi, On 11/27/20 6:04 AM, Masahiko Sawada wrote: > Thank you for updating the patch! The patch works fine and looks good > to me except for the following small comments: > > +/* > + * Log the recovery conflict. > + * > + * waitStart is the timestamp when the caller started to wait. This > function also > + * reports the details about the conflicting process ids if *waitlist > is not NULL. > + */ > +void > +LogRecoveryConflict(ProcSignalReason reason, TimestampTz waitStart, > + TimestampTz cur_ts, > VirtualTransactionId *waitlist) > > I think it's better to explain cur_ts as well in the function comment. > > Regarding the function arguments, 'waitStart' is camel case whereas > 'cur_ts' is snake case and 'waitlist' is using only lower cases. I > think we should unify them. > > --- > -ResolveRecoveryConflictWithLock(LOCKTAG locktag) > +ResolveRecoveryConflictWithLock(LOCKTAG locktag, bool logged_recovery_conflict) > > The function argument name 'logged_recovery_conflict' sounds a bit > redundant to me as this function is used only for recovery conflict > resolution. How about 'need_log' or something? Also it’s better to > explain it in the function comment. Thanks for reviewing! I have addressed your comments in the new attached version. Thanks Bertrand
Attachment
On 2020-Nov-27, Drouvot, Bertrand wrote: > + if (nprocs > 0) > + { > + ereport(LOG, > + (errmsg("%s after %ld.%03d ms", > + get_recovery_conflict_desc(reason), msecs, usecs), > + (errdetail_log_plural("Conflicting process: %s.", > + "Conflicting processes: %s.", > + nprocs, buf.data)))); > + } > +/* Return the description of recovery conflict */ > +static const char * > +get_recovery_conflict_desc(ProcSignalReason reason) > +{ > + const char *reasonDesc = gettext_noop("unknown reason"); > + > + switch (reason) > + { > + case PROCSIG_RECOVERY_CONFLICT_BUFFERPIN: > + reasonDesc = gettext_noop("recovery is still waiting for recovery conflict on buffer pin"); > + break; This doesn't work from a translation point of view. First, you're building a sentence from parts, which is against policy. Second, you're not actually invoking gettext to translate the string returned by get_recovery_conflict_desc. I think this needs to be rethought. To handle the first problem I suggest to split the error message in two. One phrase is the complain that recovery is waiting, and the other string is the reason for the wait. Separate both either by splitting with a :, or alternatively put the other sentence in DETAIL. (The latter would require to mix with the list of conflicting processes, which might be hard.) The first idea would look like this: LOG: recovery still waiting after %ld.03d ms: for recovery conflict on buffer pin DETAIL: Conflicting processes: 1, 2, 3. To achieve this, apart from editing the messages returned by get_recovery_conflict_desc, you'd need to ereport this way: ereport(LOG, errmsg("recovery still waiting after %ld.%03d ms: %s", msecs, usecs, _(get_recovery_conflict_desc(reason))), errdetail_log_plural("Conflicting process: %s.", "Conflicting processes: %s.",
Hi, On 11/27/20 2:40 PM, Alvaro Herrera wrote: > On 2020-Nov-27, Drouvot, Bertrand wrote: > >> + if (nprocs > 0) >> + { >> + ereport(LOG, >> + (errmsg("%s after %ld.%03d ms", >> + get_recovery_conflict_desc(reason), msecs, usecs), >> + (errdetail_log_plural("Conflicting process: %s.", >> + "Conflicting processes: %s.", >> + nprocs, buf.data)))); >> + } >> +/* Return the description of recovery conflict */ >> +static const char * >> +get_recovery_conflict_desc(ProcSignalReason reason) >> +{ >> + const char *reasonDesc = gettext_noop("unknown reason"); >> + >> + switch (reason) >> + { >> + case PROCSIG_RECOVERY_CONFLICT_BUFFERPIN: >> + reasonDesc = gettext_noop("recovery is still waiting for recovery conflict on buffer pin"); >> + break; > This doesn't work from a translation point of view. First, you're > building a sentence from parts, which is against policy. Second, you're > not actually invoking gettext to translate the string returned by > get_recovery_conflict_desc. > > I think this needs to be rethought. To handle the first problem I > suggest to split the error message in two. One phrase is the complain > that recovery is waiting, and the other string is the reason for the > wait. Separate both either by splitting with a :, or alternatively put > the other sentence in DETAIL. (The latter would require to mix with the > list of conflicting processes, which might be hard.) > > The first idea would look like this: > > LOG: recovery still waiting after %ld.03d ms: for recovery conflict on buffer pin > DETAIL: Conflicting processes: 1, 2, 3. > > To achieve this, apart from editing the messages returned by > get_recovery_conflict_desc, you'd need to ereport this way: > > ereport(LOG, > errmsg("recovery still waiting after %ld.%03d ms: %s", > msecs, usecs, _(get_recovery_conflict_desc(reason))), > errdetail_log_plural("Conflicting process: %s.", > "Conflicting processes: %s.", > Thanks for your comments, I did not know that. I've attached a new version that takes your comments into account. Thanks! Bertrand
Attachment
Hi Bertrand, On 2020-Nov-28, Drouvot, Bertrand wrote: > + if (nprocs > 0) > + { > + ereport(LOG, > + (errmsg("recovery still waiting after %ld.%03d ms: %s", > + msecs, usecs, _(get_recovery_conflict_desc(reason))), > + (errdetail_log_plural("Conflicting process: %s.", > + "Conflicting processes: %s.", > + nprocs, buf.data)))); > + } > + else > + { > + ereport(LOG, > + (errmsg("recovery still waiting after %ld.%03d ms: %s", > + msecs, usecs, _(get_recovery_conflict_desc(reason))))); > + } > + > + pfree(buf.data); > + } > + else > + ereport(LOG, > + (errmsg("recovery still waiting after %ld.%03d ms: %s", > + msecs, usecs, _(get_recovery_conflict_desc(reason))))); > +} Another trivial stylistic point is that you can collapse all these ereport calls into one, with something like ereport(LOG, errmsg("recovery still waiting after ...", opts), waitlist != NULL ? errdetail_log_plural("foo bar baz", opts) : 0); where the "waitlist" has been constructed beforehand, or is set to NULL if there's no process list. > + switch (reason) > + { > + case PROCSIG_RECOVERY_CONFLICT_BUFFERPIN: > + reasonDesc = gettext_noop("for recovery conflict on buffer pin"); > + break; Pure bikeshedding after discussing this with my pillow: I think I'd get rid of the initial "for" in these messages.
Hi Alvaro, On 11/28/20 6:36 PM, Alvaro Herrera wrote: > Hi Bertrand, > > On 2020-Nov-28, Drouvot, Bertrand wrote: > >> + if (nprocs > 0) >> + { >> + ereport(LOG, >> + (errmsg("recovery still waiting after %ld.%03d ms: %s", >> + msecs, usecs, _(get_recovery_conflict_desc(reason))), >> + (errdetail_log_plural("Conflicting process: %s.", >> + "Conflicting processes: %s.", >> + nprocs, buf.data)))); >> + } >> + else >> + { >> + ereport(LOG, >> + (errmsg("recovery still waiting after %ld.%03d ms: %s", >> + msecs, usecs, _(get_recovery_conflict_desc(reason))))); >> + } >> + >> + pfree(buf.data); >> + } >> + else >> + ereport(LOG, >> + (errmsg("recovery still waiting after %ld.%03d ms: %s", >> + msecs, usecs, _(get_recovery_conflict_desc(reason))))); >> +} > Another trivial stylistic point is that you can collapse all these > ereport calls into one, with something like > > ereport(LOG, > errmsg("recovery still waiting after ...", opts), > waitlist != NULL ? errdetail_log_plural("foo bar baz", opts) : 0); > > where the "waitlist" has been constructed beforehand, or is set to NULL > if there's no process list. Nice! > >> + switch (reason) >> + { >> + case PROCSIG_RECOVERY_CONFLICT_BUFFERPIN: >> + reasonDesc = gettext_noop("for recovery conflict on buffer pin"); >> + break; > Pure bikeshedding after discussing this with my pillow: I think I'd get > rid of the initial "for" in these messages. both comments implemented in the new attached version. Thanks! Bertrand
Attachment
On Sun, Nov 29, 2020 at 3:47 PM Drouvot, Bertrand <bdrouvot@amazon.com> wrote: > > Hi Alvaro, > > On 11/28/20 6:36 PM, Alvaro Herrera wrote: > > Hi Bertrand, > > > > On 2020-Nov-28, Drouvot, Bertrand wrote: > > > >> + if (nprocs > 0) > >> + { > >> + ereport(LOG, > >> + (errmsg("recovery still waiting after %ld.%03d ms: %s", > >> + msecs, usecs, _(get_recovery_conflict_desc(reason))), > >> + (errdetail_log_plural("Conflicting process: %s.", > >> + "Conflicting processes: %s.", > >> + nprocs, buf.data)))); > >> + } > >> + else > >> + { > >> + ereport(LOG, > >> + (errmsg("recovery still waiting after %ld.%03d ms: %s", > >> + msecs, usecs, _(get_recovery_conflict_desc(reason))))); > >> + } > >> + > >> + pfree(buf.data); > >> + } > >> + else > >> + ereport(LOG, > >> + (errmsg("recovery still waiting after %ld.%03d ms: %s", > >> + msecs, usecs, _(get_recovery_conflict_desc(reason))))); > >> +} > > Another trivial stylistic point is that you can collapse all these > > ereport calls into one, with something like > > > > ereport(LOG, > > errmsg("recovery still waiting after ...", opts), > > waitlist != NULL ? errdetail_log_plural("foo bar baz", opts) : 0); > > > > where the "waitlist" has been constructed beforehand, or is set to NULL > > if there's no process list. > > Nice! > > > > >> + switch (reason) > >> + { > >> + case PROCSIG_RECOVERY_CONFLICT_BUFFERPIN: > >> + reasonDesc = gettext_noop("for recovery conflict on buffer pin"); > >> + break; > > Pure bikeshedding after discussing this with my pillow: I think I'd get > > rid of the initial "for" in these messages. > > both comments implemented in the new attached version. > Thank you for updating the patch! + /* Also, set deadlock timeout for logging purpose if necessary */ + if (log_recovery_conflict_waits && !need_log) + { + timeouts[cnt].id = STANDBY_TIMEOUT; + timeouts[cnt].type = TMPARAM_AFTER; + timeouts[cnt].delay_ms = DeadlockTimeout; + cnt++; + } You changed to 'need_log' but this condition seems not correct. I think we need to set this timeout when both log_recovery_conflict and need_log is true. The rest of the patch looks good to me. Regards, -- Masahiko Sawada EnterpriseDB: https://www.enterprisedb.com/
Hi, On 11/30/20 4:41 AM, Masahiko Sawada wrote: > On Sun, Nov 29, 2020 at 3:47 PM Drouvot, Bertrand <bdrouvot@amazon.com> wrote: >> Hi Alvaro, >> >> On 11/28/20 6:36 PM, Alvaro Herrera wrote: >>> Hi Bertrand, >>> >>> On 2020-Nov-28, Drouvot, Bertrand wrote: >>> >>>> + if (nprocs > 0) >>>> + { >>>> + ereport(LOG, >>>> + (errmsg("recovery still waiting after %ld.%03d ms: %s", >>>> + msecs, usecs, _(get_recovery_conflict_desc(reason))), >>>> + (errdetail_log_plural("Conflicting process: %s.", >>>> + "Conflicting processes: %s.", >>>> + nprocs, buf.data)))); >>>> + } >>>> + else >>>> + { >>>> + ereport(LOG, >>>> + (errmsg("recovery still waiting after %ld.%03d ms: %s", >>>> + msecs, usecs, _(get_recovery_conflict_desc(reason))))); >>>> + } >>>> + >>>> + pfree(buf.data); >>>> + } >>>> + else >>>> + ereport(LOG, >>>> + (errmsg("recovery still waiting after %ld.%03d ms: %s", >>>> + msecs, usecs, _(get_recovery_conflict_desc(reason))))); >>>> +} >>> Another trivial stylistic point is that you can collapse all these >>> ereport calls into one, with something like >>> >>> ereport(LOG, >>> errmsg("recovery still waiting after ...", opts), >>> waitlist != NULL ? errdetail_log_plural("foo bar baz", opts) : 0); >>> >>> where the "waitlist" has been constructed beforehand, or is set to NULL >>> if there's no process list. >> Nice! >> >>>> + switch (reason) >>>> + { >>>> + case PROCSIG_RECOVERY_CONFLICT_BUFFERPIN: >>>> + reasonDesc = gettext_noop("for recovery conflict on buffer pin"); >>>> + break; >>> Pure bikeshedding after discussing this with my pillow: I think I'd get >>> rid of the initial "for" in these messages. >> both comments implemented in the new attached version. >> > Thank you for updating the patch! > > + /* Also, set deadlock timeout for logging purpose if necessary */ > + if (log_recovery_conflict_waits && !need_log) > + { > + timeouts[cnt].id = STANDBY_TIMEOUT; > + timeouts[cnt].type = TMPARAM_AFTER; > + timeouts[cnt].delay_ms = DeadlockTimeout; > + cnt++; > + } > > You changed to 'need_log' but this condition seems not correct. I > think we need to set this timeout when both log_recovery_conflict and > need_log is true. Nice catch! In fact it behaves correctly, that's jut the 'need_log' name that is miss leading: I renamed it to 'already_logged' in the new attached version. > The rest of the patch looks good to me. Great! Thanks Bertrand
Attachment
On Mon, Nov 30, 2020 at 3:46 PM Drouvot, Bertrand <bdrouvot@amazon.com> wrote: > > Hi, > > On 11/30/20 4:41 AM, Masahiko Sawada wrote: > > On Sun, Nov 29, 2020 at 3:47 PM Drouvot, Bertrand <bdrouvot@amazon.com> wrote: > >> Hi Alvaro, > >> > >> On 11/28/20 6:36 PM, Alvaro Herrera wrote: > >>> Hi Bertrand, > >>> > >>> On 2020-Nov-28, Drouvot, Bertrand wrote: > >>> > >>>> + if (nprocs > 0) > >>>> + { > >>>> + ereport(LOG, > >>>> + (errmsg("recovery still waiting after %ld.%03d ms: %s", > >>>> + msecs, usecs, _(get_recovery_conflict_desc(reason))), > >>>> + (errdetail_log_plural("Conflicting process: %s.", > >>>> + "Conflicting processes: %s.", > >>>> + nprocs, buf.data)))); > >>>> + } > >>>> + else > >>>> + { > >>>> + ereport(LOG, > >>>> + (errmsg("recovery still waiting after %ld.%03d ms: %s", > >>>> + msecs, usecs, _(get_recovery_conflict_desc(reason))))); > >>>> + } > >>>> + > >>>> + pfree(buf.data); > >>>> + } > >>>> + else > >>>> + ereport(LOG, > >>>> + (errmsg("recovery still waiting after %ld.%03d ms: %s", > >>>> + msecs, usecs, _(get_recovery_conflict_desc(reason))))); > >>>> +} > >>> Another trivial stylistic point is that you can collapse all these > >>> ereport calls into one, with something like > >>> > >>> ereport(LOG, > >>> errmsg("recovery still waiting after ...", opts), > >>> waitlist != NULL ? errdetail_log_plural("foo bar baz", opts) : 0); > >>> > >>> where the "waitlist" has been constructed beforehand, or is set to NULL > >>> if there's no process list. > >> Nice! > >> > >>>> + switch (reason) > >>>> + { > >>>> + case PROCSIG_RECOVERY_CONFLICT_BUFFERPIN: > >>>> + reasonDesc = gettext_noop("for recovery conflict on buffer pin"); > >>>> + break; > >>> Pure bikeshedding after discussing this with my pillow: I think I'd get > >>> rid of the initial "for" in these messages. > >> both comments implemented in the new attached version. > >> > > Thank you for updating the patch! > > > > + /* Also, set deadlock timeout for logging purpose if necessary */ > > + if (log_recovery_conflict_waits && !need_log) > > + { > > + timeouts[cnt].id = STANDBY_TIMEOUT; > > + timeouts[cnt].type = TMPARAM_AFTER; > > + timeouts[cnt].delay_ms = DeadlockTimeout; > > + cnt++; > > + } > > > > You changed to 'need_log' but this condition seems not correct. I > > think we need to set this timeout when both log_recovery_conflict and > > need_log is true. > > Nice catch! > > In fact it behaves correctly, that's jut the 'need_log' name that is > miss leading: I renamed it to 'already_logged' in the new attached version. > Thanks! I'd prefer 'need_log' because we can check it using the affirmative form in that condition, which would make the code more readable a bit. But I'd like to leave it to committers. I've marked this patch as "Ready for Committer". Regards, -- Masahiko Sawada EnterpriseDB: https://www.enterprisedb.com/
On 2020/11/20 18:17, Drouvot, Bertrand wrote: > Hi, > > On 11/17/20 4:44 PM, Fujii Masao wrote: >> >> Thanks for updating the patch! Here are review comments. >> >> + Controls whether a log message is produced when the startup process >> + is waiting longer than <varname>deadlock_timeout</varname> >> + for recovery conflicts. >> >> But a log message can be produced also when the backend is waiting >> for recovery conflict. Right? If yes, this description needs to be corrected. > > Thanks for looking at it! > > I don't think so, only the startup process should write those new log messages. > > What makes you think that would not be the case? Probably my mis-underding of the patch did that. Sorry for noise.. Regards, -- Fujii Masao Advanced Computing Technology Center Research and Development Headquarters NTT DATA CORPORATION
On 2020/11/30 16:26, Masahiko Sawada wrote: > On Mon, Nov 30, 2020 at 3:46 PM Drouvot, Bertrand <bdrouvot@amazon.com> wrote: >> >> Hi, >> >> On 11/30/20 4:41 AM, Masahiko Sawada wrote: >>> On Sun, Nov 29, 2020 at 3:47 PM Drouvot, Bertrand <bdrouvot@amazon.com> wrote: >>>> Hi Alvaro, >>>> >>>> On 11/28/20 6:36 PM, Alvaro Herrera wrote: >>>>> Hi Bertrand, >>>>> >>>>> On 2020-Nov-28, Drouvot, Bertrand wrote: >>>>> >>>>>> + if (nprocs > 0) >>>>>> + { >>>>>> + ereport(LOG, >>>>>> + (errmsg("recovery still waiting after %ld.%03d ms: %s", >>>>>> + msecs, usecs, _(get_recovery_conflict_desc(reason))), >>>>>> + (errdetail_log_plural("Conflicting process: %s.", >>>>>> + "Conflicting processes: %s.", >>>>>> + nprocs, buf.data)))); >>>>>> + } >>>>>> + else >>>>>> + { >>>>>> + ereport(LOG, >>>>>> + (errmsg("recovery still waiting after %ld.%03d ms: %s", >>>>>> + msecs, usecs, _(get_recovery_conflict_desc(reason))))); >>>>>> + } >>>>>> + >>>>>> + pfree(buf.data); >>>>>> + } >>>>>> + else >>>>>> + ereport(LOG, >>>>>> + (errmsg("recovery still waiting after %ld.%03d ms: %s", >>>>>> + msecs, usecs, _(get_recovery_conflict_desc(reason))))); >>>>>> +} >>>>> Another trivial stylistic point is that you can collapse all these >>>>> ereport calls into one, with something like >>>>> >>>>> ereport(LOG, >>>>> errmsg("recovery still waiting after ...", opts), >>>>> waitlist != NULL ? errdetail_log_plural("foo bar baz", opts) : 0); >>>>> >>>>> where the "waitlist" has been constructed beforehand, or is set to NULL >>>>> if there's no process list. >>>> Nice! >>>> >>>>>> + switch (reason) >>>>>> + { >>>>>> + case PROCSIG_RECOVERY_CONFLICT_BUFFERPIN: >>>>>> + reasonDesc = gettext_noop("for recovery conflict on buffer pin"); >>>>>> + break; >>>>> Pure bikeshedding after discussing this with my pillow: I think I'd get >>>>> rid of the initial "for" in these messages. >>>> both comments implemented in the new attached version. >>>> >>> Thank you for updating the patch! >>> >>> + /* Also, set deadlock timeout for logging purpose if necessary */ >>> + if (log_recovery_conflict_waits && !need_log) >>> + { >>> + timeouts[cnt].id = STANDBY_TIMEOUT; >>> + timeouts[cnt].type = TMPARAM_AFTER; >>> + timeouts[cnt].delay_ms = DeadlockTimeout; >>> + cnt++; >>> + } >>> >>> You changed to 'need_log' but this condition seems not correct. I >>> think we need to set this timeout when both log_recovery_conflict and >>> need_log is true. >> >> Nice catch! >> >> In fact it behaves correctly, that's jut the 'need_log' name that is >> miss leading: I renamed it to 'already_logged' in the new attached version. >> > > Thanks! I'd prefer 'need_log' because we can check it using the > affirmative form in that condition, which would make the code more > readable a bit. But I'd like to leave it to committers. I've marked > this patch as "Ready for Committer". I'm still in the middle of the review, but please let me share my current review comments. + /* Set wait start timestamp if logging is enabled */ + if (log_recovery_conflict_waits) + waitStart = GetCurrentTimestamp(); This seems to cause even the primary server to call GetCurrentTimestamp() if logging is enabled. To avoid unnecessary GetCurrentTimestamp(), we should add "InHotStandby" into the if-condition? + initStringInfo(&buf); + + if (wait_list) Isn't it better to call initStringInfo() only when wait_list is not NULL? For example, we can update the code so that it's executed when nprocs == 0. + if (proc) + { + if (nprocs == 0) + appendStringInfo(&buf, "%d", proc->pid); + else + appendStringInfo(&buf, ", %d", proc->pid); + + nprocs++; What happens if all the backends in wait_list have gone? In other words, how should we handle the case where nprocs == 0 (i.e., nprocs has not been incrmented at all)? This would very rarely happen, but can happen. In this case, since buf.data is empty, at least there seems no need to log the list of conflicting processes in detail message. Regards, -- Fujii Masao Advanced Computing Technology Center Research and Development Headquarters NTT DATA CORPORATION
On 2020-Dec-01, Fujii Masao wrote: > + if (proc) > + { > + if (nprocs == 0) > + appendStringInfo(&buf, "%d", proc->pid); > + else > + appendStringInfo(&buf, ", %d", proc->pid); > + > + nprocs++; > > What happens if all the backends in wait_list have gone? In other words, > how should we handle the case where nprocs == 0 (i.e., nprocs has not been > incrmented at all)? This would very rarely happen, but can happen. > In this case, since buf.data is empty, at least there seems no need to log > the list of conflicting processes in detail message. Yes, I noticed this too; this can be simplified by changing the condition in the ereport() call to be "nprocs > 0" (rather than wait_list being null), otherwise not print the errdetail. (You could test buf.data or buf.len instead, but that seems uglier to me.)
On Tue, Dec 1, 2020 at 3:25 AM Alvaro Herrera <alvherre@alvh.no-ip.org> wrote: > > On 2020-Dec-01, Fujii Masao wrote: > > > + if (proc) > > + { > > + if (nprocs == 0) > > + appendStringInfo(&buf, "%d", proc->pid); > > + else > > + appendStringInfo(&buf, ", %d", proc->pid); > > + > > + nprocs++; > > > > What happens if all the backends in wait_list have gone? In other words, > > how should we handle the case where nprocs == 0 (i.e., nprocs has not been > > incrmented at all)? This would very rarely happen, but can happen. > > In this case, since buf.data is empty, at least there seems no need to log > > the list of conflicting processes in detail message. > > Yes, I noticed this too; this can be simplified by changing the > condition in the ereport() call to be "nprocs > 0" (rather than > wait_list being null), otherwise not print the errdetail. (You could > test buf.data or buf.len instead, but that seems uglier to me.) +1 Maybe we can also improve the comment of this function from: + * This function also reports the details about the conflicting + * process ids if *wait_list is not NULL. to " This function also reports the details about the conflicting process ids if exist" or something. Regards, -- Masahiko Sawada EnterpriseDB: https://www.enterprisedb.com/
Hi, On 12/1/20 12:35 AM, Masahiko Sawada wrote: > CAUTION: This email originated from outside of the organization. Do not click links or open attachments unless you canconfirm the sender and know the content is safe. > > > > On Tue, Dec 1, 2020 at 3:25 AM Alvaro Herrera <alvherre@alvh.no-ip.org> wrote: >> On 2020-Dec-01, Fujii Masao wrote: >> >>> + if (proc) >>> + { >>> + if (nprocs == 0) >>> + appendStringInfo(&buf, "%d", proc->pid); >>> + else >>> + appendStringInfo(&buf, ", %d", proc->pid); >>> + >>> + nprocs++; >>> >>> What happens if all the backends in wait_list have gone? In other words, >>> how should we handle the case where nprocs == 0 (i.e., nprocs has not been >>> incrmented at all)? This would very rarely happen, but can happen. >>> In this case, since buf.data is empty, at least there seems no need to log >>> the list of conflicting processes in detail message. >> Yes, I noticed this too; this can be simplified by changing the >> condition in the ereport() call to be "nprocs > 0" (rather than >> wait_list being null), otherwise not print the errdetail. (You could >> test buf.data or buf.len instead, but that seems uglier to me.) > +1 > > Maybe we can also improve the comment of this function from: > > + * This function also reports the details about the conflicting > + * process ids if *wait_list is not NULL. > > to " This function also reports the details about the conflicting > process ids if exist" or something. > Thank you all for the review/remarks. They have been addressed in the new attached patch version. Thanks! Bertrand
Attachment
On 2020/12/01 17:29, Drouvot, Bertrand wrote: > Hi, > > On 12/1/20 12:35 AM, Masahiko Sawada wrote: >> CAUTION: This email originated from outside of the organization. Do not click links or open attachments unless you canconfirm the sender and know the content is safe. >> >> >> >> On Tue, Dec 1, 2020 at 3:25 AM Alvaro Herrera <alvherre@alvh.no-ip.org> wrote: >>> On 2020-Dec-01, Fujii Masao wrote: >>> >>>> + if (proc) >>>> + { >>>> + if (nprocs == 0) >>>> + appendStringInfo(&buf, "%d", proc->pid); >>>> + else >>>> + appendStringInfo(&buf, ", %d", proc->pid); >>>> + >>>> + nprocs++; >>>> >>>> What happens if all the backends in wait_list have gone? In other words, >>>> how should we handle the case where nprocs == 0 (i.e., nprocs has not been >>>> incrmented at all)? This would very rarely happen, but can happen. >>>> In this case, since buf.data is empty, at least there seems no need to log >>>> the list of conflicting processes in detail message. >>> Yes, I noticed this too; this can be simplified by changing the >>> condition in the ereport() call to be "nprocs > 0" (rather than >>> wait_list being null), otherwise not print the errdetail. (You could >>> test buf.data or buf.len instead, but that seems uglier to me.) >> +1 >> >> Maybe we can also improve the comment of this function from: >> >> + * This function also reports the details about the conflicting >> + * process ids if *wait_list is not NULL. >> >> to " This function also reports the details about the conflicting >> process ids if exist" or something. >> > Thank you all for the review/remarks. > > They have been addressed in the new attached patch version. Thanks for updating the patch! I read through the patch again and applied the following chages to it. Attached is the updated version of the patch. Could you review this version? If there is no issue in it, I'm thinking to commit this version. + if (waitStart > 0 && !logged_recovery_conflict) + { + TimestampTz cur_ts = GetCurrentTimestamp(); + if (TimestampDifferenceExceeds(waitStart, cur_ts, + DeadlockTimeout)) On the first time through, this is executed before we have started actually waiting. Which is a bit wasteful. So I changed LockBufferForCleanup() and ResolveRecoveryConflictWithVirtualXIDs() so that the code for logging the recovery conflict is executed after the function to wait is executed. + ereport(LOG, + errmsg("recovery still waiting after %ld.%03d ms: %s", + msecs, usecs, _(get_recovery_conflict_desc(reason))), + wait_list > 0 ? errdetail_log_plural("Conflicting process: %s.", + "Conflicting processes: %s.", + nprocs, buf.data) : 0); Seems "wait_list > 0" should be "nprocs > 0". So I changed the code that way. + if (waitStart > 0) { - const char *old_status; I added "(!logged_recovery_conflict || new_status == NULL)" into the above if-condition, to avoid executing again the code for logging after PS title was updated and the recovery conflict was logged. + timeouts[cnt].id = STANDBY_TIMEOUT; + timeouts[cnt].type = TMPARAM_AFTER; + timeouts[cnt].delay_ms = DeadlockTimeout; Maybe STANDBY_TIMEOUT should be STANDBY_DEADLOCK_TIMEOUT here? I changed the code that way. + /* + * Log the recovery conflict if there is still virtual transaction + * conflicting with the lock. + */ + if (cnt > 0) + { + LogRecoveryConflict(PROCSIG_RECOVERY_CONFLICT_LOCK, + standbyWaitStart, cur_ts, vxids); + logged_recovery_conflict = true; + } I think that ProcSleep() should log the recovery conflict even if there are no conflicting virtual transactions. Because the startup process there has already waited longer than deadlock_timeout, whether conflicting virtual transactions are still running or not. Also LogRecoveryConflict() logs the recovery conflict even if it finds that there are no conflicting active backends. So the rule about whether to log the conflict when there are no conflicting backends should be made consistent between functions, I think. Thought? Also I added more comments. Regards, -- Fujii Masao Advanced Computing Technology Center Research and Development Headquarters NTT DATA CORPORATION
Attachment
On Fri, Dec 4, 2020 at 2:54 AM Fujii Masao <masao.fujii@oss.nttdata.com> wrote: > > > > On 2020/12/01 17:29, Drouvot, Bertrand wrote: > > Hi, > > > > On 12/1/20 12:35 AM, Masahiko Sawada wrote: > >> CAUTION: This email originated from outside of the organization. Do not click links or open attachments unless you canconfirm the sender and know the content is safe. > >> > >> > >> > >> On Tue, Dec 1, 2020 at 3:25 AM Alvaro Herrera <alvherre@alvh.no-ip.org> wrote: > >>> On 2020-Dec-01, Fujii Masao wrote: > >>> > >>>> + if (proc) > >>>> + { > >>>> + if (nprocs == 0) > >>>> + appendStringInfo(&buf, "%d", proc->pid); > >>>> + else > >>>> + appendStringInfo(&buf, ", %d", proc->pid); > >>>> + > >>>> + nprocs++; > >>>> > >>>> What happens if all the backends in wait_list have gone? In other words, > >>>> how should we handle the case where nprocs == 0 (i.e., nprocs has not been > >>>> incrmented at all)? This would very rarely happen, but can happen. > >>>> In this case, since buf.data is empty, at least there seems no need to log > >>>> the list of conflicting processes in detail message. > >>> Yes, I noticed this too; this can be simplified by changing the > >>> condition in the ereport() call to be "nprocs > 0" (rather than > >>> wait_list being null), otherwise not print the errdetail. (You could > >>> test buf.data or buf.len instead, but that seems uglier to me.) > >> +1 > >> > >> Maybe we can also improve the comment of this function from: > >> > >> + * This function also reports the details about the conflicting > >> + * process ids if *wait_list is not NULL. > >> > >> to " This function also reports the details about the conflicting > >> process ids if exist" or something. > >> > > Thank you all for the review/remarks. > > > > They have been addressed in the new attached patch version. > > Thanks for updating the patch! I read through the patch again > and applied the following chages to it. Attached is the updated > version of the patch. Could you review this version? If there is > no issue in it, I'm thinking to commit this version. Thank you for updating the patch! I have one question. > > + timeouts[cnt].id = STANDBY_TIMEOUT; > + timeouts[cnt].type = TMPARAM_AFTER; > + timeouts[cnt].delay_ms = DeadlockTimeout; > > Maybe STANDBY_TIMEOUT should be STANDBY_DEADLOCK_TIMEOUT here? > I changed the code that way. As the comment of ResolveRecoveryConflictWithLock() says the following, a deadlock is detected by the ordinary backend process: * Deadlocks involving the Startup process and an ordinary backend proces * will be detected by the deadlock detector within the ordinary backend. If we use STANDBY_DEADLOCK_TIMEOUT, SendRecoveryConflictWithBufferPin() will be called after DeadlockTimeout passed, but I think it's not necessary for the startup process in this case. If we want to just wake up the startup process maybe we can use STANDBY_TIMEOUT here? Regards, -- Masahiko Sawada EnterpriseDB: https://www.enterprisedb.com/
On 2020/12/04 9:28, Masahiko Sawada wrote: > On Fri, Dec 4, 2020 at 2:54 AM Fujii Masao <masao.fujii@oss.nttdata.com> wrote: >> >> >> >> On 2020/12/01 17:29, Drouvot, Bertrand wrote: >>> Hi, >>> >>> On 12/1/20 12:35 AM, Masahiko Sawada wrote: >>>> CAUTION: This email originated from outside of the organization. Do not click links or open attachments unless you canconfirm the sender and know the content is safe. >>>> >>>> >>>> >>>> On Tue, Dec 1, 2020 at 3:25 AM Alvaro Herrera <alvherre@alvh.no-ip.org> wrote: >>>>> On 2020-Dec-01, Fujii Masao wrote: >>>>> >>>>>> + if (proc) >>>>>> + { >>>>>> + if (nprocs == 0) >>>>>> + appendStringInfo(&buf, "%d", proc->pid); >>>>>> + else >>>>>> + appendStringInfo(&buf, ", %d", proc->pid); >>>>>> + >>>>>> + nprocs++; >>>>>> >>>>>> What happens if all the backends in wait_list have gone? In other words, >>>>>> how should we handle the case where nprocs == 0 (i.e., nprocs has not been >>>>>> incrmented at all)? This would very rarely happen, but can happen. >>>>>> In this case, since buf.data is empty, at least there seems no need to log >>>>>> the list of conflicting processes in detail message. >>>>> Yes, I noticed this too; this can be simplified by changing the >>>>> condition in the ereport() call to be "nprocs > 0" (rather than >>>>> wait_list being null), otherwise not print the errdetail. (You could >>>>> test buf.data or buf.len instead, but that seems uglier to me.) >>>> +1 >>>> >>>> Maybe we can also improve the comment of this function from: >>>> >>>> + * This function also reports the details about the conflicting >>>> + * process ids if *wait_list is not NULL. >>>> >>>> to " This function also reports the details about the conflicting >>>> process ids if exist" or something. >>>> >>> Thank you all for the review/remarks. >>> >>> They have been addressed in the new attached patch version. >> >> Thanks for updating the patch! I read through the patch again >> and applied the following chages to it. Attached is the updated >> version of the patch. Could you review this version? If there is >> no issue in it, I'm thinking to commit this version. > > Thank you for updating the patch! I have one question. > >> >> + timeouts[cnt].id = STANDBY_TIMEOUT; >> + timeouts[cnt].type = TMPARAM_AFTER; >> + timeouts[cnt].delay_ms = DeadlockTimeout; >> >> Maybe STANDBY_TIMEOUT should be STANDBY_DEADLOCK_TIMEOUT here? >> I changed the code that way. > > As the comment of ResolveRecoveryConflictWithLock() says the > following, a deadlock is detected by the ordinary backend process: > > * Deadlocks involving the Startup process and an ordinary backend proces > * will be detected by the deadlock detector within the ordinary backend. > > If we use STANDBY_DEADLOCK_TIMEOUT, > SendRecoveryConflictWithBufferPin() will be called after > DeadlockTimeout passed, but I think it's not necessary for the startup > process in this case. Thanks for pointing this! You are right. > If we want to just wake up the startup process > maybe we can use STANDBY_TIMEOUT here? When STANDBY_TIMEOUT happens, a request to release conflicting buffer pins is sent. Right? If so, we should not also useSTANDBY_TIMEOUT there? Or, first of all, we don't need to enable the deadlock timer at all? Since what we'd like to do is to wake up after deadlock_timeoutpasses, we can do that by changing ProcWaitForSignal() so that it can accept the timeout and giving the deadlock_timeoutto it. If we do this, maybe we can get rid of STANDBY_LOCK_TIMEOUT from ResolveRecoveryConflictWithLock().Thought? Regards, -- Fujii Masao Advanced Computing Technology Center Research and Development Headquarters NTT DATA CORPORATION
Hi, On 12/4/20 2:21 AM, Fujii Masao wrote: > > On 2020/12/04 9:28, Masahiko Sawada wrote: >> On Fri, Dec 4, 2020 at 2:54 AM Fujii Masao >> <masao.fujii@oss.nttdata.com> wrote: >>> >>> >>> >>> On 2020/12/01 17:29, Drouvot, Bertrand wrote: >>>> Hi, >>>> >>>> On 12/1/20 12:35 AM, Masahiko Sawada wrote: >>>>> CAUTION: This email originated from outside of the organization. >>>>> Do not click links or open attachments unless you can confirm the >>>>> sender and know the content is safe. >>>>> >>>>> >>>>> >>>>> On Tue, Dec 1, 2020 at 3:25 AM Alvaro Herrera >>>>> <alvherre@alvh.no-ip.org> wrote: >>>>>> On 2020-Dec-01, Fujii Masao wrote: >>>>>> >>>>>>> + if (proc) >>>>>>> + { >>>>>>> + if (nprocs == 0) >>>>>>> + appendStringInfo(&buf, "%d", proc->pid); >>>>>>> + else >>>>>>> + appendStringInfo(&buf, ", %d", proc->pid); >>>>>>> + >>>>>>> + nprocs++; >>>>>>> >>>>>>> What happens if all the backends in wait_list have gone? In >>>>>>> other words, >>>>>>> how should we handle the case where nprocs == 0 (i.e., nprocs >>>>>>> has not been >>>>>>> incrmented at all)? This would very rarely happen, but can happen. >>>>>>> In this case, since buf.data is empty, at least there seems no >>>>>>> need to log >>>>>>> the list of conflicting processes in detail message. >>>>>> Yes, I noticed this too; this can be simplified by changing the >>>>>> condition in the ereport() call to be "nprocs > 0" (rather than >>>>>> wait_list being null), otherwise not print the errdetail. (You >>>>>> could >>>>>> test buf.data or buf.len instead, but that seems uglier to me.) >>>>> +1 >>>>> >>>>> Maybe we can also improve the comment of this function from: >>>>> >>>>> + * This function also reports the details about the conflicting >>>>> + * process ids if *wait_list is not NULL. >>>>> >>>>> to " This function also reports the details about the conflicting >>>>> process ids if exist" or something. >>>>> >>>> Thank you all for the review/remarks. >>>> >>>> They have been addressed in the new attached patch version. >>> >>> Thanks for updating the patch! I read through the patch again >>> and applied the following chages to it. Attached is the updated >>> version of the patch. Could you review this version? If there is >>> no issue in it, I'm thinking to commit this version. >> >> Thank you for updating the patch! I have one question. >> >>> >>> + timeouts[cnt].id = STANDBY_TIMEOUT; >>> + timeouts[cnt].type = TMPARAM_AFTER; >>> + timeouts[cnt].delay_ms = DeadlockTimeout; >>> >>> Maybe STANDBY_TIMEOUT should be STANDBY_DEADLOCK_TIMEOUT here? >>> I changed the code that way. >> >> As the comment of ResolveRecoveryConflictWithLock() says the >> following, a deadlock is detected by the ordinary backend process: >> >> * Deadlocks involving the Startup process and an ordinary backend >> proces >> * will be detected by the deadlock detector within the ordinary >> backend. >> >> If we use STANDBY_DEADLOCK_TIMEOUT, >> SendRecoveryConflictWithBufferPin() will be called after >> DeadlockTimeout passed, but I think it's not necessary for the startup >> process in this case. > > Thanks for pointing this! You are right. > > >> If we want to just wake up the startup process >> maybe we can use STANDBY_TIMEOUT here? > Thanks for the patch updates! Except what we are still discussing below, it looks good to me. > When STANDBY_TIMEOUT happens, a request to release conflicting buffer > pins is sent. Right? If so, we should not also use STANDBY_TIMEOUT there? Agree > > Or, first of all, we don't need to enable the deadlock timer at all? > Since what we'd like to do is to wake up after deadlock_timeout > passes, we can do that by changing ProcWaitForSignal() so that it can > accept the timeout and giving the deadlock_timeout to it. If we do > this, maybe we can get rid of STANDBY_LOCK_TIMEOUT from > ResolveRecoveryConflictWithLock(). Thought? Why not simply use (again) the STANDBY_LOCK_TIMEOUT one? (as it triggers a call to StandbyLockTimeoutHandler() which does nothing, except waking up. That's what we want, right?) I've attached a new version that makes use of it (that's the only change compare to Masao's updates). Thanks Bertrand
Attachment
On Fri, Dec 4, 2020 at 7:22 PM Drouvot, Bertrand <bdrouvot@amazon.com> wrote: > > Hi, > > On 12/4/20 2:21 AM, Fujii Masao wrote: > > > > On 2020/12/04 9:28, Masahiko Sawada wrote: > >> On Fri, Dec 4, 2020 at 2:54 AM Fujii Masao > >> <masao.fujii@oss.nttdata.com> wrote: > >>> > >>> > >>> > >>> On 2020/12/01 17:29, Drouvot, Bertrand wrote: > >>>> Hi, > >>>> > >>>> On 12/1/20 12:35 AM, Masahiko Sawada wrote: > >>>>> CAUTION: This email originated from outside of the organization. > >>>>> Do not click links or open attachments unless you can confirm the > >>>>> sender and know the content is safe. > >>>>> > >>>>> > >>>>> > >>>>> On Tue, Dec 1, 2020 at 3:25 AM Alvaro Herrera > >>>>> <alvherre@alvh.no-ip.org> wrote: > >>>>>> On 2020-Dec-01, Fujii Masao wrote: > >>>>>> > >>>>>>> + if (proc) > >>>>>>> + { > >>>>>>> + if (nprocs == 0) > >>>>>>> + appendStringInfo(&buf, "%d", proc->pid); > >>>>>>> + else > >>>>>>> + appendStringInfo(&buf, ", %d", proc->pid); > >>>>>>> + > >>>>>>> + nprocs++; > >>>>>>> > >>>>>>> What happens if all the backends in wait_list have gone? In > >>>>>>> other words, > >>>>>>> how should we handle the case where nprocs == 0 (i.e., nprocs > >>>>>>> has not been > >>>>>>> incrmented at all)? This would very rarely happen, but can happen. > >>>>>>> In this case, since buf.data is empty, at least there seems no > >>>>>>> need to log > >>>>>>> the list of conflicting processes in detail message. > >>>>>> Yes, I noticed this too; this can be simplified by changing the > >>>>>> condition in the ereport() call to be "nprocs > 0" (rather than > >>>>>> wait_list being null), otherwise not print the errdetail. (You > >>>>>> could > >>>>>> test buf.data or buf.len instead, but that seems uglier to me.) > >>>>> +1 > >>>>> > >>>>> Maybe we can also improve the comment of this function from: > >>>>> > >>>>> + * This function also reports the details about the conflicting > >>>>> + * process ids if *wait_list is not NULL. > >>>>> > >>>>> to " This function also reports the details about the conflicting > >>>>> process ids if exist" or something. > >>>>> > >>>> Thank you all for the review/remarks. > >>>> > >>>> They have been addressed in the new attached patch version. > >>> > >>> Thanks for updating the patch! I read through the patch again > >>> and applied the following chages to it. Attached is the updated > >>> version of the patch. Could you review this version? If there is > >>> no issue in it, I'm thinking to commit this version. > >> > >> Thank you for updating the patch! I have one question. > >> > >>> > >>> + timeouts[cnt].id = STANDBY_TIMEOUT; > >>> + timeouts[cnt].type = TMPARAM_AFTER; > >>> + timeouts[cnt].delay_ms = DeadlockTimeout; > >>> > >>> Maybe STANDBY_TIMEOUT should be STANDBY_DEADLOCK_TIMEOUT here? > >>> I changed the code that way. > >> > >> As the comment of ResolveRecoveryConflictWithLock() says the > >> following, a deadlock is detected by the ordinary backend process: > >> > >> * Deadlocks involving the Startup process and an ordinary backend > >> proces > >> * will be detected by the deadlock detector within the ordinary > >> backend. > >> > >> If we use STANDBY_DEADLOCK_TIMEOUT, > >> SendRecoveryConflictWithBufferPin() will be called after > >> DeadlockTimeout passed, but I think it's not necessary for the startup > >> process in this case. > > > > Thanks for pointing this! You are right. > > > > > >> If we want to just wake up the startup process > >> maybe we can use STANDBY_TIMEOUT here? > > > Thanks for the patch updates! Except what we are still discussing below, > it looks good to me. > > > When STANDBY_TIMEOUT happens, a request to release conflicting buffer > > pins is sent. Right? If so, we should not also use STANDBY_TIMEOUT there? > > Agree > > > > > Or, first of all, we don't need to enable the deadlock timer at all? > > Since what we'd like to do is to wake up after deadlock_timeout > > passes, we can do that by changing ProcWaitForSignal() so that it can > > accept the timeout and giving the deadlock_timeout to it. If we do > > this, maybe we can get rid of STANDBY_LOCK_TIMEOUT from > > ResolveRecoveryConflictWithLock(). Thought? Where do we enable deadlock timeout in hot standby case? You meant to enable it in ProcWaitForSignal() or where we set a timer for not hot standby case, in ProcSleep()? > > Why not simply use (again) the STANDBY_LOCK_TIMEOUT one? (as it triggers > a call to StandbyLockTimeoutHandler() which does nothing, except waking > up. That's what we want, right?) Right, what I wanted to mean is STANDBY_LOCK_TIMEOUT. The startup process can wake up and do nothing. Thank you for pointing out. Regards, -- Masahiko Sawada EnterpriseDB: https://www.enterprisedb.com/
On 2020/12/05 12:38, Masahiko Sawada wrote: > On Fri, Dec 4, 2020 at 7:22 PM Drouvot, Bertrand <bdrouvot@amazon.com> wrote: >> >> Hi, >> >> On 12/4/20 2:21 AM, Fujii Masao wrote: >>> >>> On 2020/12/04 9:28, Masahiko Sawada wrote: >>>> On Fri, Dec 4, 2020 at 2:54 AM Fujii Masao >>>> <masao.fujii@oss.nttdata.com> wrote: >>>>> >>>>> >>>>> >>>>> On 2020/12/01 17:29, Drouvot, Bertrand wrote: >>>>>> Hi, >>>>>> >>>>>> On 12/1/20 12:35 AM, Masahiko Sawada wrote: >>>>>>> CAUTION: This email originated from outside of the organization. >>>>>>> Do not click links or open attachments unless you can confirm the >>>>>>> sender and know the content is safe. >>>>>>> >>>>>>> >>>>>>> >>>>>>> On Tue, Dec 1, 2020 at 3:25 AM Alvaro Herrera >>>>>>> <alvherre@alvh.no-ip.org> wrote: >>>>>>>> On 2020-Dec-01, Fujii Masao wrote: >>>>>>>> >>>>>>>>> + if (proc) >>>>>>>>> + { >>>>>>>>> + if (nprocs == 0) >>>>>>>>> + appendStringInfo(&buf, "%d", proc->pid); >>>>>>>>> + else >>>>>>>>> + appendStringInfo(&buf, ", %d", proc->pid); >>>>>>>>> + >>>>>>>>> + nprocs++; >>>>>>>>> >>>>>>>>> What happens if all the backends in wait_list have gone? In >>>>>>>>> other words, >>>>>>>>> how should we handle the case where nprocs == 0 (i.e., nprocs >>>>>>>>> has not been >>>>>>>>> incrmented at all)? This would very rarely happen, but can happen. >>>>>>>>> In this case, since buf.data is empty, at least there seems no >>>>>>>>> need to log >>>>>>>>> the list of conflicting processes in detail message. >>>>>>>> Yes, I noticed this too; this can be simplified by changing the >>>>>>>> condition in the ereport() call to be "nprocs > 0" (rather than >>>>>>>> wait_list being null), otherwise not print the errdetail. (You >>>>>>>> could >>>>>>>> test buf.data or buf.len instead, but that seems uglier to me.) >>>>>>> +1 >>>>>>> >>>>>>> Maybe we can also improve the comment of this function from: >>>>>>> >>>>>>> + * This function also reports the details about the conflicting >>>>>>> + * process ids if *wait_list is not NULL. >>>>>>> >>>>>>> to " This function also reports the details about the conflicting >>>>>>> process ids if exist" or something. >>>>>>> >>>>>> Thank you all for the review/remarks. >>>>>> >>>>>> They have been addressed in the new attached patch version. >>>>> >>>>> Thanks for updating the patch! I read through the patch again >>>>> and applied the following chages to it. Attached is the updated >>>>> version of the patch. Could you review this version? If there is >>>>> no issue in it, I'm thinking to commit this version. >>>> >>>> Thank you for updating the patch! I have one question. >>>> >>>>> >>>>> + timeouts[cnt].id = STANDBY_TIMEOUT; >>>>> + timeouts[cnt].type = TMPARAM_AFTER; >>>>> + timeouts[cnt].delay_ms = DeadlockTimeout; >>>>> >>>>> Maybe STANDBY_TIMEOUT should be STANDBY_DEADLOCK_TIMEOUT here? >>>>> I changed the code that way. >>>> >>>> As the comment of ResolveRecoveryConflictWithLock() says the >>>> following, a deadlock is detected by the ordinary backend process: >>>> >>>> * Deadlocks involving the Startup process and an ordinary backend >>>> proces >>>> * will be detected by the deadlock detector within the ordinary >>>> backend. >>>> >>>> If we use STANDBY_DEADLOCK_TIMEOUT, >>>> SendRecoveryConflictWithBufferPin() will be called after >>>> DeadlockTimeout passed, but I think it's not necessary for the startup >>>> process in this case. >>> >>> Thanks for pointing this! You are right. >>> >>> >>>> If we want to just wake up the startup process >>>> maybe we can use STANDBY_TIMEOUT here? >>> >> Thanks for the patch updates! Except what we are still discussing below, >> it looks good to me. >> >>> When STANDBY_TIMEOUT happens, a request to release conflicting buffer >>> pins is sent. Right? If so, we should not also use STANDBY_TIMEOUT there? >> >> Agree >> >>> >>> Or, first of all, we don't need to enable the deadlock timer at all? >>> Since what we'd like to do is to wake up after deadlock_timeout >>> passes, we can do that by changing ProcWaitForSignal() so that it can >>> accept the timeout and giving the deadlock_timeout to it. If we do >>> this, maybe we can get rid of STANDBY_LOCK_TIMEOUT from >>> ResolveRecoveryConflictWithLock(). Thought? > > Where do we enable deadlock timeout in hot standby case? You meant to > enable it in ProcWaitForSignal() or where we set a timer for not hot > standby case, in ProcSleep()? No, what I tried to say is to change ResolveRecoveryConflictWithLock() so that it does 1. calculate the "minimum" timeout from deadlock_timeout and max_standby_xxx_delay 2. give the calculated timeout value to ProcWaitForSignal() 3. wait for signal and timeout on ProcWaitForSignal() Since ProcWaitForSignal() calls WaitLatch(), seems it's not so difficult to make ProcWaitForSignal() handle the timeout.If we do this, I was thinking that we can get rid of enable_timeouts() from ResolveRecoveryConflictWithLock(). > >> >> Why not simply use (again) the STANDBY_LOCK_TIMEOUT one? (as it triggers >> a call to StandbyLockTimeoutHandler() which does nothing, except waking >> up. That's what we want, right?) > > Right, what I wanted to mean is STANDBY_LOCK_TIMEOUT. The startup > process can wake up and do nothing. Thank you for pointing out. Okay, understood! Firstly I was thinking that enabling the same type (i.e., STANDBY_LOCK_TIMEOUT) of lock twice doesn't workproperly, but as far as I read the code, it works. In that case, only the shorter timeout would be activated in enable_timeouts().So I agree to use STANDBY_LOCK_TIMEOUT. Regards, -- Fujii Masao Advanced Computing Technology Center Research and Development Headquarters NTT DATA CORPORATION
On 2020/12/14 21:31, Fujii Masao wrote: > > > On 2020/12/05 12:38, Masahiko Sawada wrote: >> On Fri, Dec 4, 2020 at 7:22 PM Drouvot, Bertrand <bdrouvot@amazon.com> wrote: >>> >>> Hi, >>> >>> On 12/4/20 2:21 AM, Fujii Masao wrote: >>>> >>>> On 2020/12/04 9:28, Masahiko Sawada wrote: >>>>> On Fri, Dec 4, 2020 at 2:54 AM Fujii Masao >>>>> <masao.fujii@oss.nttdata.com> wrote: >>>>>> >>>>>> >>>>>> >>>>>> On 2020/12/01 17:29, Drouvot, Bertrand wrote: >>>>>>> Hi, >>>>>>> >>>>>>> On 12/1/20 12:35 AM, Masahiko Sawada wrote: >>>>>>>> CAUTION: This email originated from outside of the organization. >>>>>>>> Do not click links or open attachments unless you can confirm the >>>>>>>> sender and know the content is safe. >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> On Tue, Dec 1, 2020 at 3:25 AM Alvaro Herrera >>>>>>>> <alvherre@alvh.no-ip.org> wrote: >>>>>>>>> On 2020-Dec-01, Fujii Masao wrote: >>>>>>>>> >>>>>>>>>> + if (proc) >>>>>>>>>> + { >>>>>>>>>> + if (nprocs == 0) >>>>>>>>>> + appendStringInfo(&buf, "%d", proc->pid); >>>>>>>>>> + else >>>>>>>>>> + appendStringInfo(&buf, ", %d", proc->pid); >>>>>>>>>> + >>>>>>>>>> + nprocs++; >>>>>>>>>> >>>>>>>>>> What happens if all the backends in wait_list have gone? In >>>>>>>>>> other words, >>>>>>>>>> how should we handle the case where nprocs == 0 (i.e., nprocs >>>>>>>>>> has not been >>>>>>>>>> incrmented at all)? This would very rarely happen, but can happen. >>>>>>>>>> In this case, since buf.data is empty, at least there seems no >>>>>>>>>> need to log >>>>>>>>>> the list of conflicting processes in detail message. >>>>>>>>> Yes, I noticed this too; this can be simplified by changing the >>>>>>>>> condition in the ereport() call to be "nprocs > 0" (rather than >>>>>>>>> wait_list being null), otherwise not print the errdetail. (You >>>>>>>>> could >>>>>>>>> test buf.data or buf.len instead, but that seems uglier to me.) >>>>>>>> +1 >>>>>>>> >>>>>>>> Maybe we can also improve the comment of this function from: >>>>>>>> >>>>>>>> + * This function also reports the details about the conflicting >>>>>>>> + * process ids if *wait_list is not NULL. >>>>>>>> >>>>>>>> to " This function also reports the details about the conflicting >>>>>>>> process ids if exist" or something. >>>>>>>> >>>>>>> Thank you all for the review/remarks. >>>>>>> >>>>>>> They have been addressed in the new attached patch version. >>>>>> >>>>>> Thanks for updating the patch! I read through the patch again >>>>>> and applied the following chages to it. Attached is the updated >>>>>> version of the patch. Could you review this version? If there is >>>>>> no issue in it, I'm thinking to commit this version. >>>>> >>>>> Thank you for updating the patch! I have one question. >>>>> >>>>>> >>>>>> + timeouts[cnt].id = STANDBY_TIMEOUT; >>>>>> + timeouts[cnt].type = TMPARAM_AFTER; >>>>>> + timeouts[cnt].delay_ms = DeadlockTimeout; >>>>>> >>>>>> Maybe STANDBY_TIMEOUT should be STANDBY_DEADLOCK_TIMEOUT here? >>>>>> I changed the code that way. >>>>> >>>>> As the comment of ResolveRecoveryConflictWithLock() says the >>>>> following, a deadlock is detected by the ordinary backend process: >>>>> >>>>> * Deadlocks involving the Startup process and an ordinary backend >>>>> proces >>>>> * will be detected by the deadlock detector within the ordinary >>>>> backend. >>>>> >>>>> If we use STANDBY_DEADLOCK_TIMEOUT, >>>>> SendRecoveryConflictWithBufferPin() will be called after >>>>> DeadlockTimeout passed, but I think it's not necessary for the startup >>>>> process in this case. >>>> >>>> Thanks for pointing this! You are right. >>>> >>>> >>>>> If we want to just wake up the startup process >>>>> maybe we can use STANDBY_TIMEOUT here? >>>> >>> Thanks for the patch updates! Except what we are still discussing below, >>> it looks good to me. >>> >>>> When STANDBY_TIMEOUT happens, a request to release conflicting buffer >>>> pins is sent. Right? If so, we should not also use STANDBY_TIMEOUT there? >>> >>> Agree >>> >>>> >>>> Or, first of all, we don't need to enable the deadlock timer at all? >>>> Since what we'd like to do is to wake up after deadlock_timeout >>>> passes, we can do that by changing ProcWaitForSignal() so that it can >>>> accept the timeout and giving the deadlock_timeout to it. If we do >>>> this, maybe we can get rid of STANDBY_LOCK_TIMEOUT from >>>> ResolveRecoveryConflictWithLock(). Thought? >> >> Where do we enable deadlock timeout in hot standby case? You meant to >> enable it in ProcWaitForSignal() or where we set a timer for not hot >> standby case, in ProcSleep()? > > No, what I tried to say is to change ResolveRecoveryConflictWithLock() so that it does > > 1. calculate the "minimum" timeout from deadlock_timeout and max_standby_xxx_delay > 2. give the calculated timeout value to ProcWaitForSignal() > 3. wait for signal and timeout on ProcWaitForSignal() > > Since ProcWaitForSignal() calls WaitLatch(), seems it's not so difficult to make ProcWaitForSignal() handle the timeout.If we do this, I was thinking that we can get rid of enable_timeouts() from ResolveRecoveryConflictWithLock(). > > >> >>> >>> Why not simply use (again) the STANDBY_LOCK_TIMEOUT one? (as it triggers >>> a call to StandbyLockTimeoutHandler() which does nothing, except waking >>> up. That's what we want, right?) >> >> Right, what I wanted to mean is STANDBY_LOCK_TIMEOUT. The startup >> process can wake up and do nothing. Thank you for pointing out. > > Okay, understood! Firstly I was thinking that enabling the same type (i.e., STANDBY_LOCK_TIMEOUT) of lock twice doesn'twork properly, but as far as I read the code, it works. In that case, only the shorter timeout would be activatedin enable_timeouts(). So I agree to use STANDBY_LOCK_TIMEOUT. So I renamed the argument "deadlock_timer" in ResolveRecoveryConflictWithLock() because it's not the timer for deadlock and is confusing. Attached is the updated version of the patch. Barring any objection, I will commit this version. Regards, -- Fujii Masao Advanced Computing Technology Center Research and Development Headquarters NTT DATA CORPORATION
Attachment
Hi, On 12/14/20 4:20 PM, Fujii Masao wrote: > CAUTION: This email originated from outside of the organization. Do > not click links or open attachments unless you can confirm the sender > and know the content is safe. > > > > On 2020/12/14 21:31, Fujii Masao wrote: >> >> >> On 2020/12/05 12:38, Masahiko Sawada wrote: >>> On Fri, Dec 4, 2020 at 7:22 PM Drouvot, Bertrand >>> <bdrouvot@amazon.com> wrote: >>>> >>>> Hi, >>>> >>>> On 12/4/20 2:21 AM, Fujii Masao wrote: >>>>> >>>>> On 2020/12/04 9:28, Masahiko Sawada wrote: >>>>>> On Fri, Dec 4, 2020 at 2:54 AM Fujii Masao >>>>>> <masao.fujii@oss.nttdata.com> wrote: >>>>>>> >>>>>>> >>>>>>> >>>>>>> On 2020/12/01 17:29, Drouvot, Bertrand wrote: >>>>>>>> Hi, >>>>>>>> >>>>>>>> On 12/1/20 12:35 AM, Masahiko Sawada wrote: >>>>>>>>> CAUTION: This email originated from outside of the organization. >>>>>>>>> Do not click links or open attachments unless you can confirm the >>>>>>>>> sender and know the content is safe. >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> On Tue, Dec 1, 2020 at 3:25 AM Alvaro Herrera >>>>>>>>> <alvherre@alvh.no-ip.org> wrote: >>>>>>>>>> On 2020-Dec-01, Fujii Masao wrote: >>>>>>>>>> >>>>>>>>>>> + if (proc) >>>>>>>>>>> + { >>>>>>>>>>> + if (nprocs == 0) >>>>>>>>>>> + appendStringInfo(&buf, "%d", proc->pid); >>>>>>>>>>> + else >>>>>>>>>>> + appendStringInfo(&buf, ", %d", proc->pid); >>>>>>>>>>> + >>>>>>>>>>> + nprocs++; >>>>>>>>>>> >>>>>>>>>>> What happens if all the backends in wait_list have gone? In >>>>>>>>>>> other words, >>>>>>>>>>> how should we handle the case where nprocs == 0 (i.e., nprocs >>>>>>>>>>> has not been >>>>>>>>>>> incrmented at all)? This would very rarely happen, but can >>>>>>>>>>> happen. >>>>>>>>>>> In this case, since buf.data is empty, at least there seems no >>>>>>>>>>> need to log >>>>>>>>>>> the list of conflicting processes in detail message. >>>>>>>>>> Yes, I noticed this too; this can be simplified by changing the >>>>>>>>>> condition in the ereport() call to be "nprocs > 0" (rather than >>>>>>>>>> wait_list being null), otherwise not print the errdetail. (You >>>>>>>>>> could >>>>>>>>>> test buf.data or buf.len instead, but that seems uglier to me.) >>>>>>>>> +1 >>>>>>>>> >>>>>>>>> Maybe we can also improve the comment of this function from: >>>>>>>>> >>>>>>>>> + * This function also reports the details about the conflicting >>>>>>>>> + * process ids if *wait_list is not NULL. >>>>>>>>> >>>>>>>>> to " This function also reports the details about the conflicting >>>>>>>>> process ids if exist" or something. >>>>>>>>> >>>>>>>> Thank you all for the review/remarks. >>>>>>>> >>>>>>>> They have been addressed in the new attached patch version. >>>>>>> >>>>>>> Thanks for updating the patch! I read through the patch again >>>>>>> and applied the following chages to it. Attached is the updated >>>>>>> version of the patch. Could you review this version? If there is >>>>>>> no issue in it, I'm thinking to commit this version. >>>>>> >>>>>> Thank you for updating the patch! I have one question. >>>>>> >>>>>>> >>>>>>> + timeouts[cnt].id = STANDBY_TIMEOUT; >>>>>>> + timeouts[cnt].type = TMPARAM_AFTER; >>>>>>> + timeouts[cnt].delay_ms = DeadlockTimeout; >>>>>>> >>>>>>> Maybe STANDBY_TIMEOUT should be STANDBY_DEADLOCK_TIMEOUT here? >>>>>>> I changed the code that way. >>>>>> >>>>>> As the comment of ResolveRecoveryConflictWithLock() says the >>>>>> following, a deadlock is detected by the ordinary backend process: >>>>>> >>>>>> * Deadlocks involving the Startup process and an ordinary backend >>>>>> proces >>>>>> * will be detected by the deadlock detector within the ordinary >>>>>> backend. >>>>>> >>>>>> If we use STANDBY_DEADLOCK_TIMEOUT, >>>>>> SendRecoveryConflictWithBufferPin() will be called after >>>>>> DeadlockTimeout passed, but I think it's not necessary for the >>>>>> startup >>>>>> process in this case. >>>>> >>>>> Thanks for pointing this! You are right. >>>>> >>>>> >>>>>> If we want to just wake up the startup process >>>>>> maybe we can use STANDBY_TIMEOUT here? >>>>> >>>> Thanks for the patch updates! Except what we are still discussing >>>> below, >>>> it looks good to me. >>>> >>>>> When STANDBY_TIMEOUT happens, a request to release conflicting buffer >>>>> pins is sent. Right? If so, we should not also use STANDBY_TIMEOUT >>>>> there? >>>> >>>> Agree >>>> >>>>> >>>>> Or, first of all, we don't need to enable the deadlock timer at all? >>>>> Since what we'd like to do is to wake up after deadlock_timeout >>>>> passes, we can do that by changing ProcWaitForSignal() so that it can >>>>> accept the timeout and giving the deadlock_timeout to it. If we do >>>>> this, maybe we can get rid of STANDBY_LOCK_TIMEOUT from >>>>> ResolveRecoveryConflictWithLock(). Thought? >>> >>> Where do we enable deadlock timeout in hot standby case? You meant to >>> enable it in ProcWaitForSignal() or where we set a timer for not hot >>> standby case, in ProcSleep()? >> >> No, what I tried to say is to change >> ResolveRecoveryConflictWithLock() so that it does >> >> 1. calculate the "minimum" timeout from deadlock_timeout and >> max_standby_xxx_delay >> 2. give the calculated timeout value to ProcWaitForSignal() >> 3. wait for signal and timeout on ProcWaitForSignal() >> >> Since ProcWaitForSignal() calls WaitLatch(), seems it's not so >> difficult to make ProcWaitForSignal() handle the timeout. If we do >> this, I was thinking that we can get rid of enable_timeouts() from >> ResolveRecoveryConflictWithLock(). >> >> >>> >>>> >>>> Why not simply use (again) the STANDBY_LOCK_TIMEOUT one? (as it >>>> triggers >>>> a call to StandbyLockTimeoutHandler() which does nothing, except >>>> waking >>>> up. That's what we want, right?) >>> >>> Right, what I wanted to mean is STANDBY_LOCK_TIMEOUT. The startup >>> process can wake up and do nothing. Thank you for pointing out. >> >> Okay, understood! Firstly I was thinking that enabling the same type >> (i.e., STANDBY_LOCK_TIMEOUT) of lock twice doesn't work properly, but >> as far as I read the code, it works. In that case, only the shorter >> timeout would be activated in enable_timeouts(). So I agree to use >> STANDBY_LOCK_TIMEOUT. > > So I renamed the argument "deadlock_timer" in > ResolveRecoveryConflictWithLock() > because it's not the timer for deadlock and is confusing. Attached is the > updated version of the patch. Barring any objection, I will commit > this version. Thanks for the update! Indeed the naming is more appropriate and less confusing that way, this version looks all good to me. Thanks! Bertrand
On 2020/12/15 0:49, Drouvot, Bertrand wrote: > Hi, > > On 12/14/20 4:20 PM, Fujii Masao wrote: >> CAUTION: This email originated from outside of the organization. Do not click links or open attachments unless you canconfirm the sender and know the content is safe. >> >> >> >> On 2020/12/14 21:31, Fujii Masao wrote: >>> >>> >>> On 2020/12/05 12:38, Masahiko Sawada wrote: >>>> On Fri, Dec 4, 2020 at 7:22 PM Drouvot, Bertrand <bdrouvot@amazon.com> wrote: >>>>> >>>>> Hi, >>>>> >>>>> On 12/4/20 2:21 AM, Fujii Masao wrote: >>>>>> >>>>>> On 2020/12/04 9:28, Masahiko Sawada wrote: >>>>>>> On Fri, Dec 4, 2020 at 2:54 AM Fujii Masao >>>>>>> <masao.fujii@oss.nttdata.com> wrote: >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> On 2020/12/01 17:29, Drouvot, Bertrand wrote: >>>>>>>>> Hi, >>>>>>>>> >>>>>>>>> On 12/1/20 12:35 AM, Masahiko Sawada wrote: >>>>>>>>>> CAUTION: This email originated from outside of the organization. >>>>>>>>>> Do not click links or open attachments unless you can confirm the >>>>>>>>>> sender and know the content is safe. >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> On Tue, Dec 1, 2020 at 3:25 AM Alvaro Herrera >>>>>>>>>> <alvherre@alvh.no-ip.org> wrote: >>>>>>>>>>> On 2020-Dec-01, Fujii Masao wrote: >>>>>>>>>>> >>>>>>>>>>>> + if (proc) >>>>>>>>>>>> + { >>>>>>>>>>>> + if (nprocs == 0) >>>>>>>>>>>> + appendStringInfo(&buf, "%d", proc->pid); >>>>>>>>>>>> + else >>>>>>>>>>>> + appendStringInfo(&buf, ", %d", proc->pid); >>>>>>>>>>>> + >>>>>>>>>>>> + nprocs++; >>>>>>>>>>>> >>>>>>>>>>>> What happens if all the backends in wait_list have gone? In >>>>>>>>>>>> other words, >>>>>>>>>>>> how should we handle the case where nprocs == 0 (i.e., nprocs >>>>>>>>>>>> has not been >>>>>>>>>>>> incrmented at all)? This would very rarely happen, but can happen. >>>>>>>>>>>> In this case, since buf.data is empty, at least there seems no >>>>>>>>>>>> need to log >>>>>>>>>>>> the list of conflicting processes in detail message. >>>>>>>>>>> Yes, I noticed this too; this can be simplified by changing the >>>>>>>>>>> condition in the ereport() call to be "nprocs > 0" (rather than >>>>>>>>>>> wait_list being null), otherwise not print the errdetail. (You >>>>>>>>>>> could >>>>>>>>>>> test buf.data or buf.len instead, but that seems uglier to me.) >>>>>>>>>> +1 >>>>>>>>>> >>>>>>>>>> Maybe we can also improve the comment of this function from: >>>>>>>>>> >>>>>>>>>> + * This function also reports the details about the conflicting >>>>>>>>>> + * process ids if *wait_list is not NULL. >>>>>>>>>> >>>>>>>>>> to " This function also reports the details about the conflicting >>>>>>>>>> process ids if exist" or something. >>>>>>>>>> >>>>>>>>> Thank you all for the review/remarks. >>>>>>>>> >>>>>>>>> They have been addressed in the new attached patch version. >>>>>>>> >>>>>>>> Thanks for updating the patch! I read through the patch again >>>>>>>> and applied the following chages to it. Attached is the updated >>>>>>>> version of the patch. Could you review this version? If there is >>>>>>>> no issue in it, I'm thinking to commit this version. >>>>>>> >>>>>>> Thank you for updating the patch! I have one question. >>>>>>> >>>>>>>> >>>>>>>> + timeouts[cnt].id = STANDBY_TIMEOUT; >>>>>>>> + timeouts[cnt].type = TMPARAM_AFTER; >>>>>>>> + timeouts[cnt].delay_ms = DeadlockTimeout; >>>>>>>> >>>>>>>> Maybe STANDBY_TIMEOUT should be STANDBY_DEADLOCK_TIMEOUT here? >>>>>>>> I changed the code that way. >>>>>>> >>>>>>> As the comment of ResolveRecoveryConflictWithLock() says the >>>>>>> following, a deadlock is detected by the ordinary backend process: >>>>>>> >>>>>>> * Deadlocks involving the Startup process and an ordinary backend >>>>>>> proces >>>>>>> * will be detected by the deadlock detector within the ordinary >>>>>>> backend. >>>>>>> >>>>>>> If we use STANDBY_DEADLOCK_TIMEOUT, >>>>>>> SendRecoveryConflictWithBufferPin() will be called after >>>>>>> DeadlockTimeout passed, but I think it's not necessary for the startup >>>>>>> process in this case. >>>>>> >>>>>> Thanks for pointing this! You are right. >>>>>> >>>>>> >>>>>>> If we want to just wake up the startup process >>>>>>> maybe we can use STANDBY_TIMEOUT here? >>>>>> >>>>> Thanks for the patch updates! Except what we are still discussing below, >>>>> it looks good to me. >>>>> >>>>>> When STANDBY_TIMEOUT happens, a request to release conflicting buffer >>>>>> pins is sent. Right? If so, we should not also use STANDBY_TIMEOUT there? >>>>> >>>>> Agree >>>>> >>>>>> >>>>>> Or, first of all, we don't need to enable the deadlock timer at all? >>>>>> Since what we'd like to do is to wake up after deadlock_timeout >>>>>> passes, we can do that by changing ProcWaitForSignal() so that it can >>>>>> accept the timeout and giving the deadlock_timeout to it. If we do >>>>>> this, maybe we can get rid of STANDBY_LOCK_TIMEOUT from >>>>>> ResolveRecoveryConflictWithLock(). Thought? >>>> >>>> Where do we enable deadlock timeout in hot standby case? You meant to >>>> enable it in ProcWaitForSignal() or where we set a timer for not hot >>>> standby case, in ProcSleep()? >>> >>> No, what I tried to say is to change ResolveRecoveryConflictWithLock() so that it does >>> >>> 1. calculate the "minimum" timeout from deadlock_timeout and max_standby_xxx_delay >>> 2. give the calculated timeout value to ProcWaitForSignal() >>> 3. wait for signal and timeout on ProcWaitForSignal() >>> >>> Since ProcWaitForSignal() calls WaitLatch(), seems it's not so difficult to make ProcWaitForSignal() handle the timeout.If we do this, I was thinking that we can get rid of enable_timeouts() from ResolveRecoveryConflictWithLock(). >>> >>> >>>> >>>>> >>>>> Why not simply use (again) the STANDBY_LOCK_TIMEOUT one? (as it triggers >>>>> a call to StandbyLockTimeoutHandler() which does nothing, except waking >>>>> up. That's what we want, right?) >>>> >>>> Right, what I wanted to mean is STANDBY_LOCK_TIMEOUT. The startup >>>> process can wake up and do nothing. Thank you for pointing out. >>> >>> Okay, understood! Firstly I was thinking that enabling the same type (i.e., STANDBY_LOCK_TIMEOUT) of lock twice doesn'twork properly, but as far as I read the code, it works. In that case, only the shorter timeout would be activatedin enable_timeouts(). So I agree to use STANDBY_LOCK_TIMEOUT. >> >> So I renamed the argument "deadlock_timer" in ResolveRecoveryConflictWithLock() >> because it's not the timer for deadlock and is confusing. Attached is the >> updated version of the patch. Barring any objection, I will commit this version. > > Thanks for the update! > > Indeed the naming is more appropriate and less confusing that way, this version looks all good to me. Thanks for the review! I'm thinking to wait half a day before commiting the patch just in the case someone may object the patch. BTW, attached is the POC patch that implements the idea discussed upthread; if log_recovery_conflict_waits is enabled, the startup process reports the log also after the recovery conflict was resolved and the startup process finished waiting for it. This patch needs to be applied after v11-0002-Log-the-standby-recovery-conflict-waits.patch is applied. Regards, -- Fujii Masao Advanced Computing Technology Center Research and Development Headquarters NTT DATA CORPORATION
Attachment
At Tue, 15 Dec 2020 02:00:21 +0900, Fujii Masao <masao.fujii@oss.nttdata.com> wrote in > Thanks for the review! I'm thinking to wait half a day before > commiting > the patch just in the case someone may object the patch. Sorry for coming late. I have looked only the latest thread so I should be missing many things so please ignore if it was settled in the past discussion. It emits messages like the follows; [40509:startup] LOG: recovery still waiting after 1021.431 ms: recovery conflict on lock [40509:startup] DETAIL: Conflicting processes: 41171, 41194. [40509:startup] CONTEXT: WAL redo at 0/3013158 for Standby/LOCK: xid 510 db 13609 rel 16384 IFAIK DETAIL usually shows ordinary sentences so the first word is capitalized and ends with a period. But it is not a sentence so following period looks odd. a searcheing the tree for errdetails showed some anomalies. src/backend/parser/parse_param.c errdetail("%s versus %s", src/backend/jit/llvm/llvmjit_error.cpp errdetail("while in LLVM"))); src/backend/replication/logical/tablesync.c errdetail("The error was: %s", res->err))); src/backend/tcop/postgres.c errdetail("prepare: %s", pstmt->plansource->query_string); src/backend/tcop/postgres.c errdetail("abort reason: recovery conflict"); and one similar occurance: src/backend/utils/adt/dbsize.c errdetail("Invalid size unit: \"%s\".", strptr), I'm not sure, but it seems to me at least the period is unnecessary here. + bool maybe_log_conflict = + (standbyWaitStart != 0 && !logged_recovery_conflict); odd indentation. + /* Also, set the timer if necessary */ + if (logging_timer) + { + timeouts[cnt].id = STANDBY_LOCK_TIMEOUT; + timeouts[cnt].type = TMPARAM_AFTER; + timeouts[cnt].delay_ms = DeadlockTimeout; + cnt++; + } This doesn't consider spurious wakeup. I'm not sure it actually happenes but we usually consider that. That is since before this patch, but ProcWaitForSignal()'s comment says that: > * As this uses the generic process latch the caller has to be robust against > * unrelated wakeups: Always check that the desired state has occurred, and > * wait again if not. If we don't care of spurious wakeups, we should add such a comment. + bool maybe_log_conflict; + bool maybe_update_title; Although it should be a matter of taste and I understand that the "maybe" means that "that logging and changing of ps display may not happen in this iteration" , that variables seem expressing respectively "we should write log if the timeout for recovery conflict expires" and "we should update title if 500ms elapsed". So they seem *to me* better be just "log_conflict" and "update_title". I feel the same with "maybe_log_conflict" in ProcSleep(). + for recovery conflicts. This is useful in determining if recovery + conflicts prevent the recovery from applying WAL. (I'm not confident on this) Isn't the sentence better be in past or present continuous tense? > BTW, attached is the POC patch that implements the idea discussed > upthread; > if log_recovery_conflict_waits is enabled, the startup process reports > the log also after the recovery conflict was resolved and the startup > process > finished waiting for it. This patch needs to be applied after > v11-0002-Log-the-standby-recovery-conflict-waits.patch is applied. Ah. I was just writing a comment about that. I haven't looked it closer but it looks good to me. By the way doesn't it contains a simple fix of a comment for the base patch? regards. -- Kyotaro Horiguchi NTT Open Source Software Center
On 2020/12/15 12:04, Kyotaro Horiguchi wrote: > At Tue, 15 Dec 2020 02:00:21 +0900, Fujii Masao <masao.fujii@oss.nttdata.com> wrote in >> Thanks for the review! I'm thinking to wait half a day before >> commiting >> the patch just in the case someone may object the patch. > > Sorry for coming late. I have looked only the latest thread so I > should be missing many things so please ignore if it was settled in > the past discussion. > > > It emits messages like the follows; > > [40509:startup] LOG: recovery still waiting after 1021.431 ms: recovery conflict on lock > [40509:startup] DETAIL: Conflicting processes: 41171, 41194. > [40509:startup] CONTEXT: WAL redo at 0/3013158 for Standby/LOCK: xid 510 db 13609 rel 16384 > > IFAIK DETAIL usually shows ordinary sentences so the first word is > capitalized and ends with a period. But it is not a sentence so > following period looks odd. a searcheing the tree for errdetails > showed some anomalies. > > src/backend/parser/parse_param.c errdetail("%s versus %s", > src/backend/jit/llvm/llvmjit_error.cpp errdetail("while in LLVM"))); > src/backend/replication/logical/tablesync.c errdetail("The error was: %s", res->err))); > src/backend/tcop/postgres.c errdetail("prepare: %s", pstmt->plansource->query_string); > src/backend/tcop/postgres.c errdetail("abort reason: recovery conflict"); > > and one similar occurance: > > src/backend/utils/adt/dbsize.c errdetail("Invalid size unit: \"%s\".", strptr), > > I'm not sure, but it seems to me at least the period is unnecessary > here. Since Error Message Style Guide in the docs says "Detail and hint messages: Use complete sentences, and end each with a period.", I think that a period is necessary here. No? > > > + bool maybe_log_conflict = > + (standbyWaitStart != 0 && !logged_recovery_conflict); > > odd indentation. This is the result of pgindent run. I'm not sure why pgindent indents that way, but for now I'd like to follow pgindent. > > > + /* Also, set the timer if necessary */ > + if (logging_timer) > + { > + timeouts[cnt].id = STANDBY_LOCK_TIMEOUT; > + timeouts[cnt].type = TMPARAM_AFTER; > + timeouts[cnt].delay_ms = DeadlockTimeout; > + cnt++; > + } > > This doesn't consider spurious wakeup. I'm not sure it actually > happenes but we usually consider that. That is since before this > patch, but ProcWaitForSignal()'s comment says that: > >> * As this uses the generic process latch the caller has to be robust against >> * unrelated wakeups: Always check that the desired state has occurred, and >> * wait again if not. > > If we don't care of spurious wakeups, we should add such a comment. If ProcWaitForSignal() wakes up because of the reason (e.g., SIGHUP) other than deadlock_timeout, ProcSleep() will call ResolveRecoveryConflictWithLock() and we sleep on ProcWaitForSignal() again since the recovery conflict has not been resolved yet. So we can say that we consider "spurious wakeups"? However when I read the related code again, I found another issue in the patch. In ResolveRecoveryConflictWithLock(), if SIGHUP causes us to wake up out of ProcWaitForSignal() before deadlock_timeout is reached, we will use deadlock_timeout again when sleeping on ProcWaitForSignal(). Instead, probably we should use the "deadlock_timeout - elapsed time" so that we can emit a log message as soon as deadlock_timeout passes since starting waiting on recovery conflict. Otherwise it may take at most "deadlock_timeout * 2" to log "still waiting ..." message. To fix this issue, "deadlock_timeout - elapsed time" needs to be used as the timeout when enabling the timer at least in ResolveRecoveryConflictWithLock() and ResolveRecoveryConflictWithBufferPin(). Also similar change needs to be applied to ResolveRecoveryConflictWithVirtualXIDs(). BTW, without applying the patch, *originally* ResolveRecoveryConflictWithBufferPin() seems to have this issue. It enables deadlock_timeout timer so as to request for hot-standbfy backends to check themselves for deadlocks. But if we wake up out of ProcWaitForSignal() before deadlock_timeout is reached, the subsequent call to ResolveRecoveryConflictWithBufferPin() also uses deadlock_timeout again instead of "deadlock_timeout - elapsed time". So the request for deadlock check can be delayed. Furthermore, if ResolveRecoveryConflictWithBufferPin() always wakes up out of ProcWaitForSignal() before deadlock_timeout is reached, the request for deadlock check may also never happen infinitely. Maybe we should fix the original issue at first separately from the patch. > + bool maybe_log_conflict; > + bool maybe_update_title; > > Although it should be a matter of taste and I understand that the > "maybe" means that "that logging and changing of ps display may not > happen in this iteration" , that variables seem expressing > respectively "we should write log if the timeout for recovery conflict > expires" and "we should update title if 500ms elapsed". So they seem > *to me* better be just "log_conflict" and "update_title". > > I feel the same with "maybe_log_conflict" in ProcSleep(). I have no strong opinion about those names. So if other people also think so, I'm ok to rename them. > > > + for recovery conflicts. This is useful in determining if recovery > + conflicts prevent the recovery from applying WAL. > > (I'm not confident on this) Isn't the sentence better be in past or > present continuous tense? Could you tell me why you think that's better? >> BTW, attached is the POC patch that implements the idea discussed >> upthread; >> if log_recovery_conflict_waits is enabled, the startup process reports >> the log also after the recovery conflict was resolved and the startup >> process >> finished waiting for it. This patch needs to be applied after >> v11-0002-Log-the-standby-recovery-conflict-waits.patch is applied. > > Ah. I was just writing a comment about that. I haven't looked it > closer but it looks good to me. By the way doesn't it contains a > simple fix of a comment for the base patch? Yes, so the typo included in the base patch should be fixed when pushing it. Regards, -- Fujii Masao Advanced Computing Technology Center Research and Development Headquarters NTT DATA CORPORATION
On 2020/12/15 15:40, Fujii Masao wrote: > > > On 2020/12/15 12:04, Kyotaro Horiguchi wrote: >> At Tue, 15 Dec 2020 02:00:21 +0900, Fujii Masao <masao.fujii@oss.nttdata.com> wrote in >>> Thanks for the review! I'm thinking to wait half a day before >>> commiting >>> the patch just in the case someone may object the patch. >> >> Sorry for coming late. I have looked only the latest thread so I >> should be missing many things so please ignore if it was settled in >> the past discussion. >> >> >> It emits messages like the follows; >> >> [40509:startup] LOG: recovery still waiting after 1021.431 ms: recovery conflict on lock >> [40509:startup] DETAIL: Conflicting processes: 41171, 41194. >> [40509:startup] CONTEXT: WAL redo at 0/3013158 for Standby/LOCK: xid 510 db 13609 rel 16384 >> >> IFAIK DETAIL usually shows ordinary sentences so the first word is >> capitalized and ends with a period. But it is not a sentence so >> following period looks odd. a searcheing the tree for errdetails >> showed some anomalies. >> >> src/backend/parser/parse_param.c errdetail("%s versus %s", >> src/backend/jit/llvm/llvmjit_error.cpp errdetail("while in LLVM"))); >> src/backend/replication/logical/tablesync.c errdetail("The error was: %s", res->err))); >> src/backend/tcop/postgres.c errdetail("prepare: %s", pstmt->plansource->query_string); >> src/backend/tcop/postgres.c errdetail("abort reason: recovery conflict"); >> >> and one similar occurance: >> >> src/backend/utils/adt/dbsize.c errdetail("Invalid size unit: \"%s\".", strptr), >> >> I'm not sure, but it seems to me at least the period is unnecessary >> here. > > Since Error Message Style Guide in the docs says "Detail and hint messages: > Use complete sentences, and end each with a period.", I think that a period > is necessary here. No? > > >> >> >> + bool maybe_log_conflict = >> + (standbyWaitStart != 0 && !logged_recovery_conflict); >> >> odd indentation. > > This is the result of pgindent run. I'm not sure why pgindent indents > that way, but for now I'd like to follow pgindent. > > >> >> >> + /* Also, set the timer if necessary */ >> + if (logging_timer) >> + { >> + timeouts[cnt].id = STANDBY_LOCK_TIMEOUT; >> + timeouts[cnt].type = TMPARAM_AFTER; >> + timeouts[cnt].delay_ms = DeadlockTimeout; >> + cnt++; >> + } >> >> This doesn't consider spurious wakeup. I'm not sure it actually >> happenes but we usually consider that. That is since before this >> patch, but ProcWaitForSignal()'s comment says that: >> >>> * As this uses the generic process latch the caller has to be robust against >>> * unrelated wakeups: Always check that the desired state has occurred, and >>> * wait again if not. >> >> If we don't care of spurious wakeups, we should add such a comment. > > If ProcWaitForSignal() wakes up because of the reason (e.g., SIGHUP) > other than deadlock_timeout, ProcSleep() will call > ResolveRecoveryConflictWithLock() and we sleep on ProcWaitForSignal() > again since the recovery conflict has not been resolved yet. So we can > say that we consider "spurious wakeups"? > > However when I read the related code again, I found another issue in > the patch. In ResolveRecoveryConflictWithLock(), if SIGHUP causes us to > wake up out of ProcWaitForSignal() before deadlock_timeout is reached, > we will use deadlock_timeout again when sleeping on ProcWaitForSignal(). > Instead, probably we should use the "deadlock_timeout - elapsed time" > so that we can emit a log message as soon as deadlock_timeout passes > since starting waiting on recovery conflict. Otherwise it may take at most > "deadlock_timeout * 2" to log "still waiting ..." message. > > To fix this issue, "deadlock_timeout - elapsed time" needs to be used as > the timeout when enabling the timer at least in > ResolveRecoveryConflictWithLock() and ResolveRecoveryConflictWithBufferPin(). > Also similar change needs to be applied to > ResolveRecoveryConflictWithVirtualXIDs(). > > BTW, without applying the patch, *originally* > ResolveRecoveryConflictWithBufferPin() seems to have this issue. > It enables deadlock_timeout timer so as to request for hot-standbfy > backends to check themselves for deadlocks. But if we wake up out of > ProcWaitForSignal() before deadlock_timeout is reached, the subsequent > call to ResolveRecoveryConflictWithBufferPin() also uses deadlock_timeout > again instead of "deadlock_timeout - elapsed time". So the request for > deadlock check can be delayed. Furthermore, > if ResolveRecoveryConflictWithBufferPin() always wakes up out of > ProcWaitForSignal() before deadlock_timeout is reached, the request > for deadlock check may also never happen infinitely. > > Maybe we should fix the original issue at first separately from the patch. Hmm... commit ac22929a26 seems to make the thing worse. Before that commit, other wakeup request like SIGHUP didn't cause ProcWaitForSignal() to actually wake up in ResolveRecoveryConflictWithBufferPin(). Because such other wakeup requests use the different latch from that that ProcWaitForSignal() waits on. But commit ac22929a26 changed the startup process code so that they both use the same latch. Which could cause ProcWaitForSignal() to be more likely to wake up because of the requests other than deadlock_timeout. Maybe we need to revert commit ac22929a26. Regards, -- Fujii Masao Advanced Computing Technology Center Research and Development Headquarters NTT DATA CORPORATION
At Tue, 15 Dec 2020 15:40:03 +0900, Fujii Masao <masao.fujii@oss.nttdata.com> wrote in > > > On 2020/12/15 12:04, Kyotaro Horiguchi wrote: > > [40509:startup] DETAIL: Conflicting processes: 41171, 41194. ... > > I'm not sure, but it seems to me at least the period is unnecessary > > here. > > Since Error Message Style Guide in the docs says "Detail and hint > messages: > Use complete sentences, and end each with a period.", I think that a > period > is necessary here. No? In the first place it is not a complete sentence. Might be better be something like this if we strictly follow the style guide? > Conflicting processes are 41171, 41194. > Conflicting processes are: 41171, 41194. > > > + bool maybe_log_conflict = > > + (standbyWaitStart != 0 && !logged_recovery_conflict); > > odd indentation. > > This is the result of pgindent run. I'm not sure why pgindent indents > that way, but for now I'd like to follow pgindent. Agreed. > > + /* Also, set the timer if necessary */ > > + if (logging_timer) > > + { > > + timeouts[cnt].id = STANDBY_LOCK_TIMEOUT; > > + timeouts[cnt].type = TMPARAM_AFTER; > > + timeouts[cnt].delay_ms = DeadlockTimeout; > > + cnt++; > > + } > > This doesn't consider spurious wakeup. I'm not sure it actually > > happenes but we usually consider that. That is since before this > > patch, but ProcWaitForSignal()'s comment says that: > > > >> * As this uses the generic process latch the caller has to be robust > >> * against > >> * unrelated wakeups: Always check that the desired state has occurred, > >> * and > >> * wait again if not. > > If we don't care of spurious wakeups, we should add such a comment. > > If ProcWaitForSignal() wakes up because of the reason (e.g., SIGHUP) > other than deadlock_timeout, ProcSleep() will call > ResolveRecoveryConflictWithLock() and we sleep on ProcWaitForSignal() > again since the recovery conflict has not been resolved yet. So we can > say that we consider "spurious wakeups"? So, the following seems to be spurious wakeups.. > However when I read the related code again, I found another issue in > the patch. In ResolveRecoveryConflictWithLock(), if SIGHUP causes us > to > wake up out of ProcWaitForSignal() before deadlock_timeout is reached, > we will use deadlock_timeout again when sleeping on > ProcWaitForSignal(). > Instead, probably we should use the "deadlock_timeout - elapsed time" > so that we can emit a log message as soon as deadlock_timeout passes > since starting waiting on recovery conflict. Otherwise it may take at > most > "deadlock_timeout * 2" to log "still waiting ..." message. > > To fix this issue, "deadlock_timeout - elapsed time" needs to be used > as > the timeout when enabling the timer at least in > ResolveRecoveryConflictWithLock() and > ResolveRecoveryConflictWithBufferPin(). > Also similar change needs to be applied to > ResolveRecoveryConflictWithVirtualXIDs(). > > BTW, without applying the patch, *originally* > ResolveRecoveryConflictWithBufferPin() seems to have this issue. > It enables deadlock_timeout timer so as to request for hot-standbfy > backends to check themselves for deadlocks. But if we wake up out of > ProcWaitForSignal() before deadlock_timeout is reached, the subsequent > call to ResolveRecoveryConflictWithBufferPin() also uses > deadlock_timeout > again instead of "deadlock_timeout - elapsed time". So the request for > deadlock check can be delayed. Furthermore, > if ResolveRecoveryConflictWithBufferPin() always wakes up out of > ProcWaitForSignal() before deadlock_timeout is reached, the request > for deadlock check may also never happen infinitely. > > Maybe we should fix the original issue at first separately from the > patch. Yeah, it's not an issue of this patch. > > + bool maybe_log_conflict; > > + bool maybe_update_title; > > Although it should be a matter of taste and I understand that the > > "maybe" means that "that logging and changing of ps display may not > > happen in this iteration" , that variables seem expressing > > respectively "we should write log if the timeout for recovery conflict > > expires" and "we should update title if 500ms elapsed". So they seem > > *to me* better be just "log_conflict" and "update_title". > > I feel the same with "maybe_log_conflict" in ProcSleep(). > > I have no strong opinion about those names. So if other people also > think so, I'm ok to rename them. Shorter is better as far as it makes sense and not-too abbreviated. > > + for recovery conflicts. This is useful in determining if recovery > > + conflicts prevent the recovery from applying WAL. > > (I'm not confident on this) Isn't the sentence better be in past or > > present continuous tense? > > Could you tell me why you think that's better? To make sure, I mentioned about the "prevent". The reason for that is it represents the status at the present or past. I don't insist on that if you don't think it's not better. > for recovery conflicts. This is useful in determining if recovery > conflicts are preventing the recovery from applying WAL. > >> BTW, attached is the POC patch that implements the idea discussed > >> upthread; > >> if log_recovery_conflict_waits is enabled, the startup process reports > >> the log also after the recovery conflict was resolved and the startup > >> process > >> finished waiting for it. This patch needs to be applied after > >> v11-0002-Log-the-standby-recovery-conflict-waits.patch is applied. > > Ah. I was just writing a comment about that. I haven't looked it > > closer but it looks good to me. By the way doesn't it contains a > > simple fix of a comment for the base patch? > > Yes, so the typo included in the base patch should be fixed when > pushing it. Understood. Thanks. regards. -- Kyotaro Horiguchi NTT Open Source Software Center
On Mon, Dec 14, 2020 at 9:31 PM Fujii Masao <masao.fujii@oss.nttdata.com> wrote: > > > > On 2020/12/05 12:38, Masahiko Sawada wrote: > > On Fri, Dec 4, 2020 at 7:22 PM Drouvot, Bertrand <bdrouvot@amazon.com> wrote: > >> > >> Hi, > >> > >> On 12/4/20 2:21 AM, Fujii Masao wrote: > >>> > >>> On 2020/12/04 9:28, Masahiko Sawada wrote: > >>>> On Fri, Dec 4, 2020 at 2:54 AM Fujii Masao > >>>> <masao.fujii@oss.nttdata.com> wrote: > >>>>> > >>>>> > >>>>> > >>>>> On 2020/12/01 17:29, Drouvot, Bertrand wrote: > >>>>>> Hi, > >>>>>> > >>>>>> On 12/1/20 12:35 AM, Masahiko Sawada wrote: > >>>>>>> CAUTION: This email originated from outside of the organization. > >>>>>>> Do not click links or open attachments unless you can confirm the > >>>>>>> sender and know the content is safe. > >>>>>>> > >>>>>>> > >>>>>>> > >>>>>>> On Tue, Dec 1, 2020 at 3:25 AM Alvaro Herrera > >>>>>>> <alvherre@alvh.no-ip.org> wrote: > >>>>>>>> On 2020-Dec-01, Fujii Masao wrote: > >>>>>>>> > >>>>>>>>> + if (proc) > >>>>>>>>> + { > >>>>>>>>> + if (nprocs == 0) > >>>>>>>>> + appendStringInfo(&buf, "%d", proc->pid); > >>>>>>>>> + else > >>>>>>>>> + appendStringInfo(&buf, ", %d", proc->pid); > >>>>>>>>> + > >>>>>>>>> + nprocs++; > >>>>>>>>> > >>>>>>>>> What happens if all the backends in wait_list have gone? In > >>>>>>>>> other words, > >>>>>>>>> how should we handle the case where nprocs == 0 (i.e., nprocs > >>>>>>>>> has not been > >>>>>>>>> incrmented at all)? This would very rarely happen, but can happen. > >>>>>>>>> In this case, since buf.data is empty, at least there seems no > >>>>>>>>> need to log > >>>>>>>>> the list of conflicting processes in detail message. > >>>>>>>> Yes, I noticed this too; this can be simplified by changing the > >>>>>>>> condition in the ereport() call to be "nprocs > 0" (rather than > >>>>>>>> wait_list being null), otherwise not print the errdetail. (You > >>>>>>>> could > >>>>>>>> test buf.data or buf.len instead, but that seems uglier to me.) > >>>>>>> +1 > >>>>>>> > >>>>>>> Maybe we can also improve the comment of this function from: > >>>>>>> > >>>>>>> + * This function also reports the details about the conflicting > >>>>>>> + * process ids if *wait_list is not NULL. > >>>>>>> > >>>>>>> to " This function also reports the details about the conflicting > >>>>>>> process ids if exist" or something. > >>>>>>> > >>>>>> Thank you all for the review/remarks. > >>>>>> > >>>>>> They have been addressed in the new attached patch version. > >>>>> > >>>>> Thanks for updating the patch! I read through the patch again > >>>>> and applied the following chages to it. Attached is the updated > >>>>> version of the patch. Could you review this version? If there is > >>>>> no issue in it, I'm thinking to commit this version. > >>>> > >>>> Thank you for updating the patch! I have one question. > >>>> > >>>>> > >>>>> + timeouts[cnt].id = STANDBY_TIMEOUT; > >>>>> + timeouts[cnt].type = TMPARAM_AFTER; > >>>>> + timeouts[cnt].delay_ms = DeadlockTimeout; > >>>>> > >>>>> Maybe STANDBY_TIMEOUT should be STANDBY_DEADLOCK_TIMEOUT here? > >>>>> I changed the code that way. > >>>> > >>>> As the comment of ResolveRecoveryConflictWithLock() says the > >>>> following, a deadlock is detected by the ordinary backend process: > >>>> > >>>> * Deadlocks involving the Startup process and an ordinary backend > >>>> proces > >>>> * will be detected by the deadlock detector within the ordinary > >>>> backend. > >>>> > >>>> If we use STANDBY_DEADLOCK_TIMEOUT, > >>>> SendRecoveryConflictWithBufferPin() will be called after > >>>> DeadlockTimeout passed, but I think it's not necessary for the startup > >>>> process in this case. > >>> > >>> Thanks for pointing this! You are right. > >>> > >>> > >>>> If we want to just wake up the startup process > >>>> maybe we can use STANDBY_TIMEOUT here? > >>> > >> Thanks for the patch updates! Except what we are still discussing below, > >> it looks good to me. > >> > >>> When STANDBY_TIMEOUT happens, a request to release conflicting buffer > >>> pins is sent. Right? If so, we should not also use STANDBY_TIMEOUT there? > >> > >> Agree > >> > >>> > >>> Or, first of all, we don't need to enable the deadlock timer at all? > >>> Since what we'd like to do is to wake up after deadlock_timeout > >>> passes, we can do that by changing ProcWaitForSignal() so that it can > >>> accept the timeout and giving the deadlock_timeout to it. If we do > >>> this, maybe we can get rid of STANDBY_LOCK_TIMEOUT from > >>> ResolveRecoveryConflictWithLock(). Thought? > > > > Where do we enable deadlock timeout in hot standby case? You meant to > > enable it in ProcWaitForSignal() or where we set a timer for not hot > > standby case, in ProcSleep()? > > No, what I tried to say is to change ResolveRecoveryConflictWithLock() so that it does > > 1. calculate the "minimum" timeout from deadlock_timeout and max_standby_xxx_delay > 2. give the calculated timeout value to ProcWaitForSignal() > 3. wait for signal and timeout on ProcWaitForSignal() > > Since ProcWaitForSignal() calls WaitLatch(), seems it's not so difficult to make ProcWaitForSignal() handle the timeout.If we do this, I was thinking that we can get rid of enable_timeouts() from ResolveRecoveryConflictWithLock(). Thank you for your explanation! That makes sense to me. Even if we don't have ProcWaitForSignal() handler the timeout perhaps we don't need to set two timeouts. As you mentioned, we can calculate the minimum timeout and set it (or nothing). Regards, -- Masahiko Sawada EnterpriseDB: https://www.enterprisedb.com/
On Wed, Dec 16, 2020 at 11:22 AM Kyotaro Horiguchi <horikyota.ntt@gmail.com> wrote: > > At Tue, 15 Dec 2020 15:40:03 +0900, Fujii Masao <masao.fujii@oss.nttdata.com> wrote in > > > > > > On 2020/12/15 12:04, Kyotaro Horiguchi wrote: > > > [40509:startup] DETAIL: Conflicting processes: 41171, 41194. > ... > > > I'm not sure, but it seems to me at least the period is unnecessary > > > here. > > > > Since Error Message Style Guide in the docs says "Detail and hint > > messages: > > Use complete sentences, and end each with a period.", I think that a > > period > > is necessary here. No? > > In the first place it is not a complete sentence. Might be better be > something like this if we strictly follow the style guide? FWIW I borrowed the message style in errdetail from log messages in ProcSleep(): (errmsg("process %d still waiting for %s on %s after %ld.%03d ms", MyProcPid, modename, buf.data, msecs, usecs), (errdetail_log_plural("Process holding the lock: %s. Wait queue: %s.", "Processes holding the lock: %s. Wait queue: %s.", lockHoldersNum, lock_holders_sbuf.data, lock_waiters_sbuf.data)))); > > Conflicting processes are 41171, 41194. > > Conflicting processes are: 41171, 41194. If we use the above message we might want to change other similar messages I exemplified as well. Regards, -- Masahiko Sawada EnterpriseDB: https://www.enterprisedb.com/
At Wed, 16 Dec 2020 12:08:31 +0900, Masahiko Sawada <sawada.mshk@gmail.com> wrote in > On Wed, Dec 16, 2020 at 11:22 AM Kyotaro Horiguchi > <horikyota.ntt@gmail.com> wrote: > > > > At Tue, 15 Dec 2020 15:40:03 +0900, Fujii Masao > <masao.fujii@oss.nttdata.com> wrote in > > > > > > > > > On 2020/12/15 12:04, Kyotaro Horiguchi wrote: > > > > [40509:startup] DETAIL: Conflicting processes: 41171, 41194. > > ... > > > > I'm not sure, but it seems to me at least the period is > unnecessary > > > > here. > > > > > > Since Error Message Style Guide in the docs says "Detail and > hint > > > messages: > > > Use complete sentences, and end each with a period.", I think > that a > > > period > > > is necessary here. No? > > > > In the first place it is not a complete sentence. Might be better > be > > something like this if we strictly follow the style guide? > > FWIW I borrowed the message style in errdetail from log messages in > ProcSleep(): > (errmsg("process %d still waiting for %s on %s after %ld.%03d ms", > MyProcPid, modename, buf.data, msecs, usecs), > (errdetail_log_plural("Process holding the lock: %s. Wait queue: > %s.", > "Processes holding the lock: %s. Wait queue: > %s.", > lockHoldersNum, lock_holders_sbuf.data, > lock_waiters_sbuf.data)))); I was guessing that was the case. > > > Conflicting processes are 41171, 41194. > > > Conflicting processes are: 41171, 41194. Or I came up with the following after scanning throught the tree. | Some processes are conflicting: 41171, 41194. > If we use the above message we might want to change other similar > messages I exemplified as well. I'm not sure what should we do for other anomalies. Other errdetails of this category (incomplete sentences or the absence of a period) I found are: -- period is absent pgarch.c�596: errdetail("The failed archive command was: %s", postmaster.c�3723: errdetail("Failed process was running: %s", matview.c�654: errdetail("Row: %s", tablecmds.c�2371: errdetail("%s versus %s", tablecmds.c�11512: errdetail("%s depends on column \"%s\"", subscriptioncmds.c�1081: errdetail("The error was: %s", err), tablesync.c�918: errdetail("The error was: %s", res->err))); be-secure-openssl.c�235: errdetail("\"%s\" cannot be higher than \"%s\"", auth.c�1314: errdetail_internal("SSPI error %x", (unsigned int) r))); auth.c�2854: errdetail("LDAP diagnostics: %s", message); pl_exec.c�4386: errdetail_internal("parameters: %s", errdetail) : 0)); postgres.c�2401: errdetail("prepare: %s", pstmt->plansource->query_string); -- having a period. proc.c�1479: errdetail_log_plural("Process holding the lock: %s. Wait queue: %s.", pl_handler.c�106: GUC_check_errdetail("Unrecognized key word: \"%s\".", tok); Although it dpends on the precise criteria of how they are extracted, it seems that the absense of a period is more major. regards. -- Kyotaro Horiguchi NTT Open Source Software Center
At Wed, 16 Dec 2020 14:49:25 +0900 (JST), Kyotaro Horiguchi <horikyota.ntt@gmail.com> wrote in > > > > Conflicting processes are 41171, 41194. > > > > Conflicting processes are: 41171, 41194. > > Or I came up with the following after scanning throught the tree. > > | Some processes are conflicting: 41171, 41194. Or Some processes are blocking recovery progress: 41171, 41194. ? regards. -- Kyotaro Horiguchi NTT Open Source Software Center
On 2020/12/15 0:20, Fujii Masao wrote: > > > On 2020/12/14 21:31, Fujii Masao wrote: >> >> >> On 2020/12/05 12:38, Masahiko Sawada wrote: >>> On Fri, Dec 4, 2020 at 7:22 PM Drouvot, Bertrand <bdrouvot@amazon.com> wrote: >>>> >>>> Hi, >>>> >>>> On 12/4/20 2:21 AM, Fujii Masao wrote: >>>>> >>>>> On 2020/12/04 9:28, Masahiko Sawada wrote: >>>>>> On Fri, Dec 4, 2020 at 2:54 AM Fujii Masao >>>>>> <masao.fujii@oss.nttdata.com> wrote: >>>>>>> >>>>>>> >>>>>>> >>>>>>> On 2020/12/01 17:29, Drouvot, Bertrand wrote: >>>>>>>> Hi, >>>>>>>> >>>>>>>> On 12/1/20 12:35 AM, Masahiko Sawada wrote: >>>>>>>>> CAUTION: This email originated from outside of the organization. >>>>>>>>> Do not click links or open attachments unless you can confirm the >>>>>>>>> sender and know the content is safe. >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> On Tue, Dec 1, 2020 at 3:25 AM Alvaro Herrera >>>>>>>>> <alvherre@alvh.no-ip.org> wrote: >>>>>>>>>> On 2020-Dec-01, Fujii Masao wrote: >>>>>>>>>> >>>>>>>>>>> + if (proc) >>>>>>>>>>> + { >>>>>>>>>>> + if (nprocs == 0) >>>>>>>>>>> + appendStringInfo(&buf, "%d", proc->pid); >>>>>>>>>>> + else >>>>>>>>>>> + appendStringInfo(&buf, ", %d", proc->pid); >>>>>>>>>>> + >>>>>>>>>>> + nprocs++; >>>>>>>>>>> >>>>>>>>>>> What happens if all the backends in wait_list have gone? In >>>>>>>>>>> other words, >>>>>>>>>>> how should we handle the case where nprocs == 0 (i.e., nprocs >>>>>>>>>>> has not been >>>>>>>>>>> incrmented at all)? This would very rarely happen, but can happen. >>>>>>>>>>> In this case, since buf.data is empty, at least there seems no >>>>>>>>>>> need to log >>>>>>>>>>> the list of conflicting processes in detail message. >>>>>>>>>> Yes, I noticed this too; this can be simplified by changing the >>>>>>>>>> condition in the ereport() call to be "nprocs > 0" (rather than >>>>>>>>>> wait_list being null), otherwise not print the errdetail. (You >>>>>>>>>> could >>>>>>>>>> test buf.data or buf.len instead, but that seems uglier to me.) >>>>>>>>> +1 >>>>>>>>> >>>>>>>>> Maybe we can also improve the comment of this function from: >>>>>>>>> >>>>>>>>> + * This function also reports the details about the conflicting >>>>>>>>> + * process ids if *wait_list is not NULL. >>>>>>>>> >>>>>>>>> to " This function also reports the details about the conflicting >>>>>>>>> process ids if exist" or something. >>>>>>>>> >>>>>>>> Thank you all for the review/remarks. >>>>>>>> >>>>>>>> They have been addressed in the new attached patch version. >>>>>>> >>>>>>> Thanks for updating the patch! I read through the patch again >>>>>>> and applied the following chages to it. Attached is the updated >>>>>>> version of the patch. Could you review this version? If there is >>>>>>> no issue in it, I'm thinking to commit this version. >>>>>> >>>>>> Thank you for updating the patch! I have one question. >>>>>> >>>>>>> >>>>>>> + timeouts[cnt].id = STANDBY_TIMEOUT; >>>>>>> + timeouts[cnt].type = TMPARAM_AFTER; >>>>>>> + timeouts[cnt].delay_ms = DeadlockTimeout; >>>>>>> >>>>>>> Maybe STANDBY_TIMEOUT should be STANDBY_DEADLOCK_TIMEOUT here? >>>>>>> I changed the code that way. >>>>>> >>>>>> As the comment of ResolveRecoveryConflictWithLock() says the >>>>>> following, a deadlock is detected by the ordinary backend process: >>>>>> >>>>>> * Deadlocks involving the Startup process and an ordinary backend >>>>>> proces >>>>>> * will be detected by the deadlock detector within the ordinary >>>>>> backend. >>>>>> >>>>>> If we use STANDBY_DEADLOCK_TIMEOUT, >>>>>> SendRecoveryConflictWithBufferPin() will be called after >>>>>> DeadlockTimeout passed, but I think it's not necessary for the startup >>>>>> process in this case. >>>>> >>>>> Thanks for pointing this! You are right. >>>>> >>>>> >>>>>> If we want to just wake up the startup process >>>>>> maybe we can use STANDBY_TIMEOUT here? >>>>> >>>> Thanks for the patch updates! Except what we are still discussing below, >>>> it looks good to me. >>>> >>>>> When STANDBY_TIMEOUT happens, a request to release conflicting buffer >>>>> pins is sent. Right? If so, we should not also use STANDBY_TIMEOUT there? >>>> >>>> Agree >>>> >>>>> >>>>> Or, first of all, we don't need to enable the deadlock timer at all? >>>>> Since what we'd like to do is to wake up after deadlock_timeout >>>>> passes, we can do that by changing ProcWaitForSignal() so that it can >>>>> accept the timeout and giving the deadlock_timeout to it. If we do >>>>> this, maybe we can get rid of STANDBY_LOCK_TIMEOUT from >>>>> ResolveRecoveryConflictWithLock(). Thought? >>> >>> Where do we enable deadlock timeout in hot standby case? You meant to >>> enable it in ProcWaitForSignal() or where we set a timer for not hot >>> standby case, in ProcSleep()? >> >> No, what I tried to say is to change ResolveRecoveryConflictWithLock() so that it does >> >> 1. calculate the "minimum" timeout from deadlock_timeout and max_standby_xxx_delay >> 2. give the calculated timeout value to ProcWaitForSignal() >> 3. wait for signal and timeout on ProcWaitForSignal() >> >> Since ProcWaitForSignal() calls WaitLatch(), seems it's not so difficult to make ProcWaitForSignal() handle the timeout.If we do this, I was thinking that we can get rid of enable_timeouts() from ResolveRecoveryConflictWithLock(). >> >> >>> >>>> >>>> Why not simply use (again) the STANDBY_LOCK_TIMEOUT one? (as it triggers >>>> a call to StandbyLockTimeoutHandler() which does nothing, except waking >>>> up. That's what we want, right?) >>> >>> Right, what I wanted to mean is STANDBY_LOCK_TIMEOUT. The startup >>> process can wake up and do nothing. Thank you for pointing out. >> >> Okay, understood! Firstly I was thinking that enabling the same type (i.e., STANDBY_LOCK_TIMEOUT) of lock twice doesn'twork properly, but as far as I read the code, it works. In that case, only the shorter timeout would be activatedin enable_timeouts(). So I agree to use STANDBY_LOCK_TIMEOUT. > > So I renamed the argument "deadlock_timer" in ResolveRecoveryConflictWithLock() > because it's not the timer for deadlock and is confusing. Attached is the > updated version of the patch. Barring any objection, I will commit this version. Since the recent commit 8900b5a9d5 changed the recovery conflict code, I updated the patch. Attached is the updated version of the patch. Regards, -- Fujii Masao Advanced Computing Technology Center Research and Development Headquarters NTT DATA CORPORATION
Attachment
Hi, On 1/6/21 6:31 PM, Fujii Masao wrote: > CAUTION: This email originated from outside of the organization. Do > not click links or open attachments unless you can confirm the sender > and know the content is safe. > > > > On 2020/12/15 0:20, Fujii Masao wrote: >> >> >> On 2020/12/14 21:31, Fujii Masao wrote: >>> >>> >>> On 2020/12/05 12:38, Masahiko Sawada wrote: >>>> On Fri, Dec 4, 2020 at 7:22 PM Drouvot, Bertrand >>>> <bdrouvot@amazon.com> wrote: >>>>> >>>>> Hi, >>>>> >>>>> On 12/4/20 2:21 AM, Fujii Masao wrote: >>>>>> >>>>>> On 2020/12/04 9:28, Masahiko Sawada wrote: >>>>>>> On Fri, Dec 4, 2020 at 2:54 AM Fujii Masao >>>>>>> <masao.fujii@oss.nttdata.com> wrote: >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> On 2020/12/01 17:29, Drouvot, Bertrand wrote: >>>>>>>>> Hi, >>>>>>>>> >>>>>>>>> On 12/1/20 12:35 AM, Masahiko Sawada wrote: >>>>>>>>>> CAUTION: This email originated from outside of the organization. >>>>>>>>>> Do not click links or open attachments unless you can confirm >>>>>>>>>> the >>>>>>>>>> sender and know the content is safe. >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> On Tue, Dec 1, 2020 at 3:25 AM Alvaro Herrera >>>>>>>>>> <alvherre@alvh.no-ip.org> wrote: >>>>>>>>>>> On 2020-Dec-01, Fujii Masao wrote: >>>>>>>>>>> >>>>>>>>>>>> + if (proc) >>>>>>>>>>>> + { >>>>>>>>>>>> + if (nprocs == 0) >>>>>>>>>>>> + appendStringInfo(&buf, "%d", proc->pid); >>>>>>>>>>>> + else >>>>>>>>>>>> + appendStringInfo(&buf, ", %d", proc->pid); >>>>>>>>>>>> + >>>>>>>>>>>> + nprocs++; >>>>>>>>>>>> >>>>>>>>>>>> What happens if all the backends in wait_list have gone? In >>>>>>>>>>>> other words, >>>>>>>>>>>> how should we handle the case where nprocs == 0 (i.e., nprocs >>>>>>>>>>>> has not been >>>>>>>>>>>> incrmented at all)? This would very rarely happen, but can >>>>>>>>>>>> happen. >>>>>>>>>>>> In this case, since buf.data is empty, at least there seems no >>>>>>>>>>>> need to log >>>>>>>>>>>> the list of conflicting processes in detail message. >>>>>>>>>>> Yes, I noticed this too; this can be simplified by changing the >>>>>>>>>>> condition in the ereport() call to be "nprocs > 0" (rather than >>>>>>>>>>> wait_list being null), otherwise not print the errdetail. (You >>>>>>>>>>> could >>>>>>>>>>> test buf.data or buf.len instead, but that seems uglier to me.) >>>>>>>>>> +1 >>>>>>>>>> >>>>>>>>>> Maybe we can also improve the comment of this function from: >>>>>>>>>> >>>>>>>>>> + * This function also reports the details about the conflicting >>>>>>>>>> + * process ids if *wait_list is not NULL. >>>>>>>>>> >>>>>>>>>> to " This function also reports the details about the >>>>>>>>>> conflicting >>>>>>>>>> process ids if exist" or something. >>>>>>>>>> >>>>>>>>> Thank you all for the review/remarks. >>>>>>>>> >>>>>>>>> They have been addressed in the new attached patch version. >>>>>>>> >>>>>>>> Thanks for updating the patch! I read through the patch again >>>>>>>> and applied the following chages to it. Attached is the updated >>>>>>>> version of the patch. Could you review this version? If there is >>>>>>>> no issue in it, I'm thinking to commit this version. >>>>>>> >>>>>>> Thank you for updating the patch! I have one question. >>>>>>> >>>>>>>> >>>>>>>> + timeouts[cnt].id = STANDBY_TIMEOUT; >>>>>>>> + timeouts[cnt].type = TMPARAM_AFTER; >>>>>>>> + timeouts[cnt].delay_ms = DeadlockTimeout; >>>>>>>> >>>>>>>> Maybe STANDBY_TIMEOUT should be STANDBY_DEADLOCK_TIMEOUT here? >>>>>>>> I changed the code that way. >>>>>>> >>>>>>> As the comment of ResolveRecoveryConflictWithLock() says the >>>>>>> following, a deadlock is detected by the ordinary backend process: >>>>>>> >>>>>>> * Deadlocks involving the Startup process and an ordinary >>>>>>> backend >>>>>>> proces >>>>>>> * will be detected by the deadlock detector within the ordinary >>>>>>> backend. >>>>>>> >>>>>>> If we use STANDBY_DEADLOCK_TIMEOUT, >>>>>>> SendRecoveryConflictWithBufferPin() will be called after >>>>>>> DeadlockTimeout passed, but I think it's not necessary for the >>>>>>> startup >>>>>>> process in this case. >>>>>> >>>>>> Thanks for pointing this! You are right. >>>>>> >>>>>> >>>>>>> If we want to just wake up the startup process >>>>>>> maybe we can use STANDBY_TIMEOUT here? >>>>>> >>>>> Thanks for the patch updates! Except what we are still discussing >>>>> below, >>>>> it looks good to me. >>>>> >>>>>> When STANDBY_TIMEOUT happens, a request to release conflicting >>>>>> buffer >>>>>> pins is sent. Right? If so, we should not also use >>>>>> STANDBY_TIMEOUT there? >>>>> >>>>> Agree >>>>> >>>>>> >>>>>> Or, first of all, we don't need to enable the deadlock timer at all? >>>>>> Since what we'd like to do is to wake up after deadlock_timeout >>>>>> passes, we can do that by changing ProcWaitForSignal() so that it >>>>>> can >>>>>> accept the timeout and giving the deadlock_timeout to it. If we do >>>>>> this, maybe we can get rid of STANDBY_LOCK_TIMEOUT from >>>>>> ResolveRecoveryConflictWithLock(). Thought? >>>> >>>> Where do we enable deadlock timeout in hot standby case? You meant to >>>> enable it in ProcWaitForSignal() or where we set a timer for not hot >>>> standby case, in ProcSleep()? >>> >>> No, what I tried to say is to change >>> ResolveRecoveryConflictWithLock() so that it does >>> >>> 1. calculate the "minimum" timeout from deadlock_timeout and >>> max_standby_xxx_delay >>> 2. give the calculated timeout value to ProcWaitForSignal() >>> 3. wait for signal and timeout on ProcWaitForSignal() >>> >>> Since ProcWaitForSignal() calls WaitLatch(), seems it's not so >>> difficult to make ProcWaitForSignal() handle the timeout. If we do >>> this, I was thinking that we can get rid of enable_timeouts() from >>> ResolveRecoveryConflictWithLock(). >>> >>> >>>> >>>>> >>>>> Why not simply use (again) the STANDBY_LOCK_TIMEOUT one? (as it >>>>> triggers >>>>> a call to StandbyLockTimeoutHandler() which does nothing, except >>>>> waking >>>>> up. That's what we want, right?) >>>> >>>> Right, what I wanted to mean is STANDBY_LOCK_TIMEOUT. The startup >>>> process can wake up and do nothing. Thank you for pointing out. >>> >>> Okay, understood! Firstly I was thinking that enabling the same type >>> (i.e., STANDBY_LOCK_TIMEOUT) of lock twice doesn't work properly, >>> but as far as I read the code, it works. In that case, only the >>> shorter timeout would be activated in enable_timeouts(). So I agree >>> to use STANDBY_LOCK_TIMEOUT. >> >> So I renamed the argument "deadlock_timer" in >> ResolveRecoveryConflictWithLock() >> because it's not the timer for deadlock and is confusing. Attached is >> the >> updated version of the patch. Barring any objection, I will commit >> this version. > > Since the recent commit 8900b5a9d5 changed the recovery conflict code, > I updated the patch. Attached is the updated version of the patch. > Thanks for those updates! I had a look and the patch does look good to me. As far the other threads regarding: - "maybe_log_conflict" and "maybe_update_title" naming: I don’t have strong opinions about it but I am more inclined to stay with the “maybe” naming (as it is currently in this patch version) as it better reflects that this may or not occur. - the errdetail log message format in LogRecoveryConflict() (currently looks like “Conflicting process: 25118.”) : I don’t have strong opinions about it but I am more inclined to stay as it is, as it looks similar as the format being used in ProcSleep() (even if we can find different formats in other places though). Bertrand
On 2021/01/07 22:39, Drouvot, Bertrand wrote: > Hi, > > On 1/6/21 6:31 PM, Fujii Masao wrote: >> CAUTION: This email originated from outside of the organization. Do not click links or open attachments unless you canconfirm the sender and know the content is safe. >> >> >> >> On 2020/12/15 0:20, Fujii Masao wrote: >>> >>> >>> On 2020/12/14 21:31, Fujii Masao wrote: >>>> >>>> >>>> On 2020/12/05 12:38, Masahiko Sawada wrote: >>>>> On Fri, Dec 4, 2020 at 7:22 PM Drouvot, Bertrand <bdrouvot@amazon.com> wrote: >>>>>> >>>>>> Hi, >>>>>> >>>>>> On 12/4/20 2:21 AM, Fujii Masao wrote: >>>>>>> >>>>>>> On 2020/12/04 9:28, Masahiko Sawada wrote: >>>>>>>> On Fri, Dec 4, 2020 at 2:54 AM Fujii Masao >>>>>>>> <masao.fujii@oss.nttdata.com> wrote: >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> On 2020/12/01 17:29, Drouvot, Bertrand wrote: >>>>>>>>>> Hi, >>>>>>>>>> >>>>>>>>>> On 12/1/20 12:35 AM, Masahiko Sawada wrote: >>>>>>>>>>> CAUTION: This email originated from outside of the organization. >>>>>>>>>>> Do not click links or open attachments unless you can confirm the >>>>>>>>>>> sender and know the content is safe. >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> On Tue, Dec 1, 2020 at 3:25 AM Alvaro Herrera >>>>>>>>>>> <alvherre@alvh.no-ip.org> wrote: >>>>>>>>>>>> On 2020-Dec-01, Fujii Masao wrote: >>>>>>>>>>>> >>>>>>>>>>>>> + if (proc) >>>>>>>>>>>>> + { >>>>>>>>>>>>> + if (nprocs == 0) >>>>>>>>>>>>> + appendStringInfo(&buf, "%d", proc->pid); >>>>>>>>>>>>> + else >>>>>>>>>>>>> + appendStringInfo(&buf, ", %d", proc->pid); >>>>>>>>>>>>> + >>>>>>>>>>>>> + nprocs++; >>>>>>>>>>>>> >>>>>>>>>>>>> What happens if all the backends in wait_list have gone? In >>>>>>>>>>>>> other words, >>>>>>>>>>>>> how should we handle the case where nprocs == 0 (i.e., nprocs >>>>>>>>>>>>> has not been >>>>>>>>>>>>> incrmented at all)? This would very rarely happen, but can happen. >>>>>>>>>>>>> In this case, since buf.data is empty, at least there seems no >>>>>>>>>>>>> need to log >>>>>>>>>>>>> the list of conflicting processes in detail message. >>>>>>>>>>>> Yes, I noticed this too; this can be simplified by changing the >>>>>>>>>>>> condition in the ereport() call to be "nprocs > 0" (rather than >>>>>>>>>>>> wait_list being null), otherwise not print the errdetail. (You >>>>>>>>>>>> could >>>>>>>>>>>> test buf.data or buf.len instead, but that seems uglier to me.) >>>>>>>>>>> +1 >>>>>>>>>>> >>>>>>>>>>> Maybe we can also improve the comment of this function from: >>>>>>>>>>> >>>>>>>>>>> + * This function also reports the details about the conflicting >>>>>>>>>>> + * process ids if *wait_list is not NULL. >>>>>>>>>>> >>>>>>>>>>> to " This function also reports the details about the conflicting >>>>>>>>>>> process ids if exist" or something. >>>>>>>>>>> >>>>>>>>>> Thank you all for the review/remarks. >>>>>>>>>> >>>>>>>>>> They have been addressed in the new attached patch version. >>>>>>>>> >>>>>>>>> Thanks for updating the patch! I read through the patch again >>>>>>>>> and applied the following chages to it. Attached is the updated >>>>>>>>> version of the patch. Could you review this version? If there is >>>>>>>>> no issue in it, I'm thinking to commit this version. >>>>>>>> >>>>>>>> Thank you for updating the patch! I have one question. >>>>>>>> >>>>>>>>> >>>>>>>>> + timeouts[cnt].id = STANDBY_TIMEOUT; >>>>>>>>> + timeouts[cnt].type = TMPARAM_AFTER; >>>>>>>>> + timeouts[cnt].delay_ms = DeadlockTimeout; >>>>>>>>> >>>>>>>>> Maybe STANDBY_TIMEOUT should be STANDBY_DEADLOCK_TIMEOUT here? >>>>>>>>> I changed the code that way. >>>>>>>> >>>>>>>> As the comment of ResolveRecoveryConflictWithLock() says the >>>>>>>> following, a deadlock is detected by the ordinary backend process: >>>>>>>> >>>>>>>> * Deadlocks involving the Startup process and an ordinary backend >>>>>>>> proces >>>>>>>> * will be detected by the deadlock detector within the ordinary >>>>>>>> backend. >>>>>>>> >>>>>>>> If we use STANDBY_DEADLOCK_TIMEOUT, >>>>>>>> SendRecoveryConflictWithBufferPin() will be called after >>>>>>>> DeadlockTimeout passed, but I think it's not necessary for the startup >>>>>>>> process in this case. >>>>>>> >>>>>>> Thanks for pointing this! You are right. >>>>>>> >>>>>>> >>>>>>>> If we want to just wake up the startup process >>>>>>>> maybe we can use STANDBY_TIMEOUT here? >>>>>>> >>>>>> Thanks for the patch updates! Except what we are still discussing below, >>>>>> it looks good to me. >>>>>> >>>>>>> When STANDBY_TIMEOUT happens, a request to release conflicting buffer >>>>>>> pins is sent. Right? If so, we should not also use STANDBY_TIMEOUT there? >>>>>> >>>>>> Agree >>>>>> >>>>>>> >>>>>>> Or, first of all, we don't need to enable the deadlock timer at all? >>>>>>> Since what we'd like to do is to wake up after deadlock_timeout >>>>>>> passes, we can do that by changing ProcWaitForSignal() so that it can >>>>>>> accept the timeout and giving the deadlock_timeout to it. If we do >>>>>>> this, maybe we can get rid of STANDBY_LOCK_TIMEOUT from >>>>>>> ResolveRecoveryConflictWithLock(). Thought? >>>>> >>>>> Where do we enable deadlock timeout in hot standby case? You meant to >>>>> enable it in ProcWaitForSignal() or where we set a timer for not hot >>>>> standby case, in ProcSleep()? >>>> >>>> No, what I tried to say is to change ResolveRecoveryConflictWithLock() so that it does >>>> >>>> 1. calculate the "minimum" timeout from deadlock_timeout and max_standby_xxx_delay >>>> 2. give the calculated timeout value to ProcWaitForSignal() >>>> 3. wait for signal and timeout on ProcWaitForSignal() >>>> >>>> Since ProcWaitForSignal() calls WaitLatch(), seems it's not so difficult to make ProcWaitForSignal() handle the timeout.If we do this, I was thinking that we can get rid of enable_timeouts() from ResolveRecoveryConflictWithLock(). >>>> >>>> >>>>> >>>>>> >>>>>> Why not simply use (again) the STANDBY_LOCK_TIMEOUT one? (as it triggers >>>>>> a call to StandbyLockTimeoutHandler() which does nothing, except waking >>>>>> up. That's what we want, right?) >>>>> >>>>> Right, what I wanted to mean is STANDBY_LOCK_TIMEOUT. The startup >>>>> process can wake up and do nothing. Thank you for pointing out. >>>> >>>> Okay, understood! Firstly I was thinking that enabling the same type (i.e., STANDBY_LOCK_TIMEOUT) of lock twice doesn'twork properly, but as far as I read the code, it works. In that case, only the shorter timeout would be activatedin enable_timeouts(). So I agree to use STANDBY_LOCK_TIMEOUT. >>> >>> So I renamed the argument "deadlock_timer" in ResolveRecoveryConflictWithLock() >>> because it's not the timer for deadlock and is confusing. Attached is the >>> updated version of the patch. Barring any objection, I will commit this version. >> >> Since the recent commit 8900b5a9d5 changed the recovery conflict code, >> I updated the patch. Attached is the updated version of the patch. >> > Thanks for those updates! > > I had a look and the patch does look good to me. Thanks for the review! I pushed the latest patch. > > As far the other threads regarding: > > - "maybe_log_conflict" and "maybe_update_title" naming: I don’t have strong opinions about it but I am more inclined tostay with the “maybe” naming (as it is currently in this patch version) as it better reflects that this may or not occur. > - the errdetail log message format in LogRecoveryConflict() (currently looks like “Conflicting process: 25118.”) : I don’thave strong opinions about it but I am more inclined to stay as it is, as it looks similar as the format being usedin ProcSleep() (even if we can find different formats in other places though). Agreed. And if we come up with better idea about those topics, we can improve the code later. Regards, -- Fujii Masao Advanced Computing Technology Center Research and Development Headquarters NTT DATA CORPORATION
On 2020/12/15 2:00, Fujii Masao wrote: > > > On 2020/12/15 0:49, Drouvot, Bertrand wrote: >> Hi, >> >> On 12/14/20 4:20 PM, Fujii Masao wrote: >>> CAUTION: This email originated from outside of the organization. Do not click links or open attachments unless you canconfirm the sender and know the content is safe. >>> >>> >>> >>> On 2020/12/14 21:31, Fujii Masao wrote: >>>> >>>> >>>> On 2020/12/05 12:38, Masahiko Sawada wrote: >>>>> On Fri, Dec 4, 2020 at 7:22 PM Drouvot, Bertrand <bdrouvot@amazon.com> wrote: >>>>>> >>>>>> Hi, >>>>>> >>>>>> On 12/4/20 2:21 AM, Fujii Masao wrote: >>>>>>> >>>>>>> On 2020/12/04 9:28, Masahiko Sawada wrote: >>>>>>>> On Fri, Dec 4, 2020 at 2:54 AM Fujii Masao >>>>>>>> <masao.fujii@oss.nttdata.com> wrote: >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> On 2020/12/01 17:29, Drouvot, Bertrand wrote: >>>>>>>>>> Hi, >>>>>>>>>> >>>>>>>>>> On 12/1/20 12:35 AM, Masahiko Sawada wrote: >>>>>>>>>>> CAUTION: This email originated from outside of the organization. >>>>>>>>>>> Do not click links or open attachments unless you can confirm the >>>>>>>>>>> sender and know the content is safe. >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> On Tue, Dec 1, 2020 at 3:25 AM Alvaro Herrera >>>>>>>>>>> <alvherre@alvh.no-ip.org> wrote: >>>>>>>>>>>> On 2020-Dec-01, Fujii Masao wrote: >>>>>>>>>>>> >>>>>>>>>>>>> + if (proc) >>>>>>>>>>>>> + { >>>>>>>>>>>>> + if (nprocs == 0) >>>>>>>>>>>>> + appendStringInfo(&buf, "%d", proc->pid); >>>>>>>>>>>>> + else >>>>>>>>>>>>> + appendStringInfo(&buf, ", %d", proc->pid); >>>>>>>>>>>>> + >>>>>>>>>>>>> + nprocs++; >>>>>>>>>>>>> >>>>>>>>>>>>> What happens if all the backends in wait_list have gone? In >>>>>>>>>>>>> other words, >>>>>>>>>>>>> how should we handle the case where nprocs == 0 (i.e., nprocs >>>>>>>>>>>>> has not been >>>>>>>>>>>>> incrmented at all)? This would very rarely happen, but can happen. >>>>>>>>>>>>> In this case, since buf.data is empty, at least there seems no >>>>>>>>>>>>> need to log >>>>>>>>>>>>> the list of conflicting processes in detail message. >>>>>>>>>>>> Yes, I noticed this too; this can be simplified by changing the >>>>>>>>>>>> condition in the ereport() call to be "nprocs > 0" (rather than >>>>>>>>>>>> wait_list being null), otherwise not print the errdetail. (You >>>>>>>>>>>> could >>>>>>>>>>>> test buf.data or buf.len instead, but that seems uglier to me.) >>>>>>>>>>> +1 >>>>>>>>>>> >>>>>>>>>>> Maybe we can also improve the comment of this function from: >>>>>>>>>>> >>>>>>>>>>> + * This function also reports the details about the conflicting >>>>>>>>>>> + * process ids if *wait_list is not NULL. >>>>>>>>>>> >>>>>>>>>>> to " This function also reports the details about the conflicting >>>>>>>>>>> process ids if exist" or something. >>>>>>>>>>> >>>>>>>>>> Thank you all for the review/remarks. >>>>>>>>>> >>>>>>>>>> They have been addressed in the new attached patch version. >>>>>>>>> >>>>>>>>> Thanks for updating the patch! I read through the patch again >>>>>>>>> and applied the following chages to it. Attached is the updated >>>>>>>>> version of the patch. Could you review this version? If there is >>>>>>>>> no issue in it, I'm thinking to commit this version. >>>>>>>> >>>>>>>> Thank you for updating the patch! I have one question. >>>>>>>> >>>>>>>>> >>>>>>>>> + timeouts[cnt].id = STANDBY_TIMEOUT; >>>>>>>>> + timeouts[cnt].type = TMPARAM_AFTER; >>>>>>>>> + timeouts[cnt].delay_ms = DeadlockTimeout; >>>>>>>>> >>>>>>>>> Maybe STANDBY_TIMEOUT should be STANDBY_DEADLOCK_TIMEOUT here? >>>>>>>>> I changed the code that way. >>>>>>>> >>>>>>>> As the comment of ResolveRecoveryConflictWithLock() says the >>>>>>>> following, a deadlock is detected by the ordinary backend process: >>>>>>>> >>>>>>>> * Deadlocks involving the Startup process and an ordinary backend >>>>>>>> proces >>>>>>>> * will be detected by the deadlock detector within the ordinary >>>>>>>> backend. >>>>>>>> >>>>>>>> If we use STANDBY_DEADLOCK_TIMEOUT, >>>>>>>> SendRecoveryConflictWithBufferPin() will be called after >>>>>>>> DeadlockTimeout passed, but I think it's not necessary for the startup >>>>>>>> process in this case. >>>>>>> >>>>>>> Thanks for pointing this! You are right. >>>>>>> >>>>>>> >>>>>>>> If we want to just wake up the startup process >>>>>>>> maybe we can use STANDBY_TIMEOUT here? >>>>>>> >>>>>> Thanks for the patch updates! Except what we are still discussing below, >>>>>> it looks good to me. >>>>>> >>>>>>> When STANDBY_TIMEOUT happens, a request to release conflicting buffer >>>>>>> pins is sent. Right? If so, we should not also use STANDBY_TIMEOUT there? >>>>>> >>>>>> Agree >>>>>> >>>>>>> >>>>>>> Or, first of all, we don't need to enable the deadlock timer at all? >>>>>>> Since what we'd like to do is to wake up after deadlock_timeout >>>>>>> passes, we can do that by changing ProcWaitForSignal() so that it can >>>>>>> accept the timeout and giving the deadlock_timeout to it. If we do >>>>>>> this, maybe we can get rid of STANDBY_LOCK_TIMEOUT from >>>>>>> ResolveRecoveryConflictWithLock(). Thought? >>>>> >>>>> Where do we enable deadlock timeout in hot standby case? You meant to >>>>> enable it in ProcWaitForSignal() or where we set a timer for not hot >>>>> standby case, in ProcSleep()? >>>> >>>> No, what I tried to say is to change ResolveRecoveryConflictWithLock() so that it does >>>> >>>> 1. calculate the "minimum" timeout from deadlock_timeout and max_standby_xxx_delay >>>> 2. give the calculated timeout value to ProcWaitForSignal() >>>> 3. wait for signal and timeout on ProcWaitForSignal() >>>> >>>> Since ProcWaitForSignal() calls WaitLatch(), seems it's not so difficult to make ProcWaitForSignal() handle the timeout.If we do this, I was thinking that we can get rid of enable_timeouts() from ResolveRecoveryConflictWithLock(). >>>> >>>> >>>>> >>>>>> >>>>>> Why not simply use (again) the STANDBY_LOCK_TIMEOUT one? (as it triggers >>>>>> a call to StandbyLockTimeoutHandler() which does nothing, except waking >>>>>> up. That's what we want, right?) >>>>> >>>>> Right, what I wanted to mean is STANDBY_LOCK_TIMEOUT. The startup >>>>> process can wake up and do nothing. Thank you for pointing out. >>>> >>>> Okay, understood! Firstly I was thinking that enabling the same type (i.e., STANDBY_LOCK_TIMEOUT) of lock twice doesn'twork properly, but as far as I read the code, it works. In that case, only the shorter timeout would be activatedin enable_timeouts(). So I agree to use STANDBY_LOCK_TIMEOUT. >>> >>> So I renamed the argument "deadlock_timer" in ResolveRecoveryConflictWithLock() >>> because it's not the timer for deadlock and is confusing. Attached is the >>> updated version of the patch. Barring any objection, I will commit this version. >> >> Thanks for the update! >> >> Indeed the naming is more appropriate and less confusing that way, this version looks all good to me. > > Thanks for the review! I'm thinking to wait half a day before commiting > the patch just in the case someone may object the patch. > > BTW, attached is the POC patch that implements the idea discussed upthread; > if log_recovery_conflict_waits is enabled, the startup process reports > the log also after the recovery conflict was resolved and the startup process > finished waiting for it. This patch needs to be applied after > v11-0002-Log-the-standby-recovery-conflict-waits.patch is applied. Attached is the updated version of the patch. This can be applied to current master. With the patch, for example, if the startup process waited longer than deadlock_timeout for the recovery conflict on the lock, the latter log message in the followings would be additionally output. LOG: recovery still waiting after 1001.223 ms: recovery conflict on lock LOG: recovery finished waiting after 19004.694 ms: recovery conflict on lock Regards, -- Fujii Masao Advanced Computing Technology Center Research and Development Headquarters NTT DATA CORPORATION
Attachment
At Fri, 8 Jan 2021 01:32:11 +0900, Fujii Masao <masao.fujii@oss.nttdata.com> wrote in > > Attached is the updated version of the patch. This can be applied to > current master. > > With the patch, for example, if the startup process waited longer than > deadlock_timeout for the recovery conflict on the lock, the latter log > message in the followings would be additionally output. > > LOG: recovery still waiting after 1001.223 ms: recovery conflict on > lock > LOG: recovery finished waiting after 19004.694 ms: recovery conflict > on lock + /* + * Emit the log message if recovery conflict on buffer pin was resolved but + * the startup process waited longer than deadlock_timeout for it. The first line is beyond the 80th column. LGTM other than the above. regards. -- Kyotaro Horiguchi NTT Open Source Software Center
On 2021/01/08 11:17, Kyotaro Horiguchi wrote: > At Fri, 8 Jan 2021 01:32:11 +0900, Fujii Masao <masao.fujii@oss.nttdata.com> wrote in >> >> Attached is the updated version of the patch. This can be applied to >> current master. >> >> With the patch, for example, if the startup process waited longer than >> deadlock_timeout for the recovery conflict on the lock, the latter log >> message in the followings would be additionally output. >> >> LOG: recovery still waiting after 1001.223 ms: recovery conflict on >> lock >> LOG: recovery finished waiting after 19004.694 ms: recovery conflict >> on lock > > + /* > + * Emit the log message if recovery conflict on buffer pin was resolved but > + * the startup process waited longer than deadlock_timeout for it. > > The first line is beyond the 80th column. Thanks for pointing out this! This happened because I forgot to run pgindent for bufmgr.c. Attached is the updated version of the patch. Regards, -- Fujii Masao Advanced Computing Technology Center Research and Development Headquarters NTT DATA CORPORATION
Attachment
Hi, On 1/7/21 4:51 PM, Fujii Masao wrote: > Thanks for the review! I pushed the latest patch. > Thanks all of you for your precious help on this patch! The original idea behind this thread has been split into 3 pieces. Pieces 1 (9d0bd95fa90a7243047a74e29f265296a9fc556d) and 2 (0650ff23038bc3eb8d8fd851744db837d921e285) have now been committed, the last one is to add more information regarding the canceled statements (if any), like: * What was the blocker(s) doing? * When did the blocker(s) started their queries (if any)? * What was the blocker(s) waiting for? on which wait event? Does this proposal sound good to you? If so I'll start a new thread with a patch proposal. Bertrand
Hi, On 1/8/21 5:19 AM, Fujii Masao wrote: > > On 2021/01/08 11:17, Kyotaro Horiguchi wrote: >> At Fri, 8 Jan 2021 01:32:11 +0900, Fujii Masao >> <masao.fujii@oss.nttdata.com> wrote in >>> >>> Attached is the updated version of the patch. This can be applied to >>> current master. >>> >>> With the patch, for example, if the startup process waited longer than >>> deadlock_timeout for the recovery conflict on the lock, the latter log >>> message in the followings would be additionally output. >>> >>> LOG: recovery still waiting after 1001.223 ms: recovery >>> conflict on >>> lock >>> LOG: recovery finished waiting after 19004.694 ms: recovery >>> conflict >>> on lock >> >> + /* >> + * Emit the log message if recovery conflict on >> buffer pin was resolved but >> + * the startup process waited longer than >> deadlock_timeout for it. >> >> The first line is beyond the 80th column. > > Thanks for pointing out this! This happened because I forgot to run > pgindent > for bufmgr.c. Attached is the updated version of the patch. The patch looks good to me. Bertrand
On 2021/01/08 14:02, Drouvot, Bertrand wrote: > Hi, > > On 1/7/21 4:51 PM, Fujii Masao wrote: >> Thanks for the review! I pushed the latest patch. >> > Thanks all of you for your precious help on this patch! > > The original idea behind this thread has been split into 3 pieces. > > Pieces 1 (9d0bd95fa90a7243047a74e29f265296a9fc556d) and 2 (0650ff23038bc3eb8d8fd851744db837d921e285) have now been committed,the last one is to add more information regarding the canceled statements (if any), like: > > * What was the blocker(s) doing? This "canceled statement" is just one that's canceled by recovery conflict? If so, the blocker is always the startup process? Sorry maybe I fail to understand this idea well.. > * When did the blocker(s) started their queries (if any)? If the blocker is the startup process, it doesn't start any query at all? Anyway if you post the patch, I'm happy to review that! Regards, -- Fujii Masao Advanced Computing Technology Center Research and Development Headquarters NTT DATA CORPORATION
On 1/8/21 7:24 AM, Fujii Masao wrote: > CAUTION: This email originated from outside of the organization. Do > not click links or open attachments unless you can confirm the sender > and know the content is safe. > > > > On 2021/01/08 14:02, Drouvot, Bertrand wrote: >> Hi, >> >> On 1/7/21 4:51 PM, Fujii Masao wrote: >>> Thanks for the review! I pushed the latest patch. >>> >> Thanks all of you for your precious help on this patch! >> >> The original idea behind this thread has been split into 3 pieces. >> >> Pieces 1 (9d0bd95fa90a7243047a74e29f265296a9fc556d) and 2 >> (0650ff23038bc3eb8d8fd851744db837d921e285) have now been committed, >> the last one is to add more information regarding the canceled >> statements (if any), like: >> >> * What was the blocker(s) doing? > > This "canceled statement" is just one that's canceled by recovery > conflict? > If so, the blocker is always the startup process? Sorry maybe I fail to > understand this idea well.. > > By blocker, I meant the one being canceled (I had in mind the startup process being the blocked one, not the blocker one). Sorry if i have not been clear enough. As an example, it could provide things like: 2020-06-15 06:48:54.778 UTC [7037] LOG: about to interrupt pid: 7037, backend_type: client backend, state: active, wait_event_type: Timeout, wait_event: PgSleep, query_start: 2020-06-15 06:48:13.008427+00 2020-06-15 06:48:54.778 UTC [7037] ERROR: canceling statement due to conflict with recovery > Anyway if you post the patch, I'm happy to review that! Thanks! Bertrand
On Fri, Jan 8, 2021 at 2:15 PM Drouvot, Bertrand <bdrouvot@amazon.com> wrote: > > Hi, > > On 1/8/21 5:19 AM, Fujii Masao wrote: > > > > On 2021/01/08 11:17, Kyotaro Horiguchi wrote: > >> At Fri, 8 Jan 2021 01:32:11 +0900, Fujii Masao > >> <masao.fujii@oss.nttdata.com> wrote in > >>> > >>> Attached is the updated version of the patch. This can be applied to > >>> current master. > >>> > >>> With the patch, for example, if the startup process waited longer than > >>> deadlock_timeout for the recovery conflict on the lock, the latter log > >>> message in the followings would be additionally output. > >>> > >>> LOG: recovery still waiting after 1001.223 ms: recovery > >>> conflict on > >>> lock > >>> LOG: recovery finished waiting after 19004.694 ms: recovery > >>> conflict > >>> on lock > >> > >> + /* > >> + * Emit the log message if recovery conflict on > >> buffer pin was resolved but > >> + * the startup process waited longer than > >> deadlock_timeout for it. > >> > >> The first line is beyond the 80th column. > > > > Thanks for pointing out this! This happened because I forgot to run > > pgindent > > for bufmgr.c. Attached is the updated version of the patch. > > The patch looks good to me. Thanks for the review! I pushed the patch. Regards, -- Fujii Masao
On Fri, Jan 8, 2021 at 3:43 PM Drouvot, Bertrand <bdrouvot@amazon.com> wrote: > > > On 1/8/21 7:24 AM, Fujii Masao wrote: > > CAUTION: This email originated from outside of the organization. Do > > not click links or open attachments unless you can confirm the sender > > and know the content is safe. > > > > > > > > On 2021/01/08 14:02, Drouvot, Bertrand wrote: > >> Hi, > >> > >> On 1/7/21 4:51 PM, Fujii Masao wrote: > >>> Thanks for the review! I pushed the latest patch. > >>> > >> Thanks all of you for your precious help on this patch! > >> > >> The original idea behind this thread has been split into 3 pieces. > >> > >> Pieces 1 (9d0bd95fa90a7243047a74e29f265296a9fc556d) and 2 > >> (0650ff23038bc3eb8d8fd851744db837d921e285) have now been committed, > >> the last one is to add more information regarding the canceled > >> statements (if any), like: > >> > >> * What was the blocker(s) doing? > > > > This "canceled statement" is just one that's canceled by recovery > > conflict? > > If so, the blocker is always the startup process? Sorry maybe I fail to > > understand this idea well.. > > > > > By blocker, I meant the one being canceled (I had in mind the startup > process being the blocked one, not the blocker one). Thanks! Understood. > Sorry if i have not > been clear enough. > > As an example, it could provide things like: > > 2020-06-15 06:48:54.778 UTC [7037] LOG: about to interrupt pid: 7037, > backend_type: client backend, state: active, wait_event_type: Timeout, > wait_event: PgSleep, query_start: 2020-06-15 06:48:13.008427+00 Sorry I'm not sure yet how this information is actually useful. What is the actual use case of this information? Maybe this topic should be discussed in new thread. Regards, -- Fujii Masao