log messages for archive recovery progress - Mailing list pgsql-hackers

From Satoshi Nagayasu / Uptime Technologies, LLC.
Subject log messages for archive recovery progress
Date
Msg-id 4F09AF54.5070408@uptime.jp
Whole thread Raw
Responses Re: log messages for archive recovery progress
List pgsql-hackers
Hi,

When I look into archive recovery deeply as DBA point of view, I found
that it's difficult to know (1) when the recovery process switched
reading WAL segments files from archive directory to pg_xlog directory,
and (2) whether it succeeded applying the latest WAL segments in the
pg_xlog directory.

For example, when I had "47" WAL segment in the archive directory and
"48" WAL segment in the pg_xlog directory, PostgreSQL said:

> [2011-12-08 05:59:03 JST] 9003: LOG:  restored log file "000000080000000000000046" from archive
> [2011-12-08 05:59:03 JST] 9003: LOG:  restored log file "000000080000000000000047" from archive
> cp: cannot stat `/backups/archlog/000000080000000000000048': No such file or directory
> [2011-12-08 05:59:03 JST] 9003: LOG:  record with zero length at 0/489F8B74
> [2011-12-08 05:59:03 JST] 9003: LOG:  redo done at 0/489F8B38
> [2011-12-08 05:59:03 JST] 9003: LOG:  last completed transaction was at log time 2011-12-08 05:52:01.507063+09
> cp: cannot stat `/backups/archlog/00000009.history': No such file or directory
> [2011-12-08 05:59:03 JST] 9003: LOG:  selected new timeline ID: 9
> [2011-12-08 05:59:03 JST] 9003: LOG:  restored log file "00000008.history" from archive
> [2011-12-08 05:59:04 JST] 9003: LOG:  archive recovery complete

I felt it's difficult to determine whether PostgreSQL applied "48" WAL
segment in the pg_xlog, or not.

So, I want to propose new log messages to show archive log progress as
reading WAL segments.

With applying my tiny modification, the recovery process reports its
progress for each WAL segment file, and also tells switching reading
archive directory to pg_xlog directory explicitly as shown below.

> [2011-12-08 15:14:36 JST] 16758: LOG:  restored log file "000000080000000000000046" from archive
> [2011-12-08 15:14:36 JST] 16758: LOG:  recoverying 000000080000000000000046
> [2011-12-08 15:14:36 JST] 16758: LOG:  restored log file "000000080000000000000047" from archive
> [2011-12-08 15:14:36 JST] 16758: LOG:  recoverying 000000080000000000000047
> cp: cannot stat `/backups/archlog/000000080000000000000048': No such file or directory
> [2011-12-08 15:14:37 JST] 16758: LOG:  could not restore file "000000080000000000000048" from archive
> [2011-12-08 15:14:37 JST] 16758: LOG:  attempting to look into pg_xlog
> [2011-12-08 15:14:37 JST] 16758: LOG:  recoverying 000000080000000000000048
> [2011-12-08 15:14:37 JST] 16758: LOG:  record with zero length at 0/489F8B74
> [2011-12-08 15:14:37 JST] 16758: LOG:  redo done at 0/489F8B38
> [2011-12-08 15:14:37 JST] 16758: LOG:  last completed transaction was at log time 2011-12-08 05:52:01.507063+09
> cp: cannot stat `/backups/archlog/00000009.history': No such file or directory
> [2011-12-08 15:14:37 JST] 16758: LOG:  selected new timeline ID: 9
> [2011-12-08 15:14:37 JST] 16758: LOG:  restored log file "00000008.history" from archive
> [2011-12-08 15:14:38 JST] 16758: LOG:  archive recovery complete

Do you think this is useful?

Thanks,
-- 
Satoshi Nagayasu <snaga@uptime.jp>
Uptime Technologies, LLC. http://www.uptime.jp


pgsql-hackers by date:

Previous
From: Simon Riggs
Date:
Subject: CLOG contention, part 2
Next
From: Kohei KaiGai
Date:
Subject: Re: [v9.2] Fix Leaky View Problem