Re: Race condition in recovery? - Mailing list pgsql-hackers
From | Tatsuro Yamada |
---|---|
Subject | Re: Race condition in recovery? |
Date | |
Msg-id | 4698027d-5c0d-098f-9a8e-8cf09e36a555@nttcom.co.jp_1 Whole thread Raw |
In response to | Re: Race condition in recovery? (Kyotaro Horiguchi <horikyota.ntt@gmail.com>) |
Responses |
Duplicate history file?
Re: Race condition in recovery? |
List | pgsql-hackers |
Hi Horiguchi-san, > (Why me?) Because the story was also related to PG-REX, which you are also involved in developing. Perhaps off-list instead of -hackers would have been better, but I emailed -hackers because the same problem could be encountered by PostgreSQL users who do not use PG-REX. >> In a project I helped with, I encountered an issue where >> the archive command kept failing. I thought this issue was >> related to the problem in this thread, so I'm sharing it here. >> If I should create a new thread, please let me know. >> >> * Problem >> - The archive_command is failed always. > > Although I think the configuration is a kind of broken, it can be seen > as it is mimicing the case of shared-archive, where primary and > standby share the same archive directory. To be precise, the environment of this reproduction script is different from our actual environment. I tried to make it as simple as possible to reproduce the problem. (In order to make it look like the actual environment, you have to build a PG-REX environment.) A simple replication environment might be enough, so I'll try to recreate a script that is closer to the actual environment later. > Basically we need to use an archive command like the following for > that case to avoid this kind of failure. The script returns "success" > when the target file is found but identical with the source file. I > don't find such a description in the documentation, and haven't > bothered digging into the mailing-list archive. > > == > #! /bin/bash > > if [ -f $2 ]; then > cmp -s $1 $2 > if [ $? != 0 ]; then > exit 1 > fi > exit 0 > fi > > cp $1 $2 > == Thanks for your reply. Since the above behavior is different from the behavior of the test command in the following example in postgresql.conf, I think we should write a note about this example. # e.g. 'test ! -f /mnt/server/archivedir/%f && cp %p /mnt/server/archivedir/%f' Let me describe the problem we faced. - When archive_mode=always, archive_command is (sometimes) executed in a situation where the history file already exists on the standby side. - In this case, if "test ! -f" is written in the archive_command of postgresql.conf on the standby side, the command will keep failing. Note that this problem does not occur when archive_mode=on. So, what should we do for the user? I think we should put some notes in postgresql.conf or in the documentation. For example, something like this: ==== Note: If you use archive_mode=always, the archive_command on the standby side should not be used "test ! -f". ==== Regards, Tatsuro Yamada
pgsql-hackers by date: