Thread: Duplicate history file?

Duplicate history file?

From
Kyotaro Horiguchi
Date:
So, this is the new new thread.

This thread should have been started here:

https://www.postgresql.org/message-id/20210531.165825.921389284096975508.horikyota.ntt%40gmail.com

regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center



Re: Duplicate history file?

From
Tatsuro Yamada
Date:
Hi Horiguchi-san,

> This thread should have been started here:
> 
> https://www.postgresql.org/message-id/20210531.165825.921389284096975508.horikyota.ntt%40gmail.com
>>
>> (To recap: In a replication set using archive, startup tries to
>> restore WAL files from archive before checking pg_wal directory for
>> the desired file.  The behavior itself is intentionally designed and
>> reasonable. However, the restore code notifies of a restored file
>> regardless of whether it has been already archived or not.  If
>> archive_command is written so as to return error for overwriting as we
>> suggest in the documentation, that behavior causes archive failure.)
>>
>> After playing with this, I see the problem just by restarting a
>> standby even in a simple archive-replication set after making
>> not-special prerequisites.  So I think this is worth fixing.
>>
>> With this patch, KeepFileRestoredFromArchive compares the contents of
>> just-restored file and the existing file for the same segment only
>> when:
>>
>>       - archive_mode = always
>>   and - the file to restore already exists in pgwal
>>   and - it has a .done and/or .ready status file.
>>
>> which doesn't happen usually.  Then the function skips archive
>> notification if the contents are identical.  The included TAP test is
>> working both on Linux and Windows.
> 
> 
> Thank you for the analysis and the patch.
> I'll try the patch tomorrow.
> 
> I just noticed that this thread is still tied to another thread
> (it's not an independent thread). To fix that, it may be better to
> create a new thread again. 


I've tried your patch. Unfortunately, it didn't seem to have any good
effect on the script I sent to reproduce the problem.

I understand that, as Stefan says, the test and cp commands have
problems and should not be used for archive commands. Maybe this is not
a big problem for the community.
Nevertheless, even if we do not improve the feature, I think it is a
good idea to explicitly state in the documentation that archiving may
fail under certain conditions for new users.

I'd like to hear the opinions of experts on the archive command.

P.S.
My customer's problem has already been solved, so it's ok. I've
emailed -hackers with the aim of preventing users from encountering
the same problem.


Regards,
Tatsuro Yamada





Re: Duplicate history file?

From
Kyotaro Horiguchi
Date:
On 2021/06/08 18:19, Tatsuro Yamada wrote:
> I've tried your patch. Unfortunately, it didn't seem to have any good
> effect on the script I sent to reproduce the problem.

Oops! The patch forgot about history files.

I checked the attached with your repro script and it works fine.

> I understand that, as Stefan says, the test and cp commands have
> problems and should not be used for archive commands. Maybe this is not
> a big problem for the community.
> Nevertheless, even if we do not improve the feature, I think it is a
> good idea to explicitly state in the documentation that archiving may
> fail under certain conditions for new users.
>
> I'd like to hear the opinions of experts on the archive command.
>
> P.S.
> My customer's problem has already been solved, so it's ok. I've
> emailed -hackers with the aim of preventing users from encountering
> the same problem.
>
I understand that.

regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center


Attachment

Re: Duplicate history file?

From
Tatsuro Yamada
Date:
Hi Horiguchi-san,

On 2021/06/09 11:47, Kyotaro Horiguchi wrote:
> On 2021/06/08 18:19, Tatsuro Yamada wrote:
>> I've tried your patch. Unfortunately, it didn't seem to have any good
>> effect on the script I sent to reproduce the problem.
> 
> Oops! The patch forgot about history files.
> 
> I checked the attached with your repro script and it works fine.


Thank you for fixing the patch.
The new patch works well in my environment. :-D


Regards,
Tatsuro Yamada




Re: Duplicate history file?

From
Tatsuro Yamada
Date:
Hi

> Thank you for fixing the patch.
> The new patch works well in my environment. :-D

This may not be important at this time since it is a
PoC patch, but I would like to inform you that there
was a line that contained multiple spaces instead of tabs.

$ git diff --check
src/backend/access/transam/xlogarchive.c:465: trailing whitespace.
+

Regards,
Tatsuro Yamada




Re: Duplicate history file?

From
Fujii Masao
Date:

On 2021/06/09 15:58, Tatsuro Yamada wrote:
> Hi
> 
>> Thank you for fixing the patch.
>> The new patch works well in my environment. :-D
> 
> This may not be important at this time since it is a
> PoC patch, but I would like to inform you that there
> was a line that contained multiple spaces instead of tabs.
> 
> $ git diff --check
> src/backend/access/transam/xlogarchive.c:465: trailing whitespace.
> +

Even with the patch, if "test ! -f ..." is used in archive_command,
you may still *easily* get the trouble that WAL archiving keeps failing?

Instead, we should consider and document "better" command for
archive_command, or implement something like pg_archivecopy command
into the core (as far as I remember, there was the discussion about
this feature before...)?

Regards,

-- 
Fujii Masao
Advanced Computing Technology Center
Research and Development Headquarters
NTT DATA CORPORATION



Re: Duplicate history file?

From
Tatsuro Yamada
Date:
Hi,

On 2021/06/09 16:23, Fujii Masao wrote:
> On 2021/06/09 15:58, Tatsuro Yamada wrote:
>> This may not be important at this time since it is a
>> PoC patch, but I would like to inform you that there
>> was a line that contained multiple spaces instead of tabs.
>>
>> $ git diff --check
>> src/backend/access/transam/xlogarchive.c:465: trailing whitespace.
>> +
> 
> Even with the patch, if "test ! -f ..." is used in archive_command,
> you may still *easily* get the trouble that WAL archiving keeps failing?

Thanks for your comment.

Yes, it may solve the error when using the test command, but it is
dangerous to continue using the cp command, which is listed as an
example of an archive command.

  
> Instead, we should consider and document "better" command for
> archive_command, or implement something like pg_archivecopy command
> into the core (as far as I remember, there was the discussion about
> this feature before...)?


I agree with that idea.
Since archiving is important for all users, I think there should be
either a better and safer command in the documentation, or an archive
command (pg_archivecopy?) that we provide as a community, as you said.
I am curious about the conclusions of past discussions. :)


Regards,
Tatsuro Yamada





Re: Duplicate history file?

From
Kyotaro Horiguchi
Date:
At Wed, 09 Jun 2021 16:56:14 +0900, Tatsuro Yamada <tatsuro.yamada.tf@nttcom.co.jp> wrote in 
> Hi,
> 
> On 2021/06/09 16:23, Fujii Masao wrote:
> > On 2021/06/09 15:58, Tatsuro Yamada wrote:
> >> This may not be important at this time since it is a
> >> PoC patch, but I would like to inform you that there
> >> was a line that contained multiple spaces instead of tabs.
> >>
> >> $ git diff --check
> >> src/backend/access/transam/xlogarchive.c:465: trailing whitespace.
> >> +
> > Even with the patch, if "test ! -f ..." is used in archive_command,
> > you may still *easily* get the trouble that WAL archiving keeps
> > failing?

I'm not sure, but in regard to the the cause that the patch treats, if
an already-archived file is recycled or deleted then the same file is
restored from archive, that could happen. But the WAL segment that
contains the latest checkpoint won't be deleted. The same can be said
on history files.

> Thanks for your comment.
> 
> Yes, it may solve the error when using the test command, but it is
> dangerous to continue using the cp command, which is listed as an
> example of an archive command.

"test" command?

At first I thought that the archive command needs to compare the whole
file content *always*, but that happens with the same frequency with
the patch runs a whole-file comparison.

> > Instead, we should consider and document "better" command for
> > archive_command, or implement something like pg_archivecopy command
> > into the core (as far as I remember, there was the discussion about
> > this feature before...)?
> 
> 
> I agree with that idea.
> Since archiving is important for all users, I think there should be
> either a better and safer command in the documentation, or an archive
> command (pg_archivecopy?) that we provide as a community, as you said.
> I am curious about the conclusions of past discussions. :)

How perfect the officially-provided script or command need to be?  The
reason that the script in the documentation is so simple is, I guess,
we don't/can't offer a steps sufficiently solid for all-directions.

Since we didn't noticed that the "test ! -f" harms so it has been
there but finally we need to remove it.  Instead, we need to write
doen the known significant requirements by words. I'm afraid that the
concrete script would be a bit complex for the documentation..

So what we can do that is:

 - Remove the "test ! -f" from the sample command (for *nixen).

 - Rewrite at least the following portion in the documentation. [1]

   > The archive command should generally be designed to refuse to
   > overwrite any pre-existing archive file. This is an important
   > safety feature to preserve the integrity of your archive in case
   > of administrator error (such as sending the output of two
   > different servers to the same archive directory).
   > 
   > It is advisable to test your proposed archive command to ensure
   > that it indeed does not overwrite an existing file, and that it
   > returns nonzero status in this case. The example command above
   > for Unix ensures this by including a separate test step. On some
   > Unix platforms, cp has switches such as -i that can be used to do
   > the same thing less verbosely, but you should not rely on these
   > without verifying that the right exit status is returned. (In
   > particular, GNU cp will return status zero when -i is used and
   > the target file already exists, which is not the desired
   > behavior.)

The replacement would be something like:

"There is a case where WAL file and timeline history files is archived
more than once.  The archive command should generally be designed to
refuse to replace any pre-existing archive file with a file with
different content but to return zero if the file to be archived is
identical with the preexisting file."

But I'm not sure how it looks like.. (even ignoring the broken
phrasing..)
 

1: https://www.postgresql.org/docs/11/continuous-archiving.html

regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center



Re: Duplicate history file?

From
Kyotaro Horiguchi
Date:
At Thu, 10 Jun 2021 10:00:21 -0400, Stephen Frost <sfrost@snowman.net> wrote in 
> Greetings,
> 
> * Kyotaro Horiguchi (horikyota.ntt@gmail.com) wrote:
> > At Wed, 09 Jun 2021 16:56:14 +0900, Tatsuro Yamada <tatsuro.yamada.tf@nttcom.co.jp> wrote in 
> > > On 2021/06/09 16:23, Fujii Masao wrote:
> > > > Instead, we should consider and document "better" command for
> > > > archive_command, or implement something like pg_archivecopy command
> > > > into the core (as far as I remember, there was the discussion about
> > > > this feature before...)?
> > > 
> > > I agree with that idea.
> > > Since archiving is important for all users, I think there should be
> > > either a better and safer command in the documentation, or an archive
> > > command (pg_archivecopy?) that we provide as a community, as you said.
> > > I am curious about the conclusions of past discussions. :)
> > 
> > How perfect the officially-provided script or command need to be?  The
> > reason that the script in the documentation is so simple is, I guess,
> > we don't/can't offer a steps sufficiently solid for all-directions.
> > 
> > Since we didn't noticed that the "test ! -f" harms so it has been
> > there but finally we need to remove it.  Instead, we need to write
> > doen the known significant requirements by words. I'm afraid that the
> > concrete script would be a bit complex for the documentation..
> 
> We don't have any 'officially-provided' tool for archive command.

I meant the "test ! -f .." by the "officially-provided script".  The
fact we show it in the documentation (without a caveart) means that
the script at least doesn't break the server behavior that is running
normally including promotion.

> > So what we can do that is:
> > 
> >  - Remove the "test ! -f" from the sample command (for *nixen).
> 
> ... or just remove the example entirely.  It really doesn't do anything
> good for us, in my view.

Yeah. I feel like so. But that also means the usage instruction of the
replacements disappears from our documentation. The least problematic
example in the regards above is just "cp .." without "test" as the
instruction.

> > The replacement would be something like:
> > 
> > "There is a case where WAL file and timeline history files is archived
> > more than once.  The archive command should generally be designed to
> > refuse to replace any pre-existing archive file with a file with
> > different content but to return zero if the file to be archived is
> > identical with the preexisting file."
> > 
> > But I'm not sure how it looks like.. (even ignoring the broken
> > phrasing..)
> 
> There is so much more that we should be including here, like "you should

Mmm. Yeah, I can understand your sentiment maybe completely.

> make sure your archive command will reliably sync the WAL file to disk
> before returning success to PG, since PG will feel free to immediately
> remove the WAL file once archive command has returned successfully", and
> "the archive command should check that there exists a .history file for
> any timeline after timeline 1 in the repo for the WAL file that's being
> archived" and "the archive command should allow the exist, binary
> identical, WAL file to be archived multiple times without error, but
> should error if a new WAL file is archived which would overwrite a
> binary distinct WAL file in the repo", and "the archive command should
> check the WAL header to make sure that the WAL file matches the cluster
> in the corresponding backup repo", and "whatever is expiring the WAL
> files after they've been archived should make sure to not expire out any
> WAL that is needed for any of the backups that remain", and "oh, by the
> way, depending on the exit code of the command, PG may consider the
> failure to be something which can be retried, or not", and other things
> that I can't think of off the top of my head right now.
> I have to say that it gets to a point where it feels like we're trying
> to document everything about writing a C extension to PG using the
> hooks which we make available.  We've generally agreed that folks should
> be looking at the source code if they're writing a serious C extension
> and it's certainly the case that, in writing a serious archive command
> and backup tool, getting into the PG source code has been routinely
> necessary.

Nevertheless I agree to it, still don't we need a minimum workable
setup as the first step? Something like below.

===
The following is an example of the minimal archive_command.

Example: cp %p /blah/%f

However, it is far from perfect. The following is the discussion about
what is needed for archive_command to be more reliable.

<the long list of the requirements>
====

Anyway it doesn't seem to be the time to do that, but as now that we
know that there's a case where the current example doesn't prevent PG
from working correctly, we cannot use the "test ! -f" example and
cannot suggest "do not overwrite existing archived files" without a
caveat. At least don't we need to *fix* that parts for now?

regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center



Re: Duplicate history file?

From
Kyotaro Horiguchi
Date:
Recently my brain is always twisted..

At Fri, 11 Jun 2021 11:25:51 +0900 (JST), Kyotaro Horiguchi <horikyota.ntt@gmail.com> wrote in 
> Anyway it doesn't seem to be the time to do that, but as now that we
- know that there's a case where the current example doesn't prevent PG
+ know that there's a case where the current example prevents PG
> from working correctly, we cannot use the "test ! -f" example and
> cannot suggest "do not overwrite existing archived files" without a
> caveat. At least don't we need to *fix* that parts for now?

regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center



Re: Duplicate history file?

From
Julien Rouhaud
Date:
On Fri, Jun 11, 2021 at 11:25:51AM +0900, Kyotaro Horiguchi wrote:
> 
> Nevertheless I agree to it, still don't we need a minimum workable
> setup as the first step? Something like below.
> 
> ===
> The following is an example of the minimal archive_command.
> 
> Example: cp %p /blah/%f
> 
> However, it is far from perfect. The following is the discussion about
> what is needed for archive_command to be more reliable.
> 
> <the long list of the requirements>
> ====

"far from perfect" is a strong understatement for "appears to work but will
randomly and silently breaks everything without a simple way to detect it".

We should document a minimum workable setup, but cp isn't an example of that,
and I don't think that there will be a simple example unless we provide a
dedicated utility.

It could however be something along those lines:

Example: archive_program %p /path/to/%d

archive_program being a script ensuring that all those requirements are met:
<the long list of the requirements>



Re: Duplicate history file?

From
Kyotaro Horiguchi
Date:
At Fri, 11 Jun 2021 10:48:32 +0800, Julien Rouhaud <rjuju123@gmail.com> wrote in 
> On Fri, Jun 11, 2021 at 11:25:51AM +0900, Kyotaro Horiguchi wrote:
> > 
> > Nevertheless I agree to it, still don't we need a minimum workable
> > setup as the first step? Something like below.
> > 
> > ===
> > The following is an example of the minimal archive_command.
> > 
> > Example: cp %p /blah/%f
> > 
> > However, it is far from perfect. The following is the discussion about
> > what is needed for archive_command to be more reliable.
> > 
> > <the long list of the requirements>
> > ====
> 
> "far from perfect" is a strong understatement for "appears to work but will
> randomly and silently breaks everything without a simple way to detect it".

I think it's overstating. It sounds like a story of a mission critical
world.  How perfect archive_command should be depends on the
requirements of every system.  Simple cp is actualy sufficient in
certain log? range of usages, maybe.

> We should document a minimum workable setup, but cp isn't an example of that,
> and I don't think that there will be a simple example unless we provide a
> dedicated utility.

It looks somewhat strange like "Well, you need a special track to
drive your car, however, we don't have one. It's your responsibility
to construct a track that protects it from accidents perfectly.".

"Yeah, I'm not going to push it so hard and don't care it gets some
small scratches, couldn't I drive it on a public road?"

(Sorry for the bad analogy).

I think cp can be an example as far as we explain the limitations. (On
the other hand "test !-f" cannot since it actually prevents server
from working correctly.)

> It could however be something along those lines:
> 
> Example: archive_program %p /path/to/%d
> 
> archive_program being a script ensuring that all those requirements are met:
> <the long list of the requirements>

Isn't it almost saying that anything less than pgBackRest isn't
qualified as archive_program?


regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center



Re: Duplicate history file?

From
Michael Paquier
Date:
On Fri, Jun 11, 2021 at 03:32:28PM +0900, Kyotaro Horiguchi wrote:
> At Fri, 11 Jun 2021 10:48:32 +0800, Julien Rouhaud <rjuju123@gmail.com> wrote in
>> "far from perfect" is a strong understatement for "appears to work but will
>> randomly and silently breaks everything without a simple way to detect it".

Yeah.  Users like unplugging their hosts, because that's *fast* and
easy to do.

> I think it's overstating. It sounds like a story of a mission critical
> world.  How perfect archive_command should be depends on the
> requirements of every system.  Simple cp is actually sufficient in
> certain log? range of usages, maybe.
>
>> We should document a minimum workable setup, but cp isn't an example of that,
>> and I don't think that there will be a simple example unless we provide a
>> dedicated utility.
>
> I think cp can be an example as far as we explain the limitations. (On
> the other hand "test !-f" cannot since it actually prevents server
> from working correctly.)

Disagreed.  I think that we should not try to change this area until
we can document a reliable solution, and a simple "cp" is not that.
Hmm.  A simple command that could be used as reference is for example
"dd" that flushes the file by itself, or we could just revisit the
discussions about having a pg_copy command, or we could document a
small utility in perl that does the job.
--
Michael

Attachment

Re: Duplicate history file?

From
Julien Rouhaud
Date:
On Fri, Jun 11, 2021 at 03:32:28PM +0900, Kyotaro Horiguchi wrote:
> At Fri, 11 Jun 2021 10:48:32 +0800, Julien Rouhaud <rjuju123@gmail.com> wrote in 
> > 
> > "far from perfect" is a strong understatement for "appears to work but will
> > randomly and silently breaks everything without a simple way to detect it".
> 
> I think it's overstating. It sounds like a story of a mission critical
> world.  How perfect archive_command should be depends on the
> requirements of every system.  Simple cp is actualy sufficient in
> certain log? range of usages, maybe.

I disagree, cp is probably the worst command that can be used for this purpose.
On top on the previous problems already mentioned, you also have the fact that
the copy isn't atomic.  It means that any concurrent restore_command (or
anything that would consume the archived files) will happily process a half
copied WAL file, and in case of any error during the copy you end up with a
file for which you don't know if it contains valid data or not.  I don't see
any case where you would actually want to use that, unless maybe if you want to
benchmark how long it takes before you lose some data.

> > We should document a minimum workable setup, but cp isn't an example of that,
> > and I don't think that there will be a simple example unless we provide a
> > dedicated utility.
> 
> It looks somewhat strange like "Well, you need a special track to
> drive your car, however, we don't have one. It's your responsibility
> to construct a track that protects it from accidents perfectly.".
> 
> "Yeah, I'm not going to push it so hard and don't care it gets some
> small scratches, couldn't I drive it on a public road?"
> 
> (Sorry for the bad analogy).

I think that a better analogy would be "I don't need working brakes on my car
since I only drive on highway and there aren't any red light there".

> Isn't it almost saying that anything less than pgBackRest isn't
> qualified as archive_program?

I don't know, I'm assuming that barman also provides one, such as wal-e and
wal-g (assuming that the distant providers do their part of the job correctly).
Maybe there are other tools too.  But as long as we don't document what exactly
are the requirements, it's not really a surprise that most people don't
implement them.



Re: Duplicate history file?

From
Kyotaro Horiguchi
Date:
At Fri, 11 Jun 2021 15:18:03 +0800, Julien Rouhaud <rjuju123@gmail.com> wrote in 
> On Fri, Jun 11, 2021 at 03:32:28PM +0900, Kyotaro Horiguchi wrote:
> I disagree, cp is probably the worst command that can be used for this purpose.
> On top on the previous problems already mentioned, you also have the fact that
> the copy isn't atomic.  It means that any concurrent restore_command (or
> anything that would consume the archived files) will happily process a half
> copied WAL file, and in case of any error during the copy you end up with a
> file for which you don't know if it contains valid data or not.  I don't see
> any case where you would actually want to use that, unless maybe if you want to
> benchmark how long it takes before you lose some data.

Actually there's large room for losing data with cp. Yes, we would
need additional redundancy of storage and periodical integrity
inspection of the storage and archives on maybe need copies at the
other sites on the other side of the Earth. But they are too-much for
some kind of users.  They have the right and responsibility to decide
how durable/reliable their archive needs to be.  (Putting aside some
hardware/geological requirements :p) If we mandate some
characteristics on the archive_command, we should take them into core.
I remember I saw some discussions on archive command on this line but
I think it had ended at the point something like that "we cannot
design one-fits-all interface comforming the requirements" or
something (sorry, I don't remember in its detail..)

> I don't know, I'm assuming that barman also provides one, such as wal-e and
> wal-g (assuming that the distant providers do their part of the job correctly).

Well. rman used rsync/ssh in its documentation in the past and now
looks like providing barman-wal-archive so it seems that you're right
in that point.  So, do we recommend them in our documentation? (I'm
not sure they are actually comform the requirement, though..)

> Maybe there are other tools too.  But as long as we don't document what exactly
> are the requirements, it's not really a surprise that most people don't
> implement them.

I strongly agree to describe the requirements.

My point is that if all of them are really mandatory, it is mandatory
for us to officially provide or at least recommend the minimal
implement(s) that covers all of them.  If we recommended some external
tools, that would mean that we ensure that the tools qualify the
requirements.

If we write an example with a pseudo tool name, requiring some
characteristics on the tool, then not telling about the minimal tools,
I think that it is equivalent to that we are inhibiting certain users
from using archive_command even if they really don't want such level
of durability.

regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center



Re: Duplicate history file?

From
Kyotaro Horiguchi
Date:
At Fri, 11 Jun 2021 16:08:33 +0900, Michael Paquier <michael@paquier.xyz> wrote in 
> On Fri, Jun 11, 2021 at 03:32:28PM +0900, Kyotaro Horiguchi wrote:
> > I think cp can be an example as far as we explain the limitations. (On
> > the other hand "test !-f" cannot since it actually prevents server
> > from working correctly.)
> 
> Disagreed.  I think that we should not try to change this area until
> we can document a reliable solution, and a simple "cp" is not that.

Isn't removing cp from the documentation a change in this area? I
basically agree to not to change anything but the current example
"test ! -f <fn> && cp .." and relevant description has been known to
be problematic in a certain situation.

- Do we leave it alone igonring the possible problem?

- Just add a description about "the problem"?

- Just remove "test ! -f" and the relevant description?

- Remove "test ! -f" and rewrite the relevant description?

(- or not remove "test ! -f" and rewrite the relevant description?)

- Write the full (known) requirements and use a pseudo tool-name in
 the example?

 - provide a minimal implement of the command?

 - recommend some external tools (that we can guarantee that they
   comform the requriements)?

 - not recommend any tools?

> Hmm.  A simple command that could be used as reference is for example
> "dd" that flushes the file by itself, or we could just revisit the
> discussions about having a pg_copy command, or we could document a
> small utility in perl that does the job.

I think we should do that if pg_copy comforms the mandatory
requirements but maybe it's in the future. Showing the minimal
implement in perl looks good.

However, my main point here is "what should we do for now?".  Not
about an ideal solution.

regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center



Re: Duplicate history file?

From
Julien Rouhaud
Date:
On Tue, Jun 15, 2021 at 10:20:37AM +0900, Kyotaro Horiguchi wrote:
> 
> Actually there's large room for losing data with cp. Yes, we would
> need additional redundancy of storage and periodical integrity
> inspection of the storage and archives on maybe need copies at the
> other sites on the other side of the Earth. But they are too-much for
> some kind of users.  They have the right and responsibility to decide
> how durable/reliable their archive needs to be.  (Putting aside some
> hardware/geological requirements :p)

Note that most of those considerations are orthogonal to what a proper
archive_command should be responsible for.

Yes users are responsible to decide they want valid and durable backup or
not, but we should assume a sensible default behavior, which is a valid and
durable archive_command.  We don't document a default fsync = off with later
recommendation explaining why you shouldn't do that, and I think it should be
the same for the archive_command.  The problem with the current documentation
is that many users will just blindly copy/paste whatever is in the
documentation without reading any further.

As an example, a few hours ago some french user on the french bulletin board
said that he fixed his "postmaster.pid already exists" error with a
pg_resetxlog -f, referring to some article explaining how to start postgres in
case of "PANIC: could not locate a valid checkpoint record".  Arguably
that article didn't bother to document what are the implication for executing
pg_resetxlog, but given that the user original problem had literally nothing to
do with what was documented, I really doubt that it would have changed
anything.

> If we mandate some
> characteristics on the archive_command, we should take them into core.

I agree.

> I remember I saw some discussions on archive command on this line but
> I think it had ended at the point something like that "we cannot
> design one-fits-all interface comforming the requirements" or
> something (sorry, I don't remember in its detail..)

I also agree, but this problem is solved by making archive_command
customisable.  Providing something that can reliably work in some general and
limited cases would be a huge improvement.

> Well. rman used rsync/ssh in its documentation in the past and now
> looks like providing barman-wal-archive so it seems that you're right
> in that point.  So, do we recommend them in our documentation? (I'm
> not sure they are actually comform the requirement, though..)

We could maybe bless some third party backup solutions, but this will probably
lead to a lot more discussions, so it's better to discuss that in a different
thread.  Note that I don't have a deep knowledge of any of those tools so I
don't have an opinion.

> If we write an example with a pseudo tool name, requiring some
> characteristics on the tool, then not telling about the minimal tools,
> I think that it is equivalent to that we are inhibiting certain users
> from using archive_command even if they really don't want such level
> of durability.

I already saw customers complaining about losing backups because their
archive_command didn't ensure that the copy was durable.  Some users may not
care about losing their backups in such case, but I really think that the
majority of users expect a backup to be valid, durable and everything without
even thinking that it may not be the case.  It should be the default behavior,
not optional.



Re: Duplicate history file?

From
Stephen Frost
Date:
Greetings,

* Kyotaro Horiguchi (horikyota.ntt@gmail.com) wrote:
> At Fri, 11 Jun 2021 16:08:33 +0900, Michael Paquier <michael@paquier.xyz> wrote in
> > On Fri, Jun 11, 2021 at 03:32:28PM +0900, Kyotaro Horiguchi wrote:
> > > I think cp can be an example as far as we explain the limitations. (On
> > > the other hand "test !-f" cannot since it actually prevents server
> > > from working correctly.)
> >
> > Disagreed.  I think that we should not try to change this area until
> > we can document a reliable solution, and a simple "cp" is not that.
>
> Isn't removing cp from the documentation a change in this area? I
> basically agree to not to change anything but the current example
> "test ! -f <fn> && cp .." and relevant description has been known to
> be problematic in a certain situation.

[...]

> - Write the full (known) requirements and use a pseudo tool-name in
>  the example?

I'm generally in favor of just using a pseudo tool-name and then perhaps
providing a link to a new place on .Org where people can ask to have
their PG backup solution listed, or something along those lines.

>  - provide a minimal implement of the command?

Having been down this road for a rather long time, I can't accept this
as a serious suggestion.  No, not even with Perl.  Been there, done
that, not going back.

>  - recommend some external tools (that we can guarantee that they
>    comform the requriements)?

The requirements are things which are learned over years and changes
over time.  Trying to document them and keep up with them would be a
pretty serious project all on its own.  There are external projects who
spend serious time and energy doing their best to provide the tooling
needed here and we should be promoting those, not trying to pretend like
this is a simple thing which anyone could write a short perl script to
accomplish.

>  - not recommend any tools?

This is the approach that has been tried and it's, objectively, failed
miserably.  Our users are ending up with invalid and unusable backups,
corrupted WAL segments, inability to use PITR, and various other issues
because we've been trying to pretend that this isn't a hard problem.  We
really need to stop that and accept that it's hard and promote the tools
which have been explicitly written to address that hard problem.

> > Hmm.  A simple command that could be used as reference is for example
> > "dd" that flushes the file by itself, or we could just revisit the
> > discussions about having a pg_copy command, or we could document a
> > small utility in perl that does the job.
>
> I think we should do that if pg_copy comforms the mandatory
> requirements but maybe it's in the future. Showing the minimal
> implement in perl looks good.

Already tried doing it in perl.  No, it's not simple and it's also
entirely vaporware today and implies that we're going to develop this
tool, improve it in the future as we realize it needs to be improved,
and maintain it as part of core forever.  If we want to actually adopt
and pull in a backup tool to be part of core then we should talk about
things which actually exist, such as the various existing projects that
have been written to specifically work to address all the requirements
which are understood today, not say "well, we can just write a simple
perl script to do it" because it's not actually that simple.

Providing yet another half solution would be doubling-down on the failed
approach to document a "simple" solution and would be a disservice to
our users.

Thanks,

Stephen

Attachment

Re: Duplicate history file?

From
Stephen Frost
Date:
Greetings,

On Tue, Jun 15, 2021 at 21:11 Julien Rouhaud <rjuju123@gmail.com> wrote:
On Tue, Jun 15, 2021 at 02:28:04PM -0400, Stephen Frost wrote:
>
> * Julien Rouhaud (rjuju123@gmail.com) wrote:
> > On Tue, Jun 15, 2021 at 11:33:10AM -0400, Stephen Frost wrote:
> >
> > The fact that this is such a complex problem is the very reason why we should
> > spend a lot of energy documenting the various requirements.  Otherwise, how
> > could anyone implement a valid program for that and how could anyone validate
> > that a solution claiming to do its job actually does its job?
>
> Reading the code.

Oh, if it's as simple as that then surely documenting the various requirements
won't be an issue.

As I suggested previously- this is similar to the hooks that we provide. We don’t extensively document them because if you’re writing an extension which uses a hook, you’re going to be (or should be..) reading the code too.

Consider that, really, an archive command should refuse to allow archiving of WAL on a timeline which doesn’t have a corresponding history file in the archive for that timeline (excluding timeline 1). Also, a backup tool should compare the result of pg_start_backup to what’s in the control file, using a fresh read, after start backup returns to make sure that the storage is sane and not, say, cache’ing pages independently (such as might happen with a separate NFS mount..).  Oh, and if a replica is involved, a check should be done to see if the replica has changed timelines and an appropriate message thrown if that happens complaining that the backup was aborted due to the promotion of the replica…

To be clear- these aren’t checks that pgbackrest has today and I’m not trying to make it out as if pgbackrest is the only solution and the only tool that “does everything and is correct” because we aren’t there yet and I’m not sure we ever will be “all correct” or “done”.

These, however, are ones we have planned to add because of things we’ve seen and thought of, most of them in just the past few months.

Thanks,

Stephen

Re: Duplicate history file?

From
Kyotaro Horiguchi
Date:
Thanks for the opinions.

At Tue, 15 Jun 2021 11:33:10 -0400, Stephen Frost <sfrost@snowman.net> wrote in 
> Greetings,
> 
> * Kyotaro Horiguchi (horikyota.ntt@gmail.com) wrote:
> > At Fri, 11 Jun 2021 16:08:33 +0900, Michael Paquier <michael@paquier.xyz> wrote in 
> > > On Fri, Jun 11, 2021 at 03:32:28PM +0900, Kyotaro Horiguchi wrote:
> > > > I think cp can be an example as far as we explain the limitations. (On
> > > > the other hand "test !-f" cannot since it actually prevents server
> > > > from working correctly.)
> > > 
> > > Disagreed.  I think that we should not try to change this area until
> > > we can document a reliable solution, and a simple "cp" is not that.
> > 
> > Isn't removing cp from the documentation a change in this area? I
> > basically agree to not to change anything but the current example
> > "test ! -f <fn> && cp .." and relevant description has been known to
> > be problematic in a certain situation.
> 
> [...]
> 
> > - Write the full (known) requirements and use a pseudo tool-name in
> >  the example?
> 
> I'm generally in favor of just using a pseudo tool-name and then perhaps
> providing a link to a new place on .Org where people can ask to have
> their PG backup solution listed, or something along those lines.

Looks fine.

> >  - provide a minimal implement of the command?
> 
> Having been down this road for a rather long time, I can't accept this
> as a serious suggestion.  No, not even with Perl.  Been there, done
> that, not going back.
>
> >  - recommend some external tools (that we can guarantee that they
> >    comform the requriements)?
> 
> The requirements are things which are learned over years and changes
> over time.  Trying to document them and keep up with them would be a
> pretty serious project all on its own.  There are external projects who
> spend serious time and energy doing their best to provide the tooling
> needed here and we should be promoting those, not trying to pretend like
> this is a simple thing which anyone could write a short perl script to
> accomplish.

I agree that no simple solution could be really perfect.  The reason I
think that a simple cp can be a candidate of the example might be
based on the assumption that anyone who is going to build a database
system ought to know their requirements including the
durability/reliability of archives/backups and the limitaions of
adopted methods/technologies.  However, as Julien mentioned, if
there's actually a problem that relatively.. ahem, ill-advised users
(sorry in advance if it's rude) uses the 'cp' only for the reason that
it is shown in the example without a thought and inadvertently loses
archives, it might be better that we don't suggest a concrete command
for archive_command.

> >  - not recommend any tools?
> 
> This is the approach that has been tried and it's, objectively, failed
> miserably.  Our users are ending up with invalid and unusable backups,
> corrupted WAL segments, inability to use PITR, and various other issues
> because we've been trying to pretend that this isn't a hard problem.  We
> really need to stop that and accept that it's hard and promote the tools
> which have been explicitly written to address that hard problem.

I can sympathize that but is there any difference with system backups?
One can just copy $HOME to another directory in the same drive then
call it a day. Another uses dd to make a image backup. Others need
durability or guarantee for integrity or even encryption so acquire or
purchase a tool that conforms their requirements.  Or someone creates
their own backup solution that meets their requirements.

On the other hand, what OS distributors offer a long list for
requirements or a recipe for perfect backups? (Yeah, I'm saying this
based on nothing, just from a prejudice.)

If the system is serious, who don't know enough about backup ought to
consult professionals before building an inadequate backup system and
lose their data.

> > > Hmm.  A simple command that could be used as reference is for example
> > > "dd" that flushes the file by itself, or we could just revisit the
> > > discussions about having a pg_copy command, or we could document a
> > > small utility in perl that does the job.
> > 
> > I think we should do that if pg_copy comforms the mandatory
> > requirements but maybe it's in the future. Showing the minimal
> > implement in perl looks good.
> 
> Already tried doing it in perl.  No, it's not simple and it's also
> entirely vaporware today and implies that we're going to develop this
> tool, improve it in the future as we realize it needs to be improved,
> and maintain it as part of core forever.  If we want to actually adopt
> and pull in a backup tool to be part of core then we should talk about
> things which actually exist, such as the various existing projects that
> have been written to specifically work to address all the requirements
> which are understood today, not say "well, we can just write a simple
> perl script to do it" because it's not actually that simple.
> 
> Providing yet another half solution would be doubling-down on the failed
> approach to document a "simple" solution and would be a disservice to
> our users.

Ok, if we follow the direction that we are responsible for ensuring
that every user has reliable backups, I don't come up with proper
description about that.

We could list several "requirement" like "do sync after copy", "take a
checksum for all files then check it periodically" or other things but
what is more important things to list here, I think, is "how we run
the archive_command".

Doesn't the following work for now?

(No example)

- "%f is replace by ... %p is .., %r is ... in archive_command"

- We call the archive_command for every wal segment which is finished.

- We may call the archive_command for the same file more than once.

- We may call the archive_command for different files with the same
  name. In this case server is working incorrectly and need a
  check. Don't overwrite with the new content.

- We don't offer any durability or integrity on the archived
  files. All of them is up to you.  You can use some existing
  solutions for archiving. See the following links.

regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center



Re: Duplicate history file?

From
Kyotaro Horiguchi
Date:
At Wed, 16 Jun 2021 12:04:03 +0900 (JST), Kyotaro Horiguchi <horikyota.ntt@gmail.com> wrote in 
> Ok, if we follow the direction that we are responsible for ensuring
> that every user has reliable backups, I don't come up with proper
> description about that.
> 
> We could list several "requirement" like "do sync after copy", "take a
> checksum for all files then check it periodically" or other things but
> what is more important things to list here, I think, is "how we run
> the archive_command".
> 
> Doesn't the following work for now?
> 
> (No example)
> 
> - "%f is replace by ... %p is .., %r is ... in archive_command"
> 
> - We call the archive_command for every wal segment which is finished.
> 
> - We may call the archive_command for the same file more than once.
> 
> - We may call the archive_command for different files with the same
>   name. In this case server is working incorrectly and need a
>   check. Don't overwrite with the new content.
> 
> - We don't offer any durability or integrity on the archived
>   files. All of them is up to you.  You can use some existing
>   solutions for archiving. See the following links.

Of course, there should be some descriptions about error handling
along with.

regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center



Re: Duplicate history file?

From
Kyotaro Horiguchi
Date:
At Wed, 16 Jun 2021 11:20:55 +0800, Julien Rouhaud <rjuju123@gmail.com> wrote in 
> On Tue, Jun 15, 2021 at 11:00:57PM -0400, Stephen Frost wrote:
> > 
> > As I suggested previously- this is similar to the hooks that we provide. We
> > don’t extensively document them because if you’re writing an extension
> > which uses a hook, you’re going to be (or should be..) reading the code too.
> 
> I disagree, hooks allows developers to implement some new or additional
> behavior which by definition can't be documented.  And it's also relying on the
> C api which by definition allows you to do anything with the server.  There are
> also of course some requirements but they're quite obvious (like a planner_hook
> should return a valid plan and such).
> 
> On the other hand the archive_command is there to do only one clear thing:
> safely backup a WAL file.  And I don't think that what makes that backup "safe"
> is open to discussion.  Sure, you can chose to ignore some of it if you think
> you can afford to do it, but it doesn't change the fact that it's still a
> requirement which should be documented.

I agree to Julien, however, I want to discuss (also) on what to do for
14 now.  If we decide not to touch the document for the version. that
discussion would end.  What do you think about that?  I think it's
impossible to write the full-document for the requirements *for 14*.

regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center

Re: Duplicate history file?

From
Stephen Frost
Date:
Greetings,

On Tue, Jun 15, 2021 at 23:21 Julien Rouhaud <rjuju123@gmail.com> wrote:
On Tue, Jun 15, 2021 at 11:00:57PM -0400, Stephen Frost wrote:
>
> As I suggested previously- this is similar to the hooks that we provide. We
> don’t extensively document them because if you’re writing an extension
> which uses a hook, you’re going to be (or should be..) reading the code too.

I disagree, hooks allows developers to implement some new or additional
behavior which by definition can't be documented.  And it's also relying on the
C api which by definition allows you to do anything with the server.  There are
also of course some requirements but they're quite obvious (like a planner_hook
should return a valid plan and such).

The archive command is technically invoked using the shell, but the interpretation of the exit code, for example, is only discussed in the C code, but it’s far from the only consideration that someone developing an archive command needs to understand.

On the other hand the archive_command is there to do only one clear thing:
safely backup a WAL file.  And I don't think that what makes that backup "safe"
is open to discussion.  Sure, you can chose to ignore some of it if you think
you can afford to do it, but it doesn't change the fact that it's still a
requirement which should be documented.

The notion that an archive command can be distanced from backups is really not reasonable in my opinion. 

> Consider that, really, an archive command should refuse to allow archiving
> of WAL on a timeline which doesn’t have a corresponding history file in the
> archive for that timeline (excluding timeline 1).

Yes, that's a clear requirement that should be documented.

Is it a clear requirement that pgbackrest and every other organization that has developed an archive command has missed? Are you able to point to a tool which has such a check today?

This is not a trivial problem any more than PG’s use of fsync is trivial and we clearly should have understood how Linux and fsync work decades ago and made sure to always crash on any fsync failure and not believe that a later fsync would return a failure if an earlier one did and the problem didn’t resolve itself properly.

> Also, a backup tool
> should compare the result of pg_start_backup to what’s in the control file,
> using a fresh read, after start backup returns to make sure that the
> storage is sane and not, say, cache’ing pages independently (such as might
> happen with a separate NFS mount..).  Oh, and if a replica is involved, a
> check should be done to see if the replica has changed timelines and an
> appropriate message thrown if that happens complaining that the backup was
> aborted due to the promotion of the replica…

I agree, but unless I'm missing something it's unrelated to what an
archive_command should be in charge of?

I’m certainly not moved by this argument as it seems to be willfully missing the point.  Further, if we are going to claim that we must document archive command to such level then surely we need to also document all the requirements for pg_start_backup and pg_stop_backup too, so this strikes me as entirely relevant.

Thanks,

Stephen

Re: Duplicate history file?

From
Kyotaro Horiguchi
Date:
At Wed, 16 Jun 2021 12:36:19 +0800, Julien Rouhaud <rjuju123@gmail.com> wrote in 
> On Wed, Jun 16, 2021 at 01:10:16PM +0900, Kyotaro Horiguchi wrote:
> > 
> > I agree to Julien, however, I want to discuss (also) on what to do for
> > 14 now.  If we decide not to touch the document for the version. that
> > discussion would end.  What do you think about that?  I think it's
> > impossible to write the full-document for the requirements *for 14*.
> 
> My personal take on that is that this is a bug in the documentation and the
> list of requirements should be backported.  Hopefully this can be done before
> v14 is released, but if not I don't think that it should be a blocker to make
> progress.

I understand that, we discuss how we fix the documentation and we
don't change it for the version 14 if any conclusion cannot be made
until the deadline.

Thanks!

-- 
Kyotaro Horiguchi
NTT Open Source Software Center



Re: Duplicate history file?

From
Julien Rouhaud
Date:
On Wed, Jun 16, 2021 at 9:19 PM Stephen Frost <sfrost@snowman.net> wrote:
>
> This is exactly it.  I don't agree that we can, or should, treat every
> sensible thing that we realize about what the archive command or the
> backup tool should be doing as some bug in our documentation that has to
> be backpatched.
> If you're serious about continuing on this path, it strikes me that the
> next step would be to go review all of the above mentioned tools,
> identify all of the things that they do and the checks that they have,
> and then craft a documentation patch to add all of those- for both
> archive command and pg_start/stop_backup.

1) I'm not saying that every single check that every single tools
currently does is a requirement for a safe command and/or should be
documented
2) I don't think that there are thousands and thousands of
requirements, as you seem to imply
3) I still don't understand why you think that having a partial
knowledge of what makes an archive_command safe scattered in the
source code of many third party tools is a good thing

But what better alternative are you suggesting?  Say that no ones
knows what an archive_command should do and let people put a link to
their backup solution in the hope that they will eventually converge
to a safe solution that no one will be able to validate?



Re: Duplicate history file?

From
Stephen Frost
Date:
Greetings,

* Julien Rouhaud (rjuju123@gmail.com) wrote:
> On Wed, Jun 16, 2021 at 9:19 PM Stephen Frost <sfrost@snowman.net> wrote:
> > This is exactly it.  I don't agree that we can, or should, treat every
> > sensible thing that we realize about what the archive command or the
> > backup tool should be doing as some bug in our documentation that has to
> > be backpatched.
> > If you're serious about continuing on this path, it strikes me that the
> > next step would be to go review all of the above mentioned tools,
> > identify all of the things that they do and the checks that they have,
> > and then craft a documentation patch to add all of those- for both
> > archive command and pg_start/stop_backup.
>
> 1) I'm not saying that every single check that every single tools
> currently does is a requirement for a safe command and/or should be
> documented

That's true- you're agreeing that there's even checks beyond those that
are currently implemented which should also be done.  That's exactly
what I was responding to.

> 2) I don't think that there are thousands and thousands of
> requirements, as you seem to imply

You've not reviewed any of the tools which have been written and so I'm
not sure what you're basing your belief on.  I've done reviews of the
various tools and have been rather involved in the development of one of
them.  I do think there's lots of requirements and it's not some static
list which could be just written down once and then never touched or
thought about again.

Consider pg_dump- do we document everything that a logical export tool
should do?  That someone who wants to implement pg_dump should make sure
that the tool runs around and takes out a share lock on all of the
tables to be exported?  No, of course we don't, because we provide a
tool to do that and if people actually want to understand how it works,
we point them to the source code.  Had we started out with a backup tool
in core, the same would be true for that.  Instead, we didn't, and such
tools were developed outside of core (and frankly have largely had to
play catch-up to try and figure out all the things that are needed to do
it well and likely always will be since they aren't part of core).

> 3) I still don't understand why you think that having a partial
> knowledge of what makes an archive_command safe scattered in the
> source code of many third party tools is a good thing

Having partial knowledge of what makes an archive_command safe in the
official documentation is somehow better..?  What would that lead to-
other people seriously developing a backup solution for PG?  No, I
seriously doubt that, as those who are seriously developing such
solutions couldn't trust to only what we've got documented anyway but
would have to go looking through the source code and would need to
develop a deep understanding of how WAL works, what happens when PG is
started up to perform PITR but with archiving disabled and how that
impacts what ends up being archived (hint: the server will switch
timelines but won't actually archive a history file because archiving is
disabled- a restart which then enables archiving will then start pushing
WAL on a timeline where there's no history file; do that twice from an
older backup and not you've got the same WAL files trying to be pushed
into the repo which are actually on materially different timelines even
though the same timeline has been chosen multiple times...), how
timelines work, and all the rest.

We already have partial documentation about what should go into
developing an archive_command and what it's lead to are people ignoring
that and instead copying the example that's explicitly called out as not
sufficient.  That's the actual problem that needs to be addressed here.

Let's rip out the example and instead promote tools which have been
written to specifically address this and which are actively maintained.
If someone actually comes asking about how to develop their own backup
solution for PG, we should suggest that they review the PG code related
to WAL, timelines, how promotion works, etc, and probably point them at
the OSS projects which already work to tackle this issue, because to
develop a proper tool you need to actually understand all of that.

> But what better alternative are you suggesting?  Say that no ones
> knows what an archive_command should do and let people put a link to
> their backup solution in the hope that they will eventually converge
> to a safe solution that no one will be able to validate?

There are people who do know, today, what an archive command should do
and we should be promoting the tools that they've developed which do, in
fact, implement those checks already, at least the ones we've thought of
so far.

Instead, the suggestion being made here is to write a detailed design
document for how to develop a backup tool (and, no, I don't agree that
we can "just" focus on archive command) for PG and then to maintain it
and update it and backpatch every change to it that we think of.

Thanks,

Stephen

Attachment